Gregory M. Turner wrote:
Hmm, that pure parser business... is that for reentrancy? Or is that a side-effect of some larger "pureness"?
If they aren't portably generated, just do it ahead of time -- betcha AJ will have no problem checking in some pre-generated stuff so long as the sources to regenerate them are available to those who have the right tools.
The pureness is about reentrancy, though it maybe necessary so that two parsers don't have conflicting global variable names. I also thought about pre-generated parser code... that's probably a good idea if Alexandre agrees.
- make the database read/write - not immediately important, but
necessary later
does MSI do that? I thought that was done by some funky tool... or does it use msi.dll to do the work?
There's some tools you can download from MSDN, however MSI.DLL does all the work of reading and writing the database.
I don't have lots of time to work on this at the moment, so if somebody wants to have a go, I'm happy to help them out.
well considering I'm the only recipient of this, would I be the "somebody" you have in mind? ;) (just kidding of course ... as for me, my plate is pretty full right now, but I will of course take a look at it when I get the chance and see if there's anything I can do to help. Since I need to brush up on my parsing skills if I ever want to help Ove with widl, maybe I can find something to do in that arena.)
It's probably better that somebody who's looking for stuff to do does it, seeing you've got other stuff to do. I might do some work on it periodically, but it seems like it might be a fun project for somebody to take up.
Mike
Are there any documents describing the MSI file format/structures or is MSDN it?
thanks -mike
On Thu, 2003-08-07 at 06:30, Mike McCormack wrote:
Gregory M. Turner wrote:
Hmm, that pure parser business... is that for reentrancy? Or is that a side-effect of some larger "pureness"?
If they aren't portably generated, just do it ahead of time -- betcha AJ will have no problem checking in some pre-generated stuff so long as the sources to regenerate them are available to those who have the right tools.
The pureness is about reentrancy, though it maybe necessary so that two parsers don't have conflicting global variable names. I also thought about pre-generated parser code... that's probably a good idea if Alexandre agrees.
- make the database read/write - not immediately important, but
necessary later
does MSI do that? I thought that was done by some funky tool... or does it use msi.dll to do the work?
There's some tools you can download from MSDN, however MSI.DLL does all the work of reading and writing the database.
I don't have lots of time to work on this at the moment, so if somebody wants to have a go, I'm happy to help them out.
well considering I'm the only recipient of this, would I be the "somebody" you have in mind? ;) (just kidding of course ... as for me, my plate is pretty full right now, but I will of course take a look at it when I get the chance and see if there's anything I can do to help. Since I need to brush up on my parsing skills if I ever want to help Ove with widl, maybe I can find something to do in that arena.)
It's probably better that somebody who's looking for stuff to do does it, seeing you've got other stuff to do. I might do some work on it periodically, but it seems like it might be a fun project for somebody to take up.
Mike
Hi Mike,
As far as I know, MSDN is it. The code I posted was made by examining various .msi files.
Here's what I found (informal description of the MSI format):
****
An .msi file is an OLE structured store file.
Each table in the database is a single stream in the structured storage file. The stream names are mapped from the name of the table, using a simple 8bit <-> 6bit encoding, similar to base64 encoding.
Strings in the database are represented by an number, which is an offset into the string table. The string can be determined using the streams _StringData and _StringPool. The offset points to two 2 byte integers in _StringPool, one being the length of the string in bytes, and the other the number of occurences. The string is found in _StringData by adding up all the previous lengths and generating an offset.
All general tables contain 2 or 4 byte numbers that are either offsets into the string table, or just integers. The high bit of an integer value means that it is positive, not negative.
All the values in the first column are stored sequentially, followed by all the values in the second column, etc. This means that row inserts cannot be performed without moving most of the data in a table... the MSI database format appears to be optimized for space, not speed.
The type of an field is defined by the "Type" field in the "_Tables" table. It also defines the size of each column in the database, it's type, encoding (string/number) and whether it is a primary key or not.
If a stream is saved into a database field, it is stored as a normal stream in the structured storage file. The encoding of the stream name is done such that it doesn't clash with table names.
****
I think the fields in the database are fairly well defined in MSDN, just not the actual structure of the database.
Mike
Mike Hearn wrote:
Are there any documents describing the MSI file format/structures or is MSDN it?