Gregory M. Turner wrote:
The pureness is about reentrancy, though it maybe necessary so that two parsers don't have conflicting global variable names. I also thought about pre-generated parser code... that's probably a good idea if Alexandre agrees.
There's some tools you can download from MSDN, however MSI.DLL does all the work of reading and writing the database.
It's probably better that somebody who's looking for stuff to do does it, seeing you've got other stuff to do. I might do some work on it periodically, but it seems like it might be a fun project for somebody to take up.
Mike
Are there any documents describing the MSI file format/structures or is MSDN it?
thanks -mike
On Thu, 2003-08-07 at 06:30, Mike McCormack wrote:
Hi Mike,
As far as I know, MSDN is it. The code I posted was made by examining various .msi files.
Here's what I found (informal description of the MSI format):
****
An .msi file is an OLE structured store file.
Each table in the database is a single stream in the structured storage file. The stream names are mapped from the name of the table, using a simple 8bit <-> 6bit encoding, similar to base64 encoding.
Strings in the database are represented by an number, which is an offset into the string table. The string can be determined using the streams _StringData and _StringPool. The offset points to two 2 byte integers in _StringPool, one being the length of the string in bytes, and the other the number of occurences. The string is found in _StringData by adding up all the previous lengths and generating an offset.
All general tables contain 2 or 4 byte numbers that are either offsets into the string table, or just integers. The high bit of an integer value means that it is positive, not negative.
All the values in the first column are stored sequentially, followed by all the values in the second column, etc. This means that row inserts cannot be performed without moving most of the data in a table... the MSI database format appears to be optimized for space, not speed.
The type of an field is defined by the "Type" field in the "_Tables" table. It also defines the size of each column in the database, it's type, encoding (string/number) and whether it is a primary key or not.
If a stream is saved into a database field, it is stored as a normal stream in the structured storage file. The encoding of the stream name is done such that it doesn't clash with table names.
****
I think the fields in the database are fairly well defined in MSDN, just not the actual structure of the database.
Mike
Mike Hearn wrote:
Are there any documents describing the MSI file format/structures or is MSDN it?