Friday, June 06, 2008
Useless Database programming, a dbase inspiration
Having fond memories of the classic database app that helped kickstart the shareware movement, PC-File from Buttonware, I’ve been hacking up a little bit of C code to do a micro-rdbms. First off I didn’t want to just do some kind of dbase reader. There is a tonne of those in existence already, not it had to be a new wheel on an old axle! Second, I didn’t want to have multiple files, one file should contain everything needed, data structures, tables, indexes etc.
The first step in realizing this was that I was writing a file system driver on the bottom layer, something to manage small chunks of data in random places in the file.
The second hack was having dual data structures, the on disk structure and the in memory structure.
Current data types are;
* Auto ID (32bit unsigned integer)
* Integer (signed 32bit integer)
* Double (double...)
* String (of max 64kb)
* Date, Time, DateTime (uses data structure of basic month/day/year, hour/minute/second.)
* Foreign Key (table ID / Field ID).
The filesystem layer manages things in 8 byte pages. You basically ask this layer for X bytes and it returns you a buffer that you throw data into which gets marked dirty and later flushed to disk.
The real limitation (remember this is not some honking 64bit thing, its.. micro, wee for a better term) is that the complete size of a row is 64kb max, even tho strings have a 64kb size limit and you can have 255 fileds per table, the combined on disk size of the row can not exceed 64kb. Obviously changing the row size from holder uint16_t to uint32_t would alleviate this but I don’t see the point.
The maximum size the underlying data manager can handle is 32gb. (2^32 * 8).. originally it was 2^16 * 8 which is only a paltry 512kb.. but 512k would have fit just fine on an Amiga 880kb floppy or a PC 720kb floppy. Maybe I will revert back to using 16bit block number with a 8byte page size (512kb) or a 16byte page size (1mb)…
Database, Field and Table names are limited to 48 characters, a table can have 255 fields and 2^32 rows.
Currently you can create DB’s, tables, fields. You can add rows. The system will flush dirty information to disk and you can load the schema up from disk.
The next task will be to read the row data back into memory, then do the indexing info, most of the code for reading row data back already exists, so it should be quite simple.
Its big bottleneck is that it operates in a form of a single linked list. If I need to update row N, I need to load the row header information for row M so I can re-point it to row N when a new disk block is allocated. The presence of a row index will alleviate this tho.
No, I have no plans for writing a SQL parser or any other fancy things, I’m not going to re implement SQLite or anything, this is just a fun thing to spend a day or two on.
The name of this marvel is, CodPiece.
Filed Under : Computers • Development •
Comments are closed There are no comments on this entry.