Making of the db4o LINQPad Driver: Meta-Data

Meta-Data

Meta-Data

This time I’m going to look at the meta-data required for the db4o driver. First I’m showing how to get the meta-data from db4o. Then I explain some internal details of the driver internal representation of this information.

 

Side note: I’ve integrated lots of improvements into the driver. It now should work with (one dimensional) arrays and auto-properties. Furthermore you can optionally add the assemblies with the original classes.

Getting Meta-Data from db4o

The first step is to get the meta-data from the database. Most databases allow you to retrieve information about the stored data. In relational database there’s a system table which contain a description of all tables, views, constrains etc. db4o has an API to get this information. There are two main ways to get information about the stored data in db4o: The stored classes and the known classes API.

What’s the difference between the stored classes and known classes? First only returns information about objects which are actually stored in the database. The second also contains information about classes which are otherwise known to db4o. Furthermore the stored classes information is a very low level and minimalistic. The known classes information is much richer. Also db4o used the information from the actual type if it can find it.

I use the known classes API to get the information I need:

Converting to Driver Internal Representation

The first step which I do is to convert the meta data the an internal representation. Why do I want to do that? The db4o driver cares about different things than db4o does. Therefore I want to have a representation which fulfills the drivers need. For example the driver needs to know if it knows the native type or if it needs to generate a class. On the other hand the driver doesn’t care about reflection abstractions, since that is done by LINQPad itself. Also it’s more handy to write tests with an internal representation which fits its needs.

Currently the implementation has three internal representation of a type. The first one it the ‘known’ type. That is when the driver was able to find original type, either because it’s a .NET type or because the user supplied the assemblies. The second type is the ‘simple’ class description, which represents types which couldn’t be found. That means we need to generate a replacement for it. The last one is the array type, which describes an array.

Parsing Type-Names

Another strange thing the driver does: It parses the type names. Why is that necessary? db4o encodes the assembly and generics information into type-name, something like the AssemblyQualifiedName of .NET types. With that name db4o then can find the original type when loading data from the database. Unfortunately the parse and resolve logic is deeply embedded in db4o and isn’t supposed to be a public API. That’s why I implemented my own type-name parser. For that I used the wonderful ‘Sprache’ library. It allows me to use LINQ to describe a parser. It’s perfect for stuff where regular expressions are not powerful enough and you need a tiny parser. Take a look at this blog-post for an introduction.

By the way: Currently I parse the type-names correctly to find out the generic arguments etc. However in the further processing generics aren’t properly handled. Some cases work, other will lead to crashes.

Ready to Generate Types

So the LINQPad driver extracts meta-data from the database and then converts it to an internal representation. After that the driver is ready to generate the types and query-context required for LINQ-queries. But that’s the content of the next post =)

Tagged on: ,