There has been a considerable lack of blogging about deegree. Googling for it yields the results for 'degree blog' (note the missing e), and forcing the issue reveals only a blob that seems only suitable for adults. Being in Bolsena under the hot Italian sun participating at the Bolsena Code Sprint 2011 seems like a good spot to change that.
So what's happening in the deegree world? Just now we're profiling our INSPIRE services. Turns out that the PostgreSQL JDBC driver has some strange handling concerning bytea fields. We're using these in our so called BLOB storage (storing GML directly in the database with a couple of indexes). In theory fetching the BLOBs (usually only a couple of KB) should be fast enough, and the actual GML parsing/exporting/rendering etc. should slow things down eventually.
The nice thing about actually profiling things is that you know where you can replace the 'should' with 'does not'. For PostgreSQL 9 there seem to be two 'encodings' to fetch the actual bytes of the bytea field. Both include a string representation of the byte in question, one using an octal number, one using a hex number (this is a new 'feature' in PostgreSQL 9). Decoding the bytes involves a method call for each and every byte, where the string is decoded into the actual byte value.
Our test case was ~23000 features sized 1-2KB each. This results in something like 30 to 40 million calls to the method that decodes the bytes, and consumes a lot of time.
So what are the options? There are other options to store large objects in PostgreSQL, so maybe using one of these might be a better option. But since PostgreSQL is Open Source, one might also try to have a closer look at the driver.
Another option would be to try and enjoy the beautiful look at the Lago di Bolsena more often, and not dig into other peoples code...
For those people who want to know more about deegree, have a look at our wiki. Posts which are more concerned about deegree will follow.
Stay tuned for other Bolsena Code Sprint stories!