Some Performance Numbers With Akka Async SQL

By gamlerhart June 21, 2012 English, java, Technical Wibbly Wobbly

A while back I wrote about my small Async SQL library for Akka and its asynchronous MySQL driver. Besides a few tiny fixes I’ve added nothing new recently.
However I want to share a few number from my performance experiments.

Racing Stuff

Good Benchmarking, Science for itself

Good benchmarking is a small science for itself. Creating a balanced benchmark, ensure equal conditions for multiple runs, cranking out usable numbers, do proper analysis etc. I didn’t do that. I just wrote some code, ran that stuff and compared the numbers.
In the end it heavily depends on your scenario anyway. For example when you run trivial queries against a local database, then JDBC will probably win, since it only blocks a thread very briefly. If you add web-like latencies to the database communication, then my ADBCJ solution has an huge advantage.

The Benchmark Scenario

This is basically the benchmark scenario which I measured. It simulates a web like application.

A request comes in
We query for the most important data. Also we query for some less important data. The more important data is returned asynchronously to the client first, while the other queries are still running. This should simulate a page, which first shows the important stuff and loads the rest in the back ground, like many websites do.
When the first part of the result arrives at the client, another request is sent, which updates some data
When all queries, updates and operations complete, we’re done. We measure the time between issuing the request and completing everything

For the plain JDBC implementation I used the Akka future API, to run that work in the background and continue with the other operations. The ADBCJ implementation relies on its asynchronous API.
For these number I ran the database locally, so there is no huge round trip overhead. Only the queries need a bit of time to complete, from sub milliseconds up the around 6 milliseconds.

Result, JDBC vs ADBCJ

Well ADBCJ is as expected faster for this task. Mainly because it can send new queries, while processing the in streaming result of previous queries. Maybe the JDBC implementation also ends up exhausting the thread pool with blocked threads, but I didn’t check it if that’s the case:

	JDBC	ADBCJ
Min	36.00ms	20.00ms
Median	47.00ms	32.00ms
Mean	66.26ms	41.25ms
Max.	222.00ms	155.00ms

Here’s also a box plot of the data. There is quite a high variance in the data, and a few latency spikes:

JDBC vs ADBCJ

JDBC with Connection Pool, ADBCJ via JDBC bridge

Now this is a little unfair, since you usually use a connection pool for JDBC. And we didn’t check out the ADBCJ via JDBC fallback driver. So I ran the benchmark as well with the BoneCP connection pool and ADBCJ via JDBC.
Guess what, JDBC with a connection pool reacts the fastest. This means that a connection pool has a huge effect in this benchmark, since it opens and closes database connection often. The conclusion is that ADBCJ desperately needs its own connect pool implementation, to handle similar scenarios.
Another interesting fact is that ADBCJ via JDBC performs the worst here. It’s clear that it is probably a bit slower the JDBC version, because I manually split the JDBC calls reasonably across threads. The ADBCJ to JDBC bridge has no insight how to smartly dispatch the blocking calls. However the huge difference between JDBC and ADBCJ via JDBC is surprising. This could indicate that there are some implementation issues present in the ADBCJ to JDBC bridge.

	JDBC with Pool	ADBCJ	JDBC	ADBCJ via JDBC
Min	17.00ms	20.00ms	36.00ms	40.00ms
Median	30.00ms	32.00ms	47.00ms	72.00ms
Mean	41.37ms	41.25ms	66.26ms	85.35ms
Max.	156.00ms	155.00ms	222.00ms	197.00ms

Again a box plot of this data. Again high variance and quite a few spikes:

JDBC Connection Pool, ADBCJ via JDBC

High Load Performance

Well, I wanted to see if the picture doesn’t change if the load is much higher. So I cranked the executed request up, so that the machine is fairly busy. The database is still running locally. The picture doesn’t change much, maybe there is a tendency that the performance advantage of ADBCJ gets bigger. But I didn’t to more scaling testing to verify that.
Also the ADBCJ to JDBC bridge crashed, because it just issued way to many connections. Well, as mentioned, the ADBCJ to JDBC bridge is more testing facility anyway.

	JDBC	ADBCJ
Min	41.00ms	20.00ms
Median	142.00ms	59.50ms
Mean	160.22ms	77.43ms
Max.	604.00ms	298.00ms

And again in a box plot:

Higher Load: JDBC vs ADBCJ

Conclusion

Well, don’t draw to many conclusion from these numbers. We certainly can see that ADBCJ can be faster for some scenarios. We also can clearly see, that the missing connection pool for ADBCJ is a huge performance penalty for many applications.

Tagged on: akka, database, java, Scala

Gamlor

thoughts about programming, tv-series and other geeky stuff.