Some Performance Numbers With Akka Async SQL

A while back I wrote about my small Async SQL library for Akka and its asynchronous MySQL driver. Besides a few tiny fixes I’ve added nothing new recently.
However I want to share a few number from my performance experiments.

Racing Stuff

Racing Stuff

Good Benchmarking, Science for itself

Good benchmarking is a small science for itself. Creating a balanced benchmark, ensure equal conditions for multiple runs, cranking out usable numbers, do proper analysis etc. I didn’t do that. I just wrote some code, ran that stuff and compared the numbers.
In the end it heavily depends on your scenario anyway. For example when you run trivial queries against a local database, then JDBC will probably win, since it only blocks a thread very briefly. If you add web-like latencies to the database communication, then my ADBCJ solution has an huge advantage.

The Benchmark Scenario

This is basically the benchmark scenario which I measured. It simulates a web like application.

  • A request comes in
  • We query for the most important data. Also we query for some less important data. The more important data is returned asynchronously to the client first, while the other queries are still running. This should simulate a page, which first shows the important stuff and loads the rest in the back ground, like many websites do.
  • When the first part of the result arrives at the client, another request is sent, which updates some data
  • When all queries, updates and operations complete, we’re done. We measure the time between issuing the request and completing everything

For the plain JDBC implementation I used the Akka future API, to run that work in the background and continue with the other operations. The ADBCJ implementation relies on its asynchronous API.
For these number I ran the database locally, so there is no huge round trip overhead. Only the queries need a bit of time to complete, from sub milliseconds up the around 6 milliseconds.

Result, JDBC vs ADBCJ

Well ADBCJ is as expected faster for this task. Mainly because it can send new queries, while processing the in streaming result of previous queries. Maybe the JDBC implementation also ends up exhausting the thread pool with blocked threads, but I didn’t check it if that’s the case:

JDBC ADBCJ
Min 36.00ms 20.00ms
Median 47.00ms 32.00ms
Mean 66.26ms 41.25ms
Max. 222.00ms 155.00ms

Here’s also a box plot of the data. There is quite a high variance in the data, and a few latency spikes:

JDBC vs ADBCJ

JDBC vs ADBCJ

JDBC with Connection Pool, ADBCJ via JDBC bridge

Now this is a little unfair, since you usually use a connection pool for JDBC. And we didn’t check out the ADBCJ via JDBC fallback driver. So I ran the benchmark as well with the BoneCP connection pool and ADBCJ via JDBC.
Guess what, JDBC with a connection pool reacts the fastest. This means that a connection pool has a huge effect in this benchmark, since it opens and closes database connection often. The conclusion is that ADBCJ desperately needs its own connect pool implementation, to handle similar scenarios.
Another interesting fact is that ADBCJ via JDBC performs the worst here. It’s clear that it is probably a bit slower the JDBC version, because I manually split the JDBC calls reasonably across threads. The ADBCJ to JDBC bridge has no insight how to smartly dispatch the blocking calls. However the huge difference between JDBC and ADBCJ via JDBC is surprising. This could indicate that there are some implementation issues present in the ADBCJ to JDBC bridge.

JDBC with Pool ADBCJ JDBC ADBCJ via JDBC
Min 17.00ms 20.00ms 36.00ms 40.00ms
Median 30.00ms 32.00ms 47.00ms 72.00ms
Mean 41.37ms 41.25ms 66.26ms 85.35ms
Max. 156.00ms 155.00ms 222.00ms 197.00ms

Again a box plot of this data. Again high variance and quite a few spikes:

JDBC Connection Pool, ADBCJ via JDBC

JDBC Connection Pool, ADBCJ via JDBC

High Load Performance

Well, I wanted to see if the picture doesn’t change if the load is much higher. So I cranked the executed request up, so that the machine is fairly busy. The database is still running locally. The picture doesn’t change much, maybe there is a tendency that the performance advantage of ADBCJ gets bigger. But I didn’t to more scaling testing to verify that.
Also the ADBCJ to JDBC bridge crashed, because it just issued way to many connections. Well, as mentioned, the ADBCJ to JDBC bridge is more testing facility anyway.

JDBC ADBCJ
Min 41.00ms 20.00ms
Median 142.00ms 59.50ms
Mean 160.22ms 77.43ms
Max. 604.00ms 298.00ms

And again in a box plot:

Higher Load: JDBC vs ADBCJ

Higher Load: JDBC vs ADBCJ

Conclusion

Well, don’t draw to many conclusion from these numbers. We certainly can see that ADBCJ can be faster for some scenarios. We also can clearly see, that the missing connection pool for ADBCJ is a huge performance penalty for many applications.

Tagged on: , , ,