Some Performance Numbers With Akka Async SQL
A while back I wrote about my small Async SQL library for Akka and its asynchronous MySQL driver. Besides a few tiny fixes I’ve added nothing new recently.
However I want to share a few number from my performance experiments.
Good Benchmarking, Science for itself
Good benchmarking is a small science for itself. Creating a balanced benchmark, ensure equal conditions for multiple runs, cranking out usable numbers, do proper analysis etc. I didn’t do that. I just wrote some code, ran that stuff and compared the numbers.
In the end it heavily depends on your scenario anyway. For example when you run trivial queries against a local database, then JDBC will probably win, since it only blocks a thread very briefly. If you add web-like latencies to the database communication, then my ADBCJ solution has an huge advantage.
The Benchmark Scenario
This is basically the benchmark scenario which I measured. It simulates a web like application.
- A request comes in
- We query for the most important data. Also we query for some less important data. The more important data is returned asynchronously to the client first, while the other queries are still running. This should simulate a page, which first shows the important stuff and loads the rest in the back ground, like many websites do.
- When the first part of the result arrives at the client, another request is sent, which updates some data
- When all queries, updates and operations complete, we’re done. We measure the time between issuing the request and completing everything
For the plain JDBC implementation I used the Akka future API, to run that work in the background and continue with the other operations. The ADBCJ implementation relies on its asynchronous API.
For these number I ran the database locally, so there is no huge round trip overhead. Only the queries need a bit of time to complete, from sub milliseconds up the around 6 milliseconds.
Result, JDBC vs ADBCJ
Well ADBCJ is as expected faster for this task. Mainly because it can send new queries, while processing the in streaming result of previous queries. Maybe the JDBC implementation also ends up exhausting the thread pool with blocked threads, but I didn’t check it if that’s the case:
JDBC | ADBCJ | |
Min | 36.00ms | 20.00ms |
Median | 47.00ms | 32.00ms |
Mean | 66.26ms | 41.25ms |
Max. | 222.00ms | 155.00ms |
Here’s also a box plot of the data. There is quite a high variance in the data, and a few latency spikes:
JDBC with Connection Pool, ADBCJ via JDBC bridge
Now this is a little unfair, since you usually use a connection pool for JDBC. And we didn’t check out the ADBCJ via JDBC fallback driver. So I ran the benchmark as well with the BoneCP connection pool and ADBCJ via JDBC.
Guess what, JDBC with a connection pool reacts the fastest. This means that a connection pool has a huge effect in this benchmark, since it opens and closes database connection often. The conclusion is that ADBCJ desperately needs its own connect pool implementation, to handle similar scenarios.
Another interesting fact is that ADBCJ via JDBC performs the worst here. It’s clear that it is probably a bit slower the JDBC version, because I manually split the JDBC calls reasonably across threads. The ADBCJ to JDBC bridge has no insight how to smartly dispatch the blocking calls. However the huge difference between JDBC and ADBCJ via JDBC is surprising. This could indicate that there are some implementation issues present in the ADBCJ to JDBC bridge.
JDBC with Pool | ADBCJ | JDBC | ADBCJ via JDBC | |
Min | 17.00ms | 20.00ms | 36.00ms | 40.00ms |
Median | 30.00ms | 32.00ms | 47.00ms | 72.00ms |
Mean | 41.37ms | 41.25ms | 66.26ms | 85.35ms |
Max. | 156.00ms | 155.00ms | 222.00ms | 197.00ms |
Again a box plot of this data. Again high variance and quite a few spikes:
High Load Performance
Well, I wanted to see if the picture doesn’t change if the load is much higher. So I cranked the executed request up, so that the machine is fairly busy. The database is still running locally. The picture doesn’t change much, maybe there is a tendency that the performance advantage of ADBCJ gets bigger. But I didn’t to more scaling testing to verify that.
Also the ADBCJ to JDBC bridge crashed, because it just issued way to many connections. Well, as mentioned, the ADBCJ to JDBC bridge is more testing facility anyway.
JDBC | ADBCJ | |
Min | 41.00ms | 20.00ms |
Median | 142.00ms | 59.50ms |
Mean | 160.22ms | 77.43ms |
Max. | 604.00ms | 298.00ms |
And again in a box plot:
Conclusion
Well, don’t draw to many conclusion from these numbers. We certainly can see that ADBCJ can be faster for some scenarios. We also can clearly see, that the missing connection pool for ADBCJ is a huge performance penalty for many applications.
- Mystery and Fantasy Series: The Fades
- 드라마 평가들는: Psych, It’s Always Sunny in Philadelphia,시크릿 가든, 한반도…등등