Async SQL and Akka

By gamlerhart May 17, 2012 java, software-development

I’ve already written several times about Akka and async programming with it. For example doing access files and webservices with it. So async API stack is complete, right? We’ve got async network and web access via NIO and frameworks like Play 2.0, Spray, HTTPClient etc. And with NIO 2 we easily can access the file system in asynchronous way. But wait, what about your database? What if tons of your data sits in a good old relational database? Unfortunately the current JDBC model is blocking by definition. No asynchronous operation is allowed.

The standard way of doing things for async SQL operations is to use some background threads and do things there. And for many applications this is good enough. But if you do more and more db operations you start to consume more and more threads, which will be waiting for blocking operations. Wouldn’t it be nicer if the database operations are truly asynchronous like other operations?

JDBC is nice, but let’s you wait

So I’ve looked for an alternative to JDBC. I’ve found two approaches to provide a asynchronous JDBC alternative. The async-mysql-connector and ADBCJ. Unfortunate both are not really actively maintained. Anyhow I decided to go ahead with ADBCJ. More about that later. Let’s first take a look at how we to use everything.

Async SQL in Akka

Let’s take a short look how to use that API in Akka. First we need to configure the database in our Akka configuration:

myapp{ async-jdbc{ url = "adbcj:mysql://localhost/employees" username = "root" password = "" } }

After that we can use the Database Akka extension to create a connection. And with that connection we then can do our database operations. The API is very JDBC like, but with the difference that all operations run asynchronously. Since they return Akka futures, you can compose the operations like your used to with Akka futures. Here an example:

// grab the extension val dbSupport = Database(actorSystem) val result = for{ // open the connection asynchronously connection <- dbSupport.connect() // Execute a query quest newestEmployee <- connection.executeQuery( "SELECT first_name,last_name,hire_date FROM employees " + "ORDER BY hire_date DESC " + "LIMIT 0,5") // When the results arrived, close the connection _ <- connection.close() } yield newestEmployee // Process the results result.onSuccess{ case resultSet =>{ resultSet.foreach{ row => println(row("first_name").getString + " "+row("first_name").getString + " since " + row("hire_date").getString ) } } }.onFailure{ case e:Exception =>{ e.printStackTrace() } }

A regular query returns a immutable ResultSet. Alternatively you can pass a event handler to the query method. That handler will be called as the data streams in. This way you can assemble the data yourself. For example it can build up a string:

val resultAsString = for { // open the connection asynchronously connection <- dbSupport.connect() // Execute a query quest dataAsString <- connection.executeQuery( "SELECT first_name,last_name,hire_date FROM employees " + "ORDER BY hire_date DESC " + "LIMIT 0,5", "") { // Instead of creating a result set, we also can directly react // as the data streams in case StartRow(jsonToBuild) => jsonToBuild + "-" case AValue(value, jsonToBuild) => jsonToBuild + "," + value.getString case EndRow(jsonToBuild) => jsonToBuild + "\n" } // When the results arrived, close the connection _ <- connection.close() } yield dataAsString // Process the results resultAsString.onSuccess { case stringData => print(stringData) }.onFailure { case e: Exception => { e.printStackTrace() } }

Of course transactions, updates and prepared statements are also supported:

val result = for { // open the connection asynchronously connection <- dbSupport.connect() _ <- connection.beginTransaction() insertedInfo <- connection.executeUpdate( "INSERT INTO employees (emp_no, birth_date, first_name, last_name, gender, hire_date) " + "VALUES (42, '1986-05-07', 'Roman', 'Stoffel', 'M', '2012-05-17')") statement <- connection.prepareQuery("SELECT first_name FROM employees WHERE emp_no = ?") queryResult <- statement.execute(42) _ <- connection.rollback() _ <- connection.close() } yield queryResult

Of course it’s annoying to manually close the connection every time. Therefore a .withConnection method exists. It will open a connection for you, run your code and close the connection afterwards. Everything is done asynchronously, therefore closure has to return a future, and the connection is closed when that future completes:

val result = Database(actorSystem) .withConnection{ connection =>connection.executeQuery( "SELECT first_name,last_name,hire_date FROM employees " + "ORDER BY hire_date DESC " + "LIMIT 0,5") }

And also a .withTransaction exists. It does the same for the transaction. It will commit it if the given closure finishes regular. It rolls the transaction back if the closure or the returned future fails.

connection.withTransaction{ txConn => txConn.executeQuery( "SELECT first_name,last_name,hire_date FROM employees " + "ORDER BY hire_date DESC " + "LIMIT 0,5") }

ADBCJ calls you back

The ADBCJ Stuff

Now the stuff above is just a thin wrapper around the ADBCJ API. It’s basically a JDBC clone, where all operations are asynchronous. That also means that it requires a driver which implement the operations asynchronously. Originally three drivers where around. A MySQL driver, a PostgreeSQL driver and a Bridge to JDBC. Of course the JDBC bridge cannot remove the blocking calls. It just just executes them on worker threads. So it doesn’t bring a real benefit.

Now unfortunately the ADBCJ implementations were only proof of concept implementations. So I forked it on Github, fixed the worst issues and implemented a few missing features. But I did that only for MySQL. So what’s the exact state of it?

The MySQL and JDBC bridge are up to date. The basics work: Transactions, queries, prepared statements etc.
The PostgreeSQL compiles, but I haven’t maintained it. Didn’t bother to run the tests. Prepared statements not implemented.
Many data type implementations are missing.
The test suite is not in a good shape. I’ve improved it a lot, but it is still very minimalistic. It just doesn’t cover enough.
Connection pool for ADBCJ connections is missing. (There was some implementation, but not tests. Removed it for now until I’ve test etc im Place.)
It’s just far away from rock solid.

So for a real world application, ADBCJ would need a lot of work. First to improve the MySQL implementation and second to support other databases. Unfortunately this is tedious database protocol implementation work, which isn’t fun. So unless someone is really desperate, it will never be done.

Hopefully, JDBC version 42.0 will include async operations, and async operations can be done easily ;).

Get the Stuff

The tiny ‘wrapper’ is on my github repo here: https://github.com/gamlerhart/akka-async-apis. My fork ADBCJ is here: https://github.com/gamlerhart/adbcj. Also the stuff is on my github hosted Maven Snapshot repository. You can grab it via SBT or Maven:

Repository for Maven: https://github.com/gamlerhart/gamlor-mvn/raw/master/snapshots
GroupID: info.gamlor.akkaasync
ArtifactID: akka-dbclient_2.9.1
Version: 1.0-SNAPSHOT

And you also need the MySQL driver:

GroupID: org.adbcj
ArtifactID: mysql-async-driver
Version: 0.3-SNAPSHOT

So via SBT:

The Future

Well, most applications probably don’t have a real need for an asynchronous JDBC replacement. Connection pools and running enough threads is good enough. And many applications can use enough caching to reduce the database operation pressure. So the niche for this stuff is small =).

Next time I probably shine some more light on ADBCJ itself.

Tagged on: adbcj, akka, database, java, Scala

5 thoughts on “Async SQL and Akka”

Maurício Linhares May 21, 2012 at 1:23 am

This is definitely great ;D

I have started a side project to do something like that, but since AJDBC was dead for such a long time I decided to build it from scratch instead of reusing what was available, so I built an async PostgreSQL driver using Netty and it’s close to be feature complete, I’m missing a couple o type parsers and authentication options, but it’s already running common and prepared statements:

https://github.com/mauricio/postgresql-netty

Do you think it’s worthwhile to reuse AJDBC?
gamlerhart Post authorMay 21, 2012 at 1:45 am

Hi Maurício

Thanks for the link. I thing I already stumbled upon that project somewhere.

Well I used the AJDBC stuff to avoid building yet another dead project from scratch ;). Another small reason is that I think it would be nice to have the core in Java, so that it can be easily be used from Java, Scala, Groovy etc.

So is it worth reusing AJDBC? In the current state there not all that much to reuse, implementation wise. Especially for the PostgreeSQL driver, since I didn’t touch and improved that one. It is way behind your implementation. Besides that your implementation seems to have some, like Joda-Time. For now I would build on that.
Pingback: Async SQL and Akka | Gamlor | playframework | Scoop.it
Pingback: This week in #Scala (25/05/2012) | Cake Solutions Team Blog
Pingback: State of Asynchronous Database Access for Java | Gamlor

Gamlor

thoughts about programming, tv-series and other geeky stuff.