November 6, 2019

MVStore: Accessing Old Versions

A while back I blogged about MVStore basics. Let’s look at the way MVStore keeps old revisions around. First, MVStore maps have a .getVersion() method and a .openVersion() method pair. With that, you can access old version previous values. A new version of the map is created on (auto-) commit.

MVMap restores old versions of itself
Figure 1. MVMap restores old versions of itself

Let’s try it:

MVMap<String, String> myMap = store.openMap("test-map");

// Store our initial key-value
myMap.put("key-1","first-version");
store.commit();
var version1 = myMap.getVersion();

// Update key-value and add and new key
myMap.put("key-1", "second-version");
myMap.put("a-new-key", "second-version");
store.commit();
var version2 = myMap.getVersion();

// Print out the different versions
System.out.println("Version "+version1+":");
printEntries(myMap.openVersion(version1));
System.out.println("Version "+version2+":");
printEntries(myMap.openVersion(version2));
System.out.println("Current:");
printEntries(myMap);


private static void printEntries(MVMap<String, String> map) {
    for (var entry : map.entrySet()) {
        System.out.println("- " + entry.getKey() + " -> " + entry.getValue());
    }
}

This results in:

Version 0:
- key-1 -> first-version
Version 1:
- a-new-key -> second-version
- key-1 -> second-version
Current:
- a-new-key -> second-version
- key-1 -> second-version

Ok, so we can get to old versions. Hmm, does that mean the file grows forever because MVStore never deletes old versions? Let’s create many versions to check.

var initial = myMap.getVersion();
var previous = initial;
// Create a bunch of versions
for (int i = 0; i < 20; i++) {
    // Create a new version
    myMap.put("key-1","iteration-"+i);
    store.commit();

    // Print the state of the versions
    var currentVersion = myMap.getVersion();
    System.out.println("Current " + currentVersion);
    System.out.println(myMap.get("key-1"));
    System.out.println("Previous " + previous);
    System.out.println(myMap.openVersion(previous).get("key-1"));
    System.out.println("Initial " + initial);
    System.out.println(myMap.openVersion(initial).get("key-1"));
    System.out.println("--------");

    previous = currentVersion;
}

Getting oldest version results in a crash:

// ...
Initial 0
initial-version
--------
Current 5
iteration-4
Previous 4
iteration-3
Initial 0
Exception in thread "main" java.lang.IllegalArgumentException: Unknown version 0 [1.4.200/0]
   at org.h2.mvstore.DataUtils.newIllegalArgumentException(DataUtils.java:924)
   at org.h2.mvstore.MVMap.openVersion(MVMap.java:1097)
   at info.gamlor.testing.Main2.createNewVersions(Main2.java:49)
   at info.gamlor.testing.Main2.main(Main2.java:24)

Alright, there is limit on the versions MVStore keeps. Indeed you can check and change how many versions it keeps. The default is 5 versions.

        var versionsKept = store.getVersionsToKeep(); // default is 5
        System.out.println("MVStore keeps " + versionsKept + " Versions");
        store.setVersionsToKeep(30);

When we increase the limit we can read the old versions:

--------
Current 20
iteration-19
Previous 19
iteration-18
Initial 0
initial-version

Retention Time

We’ve seen that MVStore doesn’t do in-place updates. Instead, it creates new versions and it lets you access older versions. This goes further: MVStore doesn’t reuse the disk space of expired versions immediately. Let’s look at an example where we store many versions of a value:

// Default config, so 5 versions are kept
MVMap<String, String> myMap = store.openMap("test-map");
myMap.put("key-1", "initial-version");
store.commit();
var oldestVersion = myMap.getVersion();

for (int i = 0; i < 100; i++) {
    myMap.put("key-1", "iteration="+i);
    store.commit();
}

try{
    myMap.openVersion(oldestVersion);
} catch (IllegalArgumentException e){
    System.out.println("Expected, since the version is gone");
}

The old values should be gone, but if when inspect the database file we still find the old values:

gamlor@gamlor-t470p ~/h/mv-store> strings database.mv | grep initial-version
]key-1ginitial-version
gamlor@gamlor-t470p ~/h/mv-store> strings database.mv | grep iteration=
]key-1citeration=0
]key-1citeration=1
]key-1citeration=2
]key-1citeration=3
]key-1citeration=4
]key-1citeration=5
]key-1citeration=6
]key-1citeration=7
]key-1citeration=8
]key-1citeration=9
]key-1diteration=10
// All old values are here

By design, MVStore doesn’t override old values for 45 seconds. MVStore relies on this timeout to give the file system time to flush the data to permanent storage. It doesn’t do a fsync. You must not shorten this timeout unless you know what you are doing. If you want to use fsync as extra security you have to do call it yourself.

// Get and change the retention time.
System.out.println("Keep old data: "+store.getRetentionTime()+"ms");
// Set retention time. Note MVStore relies on this timeout to give the File System time to flush the data
var minute = 60000; //millis
store.setRetentionTime(minute);

// In case you want to fsync, you have to do yourself if you don't trust the retention time approach
store.getFileStore().sync();

For example, with a shorter retention time and some waiting our old entries are wiped from the database file:

// Demo only: Don't set a short retention time on production databases.
store.setRetentionTime(100 /* milliseconds */);
// Slower inserts to let the time pass. Now old versions of the data will be evicted from the database file
for (int i = 0; i < 100; i++) {
    myMap.put("key-1", "iteration="+i);
    store.commit();
    Thread.sleep(50);
}

And the old data is gone:

gamlor@gamlor-t470p ~/h/mv-store> strings database.mv | grep initial-version
gamlor@gamlor-t470p ~/h/mv-store> strings database.mv | grep iteration=
]key-1diteration=97
]key-1diteration=98
]key-1diteration=99
]key-1diteration=93
]key-1diteration=94
]key-1diteration=95
]key-1diteration=96

Anyway, the retention time is an integral part of the MVStore. I recommend leaving it at the default value.

Retention Time, Versions to Keep and Iteration

The retention time and versions do create an ugly edge case. Let me demonstrate it. We have one thread that iterates over an MVMap. Another thread does update the map concurrently. Assume it’s a long iteration taking a long time.

// Accelerate the time until the issue shows up.
store.setRetentionTime(1000);

MVMap<String, String> testMap = store.openMap("test-map");
// Keep on iterating over the map: This will fail at some point
Thread iteratorThread = runInThread("Iterate", () -> {
    for (int i = 0; i < 10000; i++) {
        long keyCount = 0;
        for (String k : testMap.keySet()) {
            keyCount++;
        }
        System.out.println("Run no " + i + " iterated over " + keyCount + " keys.");
    }
});
// Insert new entries concurrently
Thread insertThread = runInThread("Insert", () -> {
    long i = 0L;
    while (iteratorThread.isAlive()) {
        i++;
        testMap.put("key:" + i, "data: " + i);
    }
});

// Wait for the threads
iteratorThread.join();
insertThread.join();


private static Thread runInThread(String name, Runnable block) {
    Thread thread = new Thread(() -> {
        try {
            block.run();
            System.out.println(name + " ended");
        } catch (Exception e) {
            e.printStackTrace();
            System.err.println("Program failed, exit");
            System.exit(0);
        }
    }, name);
    thread.start();
    return thread;
}

It will crash with:

Run no 425 iterated over 16436272 keys.
java.lang.IllegalStateException: Chunk 13 not found [1.4.200/9]
   at org.h2.mvstore.DataUtils.newIllegalStateException(DataUtils.java:950)
   at org.h2.mvstore.MVStore.getChunk(MVStore.java:1230)
   at org.h2.mvstore.MVStore.readBufferForPage(MVStore.java:1214)
   at org.h2.mvstore.MVStore.readPage(MVStore.java:2209)
   at org.h2.mvstore.MVMap.readPage(MVMap.java:672)
   at org.h2.mvstore.Page$NonLeaf.getChildPage(Page.java:1043)
   at org.h2.mvstore.Cursor.hasNext(Cursor.java:53)
   at info.gamlor.testing.Main.lambda$versionIssues$0(Main.java:88)
   at info.gamlor.testing.Main.lambda$runInThread$2(Main.java:201)
   at java.base/java.lang.Thread.run(Thread.java:830)

That message looks like the database corruption. However, if we open and check the database everything is fine. What happens is this: When you created the iterator it points to the start page of the data to read. While iterating it doesn’t block any other database operations. As we continue to write the MVStore eventually starts overwriting old pages after it passed the retention time and versions kept limits. The iterator continues, but suddenly it can try to read a page which got wiped and then we see that exception.

MVStore fails because the old versions is gone
Figure 2. MVStore fails because the old versions are gone

You could increase the retention time and versions kept, but that is a guessing game. Instead, tell the MVStore that you are reading a specific version. MVStore will keep that version alive until all readers of that version are finished. If you iterate over a range or do some other long operation, then you should keep the version alive until you are done.

var using = store.registerVersionUsage();
try{
    for (String k : testMap.keySet()) {
        keyCount++;
    }
} finally {
    // We're done reading on the current version
    store.deregisterVersionUsage(using);
}

Rolling Back

MVStore also supports rollbacks to a specified version. It will drop all changes made after that version. Note that MVStore has auto-commit. If auto-commit is not disabled MVStore will commit and create new versions periodically.

MVMap<String, String> myMap = store.openMap("test-map");

// Store our initial key-value
myMap.put("key-1", "first-version");
var version1 = store.commit(); // or store.getCurrentVersion() also returns the current version

// Update key-value and add and new key
myMap.put("key-1", "second-version");
myMap.put("a-new-key", "second-version");
var version2 = store.commit(); // or store.getCurrentVersion() also returns the current version
System.out.println("Version before rollback is: "+version2);

// Rollback the keystore
store.rollbackTo(version1);

// We're back to the old version
System.out.println("Rolled back, we are back at Version: " + store.getCurrentVersion());
for (var entry : myMap.entrySet()) {
    System.out.println(entry.getKey() + " => " + entry.getValue());
}
Version before rollback is: 2
Rolled back, we are back at Version: 1
key-1 => first-version

Of course, we can only rollback to the versions still in the store. Again that is regulated by the retention time and versions to keep limit. If we try to restore a version outside that limit it will fail.

System.out.println("Version is: "+store.getCurrentVersion() + " and " + store.getVersionsToKeep() + " versions are kept.");
var versionsRolledBack = store.getVersionsToKeep() + 1;
var version = store.getCurrentVersion() - versionsRolledBack;
System.out.println("Rolling " + versionsRolledBack + " versions to version " + version);

// Failing rollback if the version is not kept and retention time passed
store.rollbackTo(version);
Version is: 20 and 10 versions are kept.
Rolling 11 versions to version 9
Exception in thread "main" java.lang.IllegalArgumentException: Unknown version 9 [1.4.200/0]
   at org.h2.mvstore.DataUtils.newIllegalArgumentException(DataUtils.java:924)
   at org.h2.mvstore.DataUtils.checkArgument(DataUtils.java:911)
   at org.h2.mvstore.MVStore.rollbackTo(MVStore.java:2539)
   at info.gamlor.testing.Main.rollback2(Main.java:49)
   at info.gamlor.testing.Main.main(Main.java:30)

Transactions?

We’ve seen retention times, reading specific versions of maps and rollbacks. You can smell the transactions in the air. So, does the MVStore have transactions? Yes and no. The raw MVStore does not offer transactions. There is a TransactionStore in the MVStore library, which offers transactions on top of MVStore. That is a topic for another time =).

Tags: MVStore Java