ZDB as historical trade database – Part 2

Posted by Mattias on 2012-10-28

In a series of posts we will examine how you can use ZDB to build a historical trade database, and how to perform queries and aggregations in it. In the previous part we discussed how to create the database, and in this part we will discuss how to write data to it.

There are three different ways to write to a database, all having different advantages and use-patterns. The three different ways is a series of getTable-updates-putTable, just using putTable and finally using a TableAppender as returned by directAppend.

Using getTable-updates-putTable is the most generic way to update a table, since it allows you to perform updates and deletes as well as appending data. In this series we are only appending data, not updating or deleting, so using one of the two other methods will be more efficient since it will avoid unnecessary read operations.

The choice between putTable and directAppend is fairly straightforward, if all data is available one should use putTable and if data is available a bit at a time one should use directAppend. In the context of market data, where updates get available one row at a time, the most natural way is to use directAppend and write every update as they arrive. Unfortunately, the currently available implementations of ColumnLoader perform an open-write-close cycle for each column on every append so it is always a good idea to collect the update in batches.

Now that we know how to write a table, it is time to actually creating the table we will be writing. Again there are several possible ways to create a table, or rather create the columns containing the actual data that will then be combined to form a table. The simplest way to use the implementations found in the package com.zolltov.zdb.memory.

There are several ways to create a memory column containing data. Data can be copied from a java.util.Collection, an array or another column of the same type. One can also create a column of appropriate size and set all the values in it. All methods have roughly the same complexity so you should select one that best suits the data you have available.

Putting all of this together gives a code similar to this

private static void writeData(List beans) { DateColumn date = new MemoryDateColumn(beans.size()); StringColumn symbol = new MemoryStringColumn(beans.size()); StringColumn market = new MemoryStringColumn(beans.size()); StringColumn currency = new MemoryStringColumn(beans.size()); StringColumn isin = new MemoryStringColumn(beans.size()); StringColumn name = new MemoryStringColumn(beans.size()); DoubleColumn price = new MemoryDoubleColumn(beans.size()); DoubleColumn volume = new MemoryDoubleColumn(beans.size()); BoolColumn onexchange = new MemoryBoolColumn(beans.size()); TimeColumn time = new MemoryTimeColumn(beans.size()); for(int i=0; i < beans.size(); i++) { TradeBean tb = beans.get(i); date.setInt(i, tb.getDate().toInteger()); symbol.setString(i, tb.getSymbol()); market.setString(i, tb.getMarket()); currency.setString(i, tb.getCurrency()); isin.setString(i, tb.getIsin()); name.setString(i, tb.getName()); price.setDouble(i, tb.getPrice()); volume.setDouble(i, tb.getVolume()); onexchange.setBoolean(i, tb.isOnExchange()); time.setInt(i, tb.getTime().toInteger()); } MemoryTable table = new MemoryTable(); table.addColumn(, date); table.addColumn(Schema.symbol, symbol); table.addColumn(, market); table.addColumn(Schema.currency, currency); table.addColumn(Schema.isin, isin); table.addColumn(, name); table.addColumn(Schema.price, price); table.addColumn(Schema.volume, volume); table.addColumn(Schema.onexchange, onexchange); table.addColumn(Schema.time, time); db.putTable("trade", table, new Date(), new Date()); }