ZDB as historical trade database – Part 1

Posted by Mattias on 2012-09-02

In a series of posts we will examine how you can use ZDB to build a historical trade database, and how to perform queries and aggregations in it; in this part we are going to discuss how to create the database. In the examples we will use equity market data as reported by trading venues, such as stock exchanges, but it can fairly easily be adapted to other types of instruments.

First off is the instrument data, and we will be using the following fields


This is the unique symbol, or ticker, that your market data vendor uses to identify a listing. The content and format of this varies from vendor to vendor, so we will just save it as a string.


This is the ISIN, or International Securities Identification Number, that can be used to identify the traded security. This will also be saved as a string since it includes a country code.


The Market Identifier Code is also saved as a string.


The currency the instrument is traded in.


This is a string that contains the name of the company.

In the root of the hierarchy we have the company name, for example 'AstraZeneca PLC'. Each company has one or more ISIN codes. Each ISIN is in turn traded on one or more markets, and on each market you can trade it in one or more currencies.

For the actual trades, we will be using the following fields


Date it was traded. To make things easier, we are going to assume this is always todays date on the input feed.


Time the trade was done. Again we are going to make things simple for us, and assume that there are no late trades. We are also going to ignore timezones and daylight saving time.


The number of shares traded. This will be stored as a double.


The price per share the trade took place at. This will also be stored as a double.


This boolean will be set to true if this trade was crossed on the exchange, or false if it was done off exchange for various reasons.

Also, we are going to ignore trade corrections or cancels to keep things simple and tidy.

Now that we have defined all data that will be stored in the database, it is time for implementation. At the center of any file database is the FileDataBase class that manages all tables and also read and write data to file when needed. At its aid is an implementation of the interface ColumnLoader that performs the actual reading and writing of column files. In this case we are going to use the already existing class SimpleColumnLoader that is suitable for most needs.

The first step is to create the database directory structure. The quickly and easiest way is to instantiate an SimpleColumnLoader with the, preferable empty, root directory for the database, and then call the method createDataBase(). This will create all the necessary files and directories for an empty database that we can use later to store all our data in.

Then we use the previously created SimpleColumnLoader to create a FileDataBase object that will be used to create the table. It is important that we call createDataBase() before we pass it along to FileDataBase, since the later will fetch a list of tables and available dates among other things when it is constructed.

The table is created by a call to createPartitionedTable that takes a name for the table and a list of columns it will contain. The class ColumnInfo is used when creating a table to provide a name and type for each column. The types we will be using in this series is boolean, date, double, string and time but there are several other types available as well.

Putting this together give the following code snippet for creating the needed database in the directory given by the argument root

public void createDataBase(File root) { SimpleColumnLoader loader = new SimpleColumnLoader(root); loader.createDataBase(); FileDataBase db = new FileDataBase(loader); db.createPartitionedTable("trade", new ColumnInfo(Type.DatePartition, "date"), new ColumnInfo(Type.String, "symbol"), new ColumnInfo(Type.String, "market"), new ColumnInfo(Type.String, "currency"), new ColumnInfo(Type.String, "isin"), new ColumnInfo(Type.String, "name"), new ColumnInfo(Type.Double, "price"), new ColumnInfo(Type.Double, "volume"), new ColumnInfo(Type.Bool, "onExchange"), new ColumnInfo(Type.Time, "time")); }