RealDB

Master's project by Jason Winnebeck - Current Status | Download | Using RealDB | Progress Reports

Committee

Chairperson: Henry A. Etlinger
Reader: Alan Kaminsky
Observer: T.J. Borrelli

Abstract

Embedded sensor monitoring systems deal with large amounts of live time-sequenced stream data. The embedded system requires a lower overhead data store that can work with limited resources and be able to run reliably and unattended even in the face of power faults.

Relational database management systems (RDBMS) are a well-understood and powerful solution capable of storing time-sequenced data; however, many have a high overhead, are not sufficiently reliable and maintenance free, or are unable to maintain a hard size limit without adding substantial complexity.

RealDB is a specialized solution that capitalizes on the unique attributes of the data stream storage problem to maintain maximum reliability in an unstable environment while significantly reducing overhead from indexing, space allocation, and inter-process communication when compared to a traditional RDBMS-based solution.

Summary

To summarize the main topics on what this work is to accomplish:

Create a storage engine for live streaming data streams that performs better than traditional relational database systems, by making applicable assumptions and tradeoffs for the specific situation.
- Faster: constant-time insertion and deletion of records, with respect to the size of the database
- Faster: logarithmic lookup time for data within a data stream, with linear time to read records
- Smaller: tightly compacted dataset with little or no indexing required, data is assumed to come in-order
Represent data streams as a continuous value over time, rather than as a set of discrete samples or rows
- Smaller: allows reduction of data stored by allowing reconstruction of the signal from a reduced data set
- Smarter: allows calculation of averages, integrals, etc based on the reconstruction over timespans.

Current Status / Schedule

The RealDB project final report is tentatively finished and can be downloaded here; the project defense is scheduled for October 5, 2010. The final proposal can be found here (SVN revision 81, 7-Aug-2008).

Milestone 1: First design phase completed, creation of object model and some stub functionality and tests. Setup of environments and compilers such as GCJ, and prototypes for high-risk code.
Milestone 2: Creation of maintenance tools to create an empty database on disk, and simple storage implementation (single stream, no or incomplete space management)
Milestone 3: Completion of database metadata and functionality required for writing, including space management but excluding compressed data storage
Milestone 4: Completed research and implementation of compressed data storage and gathering, reconstruction algorithms, and read functionality including APIs and query tool
Milestone 5: Completion of proof-of-concept use for RealDB
Milestone 6: Design and implementation of RealDB version of performance tool and completed design for solving problem using other solutions
Milestone 7: Completion of performance tool versions for MySQL InnoDB, MySQL MyISAM, and Apache Derby

Project Schedule

Target	Planned	Completed	Percent Complete
Preproposal	2006-10-17	2006-10-17	100%
Preproposal Presentation	2006-10-17	2006-10-17	100%
Proposal Approved	2008-07-31	2008-08-11	100%
Milestone 1	2008-08-31	2008-08-28	100%
Milestone 2	2008-08-31	2008-09-29	100% (r123)
Milestone 3	2008-09-08	2009-01-29	100% (r199)
Milestone 4	2008-09-22	2010-02-28	100%
Milestone 5	2008-10-20	2010-04-11	100%
Milestone 6	2008-11-10	2010-07-27	100%
Milestone 7	2008-11-24	2010-07-27	100%
Report	2008-12-15	2010-09-19	100%
Defense	2009-01-15	2010-10-05	presentation 100%

Download

Latest Release: RealDB 1.0 (SVN revision 504), released on August 22, 2010.

realdb-1.0-bin.zip - Includes the binary versions of each RealDB component
realdb-1.0-deps.zip - Includes runtime dependencies for core library and all other components, plus source where required. Extract this archive to the same directory as realdb-1.0-bin (so that all of the jars are in the same directory) to make RealDB runnable. For build/test dependencies, please use Maven.
realdb-1.0-src.tar.gz - Source code for the entire RealDB project

Please note that unfortuantely the full dataset used for benchmarking in the final report cannot be published. A very small dataset usable for functionality demonstration is provided but is not suitable for benchmarking due to its size.

Javadoc for the realdb-core module (the "library" part) can be browsed online here. You may want to read on how to use RealDB.

RealDB is built with Maven. The outputs are published as a Maven repository at http://www.gillius.org/maven2/. You may download individual javadoc/source/binary jars there (but not the dependencies).

Report

The final report was completed on September 19, 2010, although it may go through minor revision after the defense based on feedback.

RDBFinalReport.pdf

Defense presentation: PowerPoint PDF Handouts (PDF)

License

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, see <http://www.gnu.org/licenses>.

Additional permission under GNU GPL version 3 section 7

If you modify this Program, or any covered work, by linking or combining it with JUnit (or a modified version of that library), containing parts covered by the terms of Common Public License, 1.0, the licensors of this Program grant you additional permission to convey the resulting work.

Dependencies

The following libraries are used. Note that with the new Maven build system these would be downloaded for you:

Antlr 3.2 and its stringtemplate dependency - used to generate lexer and parser for RDL (RealDB Definition Language), Modified BSD license
JUnit 4.8.1, CPL licensed
JFreeChart 1.0.13, and its dependency JCommon, 1.0.15, LGPL licensed.
EasyMock, Apache 2.0 License

For running the main library, realdb-core, the only dependency is the Antlr runtime jar. For the browser and concept tools, JFreeChart is also needed. All other dependencies are used in building or unit tests.

For the benchmark program, it should be able to work with any JDBC driver, assuming the database server and driver configurations are set up properly. The benchmark as run in the final report uses the following:

MySQL Connector/J 5.1.12, GPLv2
Derby 10.5.3.0_1, Apache 2.0 License

Using RealDB

This section is very minimalistic and will be expanded in the future.

Running RealDB

Download the full release of RealDB; no compilation is needed. There are several ways to use RealDB:

Run the realdb-demo.jar in a GUI environment by double-clicking it (if your OS environment has an association for executable jars).
Run the realdb-cli.jar on the command line by executing java -jar RealDB-cli.jar in the directory where you uncompressed RealDB.
Reference realdb-core.jar in your own Java project and use the API calls to read and write data.
Run the RealDB Browser (DB viewer) in a GUI environment by double-clicking realdb-browser.jar
Run the RealDB proof-of-concept:
1. Execute java -cp realdb-concept-1.0.jar org.gillius.realdb.concept.VehicleDataSimulation data/Test.csv test.rdb -- note that this will make a 200MB database file!
2. Run the concept analysis tool in a GUI environment by double-clicking realdb-concept.jar
Run the RealDB benchmark on a Linux machine:
1. Follow the instructions in mysql-setup.txt to download and install "private" copy of MySQL.
2. Run mkdir -p db_data/realdb to create the DB directories.
3. Run java -jar realdb-benchmark-1.0.jar realdb_file myisam innodb derby
  1. If you run as "root" and have a raw partition, you can run the "realdb_raw" test:
    1. Modify data/realdb_raw.properties and setting the "dataFile" to the device you wish to use; the benchmark will replace any data on that partition/device!
    2. Change monitoredDir and monitoredPattern to match.
    3. Include "realdb_raw" as a parameter to the benchmark
4. If you are on Windows, you are mostly on your own. However the steps are:
  1. Follow equivalent steps to mysql-setup.txt.
  2. Modify the data/innodb.properties and data/myisam.properties to comment out the Linux form of the command line and uncomment the Windows form
  3. Don't include "realdb_raw" in the run line, since this mode of RealDB works only on Linux.

Note, the current version of RealDB may be unable to work with databases made with versions of RealDB before 1.0.

Compiling RealDB

Apache Maven is used for compilation. To compile RealDB yourself, download a recent version of Maven (2.2 is known to work), and run "mvn install" to locally install the RealDB packages. The resulting source, javadoc, and binary jars for each module will be in the respective target directory.

NetBeans and IntelliJ IDEA IDEs are capable of creating projects from the Maven files. IDEA 9 community edition project files are already generated and exist in the source code.

RealDB Definition Language

The RealDB Definition Language (RDL) is used to define RealDB databases. Here is an example RDL file:

SET blockSize     = 2048
SET fileSize      = 204800
SET maxStreams    = 3
SET dataBlockSize = 2

CREATE STREAM Test WITH ID 1 {
  value float NULL //will use SampledAlgorithm by default
}

CREATE STREAM CarSnapshots WITH ID 2 {
  rpm float WITH CODEC DeadbandAlgorithm PARAMS (deadband=50.0),
  speed float WITH CODEC DeadbandAlgorithm PARAMS (deadband=5),
  passengers uint8 WITH CODEC StepAlgorithm,
  driving boolean WITH CODEC StepAlgorithm
}

This section will be expanded more later. Brief descriptions of the elements in this file:

blockSize = sets the low-level block size of the file, must be greater than or a multiple of the native block size to ensure complete reliability.
fileSize = Size of the file, in bytes, should be a multiple of the blockSize; a partial block at the end would be wasted.
maxStreams = Amount of space to allocate in the file for data streams. This must be equal to or greater than the number of streams actually created. In the future, RealDB will support adding new streams to an existing database.
dataBlockSize = multiple of blockSize for a raw stream data block. The larger the dataBlockSize the more efficient the access, but the more data lost in a database/power/IO fault.

Creating Streams

CREATE STREAM <stream name> WITH ID <unique integer user ID>
<element name> <element type> [NULL] [WITH CODEC <codec name>] [PARAMS(a=value, b=value, ...)]
- If NULL is specified, the element is allowed to take on a null value, otherwise it must always have a defined value
<codec name> = one of the following options:
- SampledAlgorithm (the default) - Every record is written
- StepAlgorithm - Record is written if the value changed at all
- DeadbandAlgorithm - works only on numeric types; value written if |nextValue - previousValue| >= deadband; takes deadband as a parameter.
- A fully-qualified class name of a Java class that implements org.gillius.realdb.metadata.ElementCodec (allows user-defined codecs).
<element type>
- sint8, sint16, sint32, sint64, uint8, uint16, uint32 = signed and unsigned integer types of 8, 16, 32, or 64 bits.
- float, double = 32 bit and 64 bit types, supports IEEE-754 ranges and special states including denormals.
- boolean = element can be true or false

RealDB Concepts

DataStream

A DataStream is a time-ordered sequence of Records with a fixed format -- an ordered list of elements.

Record

A single entry in a DataStream, has a timestamp and can be a "discontinuity" marker.

Element

A specific field in the Record with a known name and type.

StreamInterval

A reconstructed continuous interval of a DataStream that comprises of 1 or more original (possibly dropped) records.

Discontinuities

A stream can be in a "discontinuity" state. This is different from elements individually being null and even all elements being null at the same time. The actual meaning can be defined by the user. For example a null could be "not applicable," or "collected but not present", whereas a discontinuity could signifiy ranges where the system was turned off and not collecting data at all. When a discontinuity is written, the stream is said to remain in that state until the next non-discontinuity record is written.

Timestamps

All records in RealDB are timestamped, and written in ascending order of time. The timestamp is a 63 bit unsigned integer. RealDB does not assume any particular unit or meaning for time other than it being linear (i.e. distance between 5 and 10 is the same as between 50 and 55), leaving the user to define both the unit (i.e. seconds versus milliseconds) and the epoch (the meaning of time 0).

Command Line Tool

The current command line tool supports the following commands. Use the help command to get more information:

bulkLoad	Bulk loads data into a stream from a tab-separated values file.
create	Creates a new RealDB Database
describe	Shows information about a database.
help	Prints out a list of all commands or detailed help on a single command
intervals	Reads intervals in a given time range on an element within a stream.
read	Reads all raw records or a range of raw records from a single stream and outputs a tab-delimited output with the first row as the column headers.
summary	Reads intervals in a given time range on an element within a stream, and summarizes them into a single report.

Graphical Browser

There is a graphical browser tool that can view information about a database and the streams within. See the Running RealDB section on how to start it.

Progress Reports

3-Oct-2010

Clear up some instructions about running the "realdb_raw" test in the "Running RealDB" section.

28-Sept-2010

The final report has been posted in the download section.

22-Aug-2010

The final code deliverables are up on the site. Milestones 6 and 7 were completed back in late June and July. The final report is completed except that it needs some proofreading and touchup on reference citations. The slides for the presentation are half complete.

After the defense presentation, the final report and the full set of results will be posted, and this site will be finalized a bit more in content.

The project deliverables are licensed under GPLv3.

26-May-2010

Milestones 6 and 7 are mostly completed. What is done is performance testing of the 5 configurations.

Dimensions

Implementations: RealDB File, RealDB Partition, Derby, MySQL MyISAM, MySQL InnoDB
size management: Y/N
Number of records: every 1 million

Implementation Notes

The size management applies significantly only to the non-RealDB configurations, where I implement a DELETE on every batch of 1000 records to delete all records older than the first by some amount of time (right now 1000 seconds). With no size management on SQL, I issue no deletes. For RealDB, the difference is a 100MB database versus a 200MB one.

For SQL databases, I implement the StepAlgorithm that RealDB uses to eliminate duplicate data points, to put both systems on equal footing. MySQL and Derby are used as they come "out of the box", except that on MySQL I use the JDBC parameter rewriteBatchedStatements=true to turn JDBC batches into MySQL multi-inserts, which offers a huge speed up (seems to be about 10x but I did not measure it directly).

The dataset I will describe in more detail later, but it includes raw data and the benchmark uses the same techniques and schema as the proof of concept to generate 2 boolean streams. 9203285 records are generated from this.

Disk statistics for the flash memory test are better than the hard drive test because the OS and database/RealDB binaries reside there. For the flash test I put ONLY the data store on the flash, not the programs.

Metrics Collected

Database size (assuming no filesystem overhead)
DB startup and creation:
- time
- disk sectors reads/writes
DB load and shutdown:
- time
- disk sectors read/write
- reads/writes disk millis
- user and kernel mode jiffies (incl children)

I probably won't be graphing each metric, but these are what I collect. I am running the benchmark on the Linux operating system, where I use /proc/self/stat to get CPU statistics, and /sys/block/DEVICE/stat (where DEVICE is the raw disk in use, for example sdc).

Configuration

Ubuntu GNU/Linux 9.10 (Karmic), kernel 2.6.31-21-generic
Intel Core 2 2.4 GHZ E6600 (CPU frequency scaling left on)
2GB RAM
Java 1.6: OpenJDK 6b16-1.6.1-3ubuntu3
Hard Disk test: RAID 1 with 2 disks, 10k Western Digital Raptor and 7.2K WD, filesystem ext4
Flash Memory test:
- Sandisk Ultra II CompactFlash "15 MB/sec" with 0.5GB unformatted partition (for RealDB Raw) and 3.5GB FAT32 partition (all others)
- Crappy generic USB 2.0 memory card reader "ALLIN1"
- 5.7MB/s writes, 6.8MB/s reads -- I know the reader itself is the bottleneck as I get better speeds on other machines.

Next Steps

To finish this milestone I will meet with my committee to find out what else I need to collect, if anything. The only major missing component from the benchmark is a reliability test. I have no test right now to quantify that -- for example I know that MyISAM requires a repair after crash and I know it needs free disk space equal to the largest table but how long does it take? Does Derby and InnoDB recover in constant time? Also, I should prove that RealDB also recovers, and in "constant" (for a given DB config) time.

I also may bundle up a 0.7 release when this is done.

25-Apr-2010

I am looking into how to best emulate the "circular buffer" behavior from RealDB in a standard JDBC data store. The problem is that there is no perfect decision, and the best decision differs on the database implementation. Here are some options:

Delete the oldest number of rows to keep count(*) under some limit.
- Requires a willingness to accept to set a per-table size limit rather than a per-database size limit, and live with the fact that one table could be 100% full and the other 0%.
- Works only if each row takes up a fixed space, which is not true if the database can get fragmented or uses pages. Either one must live with hoping there will be some amount of maximum "slack space", or take the database offline for some time with an operation like MySQL's OPTIMIZE TABLE when the fragmentation level reaches some threshold. On MyISAM, an optimize table requires a rebuild of the whole table.
- Deleting an exact number of rows is not possible in all SQL databases. On MySQL, you can use ORDER BY and LIMIT on a DELETE to delete an exact number of the oldest rows, but in Transact-SQL, DELETE TOP X deletes random rows, which is not useful. Apache Derby supports no such method. Alternatives include trying to delete rows based on time and guess a time range, or selecting a limit of rows to find the exact time range (slow).
  - (It may be possible to do it with cursors, but I am not aware of the portablility of cursor SQL syntax or performance).
- On some database formats, such as InnoDB, SELECT COUNT(*) is not constant-time. One could work around this partially by assuming they are the only client (which RealDB does, so it would be fair), and keeping track of counts manually, so that the SELECT COUNT(*) is only needed at startup, but then startup is not O(1) time.
Keep the database with a constant time range; for example store only the last week of data.
- Possible and very easy to do with standard SQL: delete all rows from all data tables where time is less than a week of the most recently inserted point.
- The amount of actual data "in the last week" can very very wildly. If you don't record any data in a week, you could delete virtually the whole database, or if there is a lot of activity, the size can be virtually unbounded.
Delete rows when the size of the database is large
- Deleting rows based on count described above has big problems.
- Getting the size of the database is not portable. Possible with extensions like MySQL's SHOW TABLE STATUS, although even in MySQL for some formats like InnoDB the reporting is not exact.
- In certain formats like MyISAM, deleting rows don't free space unless you run OPTIMIZE TABLE; the row is just flagged as deleted. You can hope that all reported free space can be used without extending the file.

Ultimately, I think that any real-world application that needs this type of functionality that chooses to use a SQL database as a backend will need a database-specific implementation (because portable SQL does not give you a standard way to control size), and also likely will need to compromise on strong guarantees of performance (constant time startup) and/or size control (even the proprietary methods I reviewed were not sufficient).

Given that there is no single solution that works everywhere and solves the whole problem, I decided for the RealDB benchmark run on JDBC databases will use the "constant time range" method for size control. This is simplest to implement and relies only on portable SQL. A real-world application may best be able to use this method practically if it has a way of summarizing when data was recorded and can assume some maximum data rate.

18-Apr-2010

The benchmark program is proceeding well. I have written a "generic JDBC" method for writing the data in a table format that mirrors the following:

streamId : 32-bit integer, part of primary key
time: 64-bit integer, second part of primary key, also in its own non-unique index
data columns matching those from the RealDB stream
discontinuity: boolean (or CHAR(1) depending on DB support)

This allows storage of multiple streams of data sharing the same structure in the same table. I felt that this is the most straightforward mapping of RealDB streams to SQL tables because typically in relational database design you would not create separate tables with the same structure. I also specified an index on the time column because in a relational database we need to delete old data to make room for new data by time.

I figured the most performant JDBC method would be to use PreparedStatement with batched execution, 1000 statements at a time. If there is a better and practical method I am open for suggestions.

Currently the benchmarking program has a way to specify a DB-specific properties (configuration) file and SQL script to set up the database. Currently I have tested the program against both MySQL MyISAM and Apache Derby databases. I've done preliminary runs on the three database systems on a data simulation that does not perform any deletes yet. Currently the only metrics I have gathered is time to run the simulation and amount of disk space used. I will need to collect more, such as I/O operations, memory usage, and ability for database recovery on system fault.

To complete milestones 5 and 6 I need to do the following:

Finish definitions of tests to run and metrics to collect.
Implement an algorithm to simulate the "circular buffer" behavior from RealDB (i.e. deleting old rows). I know on some database formats (InnoDB) it may be impossible to reclaim records to keep the database within a certain size; instead I may have to reclaim to rows or time limit (because "SELECT COUNT(*)" is not constant time) as a proxy.
Actually run the tests on solid state hardware.

11-Apr-2010

I completed the new transaction system below a few weeks back, and I have finished the proof-of-concept for milestone 5. While the new transaction system achieved my main goal of having a constant upper-bound on time and storage space for a given database configuration (linear in number of streams), I could not get it working to reduce the writes as much as I wanted. In the end I decided to move on and leave the optimization as future work. Essentially, only a single transaction can be outstanding at one time, which reduces performance (although it is still linear with data). Since a transaction is only needed when committing a data block (not a data point), the larger the data block, the lower the overhead. My eventual goal is to allow overlapping transactions, so that single writes to disk can commit multiple modifications.

The proof of concept for milestone 5 farily closely follows what was specified in the proposal. I have provided a data file with the following streams of data: RPM, FuelRate, AccelPedalPosition, BatteryVolts, FuelEcon, and Speed. For the small file included with the release, it is just fake generated data. The VehicleDataSimulation program in the realdb-concept module also generates two additional streams: VehicleStatus and HardAccelEvent. The VehicleStatus denotes where vehicle data was being collected (and not) via a boolean field. The HardAccelEvent stream denotes an event that is set to "true" at the start of 10 continuous seconds or more of accelerator input of 90%. The analysis is to be able to examine critical vehicle parameters during hard acceleration. An interface is presented to the user which allows them to select an incident (instance of the event) and show a graph of data on all of the other streams. A "smarter" event function than hard acceleration would certainly be possible, for example to detect anomalous voltages on the battery, or abnormally low fuel economy. Since the focus on this project is on the database and not diagnostic algorithms, I chose something simple that I knew would occur in the data.

With the milestone 5 release also comes a simple read-only GUI browser tool for generic databases. The proof-of-concept reuses components of this browser.

I believe I will be able to use the basis of the proof-of-concept code and database definition that loads databases as a strong foundation for milestones 6 and 7.

21-Mar-2010

Summary

I am still working towards milestone 5, but I wanted to give an update on the new transaction system I am also working on. The current system is naive and scales based on the amount of data in the database. The new (and final) system scales based on the number of data streams in the database, which should be relatively low and "fixed." I've finished the design (unless I find problems) and am part way through implementation.

Old Transaction System

Old transaction logging method: Recording a transfer of a data block:

From the unallocated pool to an index
From the free queue to an index
From one index to another (including from an index to itself)

The transaction log contains only the active transactions and nothing more. It was completely rewritten every time the transaction log is updated. A transaction object in Java recorded the sequence of steps so that a flush was delayed as much as possible, and when any step in the sequence was flushed, all steps before were flushed.

The recovery process was not appropriate, because it was linear on the size of the database. I wasn't sure properly how to safely examine the indexes to determine if the operation happened, so I scanned the entire affected indexes.

If the block appears in source and dest, we flag an error since this can't happen due to the order of operations
If the block appears only in dest, no recovery needed (transaction never happened)
If the block appears only in source, no recovery needed (transaction never happened)
If the block appears in neither, add it to a special "free index"

The recovery was done in a manner that even if the IO failed during recovery, you could re-try recovery later. So this process seems to work perfectly, but it does not scale. So I need a new transaction recovery method.

Researching

I spent some time reading the book Database Design and Implementation by Edward Sciore that I borrowed. I wanted to get some ideas specifically from the transaction management chapter (14). The idea in the book is much easier than mine -- basically create a forward-only writing log containing the modifications (block ID, offset, old data, new data) to physical blocks on disk. The recovery manager does not need to inspect anything. For rollbacks it writes the old data to the blocks, for commits it writes the new data. It might re-write something that was already done but it always ends up with the right result.

New System

The method from Sciore's book is very simple and elegant, but there are a few issues:

I try to separate the logic from the physical blocks in my system, so recording block changes is awkward and not efficient
RealDB doesn't allow arbitary transactions -- just "atomic moves" which cannot be rolled back -- so no need for an "undo" log -- can I do "better?"

But, some of the fundamental properties of the idea in Sciore are important, that I want to preserve. Specifically:

The transaction log is such that recovery can be idempotent (run multiple times and restarted in the middle and the end result is the same)
The transaction log only moves forward (no need to re-write old portions)

After much thought, I realized I can utilize a key ingredient of my database to ensure idempotency without huge overhead: the index sequence number. Given that all operations are ordered and that I have a reliable method of incrementing the sequence number, if I use the sequence number in my log, I can find out if the operation has completed simply by comparing the sequence numbers. This way I can track "deltas" which are smaller than logging the whole block's state without being afraid of getting out of sync. The next key step is to break apart the transfer into two logged operations: remove and add.

Tx log entry types:

Remove block (from free pool or another index) = { data block number (tx ID), source index, index tail sequence number}. "Opens" a transaction.
Add block = {tx ID, dest index, index head seq num}. "Closes" a transaction
Free block = { next data block number }, exists at the start of each log block as a synchronization point for replay

Tx log memory model:

Start and end block pointers to the disk (circular buffer)
Next data block number in free pool, or (free pool).length if no free blocks left
Set of open transactions = { IDs (block number), pointer to log block where it is opened}
- The pointer to the opening log block is there only as an assertion check that we don't overwrite the start of a transaction.
- The set of open transactions after a recovery is also the "free" queue.

Transaction Log Size: Given that we know that a stream can only be writing a single block at a time, we know that only that many transactions can be pending. Therefore we can size the log to the maximum number of streams times the size of the remove and add entries and know that we will never truncate an open transaction (although I keep track to assert that!)

Tx free pool to index:

Step	If step fails, after recovery
return next data block number
Queue Tx: remove block
(Optional) flush log	block stays free
Stream: Write data block to disk	block stays free
Queue Tx: add block to index
flush log	block was written but will be counted as free
flush index	replay will redo add to index

I'm not totally sure whether or not I need the "remove block" for this operation -- I think I do if I allocate free pool blocks 2, 3, 4 and 4 gets committed before 2 and 3, but this might not be possible if I delay the allocation internally until I need to commit.

Tx index to index:

Step	If step fails, after recovery
Find source index to get block
Queue Tx: remove block
Queue source index: remove block
(A) flush log	block stays allocated to source
(depends A) flush source index	will be replayed
(B depends A) flush data block	block enters free queue
(C depends B) Queue Tx: add block
(D depends B) Queue dest index: add block
(E depends C) flush log	block enters free queue
(depends E, D) flush dest index	will be replayed

Note the the order of the above can change some and still have the same effect (note dependencies).

Recovery Process: Read Tx log from beginning, on each event:

On free block entry: set next data block number
On remove block entry: if block == next data block number, increment data block number. Add transaction to open Tx set. Replay remove from index: if index tail sequence number matches tx, queue remove on dest index
On add block entry: Remove transaction from open Tx set. Queue add to index if index head sequence number matches.

28-Feb-2010

Milestone 4 release. I took a little extra time from my previous report to make the import/export file formats more robust to handle discontinuities, and I realized that I missed some API support for discontinuities in StreamIntervals.

Refer to the instructions above for how to download and run RealDB. For this release, you can run these commands:

create examples/test.rdl test.rdb
describe test.rdb
bulkLoad test.rdb 1 examples/test.txt
bulkLoad test.rdb 2 examples/CarSnapshots.txt
read test.rdb 1 10 20
read test.rdb 2
intervals test.rdb 2 rpm 50 70
summary test.rdb 2 passengers

To get the output below:

% java -jar RealDB-cli.jar
RealDB Tool Interactive mode
This version is extremely experimental and the commands will change in the next release.
You can also invoke this program with arguments to run a single command non-interactively.
Use 'help' command to get a list of commands, 'exit' to end
> create examples/test.rdl test.rdb
Created database at test.rdb
> describe test.rdb
Database /home/stu4/s30/jpw9607/realdb/realdb-2010-02-28-r318/test.rdb:
  Block size     : 2048
  Database size  : 100 blocks (204800 bytes)
  Data block size: 2
Database has 2 streams:
Test (1), with 1 elements:
  0 value (Float, optional) with codec SampledAlgorithm
Stream contains records from times -1 to -1
CarSnapshots (2), with 4 elements:
  0 rpm (Float) with codec DeadbandAlgorithm
  1 speed (Float) with codec DeadbandAlgorithm
  2 passengers (UInt8) with codec StepAlgorithm
  3 driving (Boolean) with codec StepAlgorithm
Stream contains records from times -1 to -1
> bulkLoad test.rdb 1 examples/test.txt
Wrote 35 records into stream Test (ID 1)
> bulkLoad test.rdb 2 examples/CarSnapshots.txt
Wrote 30 records into stream CarSnapshots (ID 2)
> read test.rdb 1 10 20
    stream       time      value
         1         10       null
         1         11       11.0
         1         12       12.0
         1         13       13.0
         1         14       14.0
         1         15       15.0
         1         --         --
         1         17       null
         1         18       18.0
         1         19       19.0
         1         20       20.0

Read 11 records.
> read test.rdb 2
    stream       time        rpm      speed passengers    driving
         2          5        0.0        0.0          0      false
         2         20        0.0        0.0          1      false
         2         35        0.0        0.0          2      false
         2         45        0.0        0.0          3      false
         2         50      500.0        0.0          3       true
         2         65      600.0        0.0          3       true
         2         70      950.0        5.0          3       true
         2         75     1250.0       10.0          3       true
         2         80     1500.0       20.0          3       true
         2         85      700.0       25.0          2       true
         2         90      850.0       30.0          2       true
         2         95     1000.0       35.0          2       true
         2        100     1200.0       40.0          2       true
         2        105     1400.0       45.0          2       true
         2        120        0.0        0.0          2      false
         2        135        0.0        0.0          0      false

Read 16 records.
> intervals test.rdb 2 rpm 50 70
    stream    element      start        end        min        avg        max   integral
         2        rpm         50         65      500.0      500.0      500.0     7500.0
         2        rpm         65         70      600.0      600.0      600.0     3000.0
         2        rpm         70         75      950.0      950.0      950.0     4750.0

Read 3 intervals.
> summary test.rdb 2 passengers
    stream    element      count      start        end        min        avg        max   integral
         2 passengers         16          5        135        0.0 0.25384615384615383        3.0      255.0

23-Feb-2010

I am almost completely done with milestone 4, and a release will be coming this week. Since the last progress report, I have completed all of the "next steps":

Finished implementation of design and implementation of the stream compression and reconstruction framework.
Implemented record searching -- O(log(n)) time search through the index to an index block, through the index block to a data block, and within a data block to the actual record. This allows to start pulling records between any arbitrary range.
Build user API calls to ranged (raw) record finding, and stream intervals (the "reconstructed" view of the stream)
Augment the Record data model to support discontinuities, and null record elements
Implementation of a rudimentary command-line tool for interacting with the database

I'm just polishing up a few things for the release:

Double check that the reconstructed view of the stream and stream intervals are handling discontinuities/nulls properly
Making the command-line tool a little more user-friendly (but still not so amazing), by being a little more robust with the bulk data import files and output format
Build a small "crash course" description of RDL (RealDB Definition Language) and example run

Here is an example RDL that I will be using for this run:

SET blockSize     = 2048
SET fileSize      = 204800
SET maxStreams    = 3
SET dataBlockSize = 2

CREATE STREAM Test WITH ID 1 {
  value float //will use SampledAlgorithm by default
}

CREATE STREAM CarSnapshots WITH ID 2 {
  rpm float WITH CODEC DeadbandAlgorithm PARAMS (deadband=50.0),
  speed float WITH CODEC DeadbandAlgorithm PARAMS (deadband=5),
  passengers uint8 WITH CODEC StepAlgorithm,
  driving boolean WITH CODEC StepAlgorithm
}

There is one caveat to the completion of milestone 4: the current transaction handling, while functioning perfectly (as far as I know), does not meet the scalability requirements in my proposal as the current implementation is naive. It is linear time on the size of the database, while it should be linear time on the uncommitted transactions, up to the maximum user-specified transaction log size.

The command line tool (org.gillius.realdb.tools.cli.RealDBTool) will provide at least the following commands:

Create Database
Describe Database
Bulk Load (from file)
Read raw records from a stream (over an optional time range)
Read reconstructed intervals from an element within a stream (over an optional time range)
Provide a summary (min, max, avg, integral, etc) of a element within a stream's reconstruction over a time range

Older Reports

See OldReports

Follow

RealDB

Committee

Abstract

Summary

Current Status / Schedule

Project Schedule

Download

Report

License

Dependencies

Using RealDB

Running RealDB

Compiling RealDB

RealDB Definition Language

Creating Streams

RealDB Concepts

DataStream

Record

Element

StreamInterval

Discontinuities

Timestamps

Command Line Tool

Graphical Browser

Progress Reports

3-Oct-2010

28-Sept-2010

22-Aug-2010

26-May-2010

Dimensions

Implementation Notes

Metrics Collected

Configuration

Next Steps

25-Apr-2010

18-Apr-2010

11-Apr-2010

21-Mar-2010

Summary

Old Transaction System

Researching

New System

28-Feb-2010

23-Feb-2010

Older Reports