So, You Want to Build a Database - Part 2: The Heart of a Database, CRUD(S)

Wednesday, February 10, 2016 - Jerry Sievert

All databases consist of a handful of operations: Create, Read, Update, Delete, and sometimes Scan. If this sounds a lot like reading and writing files on a disk, it is. At their core, databases need access to both data and metadata. An operating system provides this in a very basic form of a filesystem. As a simple approach to building a database, the reliance on a filesystem is key, as it provides a lot of metadata, and allows us to move forward with one less layer to deal with.

Filesystems allow for the naming of a file (the key), contents of a file (the value), and some metadata such as size and time written. This gives us a very powerful key/value store with very little work. What a filesystem does not provide is guarantees of writes, nor thread safety when reading and writing files. For now, the choice is to ignore that and build a system that works with the flaws by design. This is where the choice of Node.js comes in handy for this particular build; the idea is simplicity.

All database entries are assumed to be an object is ready to be converted into a string for storage via JSON.stringify(), and converted back into an object via JSON.parse() for retrieval. If an error is encountered, the callback will be called with the error as the first argument as per the Node.js convention.

Remember, this is a very simple interface for accessing data. A lot of assumptions are being made, so while this code is written as instructional, it is not meant to be used in production. There are better ways to accomplish these ideas and theories: for instance, you would never want to store a large number of entries in a directory as files.

Create

Creating an entry is pretty straightforward. The idea is to simply create (or overwrite) a file. The key becomes the filename, and the object is stringified and becomes the contents of the file. Thus, the method signature becomes:

function (key, value, callback) {
  // naive logic goes here
  var string = JSON.stringify(value);

  fs.writeFile(key, string, callback);
}

Read

Reading a value is also easy - read the file, attempt to transform it to an object, and return the results (or error).

function (key, callback) {
  // read the data in, again using the node.js callback expression
  fs.readFile(key, function (err, data) {
    // if there is an error, just return it directly - no file, or permissions error typically
    if (err) {
      callback(err);
    } else {
      // attempt to decode the JSON
      try {
        var ret = JSON.parse(data);
        callback(null, ret);
      } catch (err) {
        callback("Unable to decode JSON object");
      }
    }
  });
}

Update

Since a filesystem is being used, an update becomes writing a file, just like create.

function (key, value, callback) {
  // naive logic goes here
  var string = JSON.stringify(value);

  fs.writeFile(key, string, callback);
}

Delete

Deletion is also fairly straightforward, but in order to do a clean deletion we must catch any deletion errors: deleting a key that does not exist should not throw an error.

function (key, callback) {
  try {
    fs.unlink(key, function (err) {
      // there's not much on the error front here
      callback();
    });
  } catch (err) {
    // upon error, just callback
    callback();
  }
}

Scan

So far, a very simple key/value store has been built. But that is not quite enough for a queryable database. This is where additional functionality can be added: scan. Unfortunately, for those not familiar with Node.js, this is where things can get a little more complicated. In the spirit of making sure that this collection of blog posts can be followed by anyone, it will first be broken into pseudo-code and then written in real code:

scan:
  files = all filenames in directory

  for each files as file:
    data = read file
    output: { key: file, value: data }

In Node.js, this is a Stream. The convention is to create the Stream and return a reference to it for future action as an event emitter:

var stream = createDirectoryStream(directory);

stream.on('data', function (data) {
  // do something
});

stream.on('end', function ( ) {
  // yay! we are done!
});

Using this, it is fairly straightforward for someone with Node.js experience to construct a simple stream to do the heavy lifting. I am not going to show the code in this instance, but instead refer to the [https://github.com/JerrySievert/byod-store](GitHub repository) with the source code.

Here, we can put it all together:

var Store = require('byod-store');

var db = new Store({ directory: 'myDirectory' });

store.add("book", {
  "title": "A Simple Document",
  "chapters": [
    "one",
    "two",
    "three"
  ]
}, function (err) {
  // check for error
});

At this point, a simple database storage engine has been created, relying on the filesystem to do most of the heavy lifting. This is the basis of a simple key/value store, or the most simple implementation of a NoSQL database. But what if we want to do more? The next entry will go further, and into the realm of making simple queries.