db

databases

DynamoDB

Dynamodb is a scalable key-value store (NoSQL) database from Amazon. It doesn't have a schema, and can have multiple indexes to store data.

tables

Before data can be written to a database, a table must be created. Because it uses hashes as keys, a theoretically infinite amount of data can be stored within the table. A way to differentiate between the different value is by composing the hashes using prefixes. Generally you'll use a single table per application.

key design

Key design is crucial to performant dynamodb instances.

Dynamo defines 2 key types:

  • hash keys: exact, bound to table
  • range queries: conditional, bound to hash key. Operators: gt, lt.

If you want to query data without using map-reduce you'll need to combine hash keys with range queries.

Say we want to create a bidirectional relationship between two actors, we'd compose the keys as follows:

hash key                 |   range key
-------------------------------------------------------------
content_type.entity_id   |   FAN_OF.content_type.entity_id
content_type.entity_id   |   FANNED_BY.content_type.entity_id

Dynamo instances are spun up for the amount of requests. Writes to a hash key are locked to a process. If you're appending a ton of data to a single key you'll be in big big trouble (e.g. using < 20% of resources, making DynamoDB 5x as expensive as need be).

Writes are 5x as expensive as reads.

queries

Queries can either be created using nested JS objects or key-condition expressions. Using key expressions is recommended, as conditions will be phased out.

{
  TableName: 'string',                // required, table to read from
  KeyConditionExpression: 'string',   // key value to be retrieved
  ExpressionAttributeValues: {},      // value mappings for the expression
  ConsistentRead: false,              // consistency model
}

Example using the dynamo-streams module:

db.createQueryStream({
  TableName: table,
  KeyConditionExpression: '#K = :key', // target expression
  ExpressionAttributeNames: {
    '#K': 'key' // specify column, ignoring AWS's reserved words.
                // Column name here is called 'key'
  },
  ExpressionAttributeValues: {
    ':key': { S: 'bar' }  // specify keys, maps onto expressions.
                          // 'S' means 'type: String'
  }
})

KeyConditionExpressions

  • a = b true if the attribute a is equal to the value b
  • a < b true if a is less than b
  • a <= b true if a is less than or equal to b
  • a > b true if a is greater than b
  • a >= b true if a is greater than or equal to b
  • a BETWEEN b AND c true if a is greater than or equal to b, and less than or equal to c
  • begins_with (a, substr) true if the value of attribute a begins with a particular substring

resources

modules

  • dynamo-streams - A stream-flavored wrapper for the AWS DynamoDB JavaScript API
  • dynamo-client - A low-level client for accessing DynamoDB from node.js
  • dynamo-down - A leveldown API implementation on AWS DynamoDB
  • dynalite - A mock implementation of Amazon's DynamoDB built on LevelDB

leveldb

Towers of Hanoi abstraction. Merges are expensive, write to small files, merge whenever n stores can be merged.

  • in memory skiplist (sorted by keys, like a linked list but multi links)
  • SST (Sorted String Table, sorted by keys)
  • ordered log file (by timestamp)

  • files come in, dumped in a log

  • indexed in an in-memory SST
  • flushed to a on-disk SST once a threshold is reached
  • when n SSTs of a certain size exist, they are merged into a larger (and thus more efficient) SST

Ordered log files can be traversed using binary search. Larger log files are faster to traverse using binary search, but merging log files is sort of slow (not really, but hey). So leveldb creates levels of log files, and only merges them once in a while, creating several levels.

results matching ""

    No results matching ""