db
databases
DynamoDB
Dynamodb is a scalable key-value store (NoSQL) database from Amazon. It doesn't have a schema, and can have multiple indexes to store data.
tables
Before data can be written to a database, a table must be created. Because it uses hashes as keys, a theoretically infinite amount of data can be stored within the table. A way to differentiate between the different value is by composing the hashes using prefixes. Generally you'll use a single table per application.
key design
Key design is crucial to performant dynamodb instances.
Dynamo defines 2 key types:
- hash keys: exact, bound to table
- range queries: conditional, bound to hash key. Operators:
gt
,lt
.
If you want to query data without using map-reduce you'll need to combine hash keys with range queries.
Say we want to create a bidirectional relationship between two actors, we'd compose the keys as follows:
hash key | range key
-------------------------------------------------------------
content_type.entity_id | FAN_OF.content_type.entity_id
content_type.entity_id | FANNED_BY.content_type.entity_id
Dynamo instances are spun up for the amount of requests. Writes to a hash key are locked to a process. If you're appending a ton of data to a single key you'll be in big big trouble (e.g. using < 20% of resources, making DynamoDB 5x as expensive as need be).
Writes are 5x as expensive as reads.
queries
Queries can either be created using nested JS objects or key-condition expressions. Using key expressions is recommended, as conditions will be phased out.
{
TableName: 'string', // required, table to read from
KeyConditionExpression: 'string', // key value to be retrieved
ExpressionAttributeValues: {}, // value mappings for the expression
ConsistentRead: false, // consistency model
}
Example using the
dynamo-streams
module:
db.createQueryStream({
TableName: table,
KeyConditionExpression: '#K = :key', // target expression
ExpressionAttributeNames: {
'#K': 'key' // specify column, ignoring AWS's reserved words.
// Column name here is called 'key'
},
ExpressionAttributeValues: {
':key': { S: 'bar' } // specify keys, maps onto expressions.
// 'S' means 'type: String'
}
})
KeyConditionExpressions
a = b
true if the attribute a is equal to the value ba < b
true if a is less than ba <= b
true if a is less than or equal to ba > b
true if a is greater than ba >= b
true if a is greater than or equal to ba BETWEEN b AND c
true if a is greater than or equal to b, and less than or equal to cbegins_with (a, substr)
true if the value of attribute a begins with a particular substring
resources
- dynamodb/query-and-scan
- dynamodb/query-property
- dynamodb/api-query
- aws-blog/key-condition-expression
- dynamodb/key-condition-expression
modules
- dynamo-streams - A stream-flavored wrapper for the AWS DynamoDB JavaScript API
- dynamo-client - A low-level client for accessing DynamoDB from node.js
- dynamo-down - A leveldown API implementation on AWS DynamoDB
- dynalite - A mock implementation of Amazon's DynamoDB built on LevelDB
links
leveldb
Towers of Hanoi abstraction. Merges are expensive, write to small files, merge whenever n stores can be merged.
- in memory skiplist (sorted by keys, like a linked list but multi links)
- SST (Sorted String Table, sorted by keys)
ordered log file (by timestamp)
files come in, dumped in a log
- indexed in an in-memory SST
- flushed to a on-disk SST once a threshold is reached
- when n SSTs of a certain size exist, they are merged into a larger (and thus more efficient) SST
Ordered log files can be traversed using binary search. Larger log files are faster to traverse using binary search, but merging log files is sort of slow (not really, but hey). So leveldb creates levels of log files, and only merges them once in a while, creating several levels.