Getting started with YDN-DB javascript module
Setup
YDN-DB is a pure javascript library which use HTML5 browser database as storage mechanisms, which include IndexedDB, WebDatabase (WebSQL) and WebStorage (localStorage). Most moden browsers including IE10, Chrome, Firefox and Safari support either IndexedDB or WebSQL. The library also work on enviroment supporting these database such as phonegap, WebView and UIWebView mobile clients.
Download the latest version of distribution. For development you should want to use ydn.db-x.x-dev.js, which enforce assertion, log warning and error.The namespace of this library is ydn.db. The script
provide main javascript class object call ydn.db.Storage.
Database connection
A simple way to initialize a database is by specifying a database name.
db = new ydn.db.Storage('db-name');
db.put('store-name', {message: 'Hello world!'}, 'id1');
db.get('store-name', 'id1').always(function(record) {
console.log(record);
});
The storage instance, ydn.db.Storage, connect to
suitable data storage mechanisms starting from IndexedDB to WebSQL to
localStore. It will open existing database or create a new database
with the given name.
All database operation methods are asynchronous and result are
immediately return as a promise object. It is also called future or
deferred object. The resulting deferred object accepts two
asynchronous functions: done to receive value
on success event and fail to received in case of
error. A convenient then accept both of them. In
case of error, this library always invokes error callback
with Error object with has name attribute
and message attribute.
goog.async.Deferred
class, which is also similar to JQuery
Deferred or WinJS.Promise.
Both of them provides advanced asynchronous
workflow. Storing data
Use put method to insert a new or update existing
record(s).
Out-of-line key
db.put('store1', {test: 'Hello World!'}, 123)
The first argument is store name. It is object
store name in IndexedDB and TABLE name in WebSQL. A table or
object store will be created if not exist.
The second argument is the record we want to store. It should be a
simple object. A structured
clone
of the object is stored in the database. A structured clone is
similar concept in JSON serialization. But it is more efficient and
more powerful. File and Blob are clonable, while DOM Element is
not. If the record fail to clone it, underlying database API shall
throw DataCloneError. None of the error from the
database API are caught by the library and hence not documented. In
some instance, dev version of the library may caught the error and
rethrow for providing better error message.
The third argument is primary key of the record. A key can be number, string, Date or simple array. Since we are given primary key separately from the record, it is called out-of-line key.
Use get method to retrieve it by the primary key.
req = db.get('store1', 123);
req.done(function(record) {
console.log(record);
});
req.fail(function(e) {
throw e;
});
Deferred object is not only used for asynchronous workflow, but also for error control. You should not be surprised that all of database operation methods expect some errors. And it should be handled, at least, as above by re-throwing the error object. Handling by single global error function is discouraged. If error is an exception, it will have full stack trace. However database operation error are coded error and not exception error.
In-line key
Primary key can be embedded into the record. Embedding key is better because it enforces consistency. We specify how to extract the primary key from the record by using keyPath.
record = {id: 'abc', message: 'Testing in line key'};
req = db.put({name: 'store2', keyPath: 'id'}, record);
req.done(function(key) {
console.log(key);
});
req.fail(function(e) {
throw e;
});
Here, we are giving store schema in the first argument instead of just store name. The two store schema attribute are 'name' and 'keyPath'. A record is inserted to 'store2' store by using primary key 'abc'. Since primary key is specified in the record object, it is called, in-line key.
Set autoIncrement of the store schema attribute
to true to let the database engine to generate a
increasing primary key when it is not provided. Auto key
generation works for both in-line and out-of-line key.
Retrieving
Keys are the most efficient way to retrieve a record. If we don't know the key, we must browse the whole store to find it. Let us add some more records and list them.
db.put('store2', [{id: 'a', message: 'a record'}, {id: 'b', message: 'b record'}]);
Notice multiple records are stored by using array of records in one transaction.
Listing use values database operation method.
db.values('store2', null, 10).done(function(records) {
console.log(records);
});
The results is sort by ascending order of primary key. The second parameter is a limit to the number of results. It is a good practice to specify a limit.
Listing is the most expensive database operations. Unused records retrieved from the database are termed as false-positive retrieval, of which we must minimized. Instead you will list only keys as follow.
db.keys('store2', null, 10).done(function(records) {
console.log(records);
});
In contrast to listing records, key listing is much faster because it
obviates serialization and also (in most case) IO operation. IO
operation takes 2 to 4 order of magnitudes slower than memory
operation. One of the main purpose of query is reducing false-positive
retrieval of record value value. Key is very important for effective
querying. And hence keys are carefully constructed. In addition to
primary key, there is secondary key, which is simply called
as index key.
Schema
While running the above codes, we modified database schema on creating new stores. It is not good in production usage, because modifying database schema is not a trivial process. It need to notified all connections on other tabs including worker thread as well. Additionally we should use a fixed schema through out a web application for consistency. Schema, that you just created, is retrieved as follow.
db.getSchema()
You will find database version is undefined, since we
are not giving a version. The database is said to be in auto-version
mode. The actual schema from the database connection is retrieved as
follow.
db.getSchema(function(schema) {
console.log(schema);
});
These two schema are similar.
A database schema is basically defining object stores or TABLE in WebSQL. An example is
var author_store_schema = {
name: 'author',
keyPath: 'email', // optional,
autoIncrement: false, // optional.
indexes: [
{
name: 'born', // optional
keyPath: 'born',
unique: false, // optional, default to false
multiEntry: false // optional, default to false
}, {
keyPath: 'company'
}, {
keyPath: 'hobby',
multiEntry: true
}
] // optional, list of index schema as array.
};
schema = {
stores: [author_store_schema]
};
The above schema define one object store. The name of object store is
'author'. Since keyPath is defined, it is using in-line
key. Since autoIncrement is false, all
records must have a valid key in its 'emial' field attribute.
The object store 'author' has three indexes, named 'born',
'company' and 'hobby'. In WebSQL, it is a column name. If name
is not defined, it is default to keyPath. An example
'author' record will be:
author_1 = {
email: 'me@aaronsw.com',
born: 531763200000,
first: 'Aaron',
last: 'Swartz',
company: 'Reddit',
hobby: ['programming', 'blogging', 'politics']
};
Defining a data type is recommended if it is intended to used in WebSQL. Being schema-less database system, a record could have arbitrary fields in addition to index fields. In WebSQL, these extra fields are stringified and stored in special default column of BLOB data type. BLOB data type is not queryable.
If index schema attribute, unique is true,
the underlying database engine will check for every new record write
to make sure that index key value is unique. If not unique, it will
throw ConstriantError.
The index schema attribute, multiEntry is meaningful
only for field value of array data type. The index 'hobby' has multiEntry
of true, so that each element in of the array hobby are
indexed individually.
In addition to 'stores' attribute, database schema take 'version' attribute. If version number is specified, the library will open with the given version. If the client browser do not have or lower than the given version, it will be upgraded as necessary. Client version must not be higher than given version. If client version is the same as given version, the database schema must be similar. If not similar, the library will refuse to connect the database. This library will not work, if schema is not known.
Let us generate some data for querying.
genAuthors = function(n) {
var out = [];
for (var i = 0; i < n; i++) {
out[i] = {
first: randName(),
last: randName(),
born: +(new Date(1900+Math.random()*70, 12*Math.random(), 30*Math.random())),
email: randEmail(),
company: pickOne(companyList),
hobby: pickMany(hobbyList)
};
}
return out;
};
db = new ydn.db.Storage('test-2', schema);
var authors = genAuthors(10000);
db.put('author', authors).then(
function(ids) {
console.log(ids.length + ' authors put.');
}, function(e) {
throw e;
}
);
put method uses multiple asynchronous threads and
generally very fast.
Basic query
If you don't know the exact key, you must enumerate the whole or a range of object store records. Here the process is called querying. Instead of enumerating records, you can only enumerate only key. This is called key only query. Key only query is very efficient because the most expensive operations of IO and de-serialization are avoid.
Counting
Generally NoSQL database does not provide counting, but IndexedDB API does.
db.count('author').done(function(x) {
console.log('Number of authors: ' + x);
});
It is not fast. It is recommended that the result be cached. However, interestingly, Firefox uses SQLite backend and counting is fast.
This is the only aggregrate function provided by the IndexedDB API.
Sorting
Keys are sorted in the database. Hence sorting is required, the field is indexed. The following illustrate iterating records sorted by 'born' date field.
var reverse = false; // sorted by ascending of date
var limit = 10;
db.values(new ydn.db.Cursors('author', 'born', null, reverse), limit).done(function(records) {
console.log(records);
});
This query result is sorted by 'email', the primary key.
var key_range = null;
db.keys('author', key_range, 100).done(function(records) {
console.log(records);
});
Instead of values, here keys methods is
used, so that only emails, primary keys, are listed.
Similarly, we use keys method to index key
iterator when we only want sorted secondary key. The following
example list hobby index. We set unique attribute to true,
so that we get unique list of hobby.
var unique = true;
var hobby_key_iter = new ydn.db.Cursors('author', 'hobby', null, false, unique);
db.keys(hobby_key_iter).done(function(hobby) {
console.log(hobby);
});
The above code snippet use key iterator, you can also use value iterator.
var hobby_iter = new ydn.db.IndexValueCursors('author', 'hobby', null, false, unique);
db.keys(hobby_iter).done(function(hobby) {
console.log(hobby);
});
Both iterators return effective keys, 'hobby'. The return values will
be different if keys method's complementary function, values,
is used.
db.values(hobby_key_iter).done(function(refs) {
console.log(refs);
});
db.values(hobby_iter).done(function(refs) {
console.log(refs);
});
values method returns reference value of the
iterator. The reference value of index key iterator is
primary key and that of index value iterator is record
value.
Filtering
The primary way of filtering is query by key range. More sophisticated filtering are iterated merging of key range results. We dedicate these sophisticated filtering on later sections.
The following query finds authors born in 1942 February.
var lower = + new Date(1942, 1, 1); // 1942 February 1
var upper = + new Date(1942, 2, 1); // 1942 March 1
var iter = ydn.db.IndexValueCursors.where('author', 'born', '>=', lower, '<', upper);
db.values(iter).done(function(records) {
console.log(records);
records.map(function(x) {
console.log(x.first + ' ' + x.last + ' ' + new Date(x.born));
});
});
A key range is a continuous segment of keys defined by interval by
using lower and upper bound.
To retrieve all keys within a certain interval, you can construct a key range as follow:
| Range | Code |
|---|---|
| All keys ? x | ydn.db.KeyRange.upperBound(x) |
| All keys < x | ydn.db.KeyRange.upperBound(x, true)
|
| All keys ? y | ydn.db.KeyRange.lowerBound(y) |
| All keys > y | ydn.db.KeyRange.lowerBound(y, true) |
| All keys ? x && ? y | ydn.db.KeyRange.bound(x, y) |
| All keys > x &&< y | ydn.db.KeyRange.bound(x, y,
true, true) |
| All keys > x && ? y | ydn.db.KeyRange.bound(x, y,
true, false) |
| All keys ? x &&< y | ydn.db.KeyRange.bound(x, y,
false, true) |
| The key = z | ydn.db.KeyRange.only(z) |
| All (string or array) keys start with a | ydn.db.KeyRange.starts(a) |
Paging
keys and values methods accept limit
and offset for paging. Iterator has position state
property. The position of an iterator is the last cursor
position defined by effective key and primary key.
The following query list all authors having hobby in 'camping' using
iterator and paging.
var iter = new ydn.db.IndexValueCursors('author', 'hobby', ydn.db.KeyRange.only('camping'));
var limit = 20;
db.values(iter, limit).then(function(records) {
console.log(records);
console.log('From ' + records[0].email + ' to ' + records[records.length - 1].email);
}, function(e) {
throw e;
});
Also notice that the result is sorted by primary key. To continue next page, call list method again.
db.values(iter, limit).then(function(records) {
console.log(records);
console.log('From ' + records[0].email + ' to ' + records[records.length - 1].email);
}, function(e) {
throw e;
});
Notice that offset value is not used.
Map
The library provides map database iteration methods. It can be used to emulate projection query.
var iter = new ydn.db.ValueCursors('author', ydn.db.KeyRange.starts('a'));
var names = [];
db.map(iter, function(value) {
names.push(value.first + ' ' + value.last + ' <' + value.email + '>');
}).then(function() {
console.log(names);
}, function(e) {
throw e;
});
Reduce
For aggregration query can be emulate by using reduce database iteration method.
The follow snippet calculate average age of authors.
var iter = new ydn.db.Cursors('author', 'born');
var initial = 0;
var now = +new Date();
db.reduce(iter, function(previous, current, index) {
var age_ms = now - current;
var age = age_ms / (1000 * 60 * 60 * 24 * 365);
return (age + previous * index) / (index + 1);
}, initial).then(function(result) {
console.log('Average age: ' + result);
}, function(e) {
throw e;
});
Updating
Use open method to update records.
var iter = new ydn.db.ValueCursors('author', ydn.db.KeyRange.starts('a'));
var mode = 'readwrite';
var updated = 0;
var deleted = 0;
db.open(iter, function(cursor) {
var author = cursor.value();
if (author.company == 'Oracle') {
cursor.clear().then(function(e) {
deleted++;
}, function(e) {
throw e;
});
} else if (author.category != 'A') {
author.category = 'A';
cursor.update(author).then(function(e) {
updated++;
}, function(e) {
throw e;
});
}
}, mode).then(function() {
console.log(updated + ' records updated, ' + deleted + ' deleted.');
}, function(e) {
throw e;
});
Debugging
Use dev version of the compile JS file during development. Put .map file in the same directory as js file. Turn on source map for debugging with full source code. Use build-in logging facility to detail logging.
var module = 'ydn.db';
var level = 'finer'; // warning, info, fine, finest, all
ydn.debug.log(module, level);
Summary
| Read | Description |
|---|---|
| get | Get a record |
| keys | Get a list ordered keys, or effective keys of an iterator |
| values | Get a list ordered records, or reference values of an
iterator |
| Write | Description |
| add | Add record(s) |
| clear | Delete record(s) |
| put | Add or update record(s) |
| Iterate | Description |
| map | Map reference values of an iterator |
| open | Open an iterator for read/write operation |
| reduce | Reduce reference values of an iterator |
| scan | Scan keys of an iterator |
| Aggreate | Description |
| count | Count records |
Exercises
Write a script that will throw DataCloneError.
There are two primary way that cause DataCloneError.
One is the data include unsupported data type.
db.put('store1', {ele: document.getElementsByTagName('div')[0]}, 2)Another one is circular referencing.
cir_array = [1]; cir_array.push(cir_array); db.put('store1', {data: cir_array}, 3);Write a script that will throw DataError.
Why a function cannot be persisted in the database? How will you re-store an javascript object?
Function is not persisted in the database because function is not serialized by structured clone algorithm.
Unlike JSON object, javascript object has methods, which are functions and will not be restored. The technique for restoring javascript object is similar to serialization in other programming language. A static factory method or the constructor that accept serialized object data is used.
How do you create an index to 'article' store to get primary keys sorted by 'first' followed by 'last'?
- How will you test a key is existed in an object store?The most effecient way to check existence of key is using
countmethod, which return 1 if the key exists in the object store.getmethod is not only inefficient, since it involves record retrieval, but also fail to distinguish between undefined record value and non-existing key. Explain indexing is useful for projection query or not? Plan a database schema to facilitate projection querying to 'first' and 'last' field of 'article' store.
How will you query a list of emails from a certain domain, such as 'gmail.com'?
IndexedDB API does not support indexing boolean value. Why? What are the implications?
Describe how listed index can be used for many-to-many relationship.