PouchDB 4.0.1 - Gotta Go Fast

By: Nolan Lawson
Published: 01 September 2015

I'm a big fan of classic video games. So when I hack on PouchDB, my inspiration is this guy:

In Sonic the Hedgehog, the titular hero taps his foot impatiently any time you stop moving. It's almost as if he's egging you on, urging you to go faster. Come on, what's the holdup?

With PouchDB, performance is one of our top concerns; we don't want any holdups either. So 4.0.1 is a patch release containing some bugfixes, but mostly a lot of speedups.

This blog post has a detailed performance report as well as some community news, but first the changelog:

Changelog

Bugfixes

Correctly support start_key and end_key aliases (#3833 #4154)
Fix memory leak in PouchDB constructor (#4157 #4168 #4182)
Update Uglify to patch security vulnerability (#4203)
Fix inconsistent Date serialization in IndexedDB (#3444)
Fix error for invalid doc_ids (#2204)
Better CORS warning message (#4189)
Fix memory adapter in IE10 by shimming Function.prototype.name (#4216)
Don't fail replication if fetching uuid fails (#4094)

Code cleanup, simplification

Update lie (#4130)
Deleted/simplified lots of code (#4103 #4121 #4160 #4144)

Performance improvements

Avoid extra HTTP call for destroy() (#4159)
Various fixes to improve performance of updating documents (#3921 #4140 #4149 #4151 #4162 #4183 #4110 #4185 #4188)
Improve performance of api.html page in docs (#4117)

Community news

Slack channel

PouchDB now has a Slack channel. It's linked to the IRC channel, so any messages will show up in both.

Please note: this isn't a vote against IRC. While we love open-source, decentralized software (such as PouchDB itself!), many folks find it easier and more familiar to collaborate with Slack. So we're happy to support both systems.

First timers

Inspired by Kent C. Dodd's article "First Timers Only", we set up two issues designed for first-time contributors, with detailed instructions to get started with the PouchDB source code.

We're happy to report that the experiment was a success. Both issues were fixed in no time flat – less than 24 hours each! And as a result, we've been pleased to welcome Charlotte Spencer and Nicolas Brugneaux as PouchDB contributors. Welcome to the gang, Charlotte and Nicolas!

If you'd like to join PouchDB's esteemed roster of contributors (161 and counting!), then hit us up in IRC, Slack, or Github, and we'll be happy to coach you to help find an issue where you can pitch in. We also plan to continue this program by marking issues as "Help wanted" and "First-timers only".

PouchDB Server improvements

PouchDB Server and its core, express-pouchdb, have seen some notable improvements in the past month. In addition to a roughly 4% per-request performance improvement, the UI now supports uploading attachments and sports a snazzier version of the Fauxton UI:

This update fixes lots of UI and UX bugs with the Fauxton interface, as well as allowing an easy way to test the new Mango query language, which are partially supported in PouchDB Server via pouchdb-find. You can get the latest version by running npm install -g pouchdb-server.

Many thanks to Nick Colley, Marten de Vries, and the formidable CouchDB Fauxton team for their help with this release!

Hoodie + PouchDB Server = ❤

In related news, Hoodie has added support for PouchDB Server, replacing CouchDB as their default backend.

This is a great vote of confidence for PouchDB Server, and shows how its combination of easy installation and near-complete API compatibility with CouchDB makes it a perfect drop-in replacement, especially during development and testing.

Performance report

We recently had a performance problem reported to us via StackOverflow. So I took it as an opportunity to thoroughly profile the codebase and experiment with some performance improvements.

Analysis of the problem

The part of the code that was causing a slowdown was related to updating documents. Per the CouchDB spec, whenever a document is updated, we need to merge its revision tree with that of the previous version. However, the merge algorithm was not very optimized, so PouchDB was spending a lot of time in JavaScript just traversing through tree structures, as well as allocating unnecessary memory along the way.

The most shocking illustration of the slowdown came when we ran performance tests using the in-memory adapter. Most of the time, PouchDB is bound by the underlying storage engine (LevelDB, IndexedDB, and WebSQL), but in this case even the in-memory adapter was chugging along much too slowly.

The fix is in

The fix involved some major overhauls to the merge algorithm:

Use well-known JavaScript optimization techniques, such as removing functions-within-functions and preferring for loops to forEach().
Where possible, cache the calculated metadata values such as the winning revision and whether or not the winner was deleted.
Simplify PouchDB's extend() and clone() methods, removing our long-in-the-tooth jQuery version.
Use less memory by avoiding extend() and clone() entirely when we don't need them.

Of course, these improvements were justified by data! The Chrome Dev Tools profiler and Node.js flame graphs are a performance junkie's best friends.

Measuring the fix

To determine how much we improved over PouchDB v4.0.0, I ran the basic-updates test with 200 iterations, meaning that it inserted 100 documents and updated them 200 times. All numbers are in milliseconds, and were recorded on a 2013 MacBook Pro running OS X Yosemite.

	v4.0.0	v4.0.1	Improvement
Safari 9.0 (WebSQL)	24867	17232	30.70%
Firefox 40 (IndexedDB)	50519	26265	48.01%
Chrome 44 (IndexedDB)	49613	26667	46.25%
Chrome 44 (WebSQL)	46336	26358	43.12%
iojs 2.1.0 (LevelDB via LevelDOWN)	37593	15426	58.97%
iojs 2.1.0 (in-memory via MemDOWN)	33908	11433	66.28%

Note that these numbers don't apply to PouchDB as a whole – for instance, you can't really say "PouchDB is now 46% faster in Chrome." This test applies to a very specific part of the codebase (updating documents), so basic reads, writes, and secondary indexes will be largely unaffected.

Just for fun, I also ran the above performance test against CouchDB 1.6.1, over HTTP. Surprisingly it finished in 10956 milliseconds, which is even faster than our in-memory adapter! So clearly we've still got some work to do, and I encourage any JavaScript performance gurus to run the tests and lend a hand.

A note on PouchDB performance

PouchDB is a very fast database, and I think it holds up well compared to alternatives. It's worth noting, though, that we have a lot of factors that make performance a challenge.

First off, PouchDB is optimized for syncing. This means that we have some complicated data structures (such as revision trees) that add overhead compared to databases that don't need to retain old or conflicting revisions of documents.

Second off, PouchDB supports multiple storage engines, and they have different performance characteristics. For instance, allDocs() can be 40x slower in IndexedDB than in WebSQL because joins and cursors are fairly slow in IndexedDB, whereas in WebSQL we can use one big SQL query to do it all.

There's been some discussion in the W3C about improving IndexedDB performance, and IndexedDB v2 offers some new methods like getAll() that could boost performance. However, this has limited value for PouchDB, because 1) it only solves half the problem (IndexedDB cursors being slow) while leaving the other half unresolved (joins being slow), and 2) methods like getAll() are only implemented in recent versions of Chrome and Firefox, so it would significantly complicate the codebase to support both versions.

Thirdly, PouchDB relies heavily on these underlying storage engines. This means that, when PouchDB gives you a callback after a put(), data has actually been written to disk, and when you do an allDocs(), PouchDB is actually progressively fetching data from disk. This disk-heavy model allows PouchDB to run well on low-memory devices (such as phones), but also means that it runs better when the storage engine provides better tools, which is why WebSQL is still faster than IndexedDB for us.

What you can do about performance

Here are some tips to get the best performance out of PouchDB:

Don't update your documents too many times. For instance, if you're writing a text editor, don't update the document for every keystroke; set a debounce, or allow the user to explicitly save.
Don't continually read the entire database in and out of memory; use pagination and the changes feed.
Prefer allDocs() to either query() or pouchdb-find. Building up secondary indexes is expensive.

If you are not using PouchDB's sync capabilities, or if you require complicated queries (such as full-text search) on a large set of data (>1000 documents), I would also advise looking into other databases. PouchDB is designed for sync and it excels at that, but if you have different needs, then you should explore other libraries.

For instance, databases like LocalForage, Dexie, Lovefield, and YDN-DB run in more-or-less the same browsers as PouchDB, but have a much simpler data model, where updated/removed data is simply overwritten. These databases are particularly good for quick storage, lookup, and querying. If you are not using sync, you will probably also appreciate the cognitive benefit of not having to think about _rev!

If you need even more sophisticated querying capabilities and don't require a low memory footprint, then you could also look into in-memory databases such as AlaSQL and LokiJS. These databases can periodically write data to a storage engine like IndexedDB, but are primarily queried via in-memory methods, meaning they can be much faster than PouchDB because they don't have to touch disk. On the other hand, they have a high memory footprint and may drop data unless you are careful to flush the in-memory representation to disk.

Make no mistake: PouchDB is committed to being the fastest JavaScript database it can be. However, our primary goal is correctness, especially when it comes to sync. (The massive test suite should be a testament to that.) Every design decision comes with tradeoffs, and our decisions are driven by the goal of providing a free, open-source database that 1) never drops data, 2) syncs like a champ, and 3) runs flawlessly in every modern browser. If PouchDB can do all that while being fast, then that's just icing on the cake!

Get in touch

Please file issues or tell us what you think. And as always, a big thanks to all of our new and existing contributors!