Mango queries

Mango queries, also known as pouchdb-find or the find() API, are a structured query API that allows you to build secondary indexes beyond the built-in allDocs() and changes() indexes.

This API is useful for answering questions like:

find all documents where the type is 'user'
find all users whose age is greater than 21
find all Pokémon whose name starts with 'pika'
etc.

Installation

The find() API is currently offered as a separate plugin, meaning that you must install it on top of pouchdb.js. Here's how to do so:

Script tags

<script src="pouchdb.js"></script>
<script src="pouchdb.find.js"></script>

The pouchdb.find.js file is available in the pouchdb package in npm/Bower, on unpkg, or as a GitHub download. Note it must be placed after pouchdb.js.

npm

If you are using Node, Browserify, Webpack, Rollup, etc., then you can install it like so:

npm install --save pouchdb-find

Then in code:

const PouchDB = require('pouchdb');
PouchDB.plugin(require('pouchdb-find'));

Query language

The Mango query language is a DSL inspired by MongoDB, which allows you to define an index that is then used for querying. One quick way to understand how this works is to use the live query demo.

At a basic level, there are two steps to running a query: createIndex() (to define which fields to index) and find() (to query the index).

For instance, let's imagine a simple index to look up all documents whose name is "mario". First we'll create it:

db.createIndex({
  index: {fields: ['name']}
});

This returns a Promise that resolves once the index is created. At this point, we have an index based on the "name" field, so we can use it for lookup:

db.find({
  selector: {
    name: 'mario'
  }
});

This returns a Promise containing an array of all documents that match this selector. Note that this is equivalent to using the $eq (equals) operator:

db.find({
  selector: {
    name: {$eq: 'mario'}
  }
});

The important thing to understand is that, for a typical database, createIndex() is the expensive operation, because it is looping through all documents in the database and building a B-tree based on the name value.

Once the B-tree is built up, though, the find() is relatively cheap. (If this were not the case, then we would be better off just using allDocs() to iterate through the database ourselves!)

Once we have an index on name, we can also sort all documents by name:

db.find({
  selector: {
    name: {$gte: null}
  },
  sort: ['name']
});

Note that we are specifying that the name must be greater than or equal to null, which is a workaround for the fact that the Mango query language requires us to have a selector. In CouchDB collation order, null is the "lowest" value, and so this will return all documents regardless of their name value.

Pagination

Reading all documents in the database and sorting them by a particular value is neat, but we could do this ourselves with allDocs(), and it would have the same performance impact. Where it gets more interesting is when we use limit:

db.find({
  selector: {
    name: {$gte: null}
  },
  sort: ['name'],
  limit: 10
});

In this case, we only get 10 documents back, but they are the first 10 documents, sorted by name. This means that we have only read 10 documents out of the database into memory, which can be used for efficient pagination.

For instance, if we are displaying the first 10 results on a single page, and the user clicks "next" to see the next page, we can restructure our query based on the last result, to continue the pagination. Let's imagine the first 10 documents' names are:

[
  'abby', 'bertrand', 'clarice', 'don', 'emily',
  'fumiko', 'gunther', 'horatio', 'ike', 'joy'
]

For our next 10 pages of results, the query becomes:

db.find({
  selector: {
    name: {$gt: 'joy'}
  },
  sort: ['name'],
  limit: 10
});

Because we are now specifying that the name must be greater than 'joy', we are guaranteed to get the next-highest result after 'joy', which may (for instance) look like this:

[
  'kim', 'lin', 'maria', 'nell', 'oliver',
  'pat', 'quincy', 'roy', 'sam', 'tanya'
]

In this way, we can continue paginating by using the last value as our next starting point. At any given point in time, there are only 10 documents stored in memory at once, which is great for performance.

Indexing on more than one field

Sometimes an index is not as simple as "find all documents whose name is "mario". Sometimes you want to do something fancy, such as "find all documents whose name is "mario" and whose age is greater than 21". In those cases, you can index on more than one field:

db.createIndex({
  index: {
    fields: ['name', 'age']
  }
}).then(function () {
  return db.find({
    selector: {
      name: 'mario',
      age: {$gt: 21}
    }
  });
});

One thing to note is that the order of these fields matters when creating your index. For instance, the following would not work:

/* THIS WON'T WORK! */
db.createIndex({
  index: {
    fields: ['age', 'name']
  }
}).then(function () {
  return db.find({
    selector: {
      name: 'mario',
      age: {$gt: 21}
    }
  });
});

The reason for this is easy to understand if we imagine how this index would sort a hypothetical database:

name	age
Luigi	17
Luigi	28
Mario	18
Mario	21
Mario	22
Mario	26
Peach	17
Peach	21
Peach	25

In the above table, the documents are sorted by ['name', 'age'], and our "Marios above the age of 21" are very clearly grouped together.

However, if we were to change the order, and sort them by ['age', 'name'], it would look instead like this:

age	name
17	Luigi
17	Peach
18	Mario
21	Mario
21	Peach
22	Mario
25	Peach
26	Mario
28	Luigi

If we imagine our find() query as a "slice" of the data, it's obvious that there's no slice that corresponds to "all Marios whose age is greater than 21." Instead, our documents are sorted by age, and then documents with the same age are sorted by name.

This index may be good for answering questions like "find all 17-year-olds whose name starts with letters N-Z", but it's not very good for answering questions like "find all people with a certain name, older than a certain age."

This shows that it's important to carefully design an index before creating a query to use that index. Otherwise, the query planner may fall back to in-memory querying, which can be expensive.

Performance notes

The Mango query language is generally very permissive, and allows you to write queries that may not perform very well, but will run regardless. For instance, you may create an index with createIndex(), but then write a find() query that doesn't actually use that index. In general, the query planner tries to find the most appropriate index, but it may fall back to in-memory querying.

As a straightforward example, if you query using the _id field, then the query planner will automatically map that directly to an allDocs() query. However, if you query for a field that isn't yet indexed, then it will simply use allDocs() to read in all documents from the database (!) and then filter in-memory. This can lead to poor performance, especially if your database is large.

If you're ever wondering how the query planner is interpreting your query, you can use the explain endpoint:

db.explain({
  selector: {
    name: 'mario',
    age: {$gt: 21}
  }
})
.then(function (explained) {
  // detailed explained info can be viewed
});

In the console, the query planner will show a detailed explanation of how it has interpreted the query, whether it uses any indexes, and whether any parts of the query need to be executed in-memory.

You may also want to pay attention to the "warning" value included in your results set, indicating that there was no index that matched the given query. For instance, the warning may look like this:

{
  "docs": [ /* ... */ ],
  "warning": "No matching index found, create an index to optimize query time."
}

Set which index to use

When creating a query, by settings the use_index field, it is possible to tell pouchdb-find which index to use. The below example shows how to do that.

db.createIndex({
  index: {
    fields: ['age', 'name'],
    ddoc: "my-index-design-doc"
  }
}).then(function () {
  return db.find({
    selector: {
      name: 'mario',
      age: {$gt: 21},
    },
    use_index: 'my-index-design-doc'
  });
});