elasticsearch-odm
v1.1.0
Published
Like Mongoose but for Elasticsearch. Define models, preform CRUD operations, and build advanced search queries.
Downloads
64
Maintainers
Readme
Elasticsearch ODM
Like Mongoose but for Elasticsearch. Define models, preform CRUD operations, and build advanced search queries. Most commands and functionality that exist in Mongoose exist in this library. All asynchronous functions use Bluebird Promises instead of callbacks.
This is currently the only ODM/ORM library that exists for Elasticsearch on Node.js. Waterline has a plugin for Elasticsearch but it is incomplete and doesn't exactly harness it's searching power. Loopback has a storage plugin, but it also doesn't focus on important parts of Elasticsearch, such as mappings and efficient queries. This library automatically handles merging and updating Elasticsearch mappings based on your schema definition.
Installation
If you currently have npm elasticsearch installed, you can remove it and access it from client in this library if you still need it.
$ npm install elasticsearch-odm
Features
- Easy to use API that mimics Mongoose, but cuts out the extras.
- Models, Schemas and Elasticsearch specific type mapping.
- Add Elasticsearch specific type options to your Schema, like boost, analyzer or score.
- Utilizes bulk and scroll features from Elasticsearch when needed.
- Easy search queries without generating your own DSL.
- Seamlessly handles updating your Elasticsearch mappings based off your models Schema.
Quick Start
You'll find the API is intuitive if you've used Mongoose or Waterline.
Example (no schema):
var elasticsearch = require('elasticsearch-odm');
var Car = elasticsearch.model('Car');
var car = new Car({
type: 'Ford', color: 'Black'
});
elasticsearch.connect('my-index').then(function(){
// be sure to call connect before bootstrapping your app.
car.save().then(function(document){
console.log(document);
});
});
Example (using a schema):
var elasticsearch = require('elasticsearch-odm');
var carSchema = new elasticsearch.Schema({
type: String,
color: {type: String, required: true}
});
var Car = elasticsearch.model('Car', carSchema);
API Reference
- Core
- Document
- Model
.count()
.create(Object data)
.update(String id, Object data)
.remove(String id)
.removeByIds(Array ids)
.set(String id)
.find(Object/String match, Object queryOptions)
.findById(String id, Object queryOptions)
.findByIds(Array ids, Object queryOptions)
.findOne(Object/String match, Object queryOptions)
.findAndRemove(Object/String match, Object queryOptions)
.findOneAndRemove(Object/String match, Object queryOptions)
.makeInstance(Object data)
.toMapping()
- Query Options
- Schemas
Core
Core methods can be called directly on the Elasticsearch ODM instance. These include methods to configure, connect, and get information from your Elasticsearch database. Most methods act upon the official Elasticsearch client.
.connect(String/Object options)
-> Promise
Returns a promise that is reolved when the connection is complete. Can be passed a single index name, or a full configuration object. The default host is localhost:9200 when no host is provided, or just an index name is used. This method should be called at the start of your application.
If the index name does not exist, it is automatically created for you.
You can also add any of the Elasticsearch specific options, like SSL configs.
Example:
// when bootstrapping your application
var elasticsearch = require('elasticsearch-odm');
elasticsearch.connect({
host: 'localhost:9200',
index: 'my-index',
logging: false, // true by default when NODE_ENV=development
syncMapping: false // see 'sync mapping' in Schemas documentation
ssl: {
ca: fs.readFileSync('./cacert.pem'),
rejectUnauthorized: true
}
});
// OR
elasticsearch.connect('my-index'); // default host localhost:9200
new Schema(Object options)
-> Schema
Returns a new schema definition to be used for models.
.model(String modelName, Optional/Schema schema)
-> Model
Creates and returns a new Model, like calling Mongoose.model(). Takes a type name, in mongodb this is also known as the collection name. This is global function and adds the model to Elasticsearch ODM instance.
.client
-> Elasticsearch
The raw instance to the underlying Elasticsearch client. Not really needed, but it's there if you need it, for example to run queries that aren't provided by this library.
.stats()
Returns a promise that is resolved with index stats for the current Elasticsearch connections.
.removeIndex(String index)
Takes an index name, and complete destroys the index. Resolves the promise when it's complete.
.createIndex(String index, Object mappings)
Takes an index name, and a json string or object representing your mapping. Resolves the promise when it's complete.
Document
Like Mongoose, instances of models are considered documents, and are returned from calls like find() & create(). Documents include the following functions to make working with them easier.
.save()
-> Document
Saves or updates the document. If it doesn't exist it is created. Like Mongoose, Elasticsearches internal '_id' is copied to 'id' for you. If you'd like to force a custom id, you can set the id property to something before calling save(). Every document gets a createdOn and updatedOn property set with ISO-8601 formatted time.
.remove()
Removes the document and destroys the cuurrent document instance. No value is resolved, and missing documents are ignored.
.update(Object data)
-> Document
Partially updates the document. Data passed will be merged with the document, and the updated version will be returned. This also sets the current model instance with the new document.
.set(Object data)
-> Document
Completely overwrites the document with the data passed, and returns the new document. This also sets the current model instance with the new document.
Will remove any fields in the document that aren't passed.
.toObject()
Like Mongoose, strips all non-document properties from the instance and returns a raw object.
Model
Model definitions returned from .model() in core include several static functions to help query and manage documents. Most functions are similar to Mongoose, but due to the differences in Elasticsearch, querying includes some extra advanced features.
.count()
-> Object
Object returned includes a 'count' property with the number of documents for this Model (also known as _type in Elasticsearch). See Elasticsearch count.
.create(Object data)
-> Document
A helper function. Similar to calling new Model(data).save(). Takes an object, and returns the new document.
.update(String id, Object data)
-> Document
A helper function. Similar to calling new Model().update(data). Takes an id and a partial object to update the document with.
.remove(String id)
Removes the document by it's id. No value is resolved, and missing documents are ignored.
.removeByIds(Array ids)
Help function, see remove. Takes an array of ids.
.set(String id, Object data)
-> Document
Completely overwrites the document matching the id with the data passed, and returns the new document.
Will remove any fields in the document that aren't passed.
.find(Object/String match, Object queryOptions)
-> Document
There are four ways to call .find() and it's siblings. You can mix and match styles.
- Passing only a match object like
.find({name:'Joe'})
- Passing only a string to match against all document fields
.find('some string')
- Passing Query Options (match can be set to null/empty)
.find({}, {must: {active: true, sort: 'createdOn'}}}
- Use chaining options (alias for QueryOptions)
.find({}).must({active: true}).sort('createdOn').then(..)
Unlike mongoose, finding exact matches requires the fields in your mapping to be set to 'not_analyzed'. By default {index: not_analyzed}
is added to all string fields in your Schema unless you override it.
Depending on the analyzer in your mapping, find queries like must, not, and matches may not find any results.
match => Optional. An alias for the 'must' Query Option. Like Mongoose this matches name/value in documents. Also, instead of an object, just a string can be passed which will match against all document fields using the power of an Elasticsearch QueryStringQuery.
queryOptions => Optional (can also use chaining instead). An object with Query Options. Here you can specifiy paging, filtering, sorting and other advanced options. See here for more details. You can set the first argument to null, and only use filters from the query options if you wanted.
returns => Found documents, or null if nothing was found.
Example:
var Car = elasticsearch.model('Car');
// Simple query.
Car.find({color: 'blue'}).then(function(results){
console.log(results);
});
// Nested query (for nested documents/properties).
Car.find({'location.city': 'New York'})
// Find all by passing null or empty object to first argument
Car.find(null, {sort: 'createdOn'})
// Search all fields using a QueryStringQuery.
Car.find('some text')
// Chained query without using Query Options.
// Instead of Mongoose .exec(), we call .then()
Car.find()
.must({color: 'blue'})
.exists('owner')
.sort('createdOn')
.then(...)
.findById(String id, Object queryOptions)
-> Document
Finds a document by id. 'fields' argument is optional and specifies the fields of the document you'd like to include.
.findByIds(Array ids, Object queryOptions)
-> Document
Same as .findById() but for multiple documents.
.findOne(Object/String match, Object queryOptions)
-> Document
Same arguments as .find(). Returns the first matching document.
.findAndRemove(Object/String match, Object queryOptions)
-> 'Object'
Same arguments as .find(). Removes all matching documents and returns their raw objects.
.findOneAndRemove(Object/String match, Object queryOptions)
-> 'Object'
Same arguments as .findAndRemove(). Removes the first found document.
.makeInstance(Object data)
-> Document
Helper function. Takes a raw object and creates a document instance out of it. The object would need at least an id property. The document returned can be used normally as if it were returned from other calls like .find().
.toMapping()
Returns a complete Elasticsearch mapping for this model based off it's schema. If no schema was used, it returns nothing. Used internally, but it's there if you'd like it.
Query Options
The query options object includes several options that are normally included in mongoose chained queries, like sort, and paging (skip/limit), and also some advanced features from Elasticsearch. The Elasticsearch Query and Filter DSL is generated using best practices.
page & per_page
Type: Integer
For most use cases, paging is better suited than skip/limit, so this library includes thhis instead. Page 0/1 are the same thing, so either can be used. Page and per_page both use default when the other is set, page defaults to the first, and per_page defaults to 10.
Including page or per_page will result in the response being wrapped in a meta data object like the following. You can call toJSON and toObject on this response and it'll call that method on all document instances under the hits property.
// A paged response that is returned when page or per_page is set.
{
total: 0, // total documents found for the query.
hits: [], // a collection of document instances.
page: 0, // current page requested.
pages: 0 // total number of pages.
}
fields
Type: Array or String
A list of fields to include in the documents returned. For example, you could pass 'id' to only return the matching document id's. See Elasticsearch Fields.
// Query Options.
{
fields: ['name', 'age']
}
// Chained Query.
.find()
.fields(['name', 'age'])
.then(...)
sort
Type: Array or String
A list of fields to sort on. If multiple fields are passed then they are executed in order. Adding a '-' sign to the start of the field name makes it sort descending. Default is ascending. See Elasticsearch Sort.
Example:
// Query Options.
{
sort: ['name', 'createdOn']
}
// Chained Query.
.find()
.sort(['name', 'createdOn'])
.then(...)
q
Type: String
A string to search all document fields with using Elasticsearch QueryStringQuery. This can be expensive, so use it sparingly.
Example:
// Query Options.
{
q: 'Red dog run'
}
// Chained Query.
.find('Red dog run')
.then(...)
must
Type: Object
Key value pairs to match documents against. Essentially it's the same as first argument passed to Mongoose .find(). This is also an alias to the first argument passed to .find() in this library. This is a 'must' Bool Filter.
Elasticsearches internal Tokenizers are used, and fields are analyzed.
You can query nested fields using dot notation.
Example:
// Query Options.
{
must: {
name: 'Jim',
'location.country': 'Canada'
}
}
// Chained Query.
.find()
.must({name: 'Jim', 'location.country': 'Canada'})
.then(...)
not
Type: Object
The same as must, but matches documents where the key value pairs DON'T match. This is a 'must_not' Bool Filter query.
You can query nested fields using dot notation.
Example:
// Query Options.
{
not: {
name: 'Jim',
'location.country': 'Canada'
}
}
// Chained Query.
.find()
.not({name: 'Jim', 'location.country': 'Canada'})
.then(...)
missing
Type: Array or String
A single field name, or array of field names. Matches documents where these field names are missing. A field is considered mising, when it is null, empty, or does not exist. See [MissingFilter] (https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-missing-filter.html).
Example:
// Query Options.
{
missing: ['description', 'name']
}
// Chained Query.
.find()
.missing(['description', 'name'])
.then(...)
exists
Type: Array or String
A single field name, or array of field names. Matches documents where these field names exists. The opposite of missing.
Example:
// Query Options.
{
exists: ['description', 'name']
}
// Chained Query.
.find()
.exists(['description', 'name'])
.then(...)
Schemas
Models don't require schemas, but it's best to use them - especially if you'll be making search queries. Elasticsearch-odm will generate and update Elasticsearch with the proper mappings based off your schema definition. The schemas are similar to Mongoose, but several new field types have been added which Elasticsearch supports. These are; float, double, long, short, byte, binary, geo_point. Generally for numbers, only the Number type is needed (which converts to Elasticsearch integer). You can read more about Elasticsearch types here.
NOTE
- Types can be defined in several ways. The regular mongoose types exist, or you can use the actual type names Elasticsearch uses.
- You can also add any of the field options you see for Elasticsearch Core Types
- String types will default to
"index": "not_analyzed"
. See Custom Field Mappings. This is so the .find() call acts like it does in Mongoose by only fidning exact matches, however, this prevents the ability to do full text search on this field. Simply set{"index":"analyzed"}
if you'd like full text search instead.
Example:
// Before saving a document with this schema, your Elasticsearch
// mappings will automatically be updated.
// Note the various ways you can define a schema field type.
var carSchema = new elasticsearch.Schema({
// native type without options
available: Boolean,
// Elasticsearch type without options
safteyRating: 'float',
// native array type
parts: [String],
// Elasticsearch array type
oldPrices: {type: ['double']},
// with options
color: {type: String, required: true},
// a field named 'type' must be defined like the following.
type: {type: String},
// nested document
owner: {
name: String,
age: Number,
// force a required field
location: {type: 'geo_point', required: true}
},
// nested document array
inspections: [{
date: Date,
grade: Number
}],
// Enable full-text search of this field.
// NOTE: it's better to than use the 'q' paramater in queryOptions
// during searches instead of must/not or match when using 'analyzed'
description: {type:String, index: 'analyzed'}
// Ignore_malformed is an Elasticsearch Core Type field option for numbers
price: {type: 'double', ignore_malformed: true}
});
Hooks and Middleware
Schemas include pre and post hooks that function similar to Mongoose. Currently, there are pre/post hooks for 'save' and 'remove'.
Pre Hooks
Same conventions as Mongoose. Function takes a done() callback that must be called when your function is finished. this
is scoped to the current document. assing an Error to done() will cancel the current operation. For example, in a pre 'save' hook, passing an error to done() will cause the document not to be saved and will return your error to the save() callers rejection handler.
var schema = new elasticsearch.Schema(...);
schema.pre('save', function(done){
console.log(this); // this = the current document
done(); // OR done(new Error('bad document'));
});
Post Hooks
Same conventions as Mongoose. Does not have a done() callback. Executed after the hooked method. The first argument is the current document which may or may not be a document instance (eg. post remove only receives the raw object as the document no longer exists).
var schema = new elasticsearch.Schema(...);
schema.post('remove', function(document){
console.log(document);
});
Static and Instance Methods
Add methods to your schema with the same convention as Mongoose.
// Instance method.
var schema = new elasticsearch.Schema(...);
schema.methods.getFullName = function(){
return this.firstName + ' ' + this.lastName;
});
// Static method.
schema.statics.findByColor = function(color){
return this.find({color: color});
});
Sync Mapping
By default, an attempt will be made on connection to convert your schema definitions into Elasticsearch mappings, and send a PUT mapping request to sync them. This can cause major issues if your schemas mappings have conflicting types.
If you'd like to disable sync mapping, or if your node has mappings already configured, you can do it like so.
elasticsearch.connect({
host: 'localhost:9200',
index: 'my-index',
syncMapping: false
});
CHANGLELOG
CONTRIBUTING
This is a library Elasticsearch desperately needed for Node.js. Currently the official npm elasticsearch client has about 23,000 downloads per week, many of them would benefit from this library instead. Pull requests are welcome. There are Mocha and benchmark tests in the root directory.
TODO
- Browser build.
- Add support for querying nested document arrays with dot notation syntax.
- Add scrolling
- Add a wrapper to enable streaming of document results.
- Add snapshots/backups
- Allow methods to call Elasticsearch facets.
- Performance tweak application, fix garbage collection issues, and do benchmark tests.
- Integrate npm 'friendly' for use with expanding/collapsing parent/child documents.
- Use source filtering instead of fields.