Partial Updates in Elasticsearch
Gain Huge Performance Increases in Elasticsearch When Updating Documents

Elasticsearch is a great tool for implementing high performance, scalable search in web and mobile applications.  It is often used to index large documents, in addition to database-based text.  For example, a law firm might want to index Microsoft Word documents so that they're searchable in a web interface as part of a case management tool.  One of the challenges with this approach is that most Elasticsearch libraries (there is a Rails gem, for example) don't support “partial updates” out of the box.  This causes some significant performance challenges when updating records that have large attachments, because even if you only update a single text field in a record, the entire data set, including the Microsoft Word document in the example given, have to be reindexed.

Elasticsearch 2 includes the endpoint _update.  The use case is simple – you already have a record in your datastore, and you already have an indexed document in Elasticsearch.  You go to make an update to a record in your datastore, and you now need to make a corresponding update to your ES record.  Partial updates support the notion of only updating those attributes that have changed in ES, rather than reindexing the entire data record.  The implementation is simple.  Simply make a POST to your ES instance with the collection and document record:

url = "#{ELASTIC_SEARCH_HOST}/cases/case/#{c.id}/_update"
payload = "{\"doc\": {\"last_name\": \"Smith\"}"
HTTParty.post(url, :body => payload)

Note, this example uses HTTParty and Ruby, but you can use any method or language for POSTing to your ES instance, since it just lives as an HTTP service.

That's all there is to it, you can get enormous performance increases on records that have large amounts of data when only a small change is made using this endpoint.  For more information, check out the docs, and feel free to leave comments and questions below.