Update a document using another field than _id in ElasticSearch - elasticsearch

I would like to do a partial update of a document in ElasticSearch 2.3. The documentation shows:
POST /website/blog/1/_update
{
"doc" : {
"tags" : [ "testing" ],
"views": 0
}
}
Is there a way to update a document using another field other than the _id (here 1) to identify the document?

Use update_by_query API and run a query which will select the documents that match the other field that you want. Basically, with that query you identify the documents you want to update following your own rules.

Related

Use number field as date in Kibana date_histogram aggregation

I'm trying to use Visualize feature in Kibana to plot monthly date_histogram graph that counts # of messages in my system. Message type has a sent_at field that is stored as number since epoch time.
Although I can do that just fine with elasticsearch query
POST /_all/message/_search?size=0
{
"aggs" : {
"monthly_message" : {
"date_histogram" : {
"field" : "sent_at",
"interval" : "month"
}
}
}
}
I ran into a problem in Kibana saying No Compatible Fields: The "myindex" index pattern does not contain any of the following field types: date
Is there a way to get Kibana to use number field as date?
Not to my knowledge, Kibana will use the index mapping in order to find out date fields, if no date fields can be found, then Kibana won't be able to infer one from the other number fields.
What you can do is to add another field called sent_at_date to your mapping, then use the update-by-query API in order to copy the sent_at field to that new field and finally to recreate your index pattern in Kibana.
It goes basically like this:
# 1. add a new field to your mapping
PUT myindex/_mapping/message
{
"properties": {
"sent_at_date": {
"type": "date"
}
}
}
# 2. update all your documents
POST myindex/_update_by_query
{
"script": {
"source": "ctx._source.sent_at_date = ctx._source.sent_at"
}
}
And finally recreate your index pattern in Kibana. You should see a new field called sent_at_date of type date that you can use in Kibana.

Highlighting _source excluded fields in elasticsearch

I've a question in Elasticsearch highlighting. Am setting the mapping of an index in such a way that some of the fields need not be stored in elasticsearch.
For eg :-
PUT employee/_mapping/records
{
"_source": {
"includes": [
"id",
"name"
],
"excludes": [
"designation",
"age"
]
}
}
Now,I'm able to search the excluded fields. But I can't highlight fields that are excluded from _source.
I've read this in elasticsearch documentation. But I want to confirm whether there's any way to highlight fields in such a scenario. Thanks
From further research, I came to know that we should either store the field in ES by giving store : yes or it should be included in _source field of ES to highlight it in searches.

How to specify certain fields only in the query property

I am using a service which wraps requests to Elastic Search. This service only allows me to send the query property to Elastic Search. I want to tell Elastic Search to look only for matches in a certain field in a document.
For example, if this is my document:
{
name: 'foo',
value: 'true'
}
Then I want to tell Elastic Search to look only for documents where name equals foo.
The Elastic Search documentation says to do this by using the fields property like so:
{
"multi_match" : {
"query" : "this is a test",
"fields" : [ "subject^3", "message" ]
}
}
But I can ONLY access the query property, so I can't specify fields. Lower down on the page, under best fields it says that this is equivalent to doing something like +first_name:will +first_name:smith. But when I put this, it's looking for text that actually matches +first_name:will +first_name:smith in the value, rather than looking for a first_name field that has a value will.
Is it possible to specify what field to search in with Elastic Search using only the query property?
This sounds like a perfect match for query_string(https://www.elastic.co/guide/en/elasticsearch/reference/1.x/query-dsl-query-string-query.html). You can do something like this with it:
"query_string" : {
"query" : "subject:whatever OR message:whatever"
}
So, if you can change multi_match to query_string this would be what you are looking for.
Lucene supports fielded data. When performing a search you can either specify a field, or use the default field. The field names and default field is implementation specific.
You can search any field by typing the field name followed by a colon ":" and then the term you are looking for.
{
"query": {
"query_string": {
"query": "Name:\"foo bar cook\"",
"default_operator" : "or"
}
}
}
use default_operator and to perform AND operation, or to perform OR kind of operation among the values

How to add documents to existing index in elasticsearch

Am using Elasticsearch 1.4. My requirement is I will have data every hour and that needs to be uploaded. So the approach that I have taken is to create an index - "demo" and upload the data. So, the first hour data gets inserted. Now, my question is how to append the subsequent hours data into this index.
PUT /demo/userdetails/1
{
"user" : "kimchy",
"message" : "trying out Elastic Search"
}
Now I am trying to add another document
{"user": "swarna","message":"hi"}
You simply need to PUT the additional documents. In your example above you did
PUT /demo/userdetails/1 { "user" : "kimchy", "message" : "trying out Elastic Search" }
Now you would do this:
PUT /demo/userdetails/2 {"user": "swarna","message":"hi"}
In you command there demo is the index, userdetails is the type, and the number is the document id. If you omit the document id ES will make one up for you.

Can I create a document with the update API if the document doesn't exist yet

I have a very simple question :
I want to update multiple documents to elasticsearch. Sometimes the document already exists but sometimes not. I don't want to use a get request to check the existence of the document (this is decreasing my performance). I want to use directly my update request to index the document directly if it doesn't exist yet.
I know that we can use upsert to create a non existing field when updating a document, but this is not what I want. I want to index the document if it doesn't exist. I don't know if upsert can do this.
Can you provide me some explaination ?
Thanks in advance!
This is doable using the update api. It does require that you define the id of each document, since the update api requires the id of the document to determine its presence.
Given an index created with the following documents:
PUT /cars/car/1
{ "color": "blue", "brand": "mercedes" }
PUT /cars/car/2
{ "color": "blue", "brand": "toyota" }
We can get the upsert functionality you want using the update api with the following api call.
POST /cars/car/3/_update
{
"doc": {
"color" : "brown",
"brand" : "ford"
},
"doc_as_upsert" : true
}
This api call will add the document to the index since it does not exist.
Running the call a second time after changing the color of the car, will update the document, instead of creating a new document.
POST /cars/car/3/_update
{
"doc": {
"color" : "black",
"brand" : "ford"
},
"doc_as_upsert" : true
}
AFAIK when you index the documents (with a PUT call), the existing version gets replaced with the newer version. If the document did not exist, it gets created. There is no need to make a distinction between INSERT and UPDATE in ElasticSearch.
UPDATE: According to the documentation, if you use op_type=create, or a special _create version of the indexing call, then any call for a document which already exists will fail.
Quote from the documentation:
Here is an example of using the op_type parameter:
$ curl -XPUT 'http://localhost:9200/twitter/tweet/1?op_type=create' -d '{
"user" : "kimchy",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elasticsearch"
}'
Another option to specify create is to use the following uri:
$ curl -XPUT 'http://localhost:9200/twitter/tweet/1/_create' -d '{
"user" : "kimchy",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elasticsearch"
}'
For bulk API use
bulks.push({
update: {
_index: 'index',
_type: 'type',
_id: id
}
});
bulks.push({"doc_as_upsert":true, "doc": your_doc});
As of elasticsearch-model v0.1.4, upserts aren't supported. I was able to work around this by creating a custom callback.
after_commit on: :update do
begin
__elasticsearch__.update_document
rescue Elasticsearch::Transport::Transport::Errors::NotFound
__elasticsearch__.index_document
end
end
I think you want "create" action
Here's the bulk API documentation:
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html
The index and create actions expect a source on the next line, and have the same semantics as the op_type parameter in the standard index API: create fails if a document with the same ID already exists in the target, index adds or replaces a document as necessary.
Difference between actions:
create
(Optional, string) Indexes the specified document if it does not already exist. The following line must contain the source data to be indexed.
index
(Optional, string) Indexes the specified document. If the document exists, replaces the document and increments the version. The following line must contain the source data to be indexed.
update
(Optional, string) Performs a partial document update. The following line must contain the partial document and update options.
doc
(Optional, object) The partial document to index. Required for update operations.

Resources