Using Delete By Query API and Bulk API together in Elastic - elasticsearch

I couldn't see any documentation/example about using delete by query api with bulk api in elastic search.
Simply, I want to delete all the documents having same A field and insert many documents just after that. If delete process fails, it shouldn't insert any documents.
e.g.
POST _bulk
{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "1" } }
{ "field1" : "value1" }
??? { "delete_by_query???" : { "_index" : "test", "_type" : "type1", "query"... } }
Is there any way to use them together?
Thanks.

Related

Getting documents with a specific field criteria with criteria value common in all the types

I am trying to get the make mapping for different companies using elastic search in which different companies are defined in the type field and I want to fetch all the makes which occur in a particular company based on which I want to fetch makes makes of other companies common to first search query company.
I am new to elastic search want some assistance with this.
I have tried segregating the problem using elastic search filter and aggregations but still not able to get the values required
DATA for vehicle mappings is:-
{
"_index" : "vehiclemapping",
"_id" : "fN1P-GwBjuCNVtK7BNxL",
"company":"abc1",
"make":"make1"
},
{
"_index" : "vehiclemapping",
"_id" : "fN1P-GwBjuCNVtK7BNx2",
"company":"abc2",
"make":"make2"
},
{
"_index" : "vehiclemapping",
"_id" : "fN1P-GwBjuCNVtK7BNx3",
"company":"abc3",
"make":"make3"
},{
"_index" : "vehiclemapping",
"_id" : "fN1P-GwBjuCNVtK7BNx4",
"company":"abc1",
"make":"make2"
},
{
"_index" : "vehiclemapping",
"_id" : "fN1P-GwBjuCNVtK7BNx5",
"company":"abc2",
"make":"make1"
},
{
"_index" : "vehiclemapping",
"_id" : "fN1P-GwBjuCNVtK7BNx6",
"company":"abc2",
"make":"make3"
}
I am expecting to get result all documents having company = 'abc1' along with other documents having abc1 company makes.
EXPECTED OUTPUT:-
{
"make":"make1"
},
{"make":"make2"},
{"make":"make3"}

ElasticSearch Bulk with ingest plugin

I am using the Attachment Processor Attachment Processor in a Pipeline.
All work fine, but i wanted to do multiple post, then I tried to used bulk API.
Bulk work fine too, but I can't find how to send the url parameter "pipeline=attachment".
this put works :
POST testindex/type1/1?pipeline=attachment
{
"data": "Y291Y291",
"name" : "Marc",
"age" : 23
}
this bulk works :
POST _bulk
{ "index" : { "_index" : "testindex", "_type" : "type1", "_id" : "2" } }
{ "name" : "jean", "age" : 22 }
But how can I index Marc with his data field in bulk to be understood by the pipeline plugin?
thanks to Val comment, I did that and it work fine:
POST _bulk
{ "index" : { "_index" : "testindex", "_type" : "type1", "_id" : "2", "pipeline": "attachment"} } }
{"data": "Y291Y291", "name" : "jean", "age" : 22}

How do you bulk index documents into the default mapping of ElasticSearch?

The documentation for ElasticSearch 5.5 offers no examples of how to use the bulk operation to index documents into the default mapping of an index. It also gives no indication why this is not possible, unless I'm missing that somewhere else in the documentation.
The ES 5.5 documentation gives one explicit example of bulk indexing:
POST _bulk
{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "1" } }
{ "field1" : "value1" }
But it also says that
The endpoints are /_bulk, /{index}/_bulk, and {index}/{type}/_bulk.
When the index or the index/type are provided, they will be used by
default on bulk items that don’t provide them explicitly.
So, the middle endpoint is valid, and it implies to me that a) you have to explicitly provide a type in the metadata for each document indexed, or b) that you can index documents into the default mapping ("_default_").
But I can't get this to work.
I've tried the /myindex/bulk endpoint with no type specified in the metadata.
I've tried it with "_type": "_default_" specified.
I've tried /myindex/_default_/bulk.
This has nothing to do with the _default_ mapping. This is about falling back to the default type that you specify in the URL. You can do the following
POST _bulk
{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "1" } }
{ "field1" : "value1" }
However the following snippet is exactly the same
POST /test/type1/_bulk
{ "index" : { "_id" : "1" } }
{ "field1" : "value1" }
And you can mix this
POST foo/bar/_bulk
{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "1" } }
{ "field1" : "value1" }
{ "index" : { "_id" : "1" } }
{ "field1" : "value1" }
In this example, one document would be indexed into foo and one into test.
Hope this makes sense.

Is the order of operations guaranteed in a bulk update?

I am sending delete and index requests to elasticsearch in bulk (the example is adapted from the docs):
{ "delete" : { "_index" : "test", "_type" : "type1", "_id" : "1" } }
{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "1" } }
{ "field1" : "value1" }
The sequence above is intended to first delete a possible document with _id=1, then index a new document with the same _id=1.
Is the order of the actions guaranteed? In other words, for the example above, can I be sure that the delete will not touch the document indexed afterwards (because the order would not be respected for a reason or another)?
The delete operation is useless in this scenario, if you simply index a document with the same ID, it will automatically and implicitly delete/replace the previous document with the same ID.
So if document with ID=1 already exists, simply sending the below command will replace it (read delete and re-index it)
{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "1" } }
{ "field1" : "value1" }
According to an Elastic Team Member:
Elasticsearch is distributed and concurrent. We do not guarantee that requests are executed in the order they are received.
https://discuss.elastic.co/t/are-bulk-index-operations-serialized/83770/6

Cannot update path in timestamp value

Here is my problem, I'm trying to insert a bunch of data into elastic search and to vizualize it using kibana, however I got an issue with kibana timestamp recognition.
My time field is called "dateStart", and I tried to use it as a timestamp using the following command :
curl -XPUT 'localhost:9200/test/type1/_mapping' -d'{ "type1" :{"_timestamp":{"enabled":true, "format":"yyyy-MM-dd HH:mm:ss","path":"dateStart"}}}'
But this command give me the following error message :
{"error":"MergeMappingException[Merge failed with failures {[Cannot update path in _timestamp value. Value is null path in merged mapping is missing]}]","status":400}
I'm not sure to understand what I do with this command, but what I would like to do is telling to elastic search and kibana to use my "dateStart" field as a timestamp.
Here is a sample of my insert file (I use bulk insert) :
{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "1"} }
{ "dateStart" : "15-03-31 06:00:00", "score":0.9920092243874442}
{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "2"} }
{ "dateStart" : "15-03-23 06:00:00", "score":0.0}
{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "3"} }
{ "dateStart" : "15-03-29 12:00:00", "score":0.0}

Resources