Elasticsearch Bulk API using curl and text file - elasticsearch

I'm a beginner with Elasticsearch and am following an "Essential Training" in LinkedIn Learning. I'm trying to follow with bulk loading API and the instructor is using Linux, I'm on Windows. He created a text file to read in with data using "VI". I just created a text file and pasted the data and removed the ".txt". The contents of the file, called reqs, is this:
{
"index":{
"_index":"my-test",
"_type":"my-type",
"_id":"1"
}
}{
"col1":"val1"
}{
"index":{
"_index":"my-test",
"_type":"my-type",
"_id":"2"
}
}{
"col1":"val2"
}{
"index":{
"_index":"my-test",
"_type":"my-type",
"_id":"3"
}
}{
"col1":"val3"
}
I've tried saving it with a carriage return (new line) after the last line and without. I saved this into my elasticsearch folder (C:\elasticsearch-7.12.0) which is the same directory I'm running the following command from:
c:\elasticsearch-7.12.0>curl -s -H "Content-Type: application/x-ndjson" -XPOST localhost:9200/_bulk --data-binary "#reqs"; echo
When I do this, I'm getting the following error:
{"error":{"root_cause":[{"type":"parse_exception","reason":"request body is required"}],"type":"parse_exception","reason":"request body is required"},"status":400}

Use this below curl command
curl -H 'Content-Type: application/x-ndjson' -XPOST 'localhost:9200/index-name/_bulk?pretty' --data-binary #reqs.json
reqs.json should look like this
{"index" : {"_index" : "my-test", "_type" : "my-type", "_id" : "1"}}
{"col1" : "val1"}
{"index" : {"_index" : "my-test", "_type" : "my-type", "_id" : "2"}}
{"col1" : "val2"}
{"index" : {"_index" : "my-test", "_type" : "my-type", "_id" : "3"}}
{"col1" : "val3"}

Related

Why do I have to PUT new documents to a nested URI, if mapping types have been removed?

I'm on Elasticsearch 7.14.0 where mapping types have been removed.
If I run the following:
curl -X PUT "localhost:9200/products/1?pretty" -H 'Content-Type: application/json' -d'
{
"name": "Toast"
}
'
I get
{
"error" : "Incorrect HTTP method for uri [/products/1?pretty] and method [PUT], allowed: [POST]",
"status" : 405
}
It seems that elastic wants me PUT it in an /index/type/ URI:
curl -X PUT "localhost:9200/pop/products/1?pretty" -H 'Content-Type: application/json' -d'
{
"name": "Toast"
}
'
{
"_index" : "pop",
"_type" : "products",
"_id" : "1",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1
}
I am wondering why I must have a nested URI indicating a type, if mapping types have been removed?
You have to add _doc to your put request call as shown below
curl -X PUT "localhost:9200/products/_doc/1?pretty" -H 'Content-Type: application/json' -d'
{
"name": "Toast"
}
'
As mentioned in elasticsearch official documentation after mapping types were removed in 7.x, you need to add , _doc (which does not represent a document type rather it represents the endpoint name) for the document index, get, and delete APIs

Bulk index text field containing new lines with cURL

I am trying to bulk index a file with the following format to my elasticsearch index:
{"index":{"_index":"articles","_type":"_doc"}}
{"title":"My Article Title","text":"My article text. \nNext paragraph here."}
Using this command:
curl -s -XPOST -H 'Content-Type: application/x-ndjson' http://localhost:9200/_bulk --data-binary #/data.json
The problem is that the article text in my documents may contain new line characters \n, which breaks the formatting for a cURL bulk index, so I get this error:
{"error":{"root_cause":[{"type":"parse_exception","reason":"request body is required"}],"type":"parse_exception","reason":"request body is required"},"status":400}
I have been able to bulk index these documents using the javascript API, so I'm hoping it will be possible to do using cURL, as I want to index these documents into my docker image as a part of the build.
I've managed to do it on Elasticsearch 7.3 and Red Hat Enterprise Linux 7 (7.7).
1) Changed .json to .txt and just hit enter after the last line, saved and uploaded on server
[root#host tmp]$ mv data.json data.txt
2) Forced curl to append new line to output
[root#host tmp]$ echo '-w "\n"' >> ~/.curlrc
3) Curled to ES:
[root#host tmp]$ curl -s -XPOST -H 'Content-Type: application/x-ndjson' https://localhost:9200/_bulk -k -u user:pass --data-binary #data.json
{"took":4,"errors":false,"items":[{"index":{"_index":"articles","_type":"_doc","_id":"QdsosG0B3nqkAGly3E6t","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":1,"_primary_term":1,"status":201}}]}
4) Result:
[root#host tmp]$ curl -XGET -H 'Content-Type: application/x-ndjson' https://localhost:9200/articles/_search?pretty -k -u user:pass
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "articles",
"_type" : "_doc",
"_id" : "QdsosG0B3nqkAGly3E6t",
"_score" : 1.0,
"_source" : {
"title" : "My Article Title",
"text" : "My article text. \nNext paragraph here."
}
}
]
}
}

Bulk action to remove values in Elasticsearch

I have these documents:
curl -XPOST -H "Content-Type: application/json" "http://localhost:9200/jerewan/product/_search?pretty" -d'
{
{"index":{"_index":"test","_type":"product"}}
{"id_product":"1", "categories":[1,2], "enums":[10,11,20,21] }
{"index":{"_index":"test","_type":"product"}}
{"id_product":"2", "categories":[1,2], "enums":[10,15,25,26] }
{"index":{"_index":"test","_type":"product"}}
{"id_product":"3", "categories":[2,3], "enums":[11,12,13,21,22,23,24] }
}
I need to remove numbers 10 and 11 in enums field in all documents with bulk action.
So, output will be after this update:
{"id_product":"1", "categories":[1,2], "enums":[20,21] }
{"id_product":"2", "categories":[1,2], "enums":[15,25,26] }
{"id_product":"3", "categories":[2,3], "enums":[12,13,21,22,23,24]
Is it possible to do it?
Yes, it is possible.
curl -XPOST -H "Content-Type: application/json" "http://localhost:9200/jerewan/product/_search?pretty" -d'
{
{ "update" : {"_id" : "1", "_type" : "product", "_index" : "test"} }
{ "doc" : {"enums" : [20,21]} }
{ "update" : {"_id" : "2", "_type" : "product", "_index" : "test"} }
{ "doc" : {"enums" : [15,25, 26]} }
}
using this you can update for further read bulk api

Can fielddata_fields be used in mget request?

I am trying to get the fielddata from a not_analyzed field in Multi Get query. It is working fine in _search queries.
This is what I've tried with no luck:
curl -XGET "http://es:9200/articles/article/_mget/?pretty&fielddata_fields=url" -d '{"ids" : ["5763197951"]}'
curl -XGET "http://es:9200/articles/article/_mget/?pretty" -d '{"fielddata_fields": ["url"], "ids" : ["5763197951"]}'
curl -XGET "http://es:9200/articles/article/_mget/?pretty" -d '{"docs" : [{"_id" : "5763197951", "fielddata_fields": ["url"]}]}'
It looks like fielddata_fields is completely ignored, since I always get this result:
{
"docs" : [ {
"_index" : "articles",
"_type" : "article",
"_id" : "5763197951",
"_version" : 1,
"found" : true
} ]
}
I'm running ES version 1.4.4 with JVM: 1.8.0_31
Edit: I just tried the above with a test database running ES 2.2.2 with the same results...

Return document on update elasticsearch

Lets say I'm updating user data
curl -XPOST 'localhost:9200/test/type1/1/_update' -d '{
"doc" : {
"name" : "new_name"
},
"fields": ["_source"]
}'
Heres an example of what I'm getting back when I perform an update
{
"_index" : "test",
"_type" : "type1",
"_id" : "1",
"_version" : 4
}
How do I perform an update that returns the given document post update?
The documentation is a little misleading with regards to returning fields when performing an Elasticsearch update. It actually uses the same approach that the Index api uses, passing the parameter on the url, not as a field in the update.
In your case you would submit:
curl -XPOST 'localhost:9200/test/type1/1/_update?fields=_source' -d '{
"doc" : {
"name" : "new_name"
}
}'
In my testing in Elasticsearch 1.2.1 it returns something like this:
{
"_index":"test",
"_type":"testtype",
"_id":"1","_version":9,
"get": {
"found":true,
"_source": {
"user":"john",
"body":"testing update and return fields",
"name":"new_name"
}
}
}
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-index_.html
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-update.html

Resources