I read various guides to bulk insert data into an index.
But what ever I do the /n is not working. I guess there is some change in an Update of Postman or ES?
I try to POST to
localhost:9200/urls/url/_bulk
In the JSON field with JSON formated
{ "index" : {}} \n
{ "url" : "www.url1.com" } \n
{ "index" : {}} \n
{ "url" : "www.url2.com" } \n
Pretty weird but I got it.
The Code needs an empty line at the end:
"
{ "index" : { "_index" : "test", "_type" : "_doc" } }
{ "url" : "www.url1.com" }
{ "index" : { "_index" : "test2", "_type" : "_doc" } }
{ "url" : "www.url1.com" }
"
Related
I am using the Attachment Processor Attachment Processor in a Pipeline.
All work fine, but i wanted to do multiple post, then I tried to used bulk API.
Bulk work fine too, but I can't find how to send the url parameter "pipeline=attachment".
this put works :
POST testindex/type1/1?pipeline=attachment
{
"data": "Y291Y291",
"name" : "Marc",
"age" : 23
}
this bulk works :
POST _bulk
{ "index" : { "_index" : "testindex", "_type" : "type1", "_id" : "2" } }
{ "name" : "jean", "age" : 22 }
But how can I index Marc with his data field in bulk to be understood by the pipeline plugin?
thanks to Val comment, I did that and it work fine:
POST _bulk
{ "index" : { "_index" : "testindex", "_type" : "type1", "_id" : "2", "pipeline": "attachment"} } }
{"data": "Y291Y291", "name" : "jean", "age" : 22}
The documentation for ElasticSearch 5.5 offers no examples of how to use the bulk operation to index documents into the default mapping of an index. It also gives no indication why this is not possible, unless I'm missing that somewhere else in the documentation.
The ES 5.5 documentation gives one explicit example of bulk indexing:
POST _bulk
{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "1" } }
{ "field1" : "value1" }
But it also says that
The endpoints are /_bulk, /{index}/_bulk, and {index}/{type}/_bulk.
When the index or the index/type are provided, they will be used by
default on bulk items that don’t provide them explicitly.
So, the middle endpoint is valid, and it implies to me that a) you have to explicitly provide a type in the metadata for each document indexed, or b) that you can index documents into the default mapping ("_default_").
But I can't get this to work.
I've tried the /myindex/bulk endpoint with no type specified in the metadata.
I've tried it with "_type": "_default_" specified.
I've tried /myindex/_default_/bulk.
This has nothing to do with the _default_ mapping. This is about falling back to the default type that you specify in the URL. You can do the following
POST _bulk
{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "1" } }
{ "field1" : "value1" }
However the following snippet is exactly the same
POST /test/type1/_bulk
{ "index" : { "_id" : "1" } }
{ "field1" : "value1" }
And you can mix this
POST foo/bar/_bulk
{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "1" } }
{ "field1" : "value1" }
{ "index" : { "_id" : "1" } }
{ "field1" : "value1" }
In this example, one document would be indexed into foo and one into test.
Hope this makes sense.
I am new to Elasticsearch.
I am trying to upload my existing MySql data to Elasticsearch. Elasticsearch bulk import uses json as the data format. That's why I converted my data to the json format.
employee.json:
[{"EmpId":"101", "Name":"John Doe", "Dept":"IT"}
{"EmpId":"102", "Name":"FooBar", "Dept":"HR"}]
But I am not able to upload my data using the following curl command:
post: curl -XPOST 'localhost:9200/_bulk?pretty' --data-binary #employee.json
I get a parsing exception message.
After reading a document(https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html), I realized that the data format should be something like this:
action_and_meta_data\n
optional_source\n
action_and_meta_data\n
optional_source\n
....
action_and_meta_data\n
optional_source\n
I am still not sure how to format my data in the above format and perform the upload operation.
Basically I want to know the exact data format that is expected by the Elasticsearch bulk upload. And would also like to know whether my curl command is correct.
You data should be in form:
// if you want to use emp id as doc id specify otherwise dont add _id part
{ "index" : { "_index" : "index_name", "_type" : "type_name", "_id" : "101" } }
{"EmpId":"101", "Name":"John Doe", "Dept":"IT"}
{ "index" : { "_index" : "index_name", "_type" : "type_name", "_id" : "102" } }
{"EmpId":"102", "Name":"FooBar", "Dept":"HR"}
....
Or you can use logstash: https://www.elastic.co/blog/logstash-jdbc-input-plugin
From the docs:
{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "1" } }
{ "field1" : "value1" }
{ "delete" : { "_index" : "test", "_type" : "type1", "_id" : "2" } }
{ "create" : { "_index" : "test", "_type" : "type1", "_id" : "3" } }
{ "field1" : "value3" }
{ "update" : {"_id" : "1", "_type" : "type1", "_index" : "index1"} }
{ "doc" : {"field2" : "value2"} }
So you would probably want your file to read something like
{ "update" : {"_id" : "101", "_type" : "foo", "_index" : "bar"} }
{"EmpId":"101", "Name":"John Doe", "Dept":"IT"}
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html
The following sample data I haved used in my environment
Data:
{ "index" : { "_index" : "cases", "_type" : "case", "_id" : "101" } }
{ "admission" : "2015-01-03", "discharge" : "2015-01-04", "injury" : "broken arm" }
{ "index" : { "_index" : "cases", "_type" : "case", "_id" : "102" } }
{ "admission" : "2015-01-03", "discharge" : "2015-01-06", "injury" : "broken leg" }
{ "index" : { "_index" : "cases", "_type" : "case", "_id" : "103" } }
{ "admission" : "2015-01-06", "discharge" : "2015-01-07", "injury" : "broken nose" }
{ "index" : { "_index" : "cases", "_type" : "case", "_id" : "104" } }
{ "admission" : "2015-01-07", "discharge" : "2015-01-07", "injury" : "bruised arm" }
{ "index" : { "_index" : "cases", "_type" : "case", "_id" : "105" } }
{ "admission" : "2015-01-08", "discharge" : "2015-01-10", "injury" : "broken arm" }
{ "index" : { "_index" : "patients", "_type" : "patient", "_id" : "101" } }
{ "name" : "Adam", "age" : 28 }
{ "index" : { "_index" : "patients", "_type" : "patient", "_id" : "102" } }
{ "name" : "Bob", "age" : 45 }
{ "index" : { "_index" : "patients", "_type" : "patient", "_id" : "103" } }
{ "name" : "Carol", "age" : 34 }
{ "index" : { "_index" : "patients", "_type" : "patient", "_id" : "104" } }
{ "name" : "David", "age" : 14 }
{ "index" : { "_index" : "patients", "_type" : "patient", "_id" : "105" } }
{ "name" : "Eddie", "age" : 72 }
Indexed the data into the node
$ curl -X POST 'http://localhost:9200/_bulk' --data-binary #./hospital.json
[2015-02-12 08:18:01,347][INFO ][shield.license ] [node0] enabling license for [shield]
[2015-02-12 08:18:01,347][INFO ][license.plugin.core ] [node0] license for [shield] - valid
[2015-02-12 08:18:01,355][ERROR][shield.license ] [node0]
#
# Shield license will expire on [Saturday, March 14, 2015]. Cluster health, cluster stats and indices stats operations are
# blocked on Shield license expiration. All data operations (read and write) continue to work. If you
# have a new license, please update it. Otherwise, please reach out to your support contact.
#
Installed Shield and started as the above
The data is protected and I can see like below if i'm trying to access.
$ curl localhost:9200/cases/case/101?pretty=true
{
"error" : "AuthenticationException[missing authentication token for REST request [/cases/case/1]]",
"status" : 401
}
User and roles are added like below
$ elasticsearch/bin/shield/esusers useradd alice -r nurse
$ elasticsearch/bin/shield/esusers useradd bob -r doctor
I have edited the roles.yml and tried to add doctor and nurse according to the eg mentioned above. The security is not worked for me.
ubuntu#ip-10-142-247-183:~/elkproject/elasticsearch-1.4.4/config/shield$ curl --user alice:abc123 localhost:9200/_count?pretty=true
{
"error" : "AuthenticationException[unable to authenticate user [alice] for REST request [/_count?pretty=true]]",
"status" : 401
}
Note : I referred this blog http://blog.trifork.com/2015/03/05/shield-your-kibana-dashboards/
Any help would be highly appreciated
Did you install elasticsearch from a package (like a RPM or DEB)? If so, there may be an issue with the esusers tool putting the users in the wrong place. Right now, you have to configure your environment with the right location and add the users. If this is the case, you can move the $ES_HOME/config/shield directory to /etc/elasticsearch, which is the default configuration directory for RPM and DEB installations. When using the esusers commands in the future, just make sure the environment variables are set like shown in the link.
You can also remove Shield and start the install over following the full getting started guide and then start modifying the files as mentioned in the blog. To remove the existing Shield install: bin/plugin -r shield
I am using Logstash, ElasticSearch and Kibana to allow multiple users to log in and view the log data they have forwarded. I have created index aliases for each user. These restrict their results to contain only their own data.
I'd like to assign users to groups, and allow users to view data for the computers in their group. I created a parent-child relationship between the groups and the users, and I created a term lookup filter on the alias.
My problem is, I receive a RoutingMissingException when I try to apply the alias.
Is there a way to specify the routing for the term lookup filter? How can I lookup terms on a parent document?
I posted the mapping and alias below, but a full gist recreation is available at this link.
curl -XPUT 'http://localhost:9200/accesscontrol/' -d '{
"mappings" : {
"group" : {
"properties" : {
"name" : { "type" : "string" },
"hosts" : { "type" : "string" }
}
},
"user" : {
"_parent" : { "type" : "group" },
"_routing" : { "required" : true, "path" : "group_id" },
"properties" : {
"name" : { "type" : "string" },
"group_id" : { "type" : "string" }
}
}
}
}'
# Create the logstash alias for cvializ
curl -XPOST 'http://localhost:9200/_aliases' -d '
{
"actions" : [
{ "remove" : { "index" : "logstash-2014.04.25", "alias" : "cvializ-logstash-2014.04.25" } },
{
"add" : {
"index" : "logstash-2014.04.25",
"alias" : "cvializ-logstash-2014.04.25",
"routing" : "intern",
"filter": {
"terms" : {
"host" : {
"index" : "accesscontrol",
"type" : "user",
"id" : "cvializ",
"path" : "group.hosts"
},
"_cache_key" : "cvializ_hosts"
}
}
}
}
]
}'
In attempting to find a workaround for this error, I submitted a bug to the ElasticSearch team, and received an answer from them. It was a bug in ElasticSearch where the filter is applied before the dynamic mapping, causing some erroneous output. I've included their workaround below:
PUT /accesscontrol/group/admin
{
"name" : "admin",
"hosts" : ["computer1","computer2","computer3"]
}
PUT /_template/admin_group
{
"template" : "logstash-*",
"aliases" : {
"template-admin-{index}" : {
"filter" : {
"terms" : {
"host" : {
"index" : "accesscontrol",
"type" : "group",
"id" : "admin",
"path" : "hosts"
}
}
}
}
},
"mappings": {
"example" : {
"properties": {
"host" : {
"type" : "string"
}
}
}
}
}
POST /logstash-2014.05.09/example/1
{
"message":"my sample data",
"#version":"1",
"#timestamp":"2014-05-09T16:25:45.613Z",
"type":"example",
"host":"computer1"
}
GET /template-admin-logstash-2014.05.09/_search