Initial script for Elasticsearch - elasticsearch

Is it possible to create an initial script for Elasticsearch?
For example, I prepare one JSON file with index 20 users and 20 books.
I want to load it by the single request.
Example file:
PUT eyes
{
"settings" : {
"number_of_shards" : 1
},
"mappings" : {
"_doc" : {
"properties" : {
"name" : { "type" : "text" },
"color" : { "type" : "text" }
}
}
}
}
PUT eyes/_doc/1
{
"name": "XXX"
"color" : "red"
}
PUT eyes/_doc/2
{
"name": "XXXX"
"color" : "blue"
}

You can use bulk API for populating your index in one single call.
https://www.elastic.co/guide/en/elasticsearch/reference/master/docs-bulk.html
PUT /eyes/_doc/_bulk
{"index":{"_id":1}}
{"name":"XXX","color":"red"}
{"index":{"_id":2}}
{"name":"XXX","color":"blue"}
{"index":{"_id":3}}
{"name":"XXX","color":"green"}

Related

How to use nested in Ealsticsearch 7.10

Followings are the steps on how using nested field in elastersearch.
First step:
curl -XPUT 'localhost:9200/my_index/my_type/1?pretty' -d'
{
"group" : "fans",
"user" : [ // 1
{
"first" : "John",
"last" : "Smith"
},
{
"first" : "Alice",
"last" : "White"
}
]
}'
Second step:
curl -XPUT 'localhost:9200/my_index?pretty' -d'
{
"mappings": {
"my_type": {
"properties": {
"user": {
"type": "nested" // 1
}
}
}
}
}'
Before i copy the code, i have delete all the index on my machine.
However, after running step 2, something went woring like the following .
{
"error" : {
"root_cause" : [
{
"type" : "resource_already_exists_exception",
"reason" : "index [my_index/yHhgr8iEQqGnHo5Ugex2dA] already exists",
"index_uuid" : "yHhgr8iEQqGnHo5Ugex2dA",
"index" : "my_index"
}
],
"type" : "resource_already_exists_exception",
"reason" : "index [my_index/yHhgr8iEQqGnHo5Ugex2dA] already exists",
"index_uuid" : "yHhgr8iEQqGnHo5Ugex2dA",
"index" : "my_index"
},
"status" : 400
}
I really don't konw what to do about this.(I have also tried create nested field first. It also went wrong)
I'm new to elastersearch, really need help. Thankyou very mutch!!!
Since you are using Elasticsearch version 7.10, you cannot add the mapping type in the index mapping definition. Refer to this to know more about the removal of mapping types.
You can not change the mapping of an index that already exists, you need to delete it and index the data with the new mapping, or reindex into a new index with the new mapping.
You need to first create the index with the following mapping:
PUT /my_index
{
"mappings": {
"properties": {
"user": {
"type": "nested" // 1
}
}
}
}
And then index the documents into the index. Refer to this official documentation, to know more about nested type.
PUT /my_index/_doc/1
{
"group" : "fans",
"user" : [ // 1
{
"first" : "John",
"last" : "Smith"
},
{
"first" : "Alice",
"last" : "White"
}
]
}

Aggregating Nested Fields in Kibana /Elastic Search

I have defined an Index in elastic cache 6
PUT my_index
{
"mappings": {
"_doc": {
"properties": {
"user": {
"type": "nested"
}
}
}
}
}
and loaded some same data as follows
PUT my_index/_doc/1
{
"group" : "coach",
"user" : [
{
"first" : "John",
"last" : "Frank"
},
{
"first" : "Hero",
"last" : "tim"
}
]
}
PUT my_index/_doc/2
{
"group" : "team",
"user" : [
{
"first" : "John",
"last" : "term"
},
{
"first" : "david",
"last" : "gayle"
}
]
}
Now I am trying to search in the discover page or the visualization page, but I receive a blank
after a bit of trial and error and googling around i found that does not support nested type for aggregation and search out of the box. To enable this you must install a plugin and the best plugin i found is listed below.
https://ppadovani.github.io/knql_plugin/overview/
The plugin provides all the features from the discover tab to the visualization tab.

Specify Routing on Index Alias's Term Lookup Filter

I am using Logstash, ElasticSearch and Kibana to allow multiple users to log in and view the log data they have forwarded. I have created index aliases for each user. These restrict their results to contain only their own data.
I'd like to assign users to groups, and allow users to view data for the computers in their group. I created a parent-child relationship between the groups and the users, and I created a term lookup filter on the alias.
My problem is, I receive a RoutingMissingException when I try to apply the alias.
Is there a way to specify the routing for the term lookup filter? How can I lookup terms on a parent document?
I posted the mapping and alias below, but a full gist recreation is available at this link.
curl -XPUT 'http://localhost:9200/accesscontrol/' -d '{
"mappings" : {
"group" : {
"properties" : {
"name" : { "type" : "string" },
"hosts" : { "type" : "string" }
}
},
"user" : {
"_parent" : { "type" : "group" },
"_routing" : { "required" : true, "path" : "group_id" },
"properties" : {
"name" : { "type" : "string" },
"group_id" : { "type" : "string" }
}
}
}
}'
# Create the logstash alias for cvializ
curl -XPOST 'http://localhost:9200/_aliases' -d '
{
"actions" : [
{ "remove" : { "index" : "logstash-2014.04.25", "alias" : "cvializ-logstash-2014.04.25" } },
{
"add" : {
"index" : "logstash-2014.04.25",
"alias" : "cvializ-logstash-2014.04.25",
"routing" : "intern",
"filter": {
"terms" : {
"host" : {
"index" : "accesscontrol",
"type" : "user",
"id" : "cvializ",
"path" : "group.hosts"
},
"_cache_key" : "cvializ_hosts"
}
}
}
}
]
}'
In attempting to find a workaround for this error, I submitted a bug to the ElasticSearch team, and received an answer from them. It was a bug in ElasticSearch where the filter is applied before the dynamic mapping, causing some erroneous output. I've included their workaround below:
PUT /accesscontrol/group/admin
{
"name" : "admin",
"hosts" : ["computer1","computer2","computer3"]
}
PUT /_template/admin_group
{
"template" : "logstash-*",
"aliases" : {
"template-admin-{index}" : {
"filter" : {
"terms" : {
"host" : {
"index" : "accesscontrol",
"type" : "group",
"id" : "admin",
"path" : "hosts"
}
}
}
}
},
"mappings": {
"example" : {
"properties": {
"host" : {
"type" : "string"
}
}
}
}
}
POST /logstash-2014.05.09/example/1
{
"message":"my sample data",
"#version":"1",
"#timestamp":"2014-05-09T16:25:45.613Z",
"type":"example",
"host":"computer1"
}
GET /template-admin-logstash-2014.05.09/_search

Elasticsearch not storing field, what am I doing wrong?

I have something like the following template in my Elasticsearch. I just want certain part of the data returned, so I turn the source off, and explicitly stated store for the fields I want.
{
"template_1" : {
"order" : 20,
"template" : "test*",
"settings" : { },
"mappings" : {
"_default_" : {
"_source" : {
"enabled" : false
}
},
"type_1" : {
"mydata" :
"store" : "yes",
"type" : "string"
}
}
}
}
}
However, when I query the data, I don't get the fields back. The query works, however, if I enable the _source field. I am just starting with Elasticsearch, so I am not quite sure what I am doing wrong. Any help would be appreciated.
Field definitions should be wrapped in properties section of your mapping:
"type_1" : {
"properties": {
"mydata" :
"store" : "yes",
"type" : "string"
}
}
}

ElasticSerach - Statistical facets on length of the list

I have the following sample mappipng:
{
"book" : {
"properties" : {
"author" : { "type" : "string" },
"title" : { "type" : "string" },
"reviews" : {
"properties" : {
"url" : { "type" : "string" },
"score" : { "type" : "integer" }
}
},
"chapters" : {
"include_in_root" : 1,
"type" : "nested",
"properties" : {
"name" : { "type" : "string" }
}
}
}
}
}
I would like to get a facet on number of reviews - i.e. length of the "reviews" array.
For instance, verbally spoken results I need are: "100 documents with 10 reviews, 20 documents with 5 reviews, ..."
I'm trying the following statistical facet:
{
"query" : {
"match_all" : {}
},
"facets" : {
"stat1" : {
"statistical" : {"script" : "doc['reviews.score'].values.size()"}
}
}
}
but it keeps failing with:
{
"error" : "SearchPhaseExecutionException[Failed to execute phase [query_fetch], total failure; shardFailures {[mDsNfjLhRIyPObaOcxQo2w][facettest][0]: QueryPhaseExecutionException[[facettest][0]: query[ConstantScore(NotDeleted(cache(org.elasticsearch.index.search.nested.NonNestedDocsFilter#a2a5984b)))],from[0],size[10]: Query Failed [Failed to execute main query]]; nested: PropertyAccessException[[Error: could not access: reviews; in class: org.elasticsearch.search.lookup.DocLookup]
[Near : {... doc[reviews.score].values.size() ....}]
^
[Line: 1, Column: 5]]; }]",
"status" : 500
}
How can I achieve my goal?
ElasticSearch version is 0.19.9.
Here is my sample data:
{
"author" : "Mark Twain",
"title" : "The Adventures of Tom Sawyer",
"reviews" : [
{
"url" : "amazon.com",
"score" : 10
},
{
"url" : "www.barnesandnoble.com",
"score" : 9
}
],
"chapters" : [
{ "name" : "Chapter 1" }, { "name" : "Chapter 2" }
]
}
{
"author" : "Jack London",
"title" : "The Call of the Wild",
"reviews" : [
{
"url" : "amazon.com",
"score" : 8
},
{
"url" : "www.barnesandnoble.com",
"score" : 9
},
{
"url" : "www.books.com",
"score" : 5
}
],
"chapters" : [
{ "name" : "Chapter 1" }, { "name" : "Chapter 2" }
]
}
It looks like you are using curl to execute your query and this curl statement looks like this:
curl localhost:9200/my-index/book -d '{....}'
The problem here is that because you are using apostrophes to wrap the body of the request, you need to escape all apostrophes that it contains. So, you script should become:
{"script" : "doc['\''reviews.score'\''].values.size()"}
or
{"script" : "doc[\"reviews.score"].values.size()"}
The second issue is that from your description it looks like your are looking for a histogram facet or a range facet but not for a statistical facet. So, I would suggest trying something like this:
curl "localhost:9200/test-idx/book/_search?search_type=count&pretty" -d '{
"query" : {
"match_all" : {}
},
"facets" : {
"histo1" : {
"histogram" : {
"key_script" : "doc[\"reviews.score\"].values.size()",
"value_script" : "doc[\"reviews.score\"].values.size()",
"interval" : 1
}
}
}
}'
The third problem is that the script in the facet will be called for every single record in the result list and if you have a lot of results it might take really long time. So, I would suggest indexing an additional field called number_of_reviews that should be populated with the number of reviews by your client. Then your query would simply become:
curl "localhost:9200/test-idx/book/_search?search_type=count&pretty" -d '{
"query" : {
"match_all" : {}
},
"facets" : {
"histo1" : {
"histogram" : {
"field" : "number_of_reviews"
"interval" : 1
}
}
}
}'

Resources