Elasticsearch How do I get a metadata using Image Plugin - image

I defined matadata by the mapping of the Elasticsearch image Plugin.
Mapping:
"photo" : {
"mappings" : {
"scenery" : {
"properties" : {
"my_img" : {
"type" : "image",
"feature" : {"FCTH" : { }, ... },
"metadata" : {
"jpeg.image_height" : {"type" : "string","store" : true},
"jpeg.image_width" : {"type" : "string","store" : true}
}
}
}
}
}
}
After an index, although searched, metadata does not return.
How do I get a metadata?
I tried:
curl -XPOST 'localhost:9200/photo/scenery/_search' -d '{
"query":{
"image":{
"my_img":{
"feature":"CEDD",
"index":"photo",
"type":"scenery",
"id":"0",
"path":"my_img",
"hash":"BIT_SAMPLING"
}
}
}
}'
Result:
{"took":14,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":5,"max_score":1.0,"hits":[{"_index":"photo","_type":"scenery","_id":"0","_score":1.0, "_source" : {"file_name": "376423.jpg", "my_img": "/9j/4AAQSkZJRgABAQ...

Perhaps, the original data (base64 encoded image) will be returned _source field. You can use that instead, the fields option.
Try this query.
curl -XPOST 'localhost:9200/photo/scenery/_search' -d '{
"query":{
...
},
"fields": ["my_img.metadata.jpeg.image_height","my_img.metadata.jpeg.image_width" ]
}'

Related

Setting enabled to true Elasticsearch

I'm new to elasticsearch. I have an index type as follows
{
"myindex" : {
"mappings" : {
"systemChanges" : {
"_all" : {
"enabled" : false
},
"properties" : {
"autoChange" : {
"type" : "boolean"
},
"changed" : {
"type" : "object",
"enabled" : false
},
"created" : {
"type" : "date",
"format" : "strict_date_optional_time||epoch_millis"
}
}
}
}
}
}
I'm unable to fetch the details having changed.new = completed. After some research i have found that it's because the changed field is set to enabled : false. and I need to change the same. I tried as follows
curl -X PUT "localhost:9200/myindex/" -H 'Content-Type: application/json' -d' {
"mappings": {
"systemChanges" : {
"properties" : {
"changed" : {
"enabled" : true
}
}
}
}
}'
But I'm getting error as following.
{"error":{"root_cause":[{"type":"index_already_exists_exception","reason":"already exists","index":"myindex"}],"type":"index_already_exists_exception","reason":"already exists","index":"myindex"},"status":400}
How can I change the enabled to true in order to fetch the details of the changed.new field?
you are trying to add an index again with the same name and hence the error.
See the below link for updating a mapping
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html
The enabled setting can be updated on existing fields using the PUT mapping API.

How do I configure elastic search to use the icu_tokenizer?

I'm trying to search a text indexed by elasticsearch and the icu_tokenizer but can't get it working.
My testcase is to tokenize the sentence “Hello. I am from Bangkok”, in thai สวัสดี ผมมาจากกรุงเทพฯ, which should be tokenized to the five words สวัสดี, ผม, มา, จาก, กรุงเทพฯ. (Sample from Elasticsearch - The Definitive Guide)
Searching using any of the last four words fails for me. Searching using any of the space separated words สวัสดี or ผมมาจากกรุงเทพฯ works fine.
If I specify the icu_tokenizer on the command line, like
curl -XGET 'http://localhost:9200/icu/_analyze?tokenizer=icu_tokenizer' -d "สวัสดี ผมมาจากกรุงเทพฯ"
it tokenizes to five words.
My settings are:
curl http://localhost:9200/icu/_settings?pretty
{
"icu" : {
"settings" : {
"index" : {
"creation_date" : "1474010824865",
"analysis" : {
"analyzer" : {
"nfkc_cf_normalized" : [ "icu_normalizer" ],
"tokenizer" : "icu_tokenizer"
}
}
},
"number_of_shards" : "5",
"number_of_replicas" : "1",
"uuid" : "tALRehqIRA6FGPu8iptzww",
"version" : {
"created" : "2040099"
}
}
}
}
The index is populated with
curl -XPOST 'http://localhost:9200/icu/employee/' -d '
{
"first_name" : "John",
"last_name" : "Doe",
"about" : "สวัสดี ผมมาจากกรุงเทพฯ"
}'
Searching with
curl -XGET 'http://localhost:9200/_search' -d'
{
"query" : {
"match" : {
"about" : "กรุงเทพฯ"
}
}
}'
Returns nothing ("hits" : [ ]).
Performing the same search with one of สวัสดี or ผมมาจากกรุงเทพฯ works fine.
I guess I've misconfigured the index, how should it be done?
The missing part is:
"mappings": {
"employee" : {
"properties": {
"about":{
"type": "text",
"analyzer": "icu_analyzer"
}
}
}
}
In the mapping, the document field have to be specified the analyzer to be using.
[Index] : icu
[type] : employee
[field] : about
PUT /icu
{
"settings": {
"analysis": {
"analyzer": {
"icu_analyzer" : {
"char_filter": [
"icu_normalizer"
],
"tokenizer" : "icu_tokenizer"
}
}
}
},
"mappings": {
"employee" : {
"properties": {
"about":{
"type": "text",
"analyzer": "icu_analyzer"
}
}
}
}
}
test the custom analyzer using followings DSLJson
POST /icu/_analyze
{
"text": "สวัสดี ผมมาจากกรุงเทพฯ",
"analyzer": "icu_analyzer"
}
The result should be [สวัสดี, ผม, มา, จาก, กรุงเทพฯ]
My suggestion would be :
Kibana : Dev Tool could help you for effective query crafting

Elasticsearch does not found an existing document using a the DSL

I dont know why, using the URI Search way to search a document is returning the right document, but the document is not found if I use the API DSL.
To reproduce the issue:
Without any index created, I insert this document:
curl http://localhost:9299/integrationtest-index/searchable/ID_XXXX2 -d '{ "ref" : "XXXX2", "field1" : "value1" }'
So the index is created automatically with the default mapping (type searchable):
curl http://localhost:9299/integrationtest-index?pretty
{
"integrationtest-index" : {
"aliases" : { },
"mappings" : {
"searchable" : {
"properties" : {
"field1" : {
"type" : "string"
},
"ref" : {
"type" : "string"
}
}
}
},
"settings" : {
"index" : {
"field1" : "value1",
"ref" : "XXXX2",
"number_of_shards" : "5",
"creation_date" : "1466780216631",
"number_of_replicas" : "1",
"uuid" : "GBj2VF-wQy6JP74AqoIn5g",
"version" : {
"created" : "2020099"
}
}
},
"warmers" : { }
}
}
This query return one document:
curl http://localhost:9299/integrationtest-index/searchable/_search?q=ref:XXXX2
But this other query response that does not exist:
curl -XPOST http://localhost:9299/integrationtest-index/searchable/_search/exists -d '
{
"query": {
"term" : {
"ref" : "XXXX2"
}
}
}'
Why the last query said that the document does not exist?
Environment:
ElasticSearch 2.2.0
Ubuntu 16.04 LTS
OpenJDK Runtime Environment (build 1.8.0_91-8u91-b14-0ubuntu4~16.04.1-b14)
I have the same problem every few months, so I decided response myself and share my stupids errors.
By default, elasticsearch use index:analyzed, so the query with term does not found any document.
If you use the URI Search way, elasticsearch is executing a query_string and not a term query.
This query is working:
curl -XPOST http://localhost:9299/integrationtest-index/searchable/_search/exists -d '
{
"query": {
"match" : {
"ref" : "XXXX2"
}
}
}'
More information in the documentation, in the section Why doesn’t the term query match my document?

How to tell ElasticSearch to create nested fields

I'm putting data in ES and check the mapping which is created,
I'm executing this:
curl -XPOST 'http://localhost:9200/testnested2/type1/0' -d '{
"p1": ["1","2","3","4"],
"users" : {
"first" : "John",
"last" : "Sm11ith"
}
}'
and this is the schema it created:
{
"testnested2":{
"mappings":{
"type1":{
"properties":{
"p1":{"type":"string"},
"users":{
"properties":{
"first":{"type":"string"},
"last":{"type":"string"}
}
}
}
}
}
}
}
I was wondering if it's possible to tell it that "users" is nested or I have to create the mapping for myself.
I would like that ES could create an shema like this:
curl -XPOST http://180.5.5.93:9200/testnested3 -d '{
"settings" : {
"number_of_shards" : 1
},
"mappings" : {
"type1" : {
"properties" : {
"propiedad1" : { "type" : "string"},
"users" : {
"type" : "nested",
"include_in_parent": true,
"properties": {
"first" : {"type": "string" },
"last" : {"type": "string" }
}
}
}
}
}
}'
By default, the dynamic mapping feature of ElasticSearch will map users as an object instead of nested.
If you want to tune this behavior, you have to define explicitely a users attribute as nested either in :
the type1 mapping
the default mapping of your index. This way, for any type created, the users attribute will be set automatically to nested(see here for default mapping information)

Specify Routing on Index Alias's Term Lookup Filter

I am using Logstash, ElasticSearch and Kibana to allow multiple users to log in and view the log data they have forwarded. I have created index aliases for each user. These restrict their results to contain only their own data.
I'd like to assign users to groups, and allow users to view data for the computers in their group. I created a parent-child relationship between the groups and the users, and I created a term lookup filter on the alias.
My problem is, I receive a RoutingMissingException when I try to apply the alias.
Is there a way to specify the routing for the term lookup filter? How can I lookup terms on a parent document?
I posted the mapping and alias below, but a full gist recreation is available at this link.
curl -XPUT 'http://localhost:9200/accesscontrol/' -d '{
"mappings" : {
"group" : {
"properties" : {
"name" : { "type" : "string" },
"hosts" : { "type" : "string" }
}
},
"user" : {
"_parent" : { "type" : "group" },
"_routing" : { "required" : true, "path" : "group_id" },
"properties" : {
"name" : { "type" : "string" },
"group_id" : { "type" : "string" }
}
}
}
}'
# Create the logstash alias for cvializ
curl -XPOST 'http://localhost:9200/_aliases' -d '
{
"actions" : [
{ "remove" : { "index" : "logstash-2014.04.25", "alias" : "cvializ-logstash-2014.04.25" } },
{
"add" : {
"index" : "logstash-2014.04.25",
"alias" : "cvializ-logstash-2014.04.25",
"routing" : "intern",
"filter": {
"terms" : {
"host" : {
"index" : "accesscontrol",
"type" : "user",
"id" : "cvializ",
"path" : "group.hosts"
},
"_cache_key" : "cvializ_hosts"
}
}
}
}
]
}'
In attempting to find a workaround for this error, I submitted a bug to the ElasticSearch team, and received an answer from them. It was a bug in ElasticSearch where the filter is applied before the dynamic mapping, causing some erroneous output. I've included their workaround below:
PUT /accesscontrol/group/admin
{
"name" : "admin",
"hosts" : ["computer1","computer2","computer3"]
}
PUT /_template/admin_group
{
"template" : "logstash-*",
"aliases" : {
"template-admin-{index}" : {
"filter" : {
"terms" : {
"host" : {
"index" : "accesscontrol",
"type" : "group",
"id" : "admin",
"path" : "hosts"
}
}
}
}
},
"mappings": {
"example" : {
"properties": {
"host" : {
"type" : "string"
}
}
}
}
}
POST /logstash-2014.05.09/example/1
{
"message":"my sample data",
"#version":"1",
"#timestamp":"2014-05-09T16:25:45.613Z",
"type":"example",
"host":"computer1"
}
GET /template-admin-logstash-2014.05.09/_search

Resources