ElasticSearch - Search fails for 2 string fields - elasticsearch

I am running into search error - need some help. I have article index with id, title, artist, genre fields. When I run this query I get zero results-
POST /d3acampaign/article/_search
{
"query": {
"filtered": {
"query": {
"match": {"genre": "metal"}
},
"filter": {
"term": {"artist": "Somanath"}
}
}
}
}
But if I change the query to something like -
POST /d3acampaign/article/_search
{
"query": {
"filtered": {
"query": {
"match": {"genre": "metal"}
},
"filter": {
"term": {"id": "7"}
}
}
}
}
i get following result -
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1.4054651,
"hits": [
{
"_index": "d3acampaign",
"_type": "article",
"_id": "7",
"_score": 1.4054651,
"_source": {
"id": "7",
"title": "The Last Airbender",
"artist": "Somanath",
"genre": "metal"
}
}
]
}
}
Clarification - I am noticing search failing in case if I try against string e.g. artist, title

The reason of you get empty hits is when you query by using term query, it will match the exact term in index includes uppercase.
and the default analyzer will create index with lowercase.
There are many ways to analyze text: the default standard analyzer
drops most punctuation, breaks up text into individual words, and
lower cases them. https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-term-query.html
The solution:
{
"query": {
"filtered": {
"query": {
"match": {"genre": "metal"}
},
"filter": {
"term": {"artist": "somanath"} //to lower case
}
}
}
}
The other solution is change your index mapping to not_analyzed for your index type.
The full example:
#!/bin/bash
curl -XDELETE "localhost:9200/testindex"
curl -XPUT "localhost:9200/testindex/?pretty" -d '
{
"mappings":{
"test":{
"properties":{
"name":{
"index":"not_analyzed",
"type":"string"
}
}
}
}
}'
curl -XGET "localhost:9200/testindex/_mapping?pretty"
curl -XPOST "localhost:9200/testindex/test/1" -d '{
"name": "Jack"
}'
sleep 1
echo -e
echo -e
echo -e
echo -e "Filtered Query Search in not_analyzed index:"
echo -e
curl -XGET "localhost:9200/testindex/test/_search?pretty" -d '{
"query": {
"filtered": {
"filter": {
"term": {"name": "Jack"}
}
}
}
}'

Related

Elasticsearch filter query does not return any document

I cannot understand why if I query Elasticsearch with filter like this:
curl -H'content-type: application/json' "localhost:9200/.kibana/_search" -d '{
"query": {
"bool": {
"filter": [
{
"term": {
"type": "index-pattern"
}
}
]
}
}
}'
{"took":0,"timed_out":false,"_shards":{"total":4,"successful":4,"skipped":0,"failed":0},"hits":{"total":{"value":0,"relation":"eq"},"max_score":null,"hits":[]}}
As you can see, I have empty result set.
But instead I do have a document where "type" field equals to "index-pattern".
{
"_index": ".kibana",
"_type": "_doc",
"_id": "index-pattern:c37de740-7e94-11eb-b6c2-4302716621be",
"_score": 0,
"_source": {
"index-pattern": {
"title": "r*",
"timeFieldName": "#timestamp",
"fields": "<omitted - too long>"
},
"type": "index-pattern",
"references": [],
"migrationVersion": {
"index-pattern": "7.6.0"
},
"updated_at": "2021-03-06T15:58:18.062Z"
}
}
What's wrong with my query?
When the type field is mapped as text by default and you'd like to apply term queries on it, the hyphen will prevent the query from matching because text is analyzed by the standard analyzer which removes hyphens and other special characters upon ingestion. Having said that, the term query returns documents that contain an exact match (special chars included) which caused your original query to not return anything.
So target the .keyword multi-field instead:
curl -H'content-type: application/json' "localhost:9200/.kibana/_search" -d '{
"query": {
"bool": {
"filter": [
{
"term.keyword": {
"type": "index-pattern"
}
}
]
}
}
}'

How to include search suggestions grouped by entity type?

How would you search an index in Elastic Search that would include different matching indexes/entities along with it. I need to have complex search suggestions, they need to be grouped per entity. An image speaks a thousand words, so the following image describes pretty much what I want to achieve:
How should I model my indexes to achieve the above?
Right now my order index looks like this:
{
"_index": "mango",
"_type": "order",
"_id": "4",
"_score": 1,
"_source": {
"number": "000000004",
"customer": {
"id": 14,
"firstName": "Jean",
"lastName": "Hermann",
"email": "lucinda90#example.com"
}
}
}
And when I do a search with the text example.com I need a response, that looks somewhat like (left out hits to be more readable):
{
"hits": {
"hits": []
}
"aggregations": {
"customers": [
{
"id": 1,
"firstName": "Mille",
"lastName": "VonRueden",
"email": "shickle#example.com"
},
{
"id": 2,
"firstName": "Clint",
"lastName": "Effertz",
"email": "briana91#example.com"
}
]
}
}
How would my search query look like to achieve such response?
I have tried to use the following search query, but it just returns an empty bucket:
{
"size": 1,
"aggs": {
"customers": {
"nested": {
"path": "customer"
},
"aggs": {
"name": {
"terms": {
"field": "customer.id"
}
}
}
}
}
}
This is the mapping of my order index (in YAML format):
order:
mappings:
number: ~
createdAt:
type: date
customer:
type: nested
properties:
id :
type : integer
index: not_analyzed
firstName:
type: string
index: not_analyzed
The easiest would be to have one index and mapping type per entity. The screenshot you are showing could be modeled like this:
index: companies and mapping type company
index: groups and mapping type group
index: features and mapping type feature
index: skills and mapping type skill
Here are some sample commands you'd use to create those indices and mapping types for each of the entities:
curl -XPUT localhost:9200/companies -d '{
"mappings": {
"company": { "properties": { ... }}
}
}'
curl -XPUT localhost:9200/groups -d '{
"mappings": {
"group": { "properties": { ... }}
}
}'
curl -XPUT localhost:9200/features -d '{
"mappings": {
"feature": { "properties": { ... }}
}
}'
curl -XPUT localhost:9200/skills -d '{
"mappings": {
"skill": { "properties": { ... }}
}
}'
Then when searching, you'd need to search on all indices and mapping types and order the results by _type (asc) and _score (desc):
curl -XPOST 'localhost:9200/_search?q=blog' -d '{
"sort":[
{"_type":"asc"},
{"_score":"desc"}
]
}'
Finally, you simply need to read your sorted results and show them according to their type.
UPDATE
Following up on your comments, if you wish to do it through aggregations, you're on the right path, you'd just need to add a top_hits sub-aggregation:
{
"size": 1,
"aggs": {
"customers": {
"nested": {
"path": "customer"
},
"aggs": {
"name": {
"terms": {
"field": "customer.id"
},
"aggs": { <--- add this
"hits": {
"top_hits: {}
}
}
}
}
}
}
}

Elastic Search fulltext search query and filters

I wanna perform a full-text search, but I also wanna use one or many possible filters. The simplified structure of my document, when searching with /things/_search?q=*foo*:
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "things",
"_type": "thing",
"_id": "63",
"_score": 1,
"fields": {
"name": [
"foo bar"
],
"description": [
"this is my description"
],
"type": [
"inanimate"
]
}
}
]
}
}
This works well enough, but how do I combine filters with a query? Let's say I wanna search for "foo" in an index with multiple documents, but I only want to get those with type == "inanimate"?
This is my attempt so far:
{
"query": {
"filtered": {
"query": {
"query_string": {
"query": "*foo*"
}
},
"filter": {
"bool": {
"must": {
"term": { "type": "inanimate" }
}
}
}
}
}
}
When I remove the filter part, it returns an accurate set of document hits. But with this filter-definition it does not return anything, even though I can manually verify that there are documents with type == "inanimate".
Since you have not done explicit mapping, term query is looking for an exact match. you need to add "index : not_analyzed" to type field and then your query will work.
This will give you correct documents
{
"query": {
"match": {
"type": "inanimate"
}
}
}
but this is not the solution, You need do explicit mapping as I said.

Search query for elasticsearch when child element is array of string

I created a documents in elasticsearch in the following format
curl -XPUT "http://localhost:9200/my_base.main_candidate/" -d'
{
"specific_location": {
"location_name": "Mumbai",
"location_tags": [
"Mumbai"
],
"tags": [
"Mumbai"
]
}
}'
My requirement is to search for location_tags containing one of the given options like ["Mumbai", "Pune"]. How do I do this?
I tried:
curl -XGET "http://localhost:9200/my_base.main_candidate/_search" -d '
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"terms": {
"specific_location.location_tags" : ["Mumbai"]
}
}
}
}
}'
which didn't work.
I got this output :
{
"took": 72,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
There are a several ways to solve this. Perhaps the most immediate one is to search for mumbai instead of Mumbai.
If I create the index with no mapping,
curl -XDELETE "http://localhost:9200/my_base.main_candidate/"
curl -XPUT "http://localhost:9200/my_base.main_candidate/"
then add a doc:
curl -XPUT "http://localhost:9200/my_base.main_candidate/doc/1" -d'
{
"specific_location": {
"location_name": "Mumbai",
"location_tags": [
"Mumbai"
],
"tags": [
"Mumbai"
]
}
}'
then run your query with the lower-case term
curl -XPOST "http://localhost:9200/my_base.main_candidate/_search" -d'
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"terms": {
"specific_location.location_tags": [
"mumbai"
]
}
}
}
}
}'
I get back the expected doc:
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "my_base.main_candidate",
"_type": "doc",
"_id": "1",
"_score": 1,
"_source": {
"specific_location": {
"location_name": "Mumbai",
"location_tags": [
"Mumbai"
],
"tags": [
"Mumbai"
]
}
}
}
]
}
}
This is because, since no explicit mapping was used, Elasticsearch uses defaults, which means the location_tags field will be analyzed with the standard analyzer, which will convert terms to lower-case. So the term Mumbai does not exist, but mumbai does.
If you want to be able to use upper-case terms in your query, you will need to set up an explicit mapping that tells Elasticsearch not to analyze the location_tags field. Maybe something like this:
curl -XDELETE "http://localhost:9200/my_base.main_candidate/"
curl -XPUT "http://localhost:9200/my_base.main_candidate/" -d'
{
"mappings": {
"doc": {
"properties": {
"specific_location": {
"properties": {
"location_tags": {
"type": "string",
"index": "not_analyzed"
},
"tags": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}'
curl -XPUT "http://localhost:9200/my_base.main_candidate/doc/1" -d'
{
"specific_location": {
"location_name": "Mumbai",
"location_tags": [
"Mumbai"
],
"tags": [
"Mumbai"
]
}
}'
curl -XPOST "http://localhost:9200/my_base.main_candidate/_search" -d'
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"terms": {
"specific_location.location_tags": [
"Mumbai"
]
}
}
}
}
}'
Here is all the above code in a handy place:
http://sense.qbox.io/gist/74844f4d779f7c2b94a9ab65fd76eb0ffe294cbb
[EDIT: by the way, I used Elasticsearch 1.3.4 when testing the above code]

Return the most recent record from ElasticSearch index

I would like to return the most recent record (top 1) from ElasticSearch index similar to the sql query below;
SELECT TOP 1 Id, name, title
FROM MyTable
ORDER BY Date DESC;
Can this be done?
Do you have _timestamp enabled in your doc mapping?
{
"doctype": {
"_timestamp": {
"enabled": "true",
"store": "yes"
},
"properties": {
...
}
}
}
You can check your mapping here:
http://localhost:9200/_all/_mapping
If so I think this might work to get most recent:
{
"query": {
"match_all": {}
},
"size": 1,
"sort": [
{
"_timestamp": {
"order": "desc"
}
}
]
}
For information purpose, _timestamp is now deprecated since 2.0.0-beta2.
Use date type in your mapping.
A simple date mapping JSON from date datatype doc:
{
"mappings": {
"my_type": {
"properties": {
"date": {
"type": "date"
}
}
}
}
}
You can also add a format field in date:
{
"mappings": {
"my_type": {
"properties": {
"date": {
"type": "date",
"format": "yyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
}
}
}
}
}
Get the Last ID using by date (with out time stamp)
Sample URL : http://localhost:9200/deal/dealsdetails/
Method : POST
Query :
{
"fields": ["_id"],
"sort": [{
"created_date": {
"order": "desc"
}
},
{
"_score": {
"order": "desc"
}
}
],
"size": 1
}
result:
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 9,
"max_score": null,
"hits": [{
"_index": "deal",
"_type": "dealsdetails",
"_id": "10",
"_score": 1,
"sort": [
1478266145174,
1
]
}]
}
}
You can use sort on date field and size=1 parameter.
Does it help?
If you are using python elasticsearch5 module or curl:
make sure each document that gets inserted has
a timestamp field that is type datetime
and you are monotonically increasing the timestamp value for each document
from python you do
es = elasticsearch5.Elasticsearch('my_host:my_port')
es.search(
index='my_index',
size=1,
sort='my_timestamp:desc'
)
If your documents are not inserted with any field that is of type datetime, then I don't believe you can get the N "most recent".
Since this question was originally asked and answered, some of the inner-workings of Elasticsearch have changed, particularly around timestamps. Here is a full example showing how to query for single latest record. Tested on ES 6/7.
1) Tell Elasticsearch to treat timestamp field as the timestamp
curl -XPUT "localhost:9200/my_index?pretty" -H 'Content-Type: application/json' -d '{"mappings":{"message":{"properties":{"timestamp":{"type":"date"}}}}}'
2) Put some test data into the index
curl -XPOST "localhost:9200/my_index/message/1" -H 'Content-Type: application/json' -d '{ "timestamp" : "2019-08-02T03:00:00Z", "message" : "hello world" }'
curl -XPOST "localhost:9200/my_index/message/2" -H 'Content-Type: application/json' -d '{ "timestamp" : "2019-08-02T04:00:00Z", "message" : "bye world" }'
3) Query for the latest record
curl -X POST "localhost:9200/my_index/_search" -H 'Content-Type: application/json' -d '{"query": {"match_all": {}},"size": 1,"sort": [{"timestamp": {"order": "desc"}}]}'
4) Expected results
{
"took":0,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"skipped":0,
"failed":0
},
"hits":{
"total":2,
"max_score":null,
"hits":[
{
"_index":"my_index",
"_type":"message",
"_id":"2",
"_score":null,
"_source":{
"timestamp":"2019-08-02T04:00:00Z",
"message":"bye world"
},
"sort":[
1564718400000
]
}
]
}
}
I used #timestamp instead of _timestamp
{
'size' : 1,
'query': {
'match_all' : {}
},
"sort" : [{"#timestamp":{"order": "desc"}}]
}
the _timestamp didn't work out for me,
this query does work for me:
(as in mconlin's answer)
{
"query": {
"match_all": {}
},
"size": "1",
"sort": [
{
"#timestamp": {
"order": "desc"
}
}
]
}
Could be trivial but the _timestamp answer didn't gave an error but not a good result either...
Hope to help someone...
(kibana/elastic 5.0.4)
S.

Resources