Elastic search query not returning results - elasticsearch

I have an Elastic Search query that is not returning data. Here are 2 examples of the query - the first one works and returns a few records but the second one returns nothing - what am I missing?
Example 1 works:
curl -X GET "localhost:9200/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"match": {
"data.case.field1": "ABC123"
}
}
}
'
Example 2 not working:
curl -X GET "localhost:9200/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"must": {
"term" : { "data.case.field1" : "ABC123" }
}
}
}
}
'

this is happening due to the difference between match and term queries, match queries are analyzed, which means it applied the same analyzer on the search term, which is used on field at index time, while term queries are not analyzed, and used for exact searches, and search term in term queries doesn't go through the analysis process.
Official doc of term query
Returns documents that contain an exact term in a provided field.
Official doc of match query
Returns documents that match a provided text, number, date or boolean
value. The provided text is analyzed before matching.
If you are using text field for data.case.field1 without any explicit analyzer than the default analyzer(standard) for the text field would be applied, which lowercase the text and store the resultant token.
For your text, a standard analyzer would produce the below token, please refer Analyze API for more details.
{
"text" : "ABC123",
"analyzer" : "standard"
}
And generated token
{
"tokens": [
{
"token": "abc123",
"start_offset": 0,
"end_offset": 6,
"type": "<ALPHANUM>",
"position": 0
}
]
}
Now, when you use term query as a search term will not be analyzed and used as it is, which is in captical char(ABC123) it doesn't match the tokens in the index, hence doesn't return result.
PS: refer my this SO answer for more details on term and match queries.

What is your mapping for data.case.field1? If it is of type text, you should use a match query instead of term.
See the warning at the top of this page: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-term-query.html#query-dsl-term-query

Unless we know the mapping type as text or keyword. It is relatively answering in the dark without knowing all the variables involved. May be you can try the following.
curl -X GET "localhost:9200/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"bool": {
"filter": { <- Try this if you have datatype as keyword
"term" : { "data.case.field1" : "ABC123" }
}
}
}
}
'

Related

Searching documents indexed via ingest-attachment in elasticsearch

Searching documents indexed via ingest-attachment in elastic search on basis of attributes.
i'm new to elastic search and wanted to index files with some of the attributes like Author, Title, Subject, Category, Community etc.
How far i reached :-
i was able to create a attachment pipeline and was able to ingest the different docs in the elastic with attributes. see below how i did:-
1) created pipeline by following request:-
{
"description":"Extract attachment information",
"processors":[
{
"attachment":{
"field":"data"
}
}
]
}
2) upload an attachment via following code :-
{
"filename":"Presentations-Tips.ppt",
"Author":"Jaspreet",
"Category":"uploading ppt",
"Subject":"testing ppt",
"fileUrl":"",
"attributes":
{"attr11":"attr11value","attr22":"attr22value","attr33":"attr33value"},
"data": "here_base64_string_of_file"
}
3) then able to search freely on the all the above attributes and on file content as well:-
{
"query":{
"query_string":{
"query":"*test*"
}
}
}
Now what I wanted is :-
Wanted to narrow down the searches through some filters like :-
wanted to search on basis like search on specific parameters like search all those whose author must "Rohan"
then search all whose author must be "Rohan" and category must be "Education"
then search all whose author has letters like "han" and categories has letters "Tech"
search all whose author is "Rohan" and can search full text search on all fields which can have "progress" in any field, means first narrow down search for author and then full text search on those resultset fields.
Please help me with proper query syntax and call url like for above full text search I used 'GET /my_index/_search'
After spending some time finally got the answer:-
curl -X POST \
http://localhost:9200/my_index/_search \
-H 'Content-Type: application/json' \
-d '{
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "progress"
}
},
{
"wildcard": {
"Author": "Rohan"
}
},
{
"wildcard": {
"Title": "q*"
}
}
]
}
}
}'
now in above you can remove or add any object in must array as per your need.

Cannot get only number of hits in elastic search

Im using _msearch api to send multiple queries to elastic.
I only need to know how many hits generates each query.
What I understood, you can use the size parameter by setting it to "0" in order to only get the count. However, I still get results with all the found documents. Here is my query:
{"index":"myindex","type":"things","from":0,,"size":0}
{"query":{"bool":{"must":[{"match_all":{}}],"must_not":[],{"match":
{"firstSearch":true}}]}}}, "size" : 0}
{"index":"myindex","type":"things","from":0,,"size":0}
{"query":{"bool":{"must":[{"match_all":{}}],"must_not":[],{"match":
{"secondSearch":true}}]}}}, "size" : 0}
Im using curl to get the results, this way:
curl -H "Content-Type: application/x-ndjson" -XGET localhost:9200/_msearch?pretty=1 --data-binary "#requests"; echo
Setting size as zero signifies that you are asking Elasticsearch to return all the documents which satisfies the query.
You can let Elasticsearch know that you do not need the documents by sending "_source" as false.
Example:
{
"query": {},
"_source": false,
}
You can use
GET /indexname/type/_count?
{ "query":
{ "match_all": {} }
}
please read more document: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-count.html

Elasticsearch - icu_folding and utf-8 searching

Even after hours of trying to understand Elastic search, I can not understand idea how to achieve the same results for searching text with special characters.
What I am doing wrong with icu_folding? How can I achieve, that results will be same for "Škoda" and "Skoda" same? Is it even possible?
https://github.com/pavoltravnik/examples/blob/master/elastic_search_settings.sh
You're applying the icu_folding token filter on the name.sort sub-field and not on the name field itself, so your queries need to be like this instead:
# 1 result as expected
curl -XGET 'localhost:9200/my_index/_search?pretty' -d'
{
"query": { "match": { "name.sort": "Škoda" } }
}'
# 0 results - I expected the same behaviour
curl -XGET 'localhost:9200/my_index/_search?pretty' -d'
{
"query": { "match": { "name.sort": "Skoda" } }
}'

Elastics Search email search mismatch with match query using com

GET candidates1/candidate/_search
{
"fields": ["contactInfo.emails.main"],
"query": {
"bool": {
"must": [
{
"match": {
"contactInfo.emails.main": "com"
}
}
]
}
}
}
GET candidates1/candidate/_search
{
"size": 5,
"fields": [
"contactInfo.emails.main"
],
"query": {
"match": {
"contactInfo.emails.main": "com"
}
}
}
Hi,
When i am using the above query i am getting results like ['nraheem#dbtech1.com','arelysf456#gmai1.com','ron#rgb52.com'] but i am not getting emails like ['pavann.aryasomayajulu#gmail.com','kumar#gmail.com','raj#yahoo.com']
But when i am using the query to match "gmail.com", i am getting results which have gmail.com
So My question is when i am using "com" in the first query, i am expecting results that include gmail.com as "com" is present in gmail.com. But that is not happening
Note: we are having almost 2Million emailid and most of them are gmail.com , yahoo.com or hotmail but only few are of other types.
"contactInfo.emails.main" fields seem to be an analyzed field.
In elasticsearch all string fields are analyed using Standard Analyzer and are converted into tokens.You can see how your text is getting analyzed using analyze api. Email Ids mentioned by you ending in number before com are getting analyzed as nraheem , dbtech1 , com. Use following query to see the tokens.
curl -XGET 'localhost:9200/_analyze' -d '
{
"analyzer" : "standard",
"text" : "nraheem#dbtech1.com"
}'
As you can see there is a separate term com being created. While if you analyze kumar#gmail.com you will get tokens like kumar , gmail.com. There is no separate token com created in this case.
This is because Standard Analyzer splits the terms when it encounters some special characters like #,? etc or numbers too. You can create custom Analyzer to meet your requirement.
Hope this helps!!

Elastic Search Hyphen issue with term filter

I have the following Elastic Search query with only a term filter. My query is much more complex but I am just trying to show the issue here.
{
"filter": {
"term": {
"field": "update-time"
}
}
}
When I pass in a hyphenated value to the filter, I get zero results back. But if I try without an unhyphenated value I get results back. I am not sure if the hyphen is an issue here but my scenario makes me believe so.
Is there a way to escape the hyphen so the filter would return results? I have tried escaping the hyphen with a back slash which I read from the Lucene forums but that didn't help.
Also, if I pass in a GUID value into this field which is hyphenated and surrounded by curly braces, something like - {ASD23-34SD-DFE1-42FWW}, would I need to lower case the alphabet characters and would I need to escape the curly braces too?
Thanks
I would guess that your field is analyzed, which is default setting for string fields in elasticsearch. As a result, when it indexed it's not indexed as one term "update-time" but instead as 2 terms: "update" and "time". That's why your term search cannot find this term. If your field will always contain values that will have to be matched completely as is, it would be the best to define such field in mapping as not analyzed. You can do it by recreating the index with new mapping:
curl -XPUT http://localhost:9200/your-index -d '{
"mappings" : {
"your-type" : {
"properties" : {
"field" : { "type": "string", "index" : "not_analyzed" }
}
}
}
}'
curl -XPUT http://localhost:9200/your-index/your-type/1 -d '{
"field" : "update-time"
}'
curl -XPOST http://localhost:9200/your-index/your-type/_search -d'{
"filter": {
"term": {
"field": "update-time"
}
}
}'
Alternatively, if you want some flexibility in finding records based on this field, you can keep this field analyzed and use text queries instead:
curl -XPOST http://localhost:9200/your-index/your-type/_search -d'{
"query": {
"text": {
"field": "update-time"
}
}
}'
Please, keep in mind that if your field is analyzed then this record will be found by searching for just word "update" or word "time" as well.
The accepted answer didn't work for me with elastic 6.1. I solved it using the "keyword" field that elastic provides by default on string fields.
{
"filter": {
"term": {
"field.keyword": "update-time"
}
}
}
Based on the answer by #imotov If you're using spring-data-elasticsearch then all you need to do is mark your field as:
#Field(type = FieldType.String, index = FieldIndex.not_analyzed)
instead of
#Field(type = FieldType.String)
The problem is you need to drop the index though and re-instantiate it with new mappings.

Resources