How to make use of `gt` and `fields` in the same query in Elasticsearch - elasticsearch

In my previous question, I was introduced to the fields in a query_string query and how it can help me to search nested fields of a document.
{
"query": {
"query_string": {
"fields": ["*.id","id"],
"query": "2"
}
}
}
But it only works for matching, what if I want to do some comparison? After some reading and testing, it seems queries like range do not support fields. Is there any way I can perform a range query, e.g. on a date, over a field that can be scattered anywhere in the document hierarchy?
i.e. considering the following document:
{
"id" : 1,
"Comment" : "Comment 1",
"date" : "2016-08-16T15:22:36.967489",
"Reply" : [ {
"id" : 2,
"Comment" : "Inner comment",
"date" : "2016-08-16T16:22:36.967489"
} ]
}
Is there a query searching over the date field (like date > '2016-08-16T16:00:00.000000') which matches the given document, because of the nested field, without explicitly giving the address to Reply.date? Something like this (I know the following query is incorrect):
{
"query": {
"range" : {
"date" : {
"gte" : "2016-08-16T16:00:00.000000",
},
"fields": ["date", "*.date"]
}
}
}

The range query itself doesn't support it, however, you can leverage the query_string query (again) and the fact that you can wildcard fields and that it supports range queries in order to achieve what you need:
{
"query": {
"query_string": {
"query": "\*date:[2016-08-16T16:00:00.000Z TO *]"
}
}
}
The above query will return your document because Reply.date matches *date

Related

Inconsistent behavior of ElasticSearch not_analyzed field

I am using ES version 2.3. I have index some documents which have the structure like this :
{
"BUSINESSLINE" :"ABC CORP",
"NAME" : "John"
....
...
}
The field BUSINESSLINE is not_analyzed string.
The problem is that this query returns results :
{
"query": {
"multi_match" : {
"query": "ABC",
"fields": [ "_all" ]
}
}
}
But this one does not (It shows no hits!):
{
"query": {
"multi_match" : {
"query": "ABC",
"fields": [ "BUSINESSLINE " ]
}
}
}
Any help is appreciated, I tried to google and research but I am not able to able find any reason for this.
Thanks!
Yes, you are correct. The query matches the document because of _all filed which is a big string constructed by concatenating all fields by the space separator. And it is also analysed which is why your query is being matched.
You can read more about it here.

Is it possible to return a specific field when running a query in sense for elasticsearch

I have loaded some data into elasticsearch and written a query against the data however the results contain all of the data for the matching queries. Is it possible to filter the results to show a particular field?
Example
Query to find all records for a specific country but to return a list of registration numbers.
All the data is available elasticsearch however I get a full json record back for each match.
I'm running this query in SENSE (within Kibana 4.5.0).
The query is...
GET _search
{
filter_path=reg_no.*,
"fields" : ["reg_no"],
"query" : {
"fields" : ["country_cd", "oprg_stat"],
"query" : "956 AND 9074"
}
}
If I remove the two lines
filter_path=reg_no.*,
"fields" : ["reg_no"],
the query runs but brings back all the data.
Try this query:
POST _search
{
"_source": [
"reg_no"
],
"query": {
"bool": {
"filter": [
{
"term": {
"country_cd": "956"
}
},{
"term": {
"oprg_stat": "9074"
}
}
]
}
}
}

ElasticSearch - Search for complete phrase only

I am trying to create a search that will return me exactly what i requested.
For instance let's say i have 2 documents with a field named 'Val'
First doc have a value of 'a - Copy', second document is 'a - Copy (2)'
My goal is to search exactly the value 'a - Copy' and find only the first document in my returned results and not both of them with different similarity rankings
When i try most of the usual queries like:
GET test/_search
{
"query": {
"match": {
"Val": {
"query": "a - copy",
"type": "phrase"
}
}
}
}
or:
GET /test/doc/_search
{
"query": {
"query_string": {
"default_field": "Val",
"query": "a - copy"
}
}
}
I get both documents all the time
There is a very good documentation for finding exact values in ES:
https://www.elastic.co/guide/en/elasticsearch/guide/current/_finding_exact_values.html
It shows you how to use the term filter and it mentions problems with analyzed fields, too.
To put it in a nutshell you need to run a term filter like this (I've put your values in):
GET /test/doc/_search
{
"query" : {
"filtered" : {
"query" : {
"match_all" : {}
},
"filter" : {
"term" : {
"Val" : "a - copy"
}
}
}
}
}
However, this doesn't work with analyzed fields. You won't get any results.
To prevent this from happening, we need to tell Elasticsearch that
this field contains an exact value by setting it to be not_analyzed.
There are multiple ways to achieve that. e.g. custom field mappings.
Yes, you are getting that because your field is, most likely, analyzed and split into tokens.
You need an analyzer similar to this one
"custom_keyword_analyzer": {
"type": "custom",
"tokenizer": "keyword",
"filter": "lowercase"
}
which uses the keyword tokenizer and the lowercase filter (I noticed you indexed upper case letters, but expect to search with lowercase letters).
And then use a term filter to search your documents.

match or term query on a long property for exact match?

My document has the following mapping property:
"sid" : {"type" : "long", "store": "yes", "index": "not_analyzed"},
This property has only one long value for each record. I would like to query this property. I tried the following two queries:
{
"query" : {
"term" : {
"sid" : 10
}
}
}
{
"query" : {
"match" : {
"sid" : 10
}
}
}
Both queries work and return the target document. My question: which one is more efficient? And why?
You want to use a term query, and if you want to be even more effecient, use a filtered query so your results get cached.
GET index1/test/_search
{
"query": {
"filtered": {
"filter": {
"term": {
"sid": 10
}
}
}
}
}
Both work like the same way as you mentioned. As distinguished from match query the term query matches documents that have fields that contain a term (not analyzed!). So my opinion is that term query is more efficient in your case, because no analyzing have to be done.See:http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-term-query.html

Favor exact matches over nGram in elasticsearch

I am trying to map a field as nGram and 'exact' match, and make the exact matches appear first in the search results. This is an answer to a similar question, but I am struggling to make it work.
No matter what boost value I specify for the 'exact' field I get the same results order each time. This is how my field mapping looks:
"name" : {
"type" : "multi_field",
"fields" : {
"name" : {
"type" : "string",
"boost" : 2.0,
"analyzer" : "ngram"
},
"exact" : {
"type" : "string",
"boost" : 4.0,
"analyzer" : "simple",
"include_in_all" : false
}
}
}
And this is how the query looks like:
{
"query": {
"filtered": {
"query": {
"query_string": {
"fields":["name","name.exact"],
"query":"Woods"
}
}
}
}
}
Understating how score is calculated
Elasticsearch has an option for producing an explanation with every search result. by setting the explain parameter to be true
POST <Index>/<Type>/_search?explain&format=yaml
{
"query" : " ....."
}
it will produce a lot of output for every hit and that can be overwhelming, but it worth taking some time to understand what it all means
the output of eplian might be harder to read in json, so adding format=yaml makes it easier to read
Understanding why a document is matched or not
you can pass the query to a specific document like below to see explanation how matching is being done.
GET <Index>/<type>/<id>/_explain
{
"query": "....."
}
The multi_field mapping is correct, but the search query needs to be changed like this:
{
"query": {
"filtered": {
"query": {
"multi_match": { # changed from "query_string"
"fields": ["name","name.exact"],
"query": "Woods",
# added this so the engine does a "sum of" instead of a "max of"
# this is deprecated in the latest versions but works with 0.x
"use_dis_max": false
}
}
}
}
}
Now the results take into account the 'exact' match and adds up to the score.

Resources