Elasticsearch compare two fields - elasticsearch

For example MySQL query
SELECT fields_a, fields_b FROM table WHERE fields_a > fields_b;
I am trying to implement for elasticsearch. I've tried, as follows:
$where = [ "query" => [ "filter" => [ "script" => [ "script" => "doc[\"fields_a\"].value > doc[\"fields_b\"].value" ] ] ] ];
What am I missing?

this should work for me
{
"query": {
"bool": {
"must": [{
"script": {
"script": "doc['field_a'].value > doc['field_b'].value"
}
}]
}
}
}
To select only few fields instead of the whole source document use stored_fields "stored_fields": ["field_a","field_b"] and make sure to have those fields as store=true in mappings .
{
stored_fields": ["field_a","field_b"]
"query": {
"bool": {
"must": [{
"script": {
"script": "doc['field_a'].value > doc['field_b']"
}
}]
}
}
}

Related

How to search by an specific value or where field does not exists in Elasticsearch

Im trying to do a query where the field does not exists or is in the array of values, in mongo it would be something like this.
$or: [
{
field: {
$exists: false
}
},
{
field: [val1, val2]
}
]
I've tried with a couple of variations but cant seem to find the solution for this.
Edit1:
Basically on my mind, the query should look like.
query: {
"bool": {
"should": [
"must": {...logic}
"must_not": {...logic}
]
}
}
this returns
"must_not" malformed
You need to do it like this:
{
"query": {
"bool": {
"should": [
{
"terms": {
"field": [
"val1",
"val2"
]
}
},
{
"bool": {
"must_not": {
"exists": {
"field": "field"
}
}
}
}
]
}
}
}

"update by query" not working as expected with straight calls

I've an script that calls Elasticsearch with some update_by_query.
Here I update the item with id=299966 and change the trash flag, trash=0:
_update_by_query
{
"query": {
"query": {
"bool": {
"must": [
{
"terms": {
"_id": [
299966
]
}
}
],
"should": [
]
}
}
},
"script": {
"inline": "ctx._source.trash=0"
}
}
Then I the item with id=299966 (same item as above) to trash=1:
_update_by_query
{
"query": {
"query": {
"bool": {
"must": [
{
"terms": {
"_id": [
299966
]
}
}
],
"should": [
]
}
}
},
"script": {
"inline": "ctx._source.trash=1"
}
}
The thing is that after doing this two operations, if I search for the item with id=299966, I get trash=0, when it's supposed to be trash=1 as it's the last one executed. I always mantain the order and my own log shows that the one with trash=0 is first executed, and then the one with trash=1.
Is there any stuff inside the update_by_query logic that avoids to make two calls? Do I have to wait some seconds or something to make the second update_by_query?
PS: Nervemind those double query on the codes. It's working ok.
Thanks in advance.
The solution I found is to use _flush after every _update or every _update_by_query.
myindex/_update_by_query
{
"query": {
"query": {
"bool": {
"must": [
{
"terms": {
"_id": [
299966
]
}
}
],
"should": [
]
}
}
},
"script": {
"inline": "ctx._source.trash=0"
}
}
myindex/_flush
myindex/_update_by_query
{
"query": {
"query": {
"bool": {
"must": [
{
"terms": {
"_id": [
299966
]
}
}
],
"should": [
]
}
}
},
"script": {
"inline": "ctx._source.trash=1"
}
}

Search from multiple nested level fields in elasticsearch

I want to search from multiple nested level fields. query like.
select * from product where brand='brand1' and category='category1'.
In elasticsearch I have two nested level mapping one is category and other is brand.
If i wrote only brand or category it return perfect result but how to write both in following query ?
$params = [
'index' => 'my_index',
'type' => 'product',
'body' => [
"query"=>[
"filtered"=>[
"filter"=>[
"bool"=>[
"must"=>[
"bool"=>[
"must"=>[
[
"query"=>[
"match"=>[
"brand"=>[
"query"=>"brand1",
"type"=>"phrase"
]
]
]
],
[
"query"=>[
"match"=>[
"category"=>[
"query"=>"category1",
"type"=>"phrase"
]
]
]
]
]
]
]
]
]
]
]
]
];
By above query I am getting 0 result
You can try below query it will help you out to get respected answer:
GET /product/ur_type/_search
{
"from": 0,
"size": 200,
"query": {
"filtered": {
"filter": {
"bool": {
"must": {
"bool": {
"must": [
{
"query": {
"match": {
"brand": {
"query": "brand1",
"type": "phrase"
}
}
}
},
{
"query": {
"match": {
"category": {
"query": "category1",
"type": "phrase"
}
}
}
}
]
}
}
}
}
}
}
}

elasticsearch bool query combine must with OR

I am currently trying to migrate a solr-based application to elasticsearch.
I have this lucene query:
((
name:(+foo +bar)
OR info:(+foo +bar)
)) AND state:(1) AND (has_image:(0) OR has_image:(1)^100)
As far as I understand this is a combination of must clauses combined with boolean OR:
Get all documents containing (foo AND bar in name) OR (foo AND bar in info). After that filter results by condition state=1 and boost documents that have an image.
I have been trying to use a bool query with must but I am failing to get boolean OR into must clauses. Here is what I have:
GET /test/object/_search
{
"from": 0,
"size": 20,
"sort": {
"_score": "desc"
},
"query": {
"bool": {
"must": [
{
"match": {
"name": "foo"
}
},
{
"match": {
"name": "bar"
}
}
],
"must_not": [],
"should": [
{
"match": {
"has_image": {
"query": 1,
"boost": 100
}
}
}
]
}
}
}
As you can see, must conditions for info are missing.
** UPDATE **
I have updated my elasticsearch query and got rid of that function score. My base problem still exists.
OR is spelled should
AND is spelled must
NOR is spelled should_not
Example:
You want to see all the items that are (round AND (red OR blue)):
{
"query": {
"bool": {
"must": [
{
"term": {"shape": "round"}
},
{
"bool": {
"should": [
{"term": {"color": "red"}},
{"term": {"color": "blue"}}
]
}
}
]
}
}
}
You can also do more complex versions of OR, for example, if you want to match at least 3 out of 5, you can specify 5 options under "should" and set a "minimum_should" of 3.
Thanks to Glen Thompson and Sebastialonso for finding where my nesting wasn't quite right before.
Thanks also to Fatmajk for pointing out that "term" becomes a "match" in ElasticSearch Version 6.
I finally managed to create a query that does exactly what i wanted to have:
A filtered nested boolean query.
I am not sure why this is not documented. Maybe someone here can tell me?
Here is the query:
GET /test/object/_search
{
"from": 0,
"size": 20,
"sort": {
"_score": "desc"
},
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"term": {
"state": 1
}
}
]
}
},
"query": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"match": {
"name": "foo"
}
},
{
"match": {
"name": "bar"
}
}
],
"should": [
{
"match": {
"has_image": {
"query": 1,
"boost": 100
}
}
}
]
}
},
{
"bool": {
"must": [
{
"match": {
"info": "foo"
}
},
{
"match": {
"info": "bar"
}
}
],
"should": [
{
"match": {
"has_image": {
"query": 1,
"boost": 100
}
}
}
]
}
}
],
"minimum_should_match": 1
}
}
}
}
}
In pseudo-SQL:
SELECT * FROM /test/object
WHERE
((name=foo AND name=bar) OR (info=foo AND info=bar))
AND state=1
Please keep in mind that it depends on your document field analysis and mappings how name=foo is internally handled. This can vary from a fuzzy to strict behavior.
"minimum_should_match": 1 says, that at least one of the should statements must be true.
This statements means that whenever there is a document in the resultset that contains has_image:1 it is boosted by factor 100. This changes result ordering.
"should": [
{
"match": {
"has_image": {
"query": 1,
"boost": 100
}
}
}
]
Have fun guys :)
This is how you can nest multiple bool queries in one outer bool query
this using Kibana,
bool indicates we are using boolean
must is for AND
should is for OR
GET my_inedx/my_type/_search
{
"query" : {
"bool": { //bool indicates we are using boolean operator
"must" : [ //must is for **AND**
{
"match" : {
"description" : "some text"
}
},
{
"match" :{
"type" : "some Type"
}
},
{
"bool" : { //here its a nested boolean query
"should" : [ //should is for **OR**
{
"match" : {
//ur query
}
},
{
"match" : {}
}
]
}
}
]
}
}
}
This is how you can nest a query in ES
There are more types in "bool" like,
Filter
must_not
I recently had to solve this problem too, and after a LOT of trial and error I came up with this (in PHP, but maps directly to the DSL):
'query' => [
'bool' => [
'should' => [
['prefix' => ['name_first' => $query]],
['prefix' => ['name_last' => $query]],
['prefix' => ['phone' => $query]],
['prefix' => ['email' => $query]],
[
'multi_match' => [
'query' => $query,
'type' => 'cross_fields',
'operator' => 'and',
'fields' => ['name_first', 'name_last']
]
]
],
'minimum_should_match' => 1,
'filter' => [
['term' => ['state' => 'active']],
['term' => ['company_id' => $companyId]]
]
]
]
Which maps to something like this in SQL:
SELECT * from <index>
WHERE (
name_first LIKE '<query>%' OR
name_last LIKE '<query>%' OR
phone LIKE '<query>%' OR
email LIKE '<query>%'
)
AND state = 'active'
AND company_id = <query>
The key in all this is the minimum_should_match setting. Without this the filter totally overrides the should.
Hope this helps someone!
If you were using Solr's default or Lucene query parser, you can pretty much always put it into a query string query:
POST test/_search
{
"query": {
"query_string": {
"query": "(( name:(+foo +bar) OR info:(+foo +bar) )) AND state:(1) AND (has_image:(0) OR has_image:(1)^100)"
}
}
}
That said, you may want to use a boolean query, like the one you already posted, or even a combination of the two.
$filterQuery = $this->queryFactory->create(QueryInterface::TYPE_BOOL, ['must' => $queries,'should'=>$queriesGeo]);
In must you need to add the query condition array which you want to work with AND and in should you need to add the query condition which you want to work with OR.
You can check this: https://github.com/Smile-SA/elasticsuite/issues/972

Elasticsearch not storing geoip data from logstash

I'm trying to add the geoip map to kibana, following the into to logshash
I can see the correct output from the rubydebug codec:
"geoip" => {
"location" => [
[0] -122.3426,
[1] 47.739599999999996
],
But when I query elasticsearch (using the query from kibana) for anything with a "geoip.location" field I get all the results. And none of the results have a geoip field.
{
"query": {
"filtered": {
"query": {
"bool": {
"should": [
{
"query_string": {
"query": "*"
}
}
]
}
},
"filter": {
"bool": {
"must": [
{
"range": {
"#timestamp": {
"from": 1409025267221,
"to": 1409111667222
}
}
},
{
"exists": {
"field": "geoip.location"
}
}
]
}
}
}
},
"fields": [
"geoip.location",
"_id"
],
"size": 1000,
"sort": [
{
"#timestamp": {
"order": "desc"
}
}
]
}
Nevermind, it was that the dates were out of range. When I added some recent data they showed up on the map

Resources