ElasticSearch Delete By Query - delete multiple values - elasticsearch

Using the following ElasticSearch query to delete all documents with sourceId 1:
POST http://{{elasticip}}:9200/index2/index2_doc/_delete_by_query
{
"query": {
"match": {
"sourceId": 1
}
}
}
What is the proper syntax of the body if I wanted to delete from sourceId 1, 2, and 3 all at once?

Use bool terms filter:
{ "query" : { "bool" : { "filter" : { "terms" : { "sourceId" : [1,2,3] } } } } }
note: jaspreet advices are also corrects.

Related

Elasticsearch - use a field match to boost only and not to fetch the document

I have a query phrase that needs to match in either of the fields - name, summary or description or the exact match on the name field.
Now, I have one more new field brand. Match in this field should be used only to boost results. Meaning if there is a match only in the brand field, the doc should not be in the result set.
To solve the without brand I have the below query:
query: {
bool: {
minimum_should_match: 1,
should: [
multi_match:{
query : "Cadbury chocklate milk",
fields : [name, summary, description]
},
term: {
name_keyword: {
value: "Cadbury chocklate milk"
}
}
]
}
}
This works fine for me.
How do I fetch the data using the same query but boost docs that have brand:cadbury, without increasing the recall set(match based on brand:cadbury).
Thanks!
Using a bool inside must should work for you.
multi_match has multiple types and for phrase you have to use type:phrase.
{
"query": {
"bool": {
"must": [
{ "bool" :
{ "should" : [ {
"multi_match" :{
"type" : "phrase",
"query" : "Cadbury chocklate milk",
"fields" : ["name", "summary", "description"]
} }, {
"term": {
"name_keyword": {
"value": "Cadbury chocklate milk"
} }
}
]
}
}
],
"should" : {
"term" : {
"brand" : {
"value" : "cadbury"
}
}
}
}
}

Multiple OR filter in Elasticsearch

Hello I'm having trouble deciding the correctness of the following query for multiple OR in Elasticsearch. I want to select all the unique data (not count, but select all rows)
My best try for this in elastic query is
GET mystash/_search
{
"aggs": {
"uniques":{
"filter":
{
"or":
[
{ "term": { "url.raw" : "/a.json" } },
{ "term": { "url.raw" : "/b.json" } },
{ "term": { "url.raw" : "/c.json"} },
{ "term": { "url.raw" : "/d.json"} }
]
},
"aggs": {
"unique" :{
"terms" :{
"field" : "id.raw",
"size" : 0
}
}
}
}
}
}
The equivalent SQL would be
SELECT DISTINCT id
FROM json_record
WHERE
json_record.url = 'a.json' OR
json_record.url = 'b.json' OR
json_record.url = 'c.json' OR
json_record.url = 'd.json'
I was wondering whether the query above is correct, since the data will be needed for report generations.
Some remarks:
You should use a query filter instead of an aggregation filter. Your query loads all documents.
You can replace your or+term filter by a single terms filter
You could use a size=0 at the root of the query to get only agg result and not search results
Example code:
{"size":0,
"query" :{"filtered":{"filter":{"terms":{"url":["a", "b", "c"]}}}},
"aggs" :{"unique":{"term":{"field":"id", "size" :0}}}
}

ElasticSearch : IN equivalent operator in ElasticSearch

I am trying to find ElasticSearch query equivalent to IN \ NOT in SQL.
I know we can use QueryString query with multiple OR to get the same answer, but that ends up with lot of OR's.
Can anyone share the example?
Similar to what Chris suggested as a comment, the analogous replacement for IN is the terms filter (queries imply scoring, which may improve the returned order).
SELECT * FROM table WHERE id IN (1, 2, 3);
The equivalent Elasticsearch 1.x filter would be:
{
"query" : {
"filtered" : {
"filter" : {
"terms" : {
"id" : [1, 2, 3]
}
}
}
}
}
The equivalent Elasticsearch 2.x+ filter would be:
{
"query" : {
"bool" : {
"filter" : {
"terms" : {
"id" : [1, 2, 3]
}
}
}
}
}
The important takeaway is that the terms filter (and query for that matter) work on exact matches. It is implicitly an or operation, similar to IN.
If you wanted to invert it, you could use the not filter, but I would suggest using the slightly more verbose bool/must_not filter (to get in the habit of also using bool/must and bool).
{
"query" : {
"bool" : {
"must_not" : {
"terms" : {
"id" : [1, 2, 3]
}
}
}
}
}
Overall, the bool compound query syntax is one of the most important filters in Elasticsearch, as are the term (singular) and terms filters (plural, as shown).
1 terms
you can use terms term query in ElasticSearch that will act as IN
terms query is used to check if the value matches any of the provided values from Array.
2 must_not
must_not can be used as NOT in ElasticSearch.
ex.
GET my_index/my_type/_search
{
"query" : {
"bool" : {
"must":[
{
"terms": {
"id" : ["1234","12345","123456"]
}
},
{
"bool" : {
"must_not" : [
{
"match":{
"id" : "123"
}
}
]
}
}
]
}
}
}
exists
Also if it helps you can also use "exists" query to check if the field exists or not.
for ex,
check if the field exists
"exists" : {
"field" : "mobileNumber"
}
check if a field does not exist
"bool":{
"must_not" : [
{
"exists" : {
"field" : "mobileNumber"
}
}
]
}
I saw what you requested.
And I wrote the source code as below.
I hope this helps you solve your problem.
sql query :
select * from tablename where fieldname in ('AA','BB');
elastic search :
{
query :{
bool:{
must:[{
"script": {
"script":{
"inline": "(doc['fieldname'].value.toString().substring(0,2).toUpperCase() in ['AA','BB']) == true"
}
}
}],
should:[],
must_not:[]
}
}
}

Elastic(search): How to structure nested queries correctly?

I'm currently quite confuse about the structuring of queries in elastic. Let me explain what I mean with the following template that works fine for me:
{
"template" : {
"query" : {
"filtered" : {
"query" : {
"bool" : {
"must" : [
{ "match" : {
"user" : "{{param_user}}"
} },
{ "match" : {
"session" : "{{param_session}}"
} },
{ "range" : {
"date" : {
"gte" : "{{param_from}}",
"lte" : "{{param_to}}"
}
} }
]
}
}
}
}
}
}
Ok so I want to get entries of a specific session of a user in a certain time period. Now if you take a llok at this link http://www.elastic.co/guide/en/elasticsearch/guide/current/combining-filters.html you can find the following query:
{
"query" : {
"filtered" : {
"filter" : {
"bool" : {
"should" : [
{ "term" : {"price" : 20}},
{ "term" : {"productID" : "XHDK-A-1293-#fJ3"}}
],
"must_not" : {
"term" : {"price" : 30}
}
}
}
}
}
}
In this example we have right after the "filtered" the "filter" keyword. However if I exchange my second "query" with a "filter" as in the example , my template won't work anymore. This is really counterintuitive and I payed alot of time to figure this out. A̶l̶s̶o̶ ̶I̶ ̶d̶o̶n̶'̶t̶ ̶u̶n̶d̶e̶r̶s̶t̶a̶n̶d̶ ̶w̶h̶y̶ ̶w̶e̶ ̶n̶e̶e̶d̶ ̶t̶o̶ ̶p̶u̶t̶ ̶e̶v̶e̶r̶y̶ ̶f̶i̶l̶t̶e̶r̶ ̶i̶n̶ ̶s̶e̶p̶a̶r̶a̶t̶e̶ ̶̶{̶ ̶}̶̶ ̶e̶v̶e̶n̶ ̶t̶h̶o̶u̶g̶h̶ ̶t̶h̶e̶y̶ ̶a̶r̶e̶ ̶a̶l̶r̶e̶a̶d̶y̶ ̶s̶e̶p̶a̶r̶a̶t̶e̶d̶ ̶b̶y̶ ̶t̶h̶e̶ ̶a̶r̶r̶a̶y̶ ̶s̶y̶n̶t̶a̶x̶.̶
Another issue I had was that I suggested to match several fields I can just type smth like:
{
"query" : {
"match" : {
"user" : "{{param_user}}",
"session" : "{{param_session}}"
}
}
}
but it seemed that I have to use a bool query which I didn't know of, so I searched for 'elastic multi match' but got something completely different.
My question: where can I find how to structure a query properly (smth like a PEG)? The documentation only give basic examples but doesn't state what we can actually do and how.
Best regards,
Jan
Edit: Ok I just found by accident that I cannot exchange "query" with "filter" as "match" is a query and not a filter. But then again what about "range"? It seems to be a query as well as a filter... Is there a summary of keywords specifying in which context they can be used?
Is there a summary of keywords specifying in which context they can be used?
I wouldn't consider that as keywords. It's just there are both queries and filters with the same names (but not all of them).
Here is everything you need. For example there are both range query and filter. All you need is to understand the difference between filters and queries.
For example, if you want to move range section from query to filter, you can do that like shown in the code below (not tested). Since your code already contains filtered type of query, you can just create filter section right after query section.
{
"template": {
"query": {
"filtered": {
"query": {
"bool": {
"must": [
{
"match": {
"user": "{{param_user}}"
}
},
{
"match": {
"session": "{{param_session}}"
}
}
]
}
},
"filter": {
"range": {
"date": {
"gte": "{{param_from}}",
"lte": "{{param_to}}"
}
}
}
}
}
}
}
Just remember that you can filter only not analyzed fields.

Elasticsearch DSL query from an SQL statement

I'm new to Elasticsearch. I don't think I fully understand the concept of query and filters. In my case I just want to use filters as I don't want to use advance feature like scoring.
How would I convert the following SQL statement into elasticsearch query?
SELECT * FROM advertiser
WHERE company like '%com%'
AND sales_rep IN (1,2)
What I have so far:
curl -XGET 'localhost:9200/advertisers/advertiser/_search?pretty=true' -d '
{
"query" : {
"bool" : {
"must" : {
"wildcard" : { "company" : "*com*" }
}
}
},
"size":1000000
}'
How to I add the OR filters on sales_rep field?
Thanks
Add a "should" clause after your must clause. In a bool query, one or more should clauses must match by default. Actually, you can set the "minimum_number_should_match" to be any number, Check out the bool query docs.
For your case, this should work.
"should" : [
{
"term" : { "sales_rep_id" : "1" }
},
{
"term" : { "sales_rep_id" : "2" }
}
],
The same concept works for bool filters. Just change "query" to "filter". The bool filter docs are here.
I come across this post 4 years too late...
Anyways, perhaps the following code could be useful...
{
"query": {
"filtered": {
"query": {
"wildcard": {
"company": "*com*"
}
},
"filter": {
"bool": {
"should": [
{
"terms": {
"sales_rep_id": [ "1", "2" ]
}
}
]
}
}
}
}
}

Resources