A simple AND query with Elasticsearch - elasticsearch

I am trying to do a simple query for two specified fields, and the manual and google is proving to be of little help. Example below should make it pretty clear what I want to do.
{
"query": {
"and": {
"term": {
"name.family_name": "daniel",
"name.given_name": "tyrone"
}
}
}
}
As a bonus question, why does it find "Daniel Tyrone" with "daniel", but NOT if I search for "Daniel". It behaves like a realy weird anti case sensitive search.

Edit: Updated, sorry. You need a separate Term object for each field, inside of a Bool query:
{
"query": {
"bool": {
"must" : [
{
"term": {
"name.family_name": "daniel"
}
},
{
"term": {
"name.given_name": "tyrone"
}
}
]
}
}
}
Term queries are not analyzed by ElasticSearch, which makes them case sensitive. A Term query says to ES "look for this exact token inside your index, including case and punctuation".
If you want case insensitivity, you could add a keyword + lowercase filter to your analyzer. Alternatively, you could use a query that analyzes your text at query time (like a Match query)
Edit2: You could also use And or Bool filters too.

I found a solution for at least multiple text comparisons on the same field:
{
"query": {
"match": {
"name.given_name": {
"query": "daniel tyrone",
"operator": "and"
}
}
}
And I found this for multiple fields, is this the correct way?
{
"query": {
"bool": {
"must": [
{
"match": {
"name.formatted": {
"query": "daniel tyrone",
"operator": "and"
}
}
},
{
"match": {
"display_name": "tyrone"
}
}
]
}
}
}

If composing the json with PHP, these 2 examples worked for me.
$activeFilters is just a comma separated string like: 'attractions, limpopo'
$articles = Article::searchByQuery(array(
'match' => array(
'cf_categories' => array(
'query' => $activeFilters,
'operator' =>'and'
)
)
));
// The below code is also working 100%
// Using Query String https://www.elastic.co/guide/en/elasticsearch/reference/1.4/query-dsl-query-filter.html
// https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html
/* $articles = Article::searchByQuery(array(
'query_string' => array(
'query' => 'cf_categories:attractions AND cf_categories:limpopo'
)
)); */

This worked for me: minimum_should_match is set to 2 since the number of parameters for the AND query are 2.
{
"query": {
"bool": {
"should": [
{"term": { "name.family_name": "daniel"}},
{"term": { "name.given_name": "tyrone" }}
],
"minimum_should_match" : 2
}
}
}

Related

Elastic_search function for deleting all records except a spastic set

I want to delete all records from a specified index except for a specific set. Is there such a function in elastic_search?
I tried to use delete_by_query function but could not get it to work as desired. Below is a snippet of what I tried. I basically want to have an array of ids instead of only one id at a time.
POST /myindex/_delete_by_query
{
"query": {
"bool": {
"must_not": [
{
"match": {
"id": {
"query": [12345,67890]
}
}
}
]
}
}
}
I am new to elastic_search, but in SQL terms I want to something like the following query:
DELETE * FROM <my-index> WHERE <id> != <listOfIds>
Good start!! You can do it like you suggest with a terms query:
POST /myindex/_delete_by_query
{
"query": {
"bool": {
"must_not": [
{
"terms": {
"id": [
12345,
67890
]
}
}
]
}
}
}

Is it possible to access a query term in a script field?

I would like to construct an elasticsearch query in which I can search for a term and on-the-fly compute a new field for each found document, which is calculated based on some existing fields as well as the query term. Is this possible?
For example, let's say in my EL query I am searching for documents which have the keyword "amsterdam" in the "text" field.
"filter": [
{
"match_phrase": {
"text": {
"query": "amsterdam"
}
}
}]
Now I would also like to have a script field in my query, which computes some value based on other fields as well as the query.
So far, I have only found how to access the other fields of a document though, using doc['someOtherField'], for example
"script_fields" : {
"new_field" : {
"script" : {
"lang": "painless",
"source": "if (doc['citizens'].value > 10000) {
return "large";
}
return "small";"
}
}
}
How can I integrate the query term, e.g. if I wanted to add to the if statement "if the query term starts with a-e"?
You're on the right track but script_fields are primarily used to post-process your documents' attributes — they won't help you filter any docs because they're run after the query phase.
With that being said, you can use scripts to filter your documents through script queries. Before you do that, though, you should explore alternatives.
In other words, scripts should be used when all other mechanisms and techniques have been exhausted.
Back to your example. I see three possibilities off the top of my head.
Match phrase prefix queries as a group of bool-should subqueries:
POST your-index/_search
{
"query": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"match_phrase_prefix": {
"text_field": "a"
}
},
{
"match_phrase_prefix": {
"text_field": "b"
}
},
{
"match_phrase_prefix": {
"text_field": "c"
}
},
... till the letter "e"
]
}
}
]
}
}
}
A regexp query:
POST your-index/_search
{
"query": {
"bool": {
"must": [
{
"regexp": {
"text_field": "[a-e].+"
}
}
]
}
}
}
Script queries using .charAt comparisons:
POST your-index/_search
{
"query": {
"bool": {
"must": [
{
"script": {
"script": {
"source": """
char c = doc['text_field.keyword'].value.charAt(0);
return c >= params.gte.charAt(0) && c <= params.lte.charAt(0);
""",
"params": {
"gte": "a",
"lte": "e"
}
}
}
}
]
}
}
}
If you're relatively new to ES and would love to see real-world examples, check out my recently released Elasticsearch Handbook. One chapter is dedicated to scripting and as it turns out, you can achieve a lot with scripts (if of course executed properly).

Elastic search query using python list

How do I pass a list as query string to match_phrase query?
This works:
{"match_phrase": {"requestParameters.bucketName": {"query": "xxx"}}},
This does not:
{
"match_phrase": {
"requestParameters.bucketName": {
"query": [
"auditloggingnew2232",
"config-bucket-123",
"web-servers",
"esbck-essnap-1djjegwy9fvyl",
"tempexpo",
]
}
}
}
match_phrase simply does not support multiple values.
You can either use a should query:
GET _search
{
"query": {
"bool": {
"should": [
{
"match_phrase": {
"requestParameters.bucketName": {
"value": "auditloggingnew2232"
}
}
},
{
"match_phrase": {
"requestParameters.bucketName": {
"value": "config-bucket-123"
}
}
}
]
},
...
}
}
or, as #Val pointed out, a terms query:
{
"query": {
"terms": {
"requestParameters.bucketName": [
"auditloggingnew2232",
"config-bucket-123",
"web-servers",
"esbck-essnap-1djjegwy9fvyl",
"tempexpo"
]
}
}
}
that functions like an OR on exact terms.
I'm assuming that 1) the bucket names in question are unique and 2) that you're not looking for partial matches. If that's the case, plus if there are barely any analyzers set on the field bucketName, match_phrase may not even be needed! terms will do just fine. The difference between term and match_phrase queries is nicely explained here.

How to transform a Kibana query to `elasticsearch_dsl` query

I have a query
GET index/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"key1": "value"
}
},
{
"wildcard": {
"key2": "*match*"
}
}
]
}
}
}
I want to make the same call with elasticsearch_dsl package
I tried with
s = Search(index=index).query({
"bool": {
"should": [
{
"match": {
"key1": "value"
}
},
{
"wildcard": {
"key2": "*match*"
}
}
]
}
})
s.using(self.client).scan()
But the results are not same, am I missing something here
Is there a way to represent my query with elasticsearch_dsl
tried this, no results
s = Search(index=index).query('wildcard', key2='*match*').query('match', key1=value)
s.using(self.client).scan()
it seems to me that you forgot the stars in the query.
s = Search(index=index).query('wildcard', key='*match*').query('match', key=value)
This query worked for me
s = Search(index=index).query('match', key1=value)
.query('wildcard', key2='*match*')
.source(fields)
also, if key has _ like key_1 elastic search behaves differently and query matches results even which do not match your query. So try to choose your key which do not have underscores.

Elasticsearch : filter results based on the date range

I'm using Elasticsearch 6.6, trying to extract multiple results/records based on multiple values (email_address) passed to the query (Bool) on a date range. For ex: I want to extract information about few employees based on their email_address (annie#test.com, charles#test.com, heman#test.com) and from the period i.e project_date (2019-01-01).
I did use should expression but unfortunately it's pulling all the records from elasticsearch based on the date range i.e. it's even pulling other employees information from project_date 2019-01-01.
{
"query": {
"bool": {
"should": [
{ "match": { "email_address": "annie#test.com" }},
{ "match": { "email_address": "chalavadi#test.com" }}
],
"filter": [
{ "range": { "project_date": { "gte": "2019-08-01" }}}
]
}
}
}
I also tried must expression but getting no result. Could you please help me on finding employees using their email_address with the date range?
Thanks in advance.
Should(Or) clauses are optional
Quoting from this article.
"In a query, if must and filter queries are present, the should query occurrence then helps to influence the score. However, if bool query is in a filter context or has neither must nor filter queries, then at least one of the should queries must match a document."
So in your query should is only influencing the score and not actually filtering the document. You must wrap should in must, or move it in filter(if scoring not required).
GET employeeindex/_search
{
"query": {
"bool": {
"filter": {
"range": {
"projectdate": {
"gte": "2019-01-01"
}
}
},
"must": [
{
"bool": {
"should": [
{
"term": {
"email.raw": "abc#text.com"
}
},
{
"term": {
"email.raw": "efg#text.com"
}
}
]
}
}
]
}
}
}
You can also replace should clause with terms clause as in #AlwaysSunny's answer.
You can do it with terms and range along with your existing query inside filter in more shorter way. Your existing query doesn't work as expected because of should clause, it makes your filter weaker. Read more here:
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html
{
"query": {
"bool": {
"filter": [
{
"terms": {
"email_address.keyword": [
"annie#test.com", "chalavedi#test.com"
]
}
},
{
"range": {
"project_date": {
"gte": "2019-08-01"
}
}
}
]
}
}
}

Resources