ElasticSearch: how to search for %like% values in a field - elasticsearch

I have a search query (in Postman/Chrome) that returns a list of companies, but I need to filter them out for a specific pattern. What filter do I use and how to do it?
I need to filter query result for company_id LIKE %50%
Her is what I run:
{
"fields": [
"company_id"
],
"query": {
"bool": {
"must": [
{"term": {"app.raw": "AAA"}},
{"wildcard": {"cat.raw": "RS"}}
],
"must_not": [],
"should": []
}
},
"from": 0,
"size": 5,
"sort": [],
"facets": {}
}
I get back something like:
"hits": [
{
...
"fields": {
"company_id": [
"745"
]
}
},
{
...
"fields": {
"company_id": [
"5056"
]
}
},
{
...
"fields": {
"company_id": [
"7765"
]
}
},
{
...
"fields": {
"company_id": [
"5044"
]
}
},
{
...
"fields": {
"company_id": [
"501"
]
}

First of all I am not fully sure what issue you are facing. You are not getting correct/expected result? You need to include the mapping you have used because the query will depend on it.
Wildcard searches are heavy. If you want to do partial matching search (equivalent of %like%)You can use ngram token filter in your analyzer and do term search. It will take care of matching partial string.
You can define an analyzer like
{
"settings":{
"analysis":{
"analyzer":{
"Like":{
"type":"custom",
"tokenizer":"keyword",
"filter":[ "lowercase", "ngram" ]
}
},
"filter":{
"ngram":{
"type":"ngram",
"min_gram":2,
"max_gram":15
}
}
}
}
}
And define in your mapping for cat.raw the analyzer "Like" defined above.
If you have used the ngram in your analyzer then You can change the query part to something simple term query like
"query": {
"bool": {
"must": [
{"term": {"app.raw": "AAA"}},
{"term": {"cat.raw": "RS"}}
],
"must_not": [],
"should": []
}
}
EDIT: updated answer based on comment
Ok now it is clear what you want to do.
One way is to define company_id As string in your mapping and use prefix query
"query": {
"bool": {
"must": [
{"term": {"app.raw": "AAA"}},
{"term": {"cat.raw": "RS"}},
{"prefix":{"company_id": "50"}}
],
"must_not": [],
"should": []
}
}
Another alternative could be to use edgengram in your analyzer for company_id and use term filter/query.
Note: for search in "cat.raw" it is still better to use the term query with analyzer having ngram instead of the wildcard query.

Related

ElasticSearch query is not working with only 2 characters

I have a field with this mapping definition
identifierNumber: {
type: "keyword",
fields: { text: { type: "text" } },
},
the values of this field look something like this 22-001,22-002, etc
I am making the following query to ElasticSearch
{
"query": {
"bool": {
"filter": [
{
"term": {
"status": "NEW"
}
}
],
"must": [
{
"simple_query_string": {
"query": "22 22~",
"fields": [
"title^3",
"identifierNumber^2"
],
"lenient": true
}
}
]
}
},
"sort": []
}
this query returns 0 results.
changing the simple_query_string query to 22001 or 22-001 will return relevant results.
Can someone explain to me why the original query with only 2 characters does not work?
I think you need add the fields "identifierNumber.text" in simple_query_string clausule.
"simple_query_string": {
"query": "22 22~",
"fields": [
"title^3",
"identifierNumber.text"
],
"lenient": true
}

Elasticsearch Query NOT searching in the specified fields

I am struggling with an elasticsearch query. In the fields option, we have specified '*' which means it should look in all fields as well as given the higher weights to a few fields. But it isn't working as it should.
This query was written by my colleague, it'd be great if you could explain it as well as point out the solution. Here's my query:
{
"query": {
"bool": {
"must": [
{
"simple_query_string": {
"query": "Atoms for Peace",
"default_operator": "AND",
"flags": "PREFIX|PHRASE|NOT|AND|OR|FUZZY|WHITESPACE",
"fields": [
"*",
"systemNumber^5",
"global_search",
"objectType^2",
"partTypes.text",
"partTypes.id",
"gs_am_people^2",
"gs_am_person^2",
"gs_am_org^2",
"gs_title^2",
"_currentLocation.displayName",
"briefDescription",
"physicalDescription",
"summaryDescription",
"_flatPersonsNameId",
"_flatPeoplesNameId",
"_flatOrganisationsNameId",
"_primaryDate",
"_primaryDateEarliest",
"_primaryDateLatest"
]
}
}
]
}
}
Your query is fine but it will not work on field with "nested" data type.
From doc
Searching across all eligible fields does not include nested documents. Use a nested query to search those documents.
You need to use nested query
{
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"simple_query_string": {
"query": "Atoms for Peace",
"default_operator": "AND",
"flags": "PREFIX|PHRASE|NOT|AND|OR|FUZZY|WHITESPACE",
"fields": [
"*",
"systemNumber^5",
"global_search",
"objectType^2",
"partTypes.text",
"partTypes.id",
"gs_am_people^2",
"gs_am_person^2",
"gs_am_org^2",
"gs_title^2",
"_currentLocation.displayName",
"briefDescription",
"physicalDescription",
"summaryDescription",
"_flatPersonsNameId",
"_flatPeoplesNameId",
"_flatOrganisationsNameId",
"_primaryDate",
"_primaryDateEarliest",
"_primaryDateLatest"
]
}
},
{
"nested": {
"path": "record",
"query": {
"simple_query_string": {
"query": "Atoms for Peace",
"default_operator": "AND",
"flags": "PREFIX|PHRASE|NOT|AND|OR|FUZZY|WHITESPACE",
"fields": [
"*"
]
}
}
}
}
]
}
}
}

Elastic : search two terms, one on _all, other one on a field

I would like to mix a search on a whole document (eg "developer") and a search on some field for another term (eg "php").
I can do each search separately but I can't mix them.
Here my example (simplified to show only my issue) :
{
"query": {
"function_score": {
"query": {
"match": {
"_all": "developer"
},
"multi_match": {
"query": "php",
"fields": [
"skills.description",
"skills.description",
"skills.details"
],
"operator": "or",
"type": "most_fields"
}
}
}
}
If I run this example I have an error :
Parse Failure [Failed to parse source
Is there a way to search on both _all and specific fields with two terms?
Thanks.
Yes, you're almost there, you need to combine them into a bool/must query:
{
"query": {
"function_score": {
"query": {
"bool": {
"must": [
{
"match": {
"_all": "developer"
}
},
{
"multi_match": {
"query": "php",
"fields": [
"skills.description",
"skills.description",
"skills.details"
],
"operator": "or",
"type": "most_fields"
}
}
]
}
}
}
}
}

Deep Elasticsearch wildcard query for given time range and with AND operator

I managed to build query that
matches all exact values in given time range
supports and operator.
Now I want to extend the query to support partial matching but I struggle to do that. Any advice would be appreciated.
Mapping
"event": {
"properties": {
"alarmId": {
"type": "string",
"index": "not_analyzed"
},
"startTimestamp": {
"type": "long"
},
...
}
}
Current query
{
"bool": {
"must":[
{"range": {"endTimestamp": {"gte": ?0}}},
{"range": {"startTimestamp": {"lte": ?1}}}
],
"should": [
{"match": {"_all": {"query": "?2", "zero_terms_query": "all", "operator": "and"}}}
],
"minimum_should_match" : 1
}
}
Answer (thanks to Val)
{
"bool": {
"must":[
{"range": {"endTimestamp": {"gte": ?0}}},
{"range": {"startTimestamp": {"lte": ?1}}}
],
"should": [
{"query_string": {"query": "?2"}}"
],
"minimum_should_match" : 1
}
}
Now query supports grouping, wildcards and more.

Minimum should match on filtered query

Is it possible to have a query like this
"query": {
"filtered": {
"filter": {
"terms": {
"names": [
"Anna",
"Mark",
"Joe"
],
"execution" : "and"
}
}
}
}
With the "minimum_should_match": "2" statement?
I know that I can use a simple query (I've tried, it works) but I don't need the score to be computed. My goal is just to filter documents which contains 2 of the values.
Does the score generally heavily impact the time needed to retrieves document?
Using this query:
"query": {
"filtered": {
"filter": {
"terms": {
"names": [
"Anna",
"Mark",
"Joe"
],
"execution" : "and",
"minimum_should_match": "2"
}
}
}
}
I got this error:
QueryParsingException[[my_db] [terms] filter does not support [minimum_should_match]]
Minimum should match is not a parameter for the terms filter. If that is the functionality you are looking for, I might rewrite your query like this, to use the bool query wrapped in a query filter:
{
"filter": {
"query": {
"bool": {
"should": [
{
"term": {
"names": "Anna"
}
},
{
"term": {
"names": "Mark"
}
},
{
"term": {
"name": "Joe"
}
}
],
"minimum_should_match": 2
}
}
}
}
You will get documents matching preferably exactly all three, but the query will also match document with exactly two of the three terms. The must is an implicit and. We also do not compute score, as we have executed the query as a filter.

Resources