Query and exclude in ElasticSearch - elasticsearch

I'm trying to use the match_phrase_prefix query with an exclude query, so that it matches all terms except for the terms to be exclude. I have it figured out in a basic URI query, but not the regular JSON query. How do I convert this URI into a JSON type query?
"http://127.0.0.1:9200/topics/_search?q=name:"
+ QUERY + "* AND !name=" + CURRENT_TAGS
Where CURRENT_TAGS is a list of tags not to match with.
This is what I have so far:
{
"query": {
"bool": {
"must": {
"match_phrase_prefix": {
"name": "a"
}
},
"filter": {
"terms": {
"name": [
"apple"
]
}
}
}
}
}
However, when I do this apple is still included in the results. How do I exclude apple?

You are almost there, you can use must_not, which is part of boolean query to exclude the documents which you don't want, below is working example on your sample.
Index mapping
{
"mappings": {
"properties": {
"name": {
"type": "text"
}
}
}
}
Index sample docs as apple and amazon worlds biggest companies which matches your search criteria :)
Search query to exclude apple
{
"query": {
"bool": {
"must": {
"match_phrase_prefix": {
"name": "a"
}
},
"must_not": {
"match": {
"name": "apple"
}
}
}
}
}
Search results
"hits": [
{
"_index": "matchprase",
"_type": "_doc",
"_id": "2",
"_score": 0.6931471,
"_source": {
"name": "amazon"
}
}
]

Related

combine terms and bool query in elasticsearch

I would like to do a search in an elasticsearch index but only for a list of ids. I can select the ids with a terms query
{
"query": {
"terms": {
"_id": list_of_ids
}
}
}
Now I want to search in the resulting list, which can be done with a query like this
{
"query": {
"bool": {
"must": {}
}
}
}
My question is how can I combine those two queries?
One solution I found is to add the ids into the must query like this
{
"query": {
"bool": {
"must": {}
"should": [{
"term": {
"_id": id1
},
"term": {
"_id": id2
}]
}
}
}
}
which works fine. However, if the list of ids is very large it can lead to errors.
elasticsearch.exceptions.RequestError: RequestError(400, 'search_phase_execution_exception', 'failed to create query:
I am wondering whether there is a more compact way to write such a query? I think the error above is caused by my query just being too long since I added thousands of term searches... there must be a way to just provide an array, like in the terms query?
solved it
{
"query": {
"bool": {
"must": {},
"filter": {
"terms": {
"_id": list_of_ids
}
}
}
}
}
sorry I am a bit of a newbie to elasticsearch...
You can also use IDs query, which returns documents based on their IDs.
Adding a working example with index data, search query, and search result.
Index Data:
{
"name":"buiscuit",
"cost":"55",
"discount":"20"
}
{
"name":"multi grain bread",
"cost":"55",
"discount":"20"
}
Search Query:
{
"query": {
"bool": {
"must": {
"match": {
"name": "bread"
}
},
"filter": {
"ids": {
"values": [
"1",
"2",
"4"
]
}
}
}
}
}
Search Result:
"hits": [
{
"_index": "65431114",
"_type": "_doc",
"_id": "1",
"_score": 0.5754429,
"_source": {
"name": "multi grain bread",
"cost": "55",
"discount": "20"
}
}
]

How to search array of fields in elasticsearch

I have a index in elastic search called professor
If for cross field i need "AND" condition
for same field array i need to OR condition
I need to search subject which is Physics or Accounting this is array of fields(OR) statement
I need to search type is Permanent(&) condition
I need to search Location is NY(&) condition
There is chance that {'type':['Contract','Guest']} type also coming as list
test = [{'id':1,'name': 'A','subject': ['Maths','Accounting'],'type':'Contract', 'Location':'NY'},
{ 'id':2,'name': 'AB','subject': ['Physics','Engineering'],'type':'Permanent','Location':'NY'},
{'id':3,'name': 'ABC','subject': ['Maths','Engineering'],'type':'Permanent','Location':'NY'}]
Query is below,3rd one got it, How to add 1 and 2
content_search = es.search(index="professor", body={
"query": {
"bool": {
"must": {
"match_all": {}
},
"filter": [
{
"term": {
"Location.keyword": "NY"
}
}
]
}
}
})
content_search ['hits']['hits']
Expected out is id [{ 'id':2,'name': 'AB','subject': ['Physics','Engineering'],'type':'Permanent','Location':'NY'}]
You need to use the bool query, to wrap all your conditions
Adding a working example with index data(same as that in question), search query, and search result
Search Query:
{
"query": {
"bool": {
"must": [
{
"match": {
"type.keyword": "Permanent"
}
},
{
"match": {
"Location.keyword": "NY"
}
}
],
"should": [
{
"match": {
"subject.keyword": "Accounting"
}
},
{
"match": {
"subject.keyword": "Physics"
}
}
],
"minimum_should_match": 1,
"boost": 1.0
}
}
}
Search Result:
"hits": [
{
"_index": "stof_64370980",
"_type": "_doc",
"_id": "2",
"_score": 1.8365774,
"_source": {
"id": 2,
"name": "AB",
"subject": [
"Physics",
"Engineering"
],
"type": "Permanent",
"Location": "NY"
}
}
]

Using a Kibana view query from application

I used the following filter and then searched for query string using Lucene to get the view that I was looking for.
{
"query": {
"match": {
"eventSource": {
"query": "ec2.amazonaws.com",
"type": "phrase"
}
}
}
}
I do not want to return event names those start with the word describe or get. Rest of the event names from ec2 event source should be returned.
!(eventName.keyword: Describe* OR eventName.keyword:
Get* )
The question is how to combine these 2 search requests into one?
I need to use that query from my application.
Update:
The Inspect menu of Kibana Discover tab generates this query. I am just trying to rewrite query_string part with usual match or match_phrase using boolean OR clause.
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "!(eventName.keyword: Describe* OR eventName.keyword: Get* )",
"analyze_wildcard": true
}
},
{
"match_phrase": {
"eventSource": {
"query": "ec2.amazonaws.com"
}
}
},
{
"range": {
"#timestamp": {
"format": "strict_date_optional_time",
"gte": "2020-07-09T08:39:15.947Z",
"lte": "2020-07-24T08:39:15.947Z"
}
}
}
],
"filter": [],
"should": [],
"must_not": []
}
}
You can easily use the boolean query's must_not clause to exclude the documents which you don't want in your search result and you can add as many as must_not as you want, it's fairly easy to do and can be done in a single query.
Please refer the example in the same link to get more info. Created sample in my local to show your the correct query, Please note instead of wildcard I am using the prefix query which is better and server your use-case.
Create index mapping
{
"mappings": {
"properties": {
"eventName": {
"type": "keyword"
}
}
}
}
Index sample doc
{
"eventName" : "Describe the events"
}
{
"eventName" : "the Describe events"
}
{
"eventName" : "Get the event"
}
{
"eventName" : "event Get"
}
Now search query to get only 2 and 3rd doc according to your req
{
"query": {
"bool": {
"must_not": [
{
"prefix": {
"eventName": "Desc"
}
},
{
"prefix": {
"eventName": "Get"
}
}
]
}
}
}
Search result
"hits": [
{
"_index": "ngramkey",
"_type": "_doc",
"_id": "2",
"_score": 0.0,
"_source": {
"eventName": "the Describe events"
}
},
{
"_index": "ngramkey",
"_type": "_doc",
"_id": "4",
"_score": 0.0,
"_source": {
"eventName": "event Get"
}
}
]
As suggested by the user "Opster Elasticsearch Ninja", I have merged must not boolean query like this...
{
"query": {
"bool": {
"must": [
{
"bool": {
"must_not": [
{
"prefix": {
"eventName.keyword": "Desc"
}
},
{
"prefix": {
"eventName.keyword": "Get"
}
}
]
}
},
{
"match_phrase": {
"eventSource": {
"query": "ec2.amazonaws.com"
}
}
},
{
"range": {
"#timestamp": {
"format": "strict_date_optional_time",
"gte": "2020-07-09T08:39:15.947Z",
"lte": "2020-07-24T08:39:15.947Z"
}
}
}
],
"filter": [],
"should": [],
"must_not": []
}
}
}

Returning documents that match multiple wildcard string queries

I'm new to Elasticsearch and would greatly appreciate help on this
In the query below I only want the first document to be returned, but instead both documents are returned. How can I write a query to search for two wildcard strings on two separate fields, but only return documents that match?
I think what's being returned currently is score dependent, but I don't need the score.
POST /pr/_doc/1
{
"type": "Type ONE",
"currency":"USD"
}
POST /pr/_doc/2
{
"type": "Type TWO",
"currency":"USD"
}
GET /pr/_search
{
"query": {
"bool": {
"must": [
{
"simple_query_string": {
"query": "Type ON*",
"fields": ["type"],
"analyze_wildcard": true
}
},
{
"simple_query_string": {
"query": "US*",
"fields": ["currency"],
"analyze_wildcard":true
}
}
]
}
}
}
Use below query which uses the default_operator: AND and query string for in depth information and further reading.
Search query
{
"query": {
"query_string": {
"query": "(Type ON*) AND (US*)",
"fields" : ["type", "currency"],
"default_operator" : "AND"
}
}
}
Index your sample docs and it returns your expected doc only:
"hits": [
{
"_index": "multiplequery",
"_type": "_doc",
"_id": "1",
"_score": 2.1823215,
"_source": {
"type": "Type ONE",
"currency": "USD"
}
}
]

How to do a wildcard or regex match on _id in elasticsearch?

From below sample elasticsearch data I want to apply wildcard say *.000ANT.* on _id so as to fetch all docs whose _id contains 000ANT. Please help.
"hits": [
{
"_index": "data_collector",
"_type": "agents",
"_id": "Org000LAN_example1.com",
"_score": 1,
"fields": {
"host": [
"example1.com"
]
}
},
{
"_index": "data_collector",
"_type": "agents",
"_id": "000BAN_example2.com",
"_score": 1,
"fields": {
"host": [
"example2.com"
]
}
},
{
"_index": "data_collector",
"_type": "agents",
"_id": "000ANT_example3.com",
"_score": 1,
"fields": {
"host": [
"example3.com"
]
}
}
]
This is just an extension on Andrei Stefan's answer
{
"query": {
"script": {
"script": "doc['_id'][0].indexOf('000ANT') > -1"
}
}
}
Note: I do not know the performance impact of such a query, most probably this is a bad idea. Use with caution and avoid if possible.
You can use a wildcard query like this, though it's worth noting that it is not advised to start a wildcard term with * as performance will suffer.
{
"query": {
"wildcard": {
"_uid": "*000ANT*"
}
}
}
Also note that if the wildcard term you're searching for matches the type name of your documents, using uid will not work, as uid is simply the contraction of the type and the id: type#id
Try this
{
"filter": {
"bool": {
"must": [
{
"regexp": {
"_uid": {
"value": ".*000ANT.*"
}
}
}
]
}
}
}
Allow your mapping for the id to be indexed:
{
"mappings": {
"agents": {
"_id": {
"index": "not_analyzed"
}
}
}
}
And use a query_string to search for it:
{
"query": {
"query_string": {
"query": "_id:(*000ANT*)",
"lowercase_expanded_terms": false
}
}
}
Or like this (with scripts and still querying only the _id):
{
"query": {
"filtered": {
"filter": {
"script": {
"script": "org.elasticsearch.index.mapper.Uid.splitUidIntoTypeAndId(new org.apache.lucene.util.BytesRef(doc['_uid'].value))[1].utf8ToString().contains('000ANT')"
}
}
}
}
}
You have two options here, the first is to use partial matching, which is easiest by wrapping a query with wildcards similar to other answers. This works on not_analyzed fields and is case sensitive.
POST /my_index/my_type/_search
{
"query": {
"wildcard": {
"_id": {
"value": "*000ANT*"
}
}
}
}
The second option is to use ElasticSearch analyzers and proper mapping to describe the functionality you are looking for, you can read about those here.
The basic premise is that you introduce an analyzer in your mapping which has a tokenizer, which will break strings down into smaller tokens that then can be matched. Doing a simple query search for "000ANT" on the tokenized _id field will return all result with that string.

Resources