Elasticsearch: restrict result to documents with exact match - elasticsearch

Currently I trying to restrict results of Elasticsearch (5.4) with the following query:
{
"query": {
"bool": {
"must": {
"multi_match": {
"query": "apache log Linux",
"type": "most_fields",
"fields": [
"message",
"type"
]
}
},
"filter": {
"term": {
"client": "test"
}
}
}
}
}
This returns every document that contains "apache", "log", or "linux". I want to restrict the results to documents that have a field "client" with the exact specified value, this case: "test". However, this query returns all the documents that contain "test" as value. A document with "client": "test client" will also be returned.
I want to restriction to be exact, so only the documents with "client": "test" should be returned and not "client": "test client".
After testing a bunch of different queries and lots of searching, I can not find a solution to my problem. What am I missing?

Just use the keyword part of your client field, since this is 5.x and, by default, the keyword is already there:
"filter": {
"term": {
"client.keyword": "test"
}
}

Set a mapping on your index specifying that your client field is a keyword datatype.
The mapping request could look like
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"client": {
"type": "keyword"
}
}
}
}
}

Related

How does multi field mapping work in Elastic Search

I want to support both text search (match query) as well as exact match (term query) on a single field in my elasticsearch index.
Following is the mapping that I have created:
PUT multi_mapping_test/_mapping
{
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
However, the term query is not behaving as I am expecting it to (may be understanding of it is wrong).
For example, here are couple of sample documents indexed:
POST multi_mapping_test/_doc
{
"name": "abc llc"
}
POST multi_mapping_test/_doc
{
"name": "def llc"
}
Following term query yields no results:
GET multi_mapping_test/_search
{
"query": {
"term": {
"name": {
"value": "abc llc"
}
}
}
}
Am I doing anything wrong or is my understanding of exact matches with term query incorrect?
P.S. The term query works fine when I put mapping for only keyword type.
Term query: Returns documents that contain an exact term in a provided field.
When you're searching for exact match you should use keyword field types. Like the following:
GET multi_mapping_test/_search
{
"query": {
"term": {
"name.keyword": {
"value": "abc llc"
}
}
}
}
In addition, You can use bool query for both text search (match query) and exact match (term query) in your elasticsearch index.
GET multi_mapping_test/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"name": "abc llc"
}
},
{
"term": {
"name.keyword": {
"value": "abc llc"
}
}
}
],
"minimum_should_match": 1
}
}
}
Note: You can also use the match_bool_prefix query if you need to autocomplete the feature.
Details: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-bool-prefix-query.html
"abc llc" _score will be higher than "def llc" because it matches both match and term queries.

Elasticsearch 5.X Percolate: How to autogenerate copy_to fields?

In ES 2.3.3, many queries in the system I'm working on use the _all field. Sometimes these are registered to a percolate index, and when running percolator on the doc, _all is generated automatically.
In converting to ES 5.X _all is being deprecated and so _all has been replaced with a copy_to field that contains the components that we actually care about, and it works great for those searches.
Registering the same query to a percolate index with the same document mapping including copy_to fields works fine. Sending a percolate query with the document never results in a hit for a copy_to field however.
Manually building the copy_to field via simple string concatenation seems to work, it's just that I'd expect to be able to Query -> DocIndex and get the same result as Doc -> PercolateQuery... So I'm just looking for a way to have ES generate the copy_to fields automatically on a document being percolated.
Ended up there was nothing wrong with ES of course, posting here in case it helps someone else. Figured it out while attempting to generate a simpler example to post here with details... Basically the issue came down to the fact that attempting to percolate a document of a type that doesn't exist in the percolate index doesn't give any errors back, but seems to apply all percolate queries without applying any mappings which was just confusing as it worked for simple test cases, but not complex ones. Here's an example:
From the copy_to docs, generate an index with a copy_to mapping. See that a query to the copy_to field works.
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"first_name": {
"type": "text",
"copy_to": "full_name"
},
"last_name": {
"type": "text",
"copy_to": "full_name"
},
"full_name": {
"type": "text"
}
}
}
}
}
PUT my_index/my_type/1
{
"first_name": "John",
"last_name": "Smith"
}
GET my_index/_search
{
"query": {
"match": {
"full_name": {
"query": "John Smith",
"operator": "and"
}
}
}
}
Create a percolate index with the same type
PUT /my_percolate_index
{
"mappings": {
"my_type": {
"properties": {
"first_name": {
"type": "text",
"copy_to": "full_name"
},
"last_name": {
"type": "text",
"copy_to": "full_name"
},
"full_name": {
"type": "text"
}
}
},
"queries": {
"properties": {
"query": {
"type": "percolator"
}
}
}
}
}
Create a percolate query that matches our other percolate query on the copy_to field, and a second query that just queries on a basic unmodified field
PUT /my_percolate_index/queries/1?refresh
{
"query": {
"match": {
"full_name": {
"query": "John Smith",
"operator": "and"
}
}
}
}
PUT /my_percolate_index/queries/2?refresh
{
"query": {
"match": {
"first_name": {
"query": "John"
}
}
}
}
Search, but with the wrong type... there will be a hit on the basic field (first_name: John) even though no document mappings match the request
GET /my_percolate_index/_search
{
"query" : {
"percolate" : {
"field" : "query",
"document_type" : "non_type",
"document" : {
"first_name": "John",
"last_name": "Smith"
}
}
}
}
{"took":7,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":1,"max_score":0.2876821,"hits":[{"_index":"my_percolate_index","_type":"queries","_id":"2","_score":0.2876821,"_source":{
"query": {
"match": {
"first_name": {
"query": "John"
}
}
}
}}]}}
Send in the correct document_type and see both matches as expected
GET /my_percolate_index/_search
{
"query" : {
"percolate" : {
"field" : "query",
"document_type" : "my_type",
"document" : {
"first_name": "John",
"last_name": "Smith"
}
}
}
}
{"took":7,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":2,"max_score":0.51623213,"hits":[{"_index":"my_percolate_index","_type":"queries","_id":"1","_score":0.51623213,"_source":{
"query": {
"match": {
"full_name": {
"query": "John Smith",
"operator": "and"
}
}
}
}},{"_index":"my_percolate_index","_type":"queries","_id":"2","_score":0.2876821,"_source":{
"query": {
"match": {
"first_name": {
"query": "John"
}
}
}
}}]}}

Case insensitive query

Im using ElasticSearch, and I have this field:
"name": {
"type": "string",
"index": "not_analyzed"
},
I run this query to get, for example, all employees with the name "Charles":
GET company_employee/employee/_search
{
"query": {
"bool": {
"must": [
{
"match_phrase": {
"name": "Charles"
}
}
]
}
}
}
The issue with this is that I need to get an insensitive search. I need to retrieve all "Charles", even if the value I provide to the query es ChaRleS or charles, or CHarles, etc. What I need to do?
If reindexing is not an option, this leaves altering your query
Although the regexp approach doesn't allow for searching case insensitive, you could do so "manually".
If the character is always the first letter, you can get by with this:
GET company_employee/employee/_search
{
"query": {
"regexp": { "name": "[Cc]harles" }
}
}
Otherwise for "true" case-insensitive:
GET company_employee/employee/_search
{
"query": {
"regexp": { "name": "[Cc][Hh][Aa][Rr][Ll][Ee][Ss]" }
}
}
This is in no way efficient, but matches your constraint of not altering the index.
Basically you'd rather need to have your name field to be analyzed, i.e. like this:
"name": {
"type": "text",
"analyzer": "standard"
}
With such mapping all your values will be lower-cased, thus search will be case-insensitive.

Exact match in elastic search query

I want to exactly match the string ":Feed:" in a message field and go back a day pull all such records. The json I have seems to also match the plain word " feed ". I am not sure where I am going wrong. Do I need to add "constant_score" to this query JSON? The JSON I have currently is as shown below:
{
"query": {
"bool": {
"must": {
"query_string": {
"fields": ["message"],
"query": "\\:Feed\\:"
}
},
"must": {
"range": {
"timestamp": {
"gte": "now-1d",
"lte": "now"
}
}
}
}
}
}
As stated here: Finding Exact Values, since the field has been analyzed when indexed - you have no way of exact-matching its tokens (":"). Whenever the tokens should be searchable the mapping should be "not_analyzed" and the data needs to be re-indexed.
If you want to be able to easily match only ":feed:" inside the message field you might want to costumize an analyzer which doesn't tokenize ":" so you will be able to query the field with a simple "match" query instead of wild characters.
Not able to do this with query_string but managed to do so by creating a custom normalizer and then using a "match" or "term" query.
The following steps worked for me.
create a custom normalizer (available >V5.2)
"settings": {
"analysis": {
"normalizer": {
"my_normalizer": {
"type": "custom",
"filter": ["lowercase"]
}
}
}
}
Create a mapping with type "keyword"
{
"mappings": {
"default": {
"properties": {
"title": {
"type": "text",
"fields": {
"normalize": {
"type": "keyword",
"normalizer": "my_normalizer"
},
"keyword" : {
"type": "keyword"
}
}
}
}
}
}
use match or term query
{
"query": {
"bool": {
"must": [
{
"match": {
"title.normalize": "string to match"
}
}
]
}
}
}
Use match phrase
GET /_search
{
"query": {
"match_phrase": {
"message": "7000-8900"
}
}
}
In java use matchPhraseQuery of QueryBuilder
QueryBuilders.matchPhraseQuery(fieldName, searchText);
Simple & Sweet Soln:
use term query..
GET /_search
{
"query": {
"term": {
"message.keyword": "7000-8900"
}
}
}
use term query instead of match_phrase,
match_phrase this find/match with ES-document stored sentence, It will not exactly match. It matches with those sentence words!

I don't get any documents back from my elasticsearch query. Can someone point out my mistake?

I thought I had figured out Elasticsearch but I suspect I have failed to grok something, and hence this problem:
I am indexing products, which have a huge number of fields, but the ones in question are:
{
"show_in_catalogue": {
"type": "boolean",
"index": "no"
},
"prices": {
"type": "object",
"dynamic": false,
"properties": {
"site_id": {
"type": "integer",
"index": "no"
},
"currency": {
"type": "string",
"index": "not_analyzed"
},
"value": {
"type": "float"
},
"gross_tax": {
"type": "integer",
"index": "no"
}
}
}
}
I am trying to return all documents where "show_in_catalogue" is true, and there is a price with site_id 1:
{
"filter": {
"term": {
"prices.site_id": "1",
"show_in_catalogue": true
}
},
"query": {
"match_all": {}
}
}
This returns zero results. I also tried an "and" filter with two separate terms - no luck.
A subset of one of the documents returned if I have no filters looks like:
{
"prices": [
{
"site_id": 1,
"currency": "GBP",
"value": 595,
"gross_tax": 1
},
{
"site_id": 2,
"currency": "USD",
"value": 745,
"gross_tax": 0
}
]
}
I hope I am OK to omit so much of the document here; I don't believe it to be contingent but I cannot be certain, of course.
Have I missed a vital piece of knowledge, or have I done something terminally thick? Either way, I would be grateful for an expert's knowledge at this point. Thanks!
Edit:
At the suggestion of J.T. I also tried reindexing the documents so that prices.site_id was indexed - no change. Also tried the bool/must filter below to no avail.
To clarify, the reason I'm using an empty query is that the web interface may supply a query string, but the same code is used to simply filter all products. Hence I left in the query, but empty, since that's what Elastica seems to produce with no query string.
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": [
{
"term": {
"show_in_catalogue": true
}
},
{
"term": {
"prices.site_id": 1
}
}
]
}
}
}
}
}
You have site_id set as {"index": "no"}. This tells ElasticSearch to exclude the field from the index which makes it impossible to query or filter on that field. The data will still be stored. Likewise, you can set a field to only be in the index and searchable, but not stored.
I'm new to ElasticSearch as well and can't always grok the questions! I'm actually confused by you query. If you are going to "just filter" then you don't need a query. What I don't understand is your use of two fields inside the term filter. I've never done this. I guess it acts as an OR? Also, if nothing matches, it seems to return everything. If you wanted a query with the results of that query filtered, then you would want to use a
-d '{
"query": {
"filtered": {
"query": {},
"filter": {}
}
}
}'
If you just want to apply filters is the filter that should work without any "query" necessary
-d '{
"filter": {
"bool": {
"must": [
{
"term": {
"show_in_catalogue": true
}
},
{
"term": {
"prices.site_id": 1
}
}
]
}
}
}'

Resources