How does multi field mapping work in Elastic Search - elasticsearch

I want to support both text search (match query) as well as exact match (term query) on a single field in my elasticsearch index.
Following is the mapping that I have created:
PUT multi_mapping_test/_mapping
{
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
However, the term query is not behaving as I am expecting it to (may be understanding of it is wrong).
For example, here are couple of sample documents indexed:
POST multi_mapping_test/_doc
{
"name": "abc llc"
}
POST multi_mapping_test/_doc
{
"name": "def llc"
}
Following term query yields no results:
GET multi_mapping_test/_search
{
"query": {
"term": {
"name": {
"value": "abc llc"
}
}
}
}
Am I doing anything wrong or is my understanding of exact matches with term query incorrect?
P.S. The term query works fine when I put mapping for only keyword type.

Term query: Returns documents that contain an exact term in a provided field.
When you're searching for exact match you should use keyword field types. Like the following:
GET multi_mapping_test/_search
{
"query": {
"term": {
"name.keyword": {
"value": "abc llc"
}
}
}
}
In addition, You can use bool query for both text search (match query) and exact match (term query) in your elasticsearch index.
GET multi_mapping_test/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"name": "abc llc"
}
},
{
"term": {
"name.keyword": {
"value": "abc llc"
}
}
}
],
"minimum_should_match": 1
}
}
}
Note: You can also use the match_bool_prefix query if you need to autocomplete the feature.
Details: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-bool-prefix-query.html
"abc llc" _score will be higher than "def llc" because it matches both match and term queries.

Related

Elasticsearch match query on each separate value in a multi value field without nested

For a multi-valued field, like this:
PUT match_test
{
"mappings": {
"properties": {
"companies": {
"type": "text"
}
}
}
}
POST match_test/_doc/1
{
"companies": ["bank of canada", "japan games and movies", "microsoft canada"]
}
This query returns the document we inserted above:
GET match_test/_search
{
"query": {
"match": {
"companies": {
"query": "canada games",
"operator": "and"
}
}
}
}
Is there any way to tell elastic to match to each item in the list separately?
I want the doc to match "bank of", "of America", "bank", "games", "Canada", but not "Microsoft games"
I do not want to use nested documents or scripts
If you want to find words that are far apart from each other but are still on the same array index , then you can use position_increment_gap.
When creating a mapping, set position_increment_gap of the field to 100. Elasticsearch will automatically index array data at each position with +100 in position for the data at the next index.
Then write a match_phrase query with slop 99.
PUT match_test
{
"mappings": {
"properties": {
"companies": {
"type": "text",
"position_increment_gap": 100
}
}
}
}
GET match_test/_search
{
"query": {
"match_phrase": {
"companies": {
"query": "japan movies",
"slop":99
}
}
}
}
Read more about it here https://www.elastic.co/guide/en/elasticsearch/reference/current/position-increment-gap.html

Elasticsearch fuzzy query and match with fuzziness

So i saw these two queries.
First one is match with fuzziness option
{
"query": {
"match": {
"user": {
"query": "ki",
"fuzziness": "AUTO"
}
}
}
}
Second one is normal fuzzy search
{
"query": {
"fuzzy": {
"user": {
"value": "ki"
}
}
}
}
Result is pretty much the same. But my question is, does the query really does the same structure? and which one to use for fuzziness best practice?
In your example the results are the same. However, the fuzzy query behaves like a term query, so it does not perform analysis beforehand, whereas the match query does.
So if you searched for an address field containing pigeon street and indexed with a standard analyser, this query would work
GET my-index/_search
{
"query": {
"match": {
"address": {
"query": "wigeon street",
"fuzziness": 1
}
}
}
}
but this one would not:
GET my-index/_search
{
"query": {
"fuzzy": {
"address": {
"value": "wigeon street"
}
}
}
}

Elasticsearch 6.3 query with space in a keyword field and not returning all documents

I have the fallowing part of a mapping:
"name": {
"store": "true",
"type": "keyword"
}
and this query:
{
"query":{
"query_string":{
"query":"+(name:John Doe)",
"fields":[
]
}
},
"aggregations":{
"name":{
"terms":{
"field":"name",
"size":10
}
}
}
}
The query should return over 100 results however it only returns a few. If I add quotes to John Doe like this: \"John Doe\" then it returns all the desired results.
I'm wondering why this happens. Isn't enough that the field is mapped as keyword so that John Doe is analyzed as a whole, and no quotes should be added? Also, why would it return less items without quotes?
Note: In ES 1.4 the same query seems to work fine (although is not the same data to be honest, and it uses facets instead of aggregations).
The documentation for query string query clearly states:
If the field is a keyword field the analyzer will create a single term ...
So you don't need to add quotes to your search string. Instead, you need to write your query correctly. Currently your query try to find the term John in field name, and term Doe in all other fields! So you must rewrite your query in one of the following ways:
Add parentheses to your search term so the query parser can "understand" that all words must be found in name field:
{
"query": {
"query_string": {
"query": "+(name:(John Doe))",
"fields": [
]
}
},
"aggregations": {
"name": {
"terms": {
"field": "name",
"size": 10
}
}
}
}
Specify field name in fields array rather than in query string:
{
"query": {
"query_string": {
"query": "+(John Doe)",
"fields": [
"name"
]
}
},
"aggregations": {
"name": {
"terms": {
"field": "name",
"size": 10
}
}
}
}

Search returning different value in fuzzy Query- Elasticsearch

I have an Elasticsearch fuzzy query as below:
GET /resume/candidate/_search
{
"query": {
"fuzzy" : { "name" : {
"value": "Tam",
"fuzziness" : 2,
"max_expansions": 50 }
}
}
}
I have the names Tom, Roy, Maxwell in my Index. The name Tom is matched as per the request, but the name Roy also gets returned. How is this happening?
The full names are:
Roy M Lovejoy III
Tom Atwell
Also, if i set the fuzziness to 1, I am not getting any result. Shouldn't Tom be matched as only 1 character is different?
Mapping:
{
"resume": {
"aliases": {},
"mappings": {
"candidate": {
"properties": {
"name": {
"type": "text"
}
}
}
}
}
I also have an analyzer, but it is not used in the name field
Analyzer:
"analysis": {
"analyzer": {
"case_insensitive": {
"filter": [
"lowercase"
],
"tokenizer": "keyword"
}
}
}
"Tam" is not a fuzzy match with "roy", it is a match with the middle initial "m", which has an edit distance of 2.
The reason you are not getting a result on "tom" with an edit distance of 1, is because, while your indexed names are being analyzed, and thus lowercased, your query is not. You could lowercase your query, or you could use a fuzzy match query, which would be analyzed:
"query": {
"match": {
"name": {
"query": "Tam",
"fuzziness": "AUTO"
}
}
}

Elasticsearch: restrict result to documents with exact match

Currently I trying to restrict results of Elasticsearch (5.4) with the following query:
{
"query": {
"bool": {
"must": {
"multi_match": {
"query": "apache log Linux",
"type": "most_fields",
"fields": [
"message",
"type"
]
}
},
"filter": {
"term": {
"client": "test"
}
}
}
}
}
This returns every document that contains "apache", "log", or "linux". I want to restrict the results to documents that have a field "client" with the exact specified value, this case: "test". However, this query returns all the documents that contain "test" as value. A document with "client": "test client" will also be returned.
I want to restriction to be exact, so only the documents with "client": "test" should be returned and not "client": "test client".
After testing a bunch of different queries and lots of searching, I can not find a solution to my problem. What am I missing?
Just use the keyword part of your client field, since this is 5.x and, by default, the keyword is already there:
"filter": {
"term": {
"client.keyword": "test"
}
}
Set a mapping on your index specifying that your client field is a keyword datatype.
The mapping request could look like
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"client": {
"type": "keyword"
}
}
}
}
}

Resources