Elasticsearch bool search matching incorrectly - elasticsearch

So I have an object with an Id field which is populated by a Guid. I'm doing an elasticsearch query with a "Must" clause to match a specific Id in that field. The issue is that elasticsearch is returning a result which does not match the Guid I'm providing exactly. I have noticed that the Guid I'm providing and one of the results that Elasticsearch is returning share the same digits in one particular part of the Guid.
Here is my query source (I'm using the Elasticsearch head console):
{
query:
{
bool:
{
must: [
{
text:
{
couchbaseDocument.doc.Id: 5cd1cde9-1adc-4886-a463-7c8fa7966f26
}
}]
must_not: [ ]
should: [ ]
}
}
from: 0
size: 10
sort: [ ]
facets: { }
}
And it is returning two results. One with ID of
5cd1cde9-1adc-4886-a463-7c8fa7966f26
and the other with ID of
34de3d35-5a27-4886-95e8-a2d6dcf253c2
As you can see, they both share the same middle term "-4886-". However, I would expect this query to only return a record if the record were an exact match, not a partial match. What am I doing wrong here?

The query is (probably) correct.
What you're almost certainly seeing is the work of the 'Standard Analyzer` which is used by default at index-time. This Analyzer will tokenize the input (split it into terms) on hyphen ('-') among other characters. That's why a match is found.
To remedy this, you want to set your couchbaseDocument.doc.Id field to not_analyzed
See: How to not-analyze in ElasticSearch? and the links from there into the official docs.
Mapping would be something like:
{
"yourType" : {
"properties" : {
"couchbaseDocument.doc.Id" : {"type" : "string", "index" : "not_analyzed"},
}
}
}

Related

elasticsearch searching where nth character of a field matches the parameter

Elasticsearch index has json docs like this
{
id: ABC120379013
name: Matlo Jonipa
jobs: {nested data}
}
When defining index schema, I register id as keyword.
Now I need to write a query that can return all docs where 4th char in the id field value is digit 9.
# matched docs
id: ABC920379013,
id: Zxr900013000
...
I have two questions for this usecase.
1- Am I indexing it right when I set id as a keyword field? I feel I should be using some analyzer but I don't know which one.
2- Can someone guide me how to write a query to match nth char of a field?
I can use this regex to match a string whose 4th character is 9 but can I use it in an elasticsearch query?
/^...9.*$/
or
/^.{3}9.*$/
Below query would help your use case. I've used Script Query.
Also, yes you are doing it right, you need to make sure that the field id would be of type keyword. Note that keyword type doesn't make use of analyzers.
POST regexindex/_search
{
"query": {
"bool" : {
"must" : {
"script" : {
"script" : {
"source": "doc['id'].value.indexOf('9') == 3",
"lang": "painless"
}
}
}
}
}
}
I've mentioned .indexOf('9') == 3 because index starts from 0

Make a prefix query on whole filed in elastic search

Hi I am having a field called text_field in which i have two document
1.lubricant
2.air lube
I have used Edge-N gram analyzer with term query but in result when i serch with lub
Terms query over filed analyzed with edge n-gram analyzer
{
"terms" : {
"text_field" : [ "lub" ]
}
}
prefix query over filed analyzed with keyword tokenizer:
{
"prefix" : {
"text_field" : {
"prefix" : "lub"
}
}
}
In both these queries m getting two results in result set
"lubricant",
"air lube"
I don't want air lube to be in result as it starts with word air,is there any way to make a search prefix query on whole field,looks like here it's checking terms,is there any way to sort this out.

ElasticSearch filter on exact url

Let's say I create this document in my index:
put /nursery/rhyme/1
{
"url" : "http://example.com/mary",
"text" : "Mary had a little lamb"
}
Why does this query not return anything?
POST /nursery/rhyme/_search
{
"query" : {
"match_all" : {}
},
"filter" : {
"term" : {
"url" : "http://example.com/mary"
}
}
}
The Term Query finds documents that contain the exact term specified in the inverted index. When you save the document, the url property is analyzed and it will result in the following terms (with the default analyzer) : [http, example, com, mary].
So what you currently have in you inverted index is that bunch of terms, non of them is http://example.com/mary.
What you want is to not analyze the url property or to do a Match Query that will split the query into terms just like when indexing.
Exact Match does not work for analyzed field. A string is by default analyzed which means http://example.com/mary string will be split and stored in reverse index as http , example , com , mary. That's why your query results in no output.
You can make your field not analyzed
{
"url": {
"type": "string",
"index": "not_analyzed"
}
}
but for this you will have to reindex your index.
Study about not_analyzed and term query here.
Hope this helps
In the ElasticSearch 7.x you have to use type "keyword" in maping properties, which is not analized https://www.elastic.co/guide/en/elasticsearch/reference/current/keyword.html

Elasticsearch: how to query a long field for exact match

My document has the following mapping property:
"sid" : {"type" : "long", "store": "yes", "index": "no"},
This property has only one value for each record. I would like to query this property. I tried the following queries:
{
"query" : {
"term" : {
"sid" : 10
}
}
}
{
"query" : {
"match" : {
"sid" : 10
}
}
}
However, I got no results. I do have a document with sid being euqal to 10. Anything I did is wrong? I would like to query this property for exact match.
Thanks and regards.
Quote from the documentation:
index: Set to analyzed for the field to be indexed and searchable after being
broken down into token using an analyzer. not_analyzed means that its
still searchable, but does not go through any analysis process or
broken down into tokens. no means that it won’t be searchable at all
(as an individual field; it may still be included in _all). Setting to
no disables include_in_all. Defaults to analyzed.
So, by setting index to no you cannot search by that field individually. So, you either need to remove no from index and choose something else or you can use "include_in_all":"yes" and use a different type of query:
"query": {
"match": {
"_all": 10
}
}

Full-text schema in ElasticSearch

I'm (extremely) new to ElasticSearch so forgive my potentially ridiculous question. I currently use MySQL to perform full-text searches, and want to move this to ElasticSearch. Currently my table has a fulltext index spanning three columns:
title,description,tags
In ES, each document would therefore have title, description and tags fields, allowing me to do a fulltext search for a general phrase, or filter on a given tag.
I also want to add further searchable fields such as username (so I can retrieve posts by a given user). So, how do I specify that a fulltext search should match title OR description OR tags but not username?
From the OR filter example, I'd assume I'd have to use something like this:
{
"filtered" : {
"query" : {
"match_all" : {}
},
"filter" : {
"or" : [
{
"term" : { "title" : "foobar" }
},
{
"term" : { "description" : "foobar" }
},
{
"term" : { "tags" : "foobar" }
}
]
}
}
}
Coming at this new, it doesn't seem like this is very efficient. Is there a better way of doing this, or do I need to move the username field to a separate index?
This is fine.
I general I would suggest getting familiar with ElasticSearch mapping types and options.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping.html

Resources