Does Elasticsearch proximity queries work good? - elasticsearch

I happen to look into a scenario in Elasticsearch, where proximity search is not working as expected. Let me explain it below.
When I tried the search term "apple samsung"~1 it brought me around 10 results from my local cluster. But when the proximity term is "samsung apple"~1 it brought me only 2 results.
As per the Elasticsearch documentation in below URL, both the terms should bring me same number of results
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#_proximity_searches
Could anybody here would help me on this.
Thanks in advance,
Manoj

It's going to depend on your data - if your query is:
curl -XGET 'http://localhost:9200/test/message/_search?pretty' -d '{
"query": {
"query_string": { "query":"\"apple samsung\"~1"}
}
}'
this is a slop of one, it would match
"samsung apple"
"samsung xxxx apple"
to match where "apple" appears first, you'd need to specify a slop 2:
curl -XGET 'http://localhost:9200/test/message/_search?pretty' -d '{
"query": {
"query_string": { "query":"\"apple samsung\"~2"}
}
}'

Related

Elastic Search Multimatch: Is there a way to search all fields except one?

We have an Elastic Search structure that specifies fields in a multi_match query like this:
"multi_match": {
"query": "find this string",
"fields": ["*_id^20", "*_name^20", "*"]
}
This works great - except under certain circumstances like when query is "Find NOWAK". This is because "NOW" is a reserved word for date searching and field "*" matches fields that are defined as dates.
So what I would like to do is ignore fields that match "*_at".
Is there way to tell Elastic Search to ignore certain fields in a multi_match query?
If the answer to that is "no" then the follow up question is how to escape the search term so that it won't trigger key words
Running version 6.7
Try this:
Exclude a field on a Elasticsearch query
curl -XGET 'localhost:9200/testidx/items/_search?pretty=true' -d '{
"query" : {
"query_string": {
"fields": ["title", "field2", "field3"], <-- add this
"query": "Titulo"
}},
"_source" : {
"exclude" : ["*.body"]
}
}'
Apparently the answer is "No: there is not a way to tell ElasticSearch to ignore certain fields in a multi_match query"
For my particular issue I found an inexpensive way to find the necessary white-listed fields (this is performed outside the scope of ElasticSearch otherwise I would post it here) and list those in place of the "*" when building the query.
I am hopeful someone will tell me I'm wrong, but I don't think I am.

ElasticSearch queries for Ngramm

I am trying to make search for such case
for example i have document
1)"There are a lot of diesel cars in the city"
2)"Cars have diesel engines"
3)"Bob sold diesel car"
and I want to find doc 1 and doc 3
if I wrote such query
"query":
{
"function_score":
{ "query":
{"bool":
{"should":[
{"query_string":
{ "fields" : ["text"],
"query" : "\"diesel car\"~1^5"
}}]}}}}
I will find doc1 but not doc3
Is it possible if i use Ngramm analyser this query will work also for doc3?
Or maybe there are other solutions?
Proximity search works only for totally exact phrases if only one character in word change then it's not work. Maybe ES have other solutions for that?
I found the solution
1)Use english stemmer to settings and mapping
2)Use simple query like
(diesel AND car)^5

Using a string to build Query DSL for Elasticsearch

I'm using Meteor (so Javascript, Node, NPM, etc) and would like to provide a simple text input for users to search via Elasticsearch. I would like to be able to use modifiers on the text like + and "" and search for a specific field. I'm looking for something that can convert a plain text input into Elasticsearch Query DSL.
These would be some example queries:
This query would mean that the keyword "tatooine" must exist:
stormtrooper +tatooine
This would mean that "death star" should be one keyword:
stormtrooper "death star"
This would search for the keyword "bloopers" only in the category field:
stormtrooper category=bloopers
Is there a library that can do this? Can a generic solution exist or is this why I can't find any existing answers to this?
simple_query_string would support your query syntax out of the box, except for category=bloopers which should be category:bloopers instead, but otherwise it should work:
curl -XPOST localhost:9200/your_index/_search -d '{
"query": {
"simple_query_string": {
"query": "stormtrooper category:bloopers"
}
}
}'
curl -XPOST localhost:9200/your_index/_search -d '{
"query": {
"simple_query_string": {
"query": "stormtrooper +tatooine"
}
}
}'
You can also send the query in the query string directly like this:
curl -XPOST localhost:9200/your_index/_search?q=stormtrooper%20%22death%20star%22"

Determining which words were matched in a fuzzy search

I'm running a fuzzy search, and need to see which words were matched. For example, if I am searching for the query testing, and it matches a field with the sentence The boy was resting, I need to be able to know that the match was due to the word resting.
I tried setting the parameter explain = true, but it doesn't seem to contain the information I need. Any thoughts?
Alright, this is what I was looking for:
After a bit of research, I found the Highlighting feature of elasticsearch.
By default it returns a snippet of context surrounding the match, but you can set the fragment size to the query length to return only the exact match. For example:
{
query : query,
highlight : {
"fields" : {
'text' : {
"fragment_size" : query.length
}
}
}
}
Using explain should give you some clues, although not very easily available.
If you run the following, also available at https://www.found.no/play/gist/daa46f0e14273198691a , you should see e.g. description: "weight(text:nesting^0.85714287 in 1) […], description: "weight(text:testing in 1) [PerFieldSimilarity] […] and so on in the hit's _explanation.
#!/bin/bash
export ELASTICSEARCH_ENDPOINT="http://localhost:9200"
# Create indexes
curl -XPUT "$ELASTICSEARCH_ENDPOINT/play" -d '{}'
# Index documents
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_bulk?refresh=true" -d '
{"index":{"_index":"play","_type":"type"}}
{"text":"The boy was resting"}
{"index":{"_index":"play","_type":"type"}}
{"text":"The bird was testing while nesting"}
'
# Do searches
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_search?pretty" -d '
{
"query": {
"match": {
"text": {
"query": "testing",
"fuzziness": 1
}
}
},
"explain": true
}
'

Keyword search in ElasticSearch with no regards to the schema

Is it possible to use ElasticSearch to do keyword searches, exactly like in a search engine?
Let me rephrase:
As far as I understand, an ElasticSearch term query requires to specify in which field(s?) to search for keywords.
Given the fact that ElasticSearch can be "schemaless", I wish I could declare a query than can search for keywords in any field.
Is there a syntax for that?
You're looking for the behavior provided by the _all-field, which happens to be on by default:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-all-field.html
Here's a runnable example: https://www.found.no/play/gist/14688f48c75b9931272b
export ELASTICSEARCH_ENDPOINT="http://localhost:9200"
# Index documents
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_bulk?refresh=true" -d '
{"index":{"_index":"play","_type":"type"}}
{"foo":"bar"}
{"index":{"_index":"play","_type":"type"}}
{"something_else":"foo bar"}
'
# Do searches
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_search?pretty" -d '
{
"query": {
"match": {
"_all": {
"query": "bar"
}
}
}
}
'

Resources