SuggestionBuilder with BoolQueryBuilder in Elasticsearch - elasticsearch

I am currently using BoolQueryBuilder to build a text search. I am having an issue with wrong spellings. When someone searches for a "chiar" instead of "chair" I have to show them some suggestions.
I have gone through the documentation and observed that the SuggestionBuilder is useful to get the suggestions.
Can I send all the requests in a single query, so that I can show the suggestions if the result is zero?

No need to send different search terms ie chair, chiar to get suggestions, it's not efficient and performant and you don't know all the combinations which user might misspell.
Instead, Use the fuzzy query or fuzziness param in the match query itself, which can be used in the bool query.
Let me show you an example, using the match query with the fuzziness parameter.
index def
{
"mappings": {
"properties": {
"product": {
"type": "text"
}
}
}
}
Index sample doc
{
"product" : "chair"
}
Search query with wrong term chiar
{
"query": {
"match" : {
"product" : {
"query" : "chiar",
"fuzziness" : "4" --> control it according to your application
}
}
}
}
Search result
"hits": [
{
"_index": "so_fuzzy",
"_type": "_doc",
"_id": "1",
"_score": 0.23014566,
"_source": {
"product": "chair"
}
}

Related

Is there a difference between "match" and "simple_query_string" if no special characters?

Elastic Search 7.9
I'm searching a single field with a textbox exposed to users through a web UI.
{
match: {
body: {
query: 'beer pretzels',
}
}
}
I'm debating whether to use simple_query_string instead.
{
simple_query_string: {
query: 'beer pretzels',
}
}
My initial thought was to switch to simple_query_string if I detect special characters in the keywords. But now I wonder why I'd use match at all.
My questions:
Are there any differences between match and simple_query_string for the simple case where the keywords contains no special characters?
Any reason why I would not use simple_query_string all the time?
Simple Query string returns documents based on a provided query
string, using a parser with a limited but fault-tolerant syntax.
Refer this to get a detailed explanation, which states that :
The simple_query_string query is a version of the query_string query
that is more suitable for use in a single search box that is exposed
to users because it replaces the use of AND/OR/NOT with +/|/-,
respectively, and it discards invalid parts of a query instead of
throwing an exception if a user makes a mistake.
It supports Lucene syntax to interpret the text, you can refer this article that gives detailed information about how simple query string works.
Match Query returns documents that match a provided text, number, date
or boolean value. The provided text is analyzed before matching.
Refer to this ES documentation part and this blog, to understand how the match query works
I have tried to run this below search query using both simple query string and match query:
Index Data
{
"content":"foo bar -baz"
}
Search Query using simple query string:
{
"query": {
"simple_query_string": {
"fields": [ "content" ],
"query": "foo bar -baz"
}
}
}
Search Result:
"hits": [
{
"_index": "stof_63937563",
"_type": "_doc",
"_id": "1",
"_score": 0.5753642, <-- note this
"_source": {
"content": "foo bar -baz"
}
}
]
Search Query using match query:
{
"query": {
"match": {
"content": {
"query": "foo bar -baz"
}
}
}
}
Search Result:
"hits": [
{
"_index": "stof_63937563",
"_type": "_doc",
"_id": "1",
"_score": 0.8630463, <-- note this
"_source": {
"content": "foo bar -baz"
}
}
]
Please refer this SO answer that explains the difference between multi_match and query_string

Is there any way to match similar match in Elastic Search

I have a elastic search big document
I am searching with below query
{"size": 1000, "query": {"query_string": {"query": "( string1 )"}}}
Let say my string1 = Product, If some one accident type prduct some one forgot to o
Is there any way to search for that also
{"size": 1000, "query": {"query_string": {"query": "( prdct )"}}} also has to return result of prdct + product
You can use fuzzy query that returns documents that contain terms similar to the search term. Refer this blog to get detailed explanation of fuzzy queries.
Since,you have more edit distance to match prdct. Fuzziness parameter can be defined as :
0, 1, 2
0..2 = Must match exactly
3..5 = One edit allowed
More than 5 = Two edits allowed
Index Data:
{
"title":"product"
}
{
"title":"prdct"
}
Search Query:
{
"query": {
"fuzzy": {
"title": {
"value": "prdct",
"fuzziness":15,
"transpositions":true,
"boost": 5
}
}
}
}
Search Result:
"hits": [
{
"_index": "my-index1",
"_type": "_doc",
"_id": "2",
"_score": 3.465736,
"_source": {
"title": "prdct"
}
},
{
"_index": "my-index1",
"_type": "_doc",
"_id": "1",
"_score": 2.0794415,
"_source": {
"title": "product"
}
}
]
There are many solutions to this problem:
Suggestions (did you mean X instead).
Fuzziness (edits from your original search term).
Partial matching with autocomplete (if someone types "pr" and you provide the available search terms, they can click on the correct results right away) or n-grams (matching groups of letters).
All of those have tradeoffs in index / search overhead as well as the classic precision / recall problem.

how to properly use wildcard on a elastic search find?

I just jumped into ES and I dont have a lot of experience on this, so might be something I am missing on this.
I found this documentation: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-wildcard-query.html that basically explains how to do a wildcard search.
I am trying to look for all messages inside my document that have certain patter.
So, using Kibana Sense (Elastic search query UI)I did this:
GET _search
{
"query": {
"wildcard" : {
"model.message": "my*"
}
}
}
with this I am trying to obtain all the messages that start with "my"
But I get no results...
Here is a copy of my document structure (or at least the first lines...)
"_index": "my_index",
"_type": "my_type",
"_id": "123456",
"_source": {
"model": {
"id": "123456",
"message": "my message",
Any idea what could be wrong?
Your sample document actually contains the model.content.message field but not the model.message field, so the following query should work:
GET _search
{
"query": {
"wildcard" : {
"model.content.message": "my*"
}
}
}
Can you share your mapping? It looks like you need to use a nested query:
GET /_search
{
"query": {
"nested" : {
"path" : "model",
"score_mode" : "avg",
"query" : {
"wildcard" : {
"model.message": "my*"
}
}
}
}
}
You can read more about nested queries here.

Elastic Search Term Query Not Matching URL's

I am a beginner with Elastic search and I am working on a POC from last week.
I am having a URL field as a part of my document which contains URL's in the following format :"http://www.example.com/foo/navestelre-04-cop".
I can not define mapping to my whole object as every object has different keys except the URL.
Here is how I am creating my Index :
POST
{
"settings" : {
"number_of_shards" : 5,
"mappings" : {
"properties" : {
"url" : { "type" : "string","index":"not_analyzed" }
}
}
}
}
I am keeping my URL field as not_analyzed as I have learned from some resource that marking a field as not_analyzed will prevent it from tokenization and thus I can look for an exact match for that field in a term query.
I have also tried using the whitespace analyzer as the URL value thus not have any of the white space character. But again I am unable to get a successful Hit.
Below is my term query :
{
"query":{
"constant_score": {
"filter": {
"term": {
"url":"http://www.example.com/foo/navestelre-04-cop"
}
}
}
}
}
I am guessing the problem is somewhere with the Analyzers and Tokenizers but I am unable to get to a solution. Any kind of help would be great to enhance my knowledge and would help me reach to a solution.
Thanks in Advance.
You have the right idea, but it looks like some small mistakes in your settings request are leading you astray. Here is the final index request:
POST /test
{
"settings": {
"number_of_shards" : 5
},
"mappings": {
"url_test": {
"properties": {
"url": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
Notice the added url_test type in the mapping. This lets ES know that your mapping applies to this document type. Also, settings and mappings are also different keys of the root object, so they have to be separated. Because your initial settings request was malformed, ES just ignored it, and used the standard analyzer on your document, which led to you not being able to query it with your query. I point you to the ES Mapping docs
We can index two documents to test with:
POST /test/url_test/1
{
"url":"http://www.example.com/foo/navestelre-04-cop"
}
POST /test/url_test/2
{
"url":"http://stackoverflow.com/questions/37326126/elastic-search-term-query-not-matching-urls"
}
And then execute your unmodified search query:
GET /test/_search
{
"query": {
"constant_score": {
"filter": {
"term": {
"url": "http://www.example.com/foo/navestelre-04-cop"
}
}
}
}
}
Yields this result:
"hits": [
{
"_index": "test",
"_type": "url_test",
"_id": "1",
"_score": 1,
"_source": {
"url": "http://www.example.com/foo/navestelre-04-cop"
}
}
]

Elastic Search - Querying on values

I have an elasticsearch index with the following values
{
"_index": "article",
"_type": "articleId",
"_id": "10970",
"_score": 1,
"_source": {
"url": "http%3A%2F%2Fwww.tomshardware.com%2Fnews%2FAir-Traffic-Software-DoS-Attacks%2C16471.html%23xtor%3DRSS-181",
"title": "Air%20Traffic%20Software%20Vulnerable%20to%20DoS%20Attacks",
"publicationId": "888",
"text": "%20%3Cp%3E%3Cstrong%3EA%20security%20researcher%20revealed%20a%20flaw%20in%20commonly%20used%20air%20traffic%20control%20software%20that%20would%20allow%20an%20attacker%20to%20create%20an%20unlimited%20number%20of%20phantom%20flights.%3C%2Fstrong%3E%3C%2Fp%3E%20%3Cp%3E%3Ca%20target%3D%22_blank%22%3E%3C%2Fa%3E%3C%2Fp%3E%20%3Cp%3EAccording%20to%20Andrei%20Costin%2C%20%242%2C000%20in%20equipment%20and%20%22modest%20tech%20skills%22%20are%20enough%20to%20throw%20an%20air%20traffic%20control%20system%20of%20virtually%20any%20airport%20into%20complete%20disarray.%20The%20ADS-B%20system%20that%20is%20used%20across%20the%20world%20is%20vulnerable%20as%20it%20does%20not%20verify%20that%20incoming%20traffic%20signals%20as%20genuine.%20%3C%2Fp%3E%20%3Cp%3ECostin%20says%20that%20a%20hacker%20could%20inject%20flights%20that%20do%20not%20exist%20and%20could%20confuse%20an%20air%20controller%20station.%20Air%20controllers%20could%20cross-check%20flights%20with%20flight%20schedules%2C%20but%20if%20the%20number%20of%20phantom%20flights%20is%20high%20enough%2C%20there%20is%20no%20way%20that%20cross-checks%20would%20work.%20Consider%20it%20like%20an%20DoS%20attack%20on%20an%20air%20traffic%20control%20system.%3C%2Fp%3E%20%3Cp%3ECostin%20noted%20that%20rogue%20signals%20from%20the%20ground%20can%20be%20generally%20identified%20and%20ruled%20out%20as%20malicious%20signals%2C%20but%20there%20is%20no%20way%20to%20do%20the%20same%20for%20robotic%20aircraft%2C%20for%20example.%20He%20also%20noted%20that%20data%20sent%20from%20airplanes%20to%20air%20traffic%20controllers%20is%20unencrypted%20and%20can%20be%20captured%20by%20unidentified%20sources.%20Since%20this%20applies%20to%20any%20aircraft%2C%20it%20is%20in%20theory%20possible%20to%20deploy%20airplane%20tracking%20devices%20to%20track%20specific%20aircraft.%3C%2Fp%3E%20%3C%2Fp%3E%3Cp%3E%20%3Cp%3E%3Ca%20target%3D%22_blank%22%20href%3D%22mailto%3Anews-us%40bestofmedia.com%3Fsubject%3DNews%2520Article%2520Feedback%22%3E%3Cem%3E%3Csub%3EContact%20Us%20for%20News%20Tips%2C%20Corrections%20and%20Feedback%3C%2Fsub%3E%3C%2Fem%3E%3C%2Fa%3E%3C%2Fp%3E",
"keywords": {
"air": "3.4965034965034962",
"traffic": "3.4965034965034962",
"flights": "2.797202797202797",
"": "2.797202797202797",
"Costin": "2.097902097902098",
"aircraft": "2.097902097902098",
"signals": "2.097902097902098",
"control": "2.097902097902098",
"system": "2.097902097902098",
"there": "1.3986013986013985"
}
}
}
I am trying to write a query to search does this index have the keyword flights (which it does) but I am having difficulty
Its straightforward running a match query on one of the other fields like text but encountering problems when trying to do the same or similar for keywords
Is there a way of performing this search with the current setup or should I add the keywords in differently?
If I understood you correctly, you would like to find all records that have the field keyword.flights and the value of this field is not important. You can do it using string query:
curl "http://localhost:9200/_search?q=keywords.flights:*"
Or using the exist filter:
curl "http://localhost:9200/_search" -d '{
"query": {
"constant_score" : {
"filter" : {
"exists" : { "field" : "keywords.flights" }
}
}
}
}'

Resources