Searching HTML content in elastic search - elasticsearch

I am storing html data in elastic search. On search of word or phrase if records having match data then list out all match records inclusive of pre tag and post tag for the match words to be highlighted. Below is the example of my query.
< p>This is paragrpah tag< span>This is span tag</ span>< /p>
Search word : "span tag"
What query I should pass to get list of records that match the search word.

Related

Nested document full text query with filter capability

My index mappings and sample data as follows
I need full text search on these type of documents with following criteria:
country is one input of this search
If I search "Alex 4455" and country is "xxx" this document will be matched and return following document.
If I search "Landing" and country is "xxx" this document will be matched and return following document.
If I search "Martin 4455" and country is "xxx", result is null.
In the other hand, I need combined_field in nested document with filter capability!!!
I try combined_field and saw that is not good for nested document. Also I try query_string and found that not good for my needs!

query the first element of a list in ElasticSearch

In my Elasticsearch index I have fields that are lists of strings:
"city" = ["Boston","NY","Chicago"]
I need to write a query that searches only the first element of the list.
I have accomplished this by adding a new field that contains only the first element.
"city_first"="Boston"
I like to avoid creating a new field. Is there a way to write a query that searches only the 1st element of the list in Elasticsearch?

Elasticsearch : Query on one of the fields given in the list

Elasticsearch has documents indexed with the following fields:
{"id":"1", "title":"test", "locale_1_title":"locale_test"}
Given a query, following behaviour is needed at each document level:
1) If locale_1_title field is not empty(""), search only on locale_1_title field. Do not search on title field.
2) If locale_1_title field is empty, search on title field.
What can be a simple elasticsearch query to get the above behaviour ?

elasticsearch match all words from document in the search query

We can search for ALL words in a specific document.field like this:
{ "query" : { "match" : { "title": { "query" : "Black Nike Mens", "operator" : "and" } } } }
This will search for the words Black, Nike and Mens in the field title such that only those documents are returned that will have ALL these words in the title field.
But what I am trying to do is a little different.
I want to lookup such that if all the words of the title field of the document are present in my search query then it will return that document.
For e.g.
suppose there is a document with title : "Nike Free Sparq Mens White" in the elasticsearch database
now if I search with a query : "Nike Free Sparq 09 - Mens - White/Black/Varsity Red" then it should return this document, because all the words in the document.title do exist in my query
but if I search with a query : "Nike Free Lebron - Mens - White/Black" then it should NOT return the document because my query has the word Sparq missing
this is a sort of reverse-and-operator search
Is this possible? If yes, then how?
I finally got it to work but not with a direct method!
This is what I do:
Create a clean list of words from the source query, by:
change to lower case
replacing any special chars and punctuation with space
remove duplicate words
Search using normal match with OR operator for the words joined as a string
Now we will find the best relevant hits in result
We take those hits one by one and do a word to word search in php (or whatever programming language you use)
This word search will check for all the words of a document from the hits we just found, and match them with the words in source query; such that all words from hit document should be present in the source query string
This worked for me well enough!
Unless someone has a direct method from elasticsearch query language.
The Percolate query should help here. You'd register your documents as queries, making "Nike Free Sparq Mens White" a match query with an AND operator.
Then your query can become a document like one having "Nike Free Sparq 09 - Mens - White/Black/Varsity Red" as content. You should get "Nike Free Sparq Mens White" back, because it matches all terms.
Unfortunately, this won't scale well (e.g. if you have millions of documents, it might get slow).

ElasticSearch Match Multiple Prefix Terms

I am trying to give ElasticSearch a query with multiple terms and then be given matching documents where the terms specified are anywhere in the target field. The terms may be full words or word prefixes.
Example document:
{
"msg": "hello I am a text message"
}
Example query string:
"hello message"
The words "hello" and "message" appear in the text so I want the document returned. The same query should also return the document if the query string is:
"hel mes"
What is the most performant way to query ElasticSearch to achieve this goal?

Resources