Custom highlights in elastic search - elasticsearch

I am a new bie to elastic search. I have a task where I have to highlight certain queries with specific tags.
I am using a similar query mentioned in elastic search intervals. The problem now is I have to highlight "my favourite food" with a html tag,say "favorite" and cold porridge / hot water with a different html tag, say "state".
How I can do that.
POST _search
{
"query": {
"intervals" : {
"my_text" : {
"all_of" : {
"ordered" : true,
"intervals" : [
{
"match" : {
"query" : "my favourite food",
"max_gaps" : 0,
"ordered" : true
}
},
{
"any_of" : {
"intervals" : [
{ "match" : { "query" : "hot water" } },
{ "match" : { "query" : "cold porridge" } }
]
}
}
]
},
"boost" : 2.0,
"_name" : "favourite_food"
}
}
}
}

You can use the Highlighting feature in Elasticsearch as follows:
GET /index_name/_search
{
"query": {},
"highlight": {
"fields": {
"content": {
"type": "unified",
"number_of_fragments": 0,
"pre_tags": [
"<first_filter>",
"<second_filter>",
"<third filter>"
],
"post_tags": [
"<first_filter>",
"<second_filter>",
"<third filter>"
]
}
}
}
}
The order in which the tags are applied depends on the order in which the filters applied. Also note that, applying number_of_fragments:0 returns the entire content with the tagged hits.

Related

How to get the best matching document in Elasticsearch?

I have an index where I store all the places used in my documents. I want to use this index to see if the user mentioned one of the places in the text query I receive.
Unfortunately, I have two documents whose name is similar enough to trick Elasticsearch scoring: Stockholm and Stockholm-Arlanda.
My test phrase is intyg stockholm and this is the query I use to get the best matching document.
{
"size": 1,
"query": {
"bool": {
"should": [
{
"match": {
"name": "intyig stockholm"
}
}
],
"must": [
{
"term": {
"type": {
"value": "4"
}
}
},
{
"terms": {
"name": [
"intyg",
"stockholm"
]
}
},
{
"exists": {
"field": "data.coordinates"
}
}
]
}
}
}
As you can see, I use a terms query to find the interesting documents and I use a match query in the should part of the root bool query to use scoring to get the document I want (Stockholm) on top.
This code worked locally (where I run ES in a container) but it broke when I started testing on a cluster hosted in AWS (where I have the exact same dataset). I found this explaining what happens and adding the search type argument actually fixes the issue.
Since the workaround is best not used on production, I'm looking for ways to have the expected result.
Here are the two documents:
// Stockholm
{
"type" : 4,
"name" : "Stockholm",
"id" : "42",
"searchableNames" : [
"Stockholm"
],
"uniqueId" : "Place:42",
"data" : {
"coordinates" : "59.32932349999999,18.0685808"
}
}
// Stockholm-Arlanda
{
"type" : 4,
"name" : "Stockholm-Arlanda",
"id" : "1832",
"searchableNames" : [
"Stockholm-Arlanda"
],
"uniqueId" : "Place:1832",
"data" : {
"coordinates" : "59.6497622,17.9237807"
}
}

Elasticsearch - use a field match to boost only and not to fetch the document

I have a query phrase that needs to match in either of the fields - name, summary or description or the exact match on the name field.
Now, I have one more new field brand. Match in this field should be used only to boost results. Meaning if there is a match only in the brand field, the doc should not be in the result set.
To solve the without brand I have the below query:
query: {
bool: {
minimum_should_match: 1,
should: [
multi_match:{
query : "Cadbury chocklate milk",
fields : [name, summary, description]
},
term: {
name_keyword: {
value: "Cadbury chocklate milk"
}
}
]
}
}
This works fine for me.
How do I fetch the data using the same query but boost docs that have brand:cadbury, without increasing the recall set(match based on brand:cadbury).
Thanks!
Using a bool inside must should work for you.
multi_match has multiple types and for phrase you have to use type:phrase.
{
"query": {
"bool": {
"must": [
{ "bool" :
{ "should" : [ {
"multi_match" :{
"type" : "phrase",
"query" : "Cadbury chocklate milk",
"fields" : ["name", "summary", "description"]
} }, {
"term": {
"name_keyword": {
"value": "Cadbury chocklate milk"
} }
}
]
}
}
],
"should" : {
"term" : {
"brand" : {
"value" : "cadbury"
}
}
}
}
}

Highlight not working along with term lookup filter

I'm new to elastic search and have started exploring it from the past few days. My requirement is to get the matched keywords highlighted.
So I have 2 indices
http://localhost:9200/lookup/type/1?pretty
Output
{
"_index" : "lookup",
"_type" : "type",
"_id" : "1",
"_version" : 1,
"found" : true,
"_source":{"terms":["Apache
Storm","Kafka","MR","Pig","Hive","Hadoop","Mahout"]}
}
And another one as following:-
http://localhost:9200/skillsetanalyzer/resume/_search?fields=keySkills
output
{"took":19,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":3,"max_score":1.0,"hits":[{"_index":"skillsetanalyzer","_type":"resume","_id":"1","_score":1.0,"fields":{"keySkills":["Core
Java","J2EE","Struts 1.x","SOAP based
Web Services using JAX-WS","Maven","Ant","JMS","Apache
Storm","Kafka","RDBMS
(MySQL","Tomcat","Weblogic","Eclipse","Toad","TIBCO
product Suite (Administrator","Business
Work","Designer","EMS)","CVS","SVN"]}},
And below query returns the correct results but does not highlight the matched keywords.
curl -XGET 'localhost:9200/skillsetanalyzer/resume/_search?pretty' -d '
{
"query":
{"filtered":
{"filter":
{"terms":
{"keySkills":
{"index":"lookup",
"type":"type",
"id":"1",
"path":"terms"
},
"_cache_key":"1"
}
}
}
},
"highlight": {
"fields":{
"keySkills":{}
}
}
}'
Field "KeySkills" is not analyzed and its type is String. I'm not able to make out what is wrong with the
query.
Please help in providing the necessary pointers.
~Shweta
Highlighting works against the Query, you are just filtering the results. You need to specify highlight_query along with your filters like this
{
"query": {
"filtered": {
"filter": {
"terms": {
"keySkills": [
"MR","Pig","Hive"
]
}
}
}
},
"highlight": {
"fields": {
"keySkills": {
"highlight_query": {
"terms": {
"keySkills": [
"MR","Pig","Hive"
]
}
}
}
}
}
}
I hope this helps.

Elastic Search NEST - How to have multiple levels of filters in search

I would like to have multiple levels of filters to derive a result set using NEST API in Elastic Search. Is it possible to query the results of another filter...? If yes can I do that in multiple levels?
My requirement is like a User is allowed to select / unselect options of various fields.
Example: There are totally 1000 documents in my index 'people'. There may be 3 ListBoxs, 1) City 2) Favourite Food 3) Favourite Colour. If user selects a city it filters out 600 documents. Out of those 600 documents I would like to filter Favourite food, which may result with some 300 documents. Now further I would like to filter with resp. to favourite movie to retrieve 50 documents out of previously derived 300 documents.
You don't need to query within filters to achieve what you want. Just use filtered queries, http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-filtered-query.html, and provide several filters. In your instance I would assume you would do something like this for your first query:
{
"filtered" : {
"query" : {
"match_all" : { }
},
"filter" : {
"and" : [
{
"term" : {
"city" : "some city"
}
}
]
}
}
}
You would then return the results from that and display them. You'd then let them select the next filter and do the following:
{
"filtered" : {
"query" : {
"match_all" : { }
},
"filter" : {
"and" : [
{
"term" : {
"city" : "some city"
}
},
{
"term" : {
"food" : "some food"
}
}
]
}
}
}
You'd then rinse and repeat for the 3 filter param:
{
"filtered" : {
"query" : {
"match_all" : { }
},
"filter" : {
"and" : [
{
"term" : {
"city" : "some city"
}
},
{
"term" : {
"food" : "some food"
}
},
{
"term" : {
"colour" : "some colour"
}
}
]
}
}
}
I haven't tested this, but the principle is sound and will work.

ElasticSearch Aggregation Query possible?

I have the following JSON structures which I want to run an aggregation query in ElasticSearch upon:
{
"user_id": 1,
"gender_id": 1,
"locale_id": 6,
"age": 38,
"hometown_city_id": 1002,
"current_city_id": 672,
"books": [
{
"b_id": 2065,
"b_name": "aut qui assumenda ut",
"b_cat": 56
},
{
"b_id": 2527,
"b_name": "libero et laudantium",
"b_cat": 132
},
...
]
}
What I basically want to do is for example to show the top 5 books the users with "gender_id": "1" (male) have read which also read the book with the "b_id": 2065.
Being a beginner with ElasticSearch I didn't come across a solution yet. I know there's the Aggegation Module (https://github.com/elasticsearch/elasticsearch/issues/3300) coming with v1.0, but I couldn't get it running.
If somebody already has implemented something similiar, please help! Thanks a lot in advance!
maybe something like this (not sure about the nesting books.field):
{
"query" : {
"match_all" : { }
},
"facets" : {
"filter_one" : {
"filter" : {
"and" : [
{"filter" : {"term" : { "gender_id" : 1 }},
{"filter" : {"term" : { "books.b_id" : 2065 }},
]
}
},
"book_cnt" : {
"terms" : {
"field" : "books.b_name",
"size" : 5
}
}
}
}
Thanks #mconlin for the hint. The query I got it working is the following:
{
"query": {
"filtered": {
"filter": {
"and" : [
{ "term": { "gender_id": 1 } },
{ "term": { "books.b_id": 2065} }
]
}
}
},
"facets": {
"book_cnt": {
"terms": {
"field": "books.b_id",
"size": 5
}
}
}
}
Turns out it's better to use the filter in the query itself instead of as a separate facet.

Resources