elasticsearch multi_match with phrase_prefix not working - elasticsearch

I am running the below search query on my index
{
"_source": "false",
"query": {
"bool": {
"must": [
{
"multi_match": {
"fields": ["email","name", "company", "phone"],
"query": "tes",
"type" : "phrase_prefix"
}
}
]
}
},
"highlight": {
"fields": {"name": {}, "company" : {}, "email" : {}, "phone" : {}}
}
}
I have some sample data with the field values
name: test paddy
name : test user
name : test logger
name : test
When I run the above query, I do not get any results, but when I change it to "query": "test", I start seeing 1 result "test". I was expecting to see in both cases all the above names that i have. Am I missing something here?
UPDATE
I also noticed that this is working with text fields, but fails with keywords, long fields etc Also, when I tried
{ "query": {
"prefix" : { "phone" : 99 }
}
}
with number fields and keyword fields its working.
So is it like multi_match and prefix don't work well with keyword and number fields?

The issue was that I was running this on keyword fields. I changed it to text and worked like a beauty. Should have read the documentation more clearly!

Related

Elasticsearch: Ordering of Highlighted results does not work

I am trying to order the highlighted results returned by Elasticsearch. As per the documentation, here is how I do so:
enter code here res=es.search(
index="my-index",
size=30,
body=
{
"query":
{
"multi_match":
{
"fields":["chapter_name","chapter_id","subchapter_name","subchapter_id","range_name","range_id","item_name","item_id"],
"query": "diamond core bit adapters" ,
"type":"best_fields",
"fuzziness": "1",
"tie_breaker": 0.3
}
},
"highlight" :
{
"type":"unified",
"order": "score",
"fields" :
{
"chapter_name" : {},
"chapter_id" : {},
"subchapter_name":{},
"subchapter_id":{},
"range_name":{},
"range_id":{},
"item_name":{},
"item_id":{}
}
},
})
However, as part of my results I get something like this:
{u'item_name': [u'<em>Core</em> <em>bit</em> <em>adapter</em> DDBU 1 14 UNC'], u'subchapter_name': [u'<em>Diamond</em> Drilling Accessories'], u'chapter_name': [u'<em>Diamond</em> <em>Coring</em> Sawing'], u'range_name': [u'<em>Diamond</em> <em>core</em> <em>bit</em> <em>adapters</em>']}
Clearly, the field 'range_name' has higher number of fragments highlighted, but it appears lower down the order.
Can anyone help me out with this?

Inconsistent behavior of ElasticSearch not_analyzed field

I am using ES version 2.3. I have index some documents which have the structure like this :
{
"BUSINESSLINE" :"ABC CORP",
"NAME" : "John"
....
...
}
The field BUSINESSLINE is not_analyzed string.
The problem is that this query returns results :
{
"query": {
"multi_match" : {
"query": "ABC",
"fields": [ "_all" ]
}
}
}
But this one does not (It shows no hits!):
{
"query": {
"multi_match" : {
"query": "ABC",
"fields": [ "BUSINESSLINE " ]
}
}
}
Any help is appreciated, I tried to google and research but I am not able to able find any reason for this.
Thanks!
Yes, you are correct. The query matches the document because of _all filed which is a big string constructed by concatenating all fields by the space separator. And it is also analysed which is why your query is being matched.
You can read more about it here.

How to make use of `gt` and `fields` in the same query in Elasticsearch

In my previous question, I was introduced to the fields in a query_string query and how it can help me to search nested fields of a document.
{
"query": {
"query_string": {
"fields": ["*.id","id"],
"query": "2"
}
}
}
But it only works for matching, what if I want to do some comparison? After some reading and testing, it seems queries like range do not support fields. Is there any way I can perform a range query, e.g. on a date, over a field that can be scattered anywhere in the document hierarchy?
i.e. considering the following document:
{
"id" : 1,
"Comment" : "Comment 1",
"date" : "2016-08-16T15:22:36.967489",
"Reply" : [ {
"id" : 2,
"Comment" : "Inner comment",
"date" : "2016-08-16T16:22:36.967489"
} ]
}
Is there a query searching over the date field (like date > '2016-08-16T16:00:00.000000') which matches the given document, because of the nested field, without explicitly giving the address to Reply.date? Something like this (I know the following query is incorrect):
{
"query": {
"range" : {
"date" : {
"gte" : "2016-08-16T16:00:00.000000",
},
"fields": ["date", "*.date"]
}
}
}
The range query itself doesn't support it, however, you can leverage the query_string query (again) and the fact that you can wildcard fields and that it supports range queries in order to achieve what you need:
{
"query": {
"query_string": {
"query": "\*date:[2016-08-16T16:00:00.000Z TO *]"
}
}
}
The above query will return your document because Reply.date matches *date

Different boosting for the same field in different types in Elasticsearch 2.x with multi_match query

I am trying to do the following as described in the documentation (which is maybe outdated at present date).
https://www.elastic.co/guide/en/elasticsearch/guide/current/mapping.html
I will adapt the scenario described there to what I want to achieve.
Imagine that we have two types in our index: blog_t1 for blog posts
about Topic 1, and blog_t2 for blog posts about Topic 2. Both types
have a title field.
Then, I want to apply query boosting to the title field for blog_t1
only.
In previous versions of Elasticsearch, you could reference the field
from the type by using blog_t1.title and blog_t2.title. So boosting
one of them was as simple as blog_t1.title^2.
But since Elasticsearch 2.x, some old support for types have been removed (for good reasons, like removing ambiguity). Those changes are described here.
https://www.elastic.co/guide/en/elasticsearch/reference/current/breaking_20_mapping_changes.html
So my question is, how can I do that boosting for the title, just for the type blog_t1, and not blog_t2, with Elasticsearch 2.x, in a multi_match query?
The query would be something like this, but this obviously does not work as type.field is not a thing anymore.
GET /my_index/_search
{
"query": {
"multi_match": {
"query": "Hello World",
"fields": [
"blog_t1.title^2",
"blog_*.title",
"author",
"content"
]
}
}
}
FYI, the only solution I found so far is to give the titles different names, like title_boosted for blog_t1 and just title for the others, which is problematic when making use of the information, as I can no longer use the "title" as a unique thing.
Thanks.
What about adding another "optional" constraint for the document type so docs matching it have more score (you can tune it with boosting) like:
{
"query" : {
"bool" :
{
"must" :
[
{"match" : {"title" : "Hello world"}}
],
"should" :
[
{"match" : {"_type" : "blog_t1"}}
]
}
}
}
Or with score functions:
{
"query": {
"function_score": {
"query": {
"match": {
"title": "Hello world"
}
},
"boost_mode": "multiply",
"functions": [
{
"filter": {
"term": {
"_type": "blog_t1"
}
},
"weight": 2
},
{
"filter": {
"term": {
"_type": "blog_t2"
}
},
"weight": 3
}
]
}
}
}

Highlight not working along with term lookup filter

I'm new to elastic search and have started exploring it from the past few days. My requirement is to get the matched keywords highlighted.
So I have 2 indices
http://localhost:9200/lookup/type/1?pretty
Output
{
"_index" : "lookup",
"_type" : "type",
"_id" : "1",
"_version" : 1,
"found" : true,
"_source":{"terms":["Apache
Storm","Kafka","MR","Pig","Hive","Hadoop","Mahout"]}
}
And another one as following:-
http://localhost:9200/skillsetanalyzer/resume/_search?fields=keySkills
output
{"took":19,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":3,"max_score":1.0,"hits":[{"_index":"skillsetanalyzer","_type":"resume","_id":"1","_score":1.0,"fields":{"keySkills":["Core
Java","J2EE","Struts 1.x","SOAP based
Web Services using JAX-WS","Maven","Ant","JMS","Apache
Storm","Kafka","RDBMS
(MySQL","Tomcat","Weblogic","Eclipse","Toad","TIBCO
product Suite (Administrator","Business
Work","Designer","EMS)","CVS","SVN"]}},
And below query returns the correct results but does not highlight the matched keywords.
curl -XGET 'localhost:9200/skillsetanalyzer/resume/_search?pretty' -d '
{
"query":
{"filtered":
{"filter":
{"terms":
{"keySkills":
{"index":"lookup",
"type":"type",
"id":"1",
"path":"terms"
},
"_cache_key":"1"
}
}
}
},
"highlight": {
"fields":{
"keySkills":{}
}
}
}'
Field "KeySkills" is not analyzed and its type is String. I'm not able to make out what is wrong with the
query.
Please help in providing the necessary pointers.
~Shweta
Highlighting works against the Query, you are just filtering the results. You need to specify highlight_query along with your filters like this
{
"query": {
"filtered": {
"filter": {
"terms": {
"keySkills": [
"MR","Pig","Hive"
]
}
}
}
},
"highlight": {
"fields": {
"keySkills": {
"highlight_query": {
"terms": {
"keySkills": [
"MR","Pig","Hive"
]
}
}
}
}
}
}
I hope this helps.

Resources