Elasticsearch Highlight issue on aggregated fields - elasticsearch

I am using ElasticSearch and want to get highlighted field on the aggregated result of the search query.
I do not want to get results of search query, so I kept the size to be 0 which only gives me aggregated results.
Now I want to apply highlighters on the aggregated results, but that is not working. I am using a term aggregator and a top-hits aggregator as the sub aggregator. In the ES docs they have mentioned top-hits aggregator supports highlighting.
My structure of query goes like this:
{
size:0,
query:{
.......
},
aggregation:{
name-of-agg:{
term:{
....
},
aggregation:{
name-of-sub-agg:{
top-hits:{
....
}
}
}
},
highlight:{
fields:{
fieldname:{
}
}
}
}
}

You have to set the highlight inside the top_hits aggregation properties (not inside of aggregation).
Here is a working minimal example:
echo create index
curl -XPUT 'http://127.0.0.1:9010/files?pretty=1' -d '
{
"settings": {
}
}'
echo create type
curl -XPUT 'http://127.0.0.1:9010/files/_mapping/file?pretty=1' -d'
{
"properties":{
"fileName":{
"type":"string",
"term_vector":"with_positions_offsets"
}
}
}
'
echo insert files
curl -XPUT 'http://127.0.0.1:9010/files/file/1?pretty=1' -d'
{
"fileName":"quick brown fox"
}
'
echo flush
curl -XPOST 'http://127.0.0.1:9010/files/_flush?pretty=1'
echo search brown tophits
curl -XGET 'http://127.0.0.1:9010/files/file/_search?pretty=1' -d '
{
"size" : 0,
"query":{
"match":{
"fileName":"brown"
}
},
"aggregations" : {
"docs" : {
"top_hits" : {
"highlight": {
"fields": {
"fileName": {}
}
}
}
}
}
}'

Related

Elasticsearch mappings doesn't seem to apply for query

I use Elasticsearch with Spring Boot application. In this application there
I have index customer, and customer contains field secretKey. This secret key is string that is build from numbers and letters in way FOOBAR-000
My goal was to select exactly one customer by his secret key, so I changed mappings to NOT ANALYZE that fields but it seems not to work. What am I doing wrong?
Here's my mapping
curl -X GET 'http://localhost:9200/customer/_mapping'
{
"customer": {
"mappings": {
"customer": {
"properties": {
"secretKey": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
but after I will run query
curl -XGET "http:/localhost:9200/customer/_validate/query?explain" -d'
{
"query": {
"query_string": {
"query": "FOOBAR-3121"
}
}
}'
I get following explanation:
"explanations": [
{
"index": "customer",
"valid": true,
"explanation": "_all:foobar _all:3121"
},
]
From my understanding you have an index called "customer" and within this index, a document containing a "customer field. In your case the secretKey should be nested in the "customer" field. For some reasons Elasticsearch decided to have a strange behaviour if you encapsulate objects without specifying that they are of nested type. This is the article from the doc that explains the behaviour in details. If you specify it with the following :
{
"customer": {
"mappings": {
"_doc": {
"properties": {
"customer": {
"type": "nested"
}
}
}
}
}
}
Then it should work with your query
You need to specify field name in your query, without it ElasticSearch executes query against all field, so you see _all . Try this one:
curl -XGET "http:/localhost:9200/customer/_validate/query?explain" -d'
{
"query": {
"term": {
"secretKey": {
"value": "FOOBAR-3121"
}
}
}
}'
My goal was to select exactly one customer by his secret key
Your requirement is strict, so use MATCH query to select ONLY matched customer!
curl -XGET "http:/localhost:9200/customer/_validate/query?explain" -d'
{
"query": {
"match": {
"secretKey": "FOOBAR-3121"
}
}

Elastic Search Percolate Boolean Queries

I am trying to get boolean queries which are stored in ES using Percolate API.
Index mapping is given below
curl -XPUT 'localhost:9200/my-index' -d '{
"mappings": {
"taggers": {
"properties": {
"content": {
"type": "string"
}
}
}
}
}'
I am inserting records like this (Queries contain proper boolean format (AND, OR, NOT etc) as given in below example)
curl -XPUT 'localhost:9200/my-index/.percolator/1' -d '{
"query" : {
"match" : {
"content" : "Audi AND BMW"
}
}
}'
And then I am posting a document to get matched queries.
curl -XGET 'localhost:9200/my-index/my-type/_percolate' -d '{
"doc" : {
"content" : "I like audi very much"
}
}'
In above case no records should come because boolean query is "Audi AND BMW" but it is still giving record. It means that it is ignoring AND condition. I am not able to figure out that why it is not working for boolean queries.
You need to percolate this query instead, match queries do not understand the AND operator (they will treat it like the normal token and), but query_string does.
curl -XPUT 'localhost:9200/my-index/.percolator/1' -d '{
"query" : {
"query_string" : {
"query" : "Audi AND BMW",
"default_field": "content"
}
}
}'

Elasticsearch find ONLY perfect match

I've been trying to do this for some time, read and searched a lot and I haven't found any definitive answer or solution.
Let's say we add some documents:
$ curl -XPUT http://localhost:9200/tm/entries/1 -d '{"item": "foo" }'
{"_index":"tm","_type":"entries","_id":"1","_version":1,"created":true}
$ curl -XPUT http://localhost:9200/tm/entries/2 -d '{"item": "foo bar" }'
{"_index":"tm","_type":"entries","_id":"2","_version":1,"created":true}
$ curl -XPUT http://localhost:9200/tm/entries/3 -d '{"item": "foo bar foo" }'
{"_index":"tm","_type":"entries","_id":"3","_version":1,"created":true}
After this, i want to find ONLY the document(s) that match perfectly the search query
$ curl -XGET http://localhost:9200/_search?q=foo
The result contains all 3 documents and I only want to get the one which matches "foo" only and nothing else.
Also,
$ curl -XGET http://localhost:9200/_search?q=bar foo
Should not return any results.
Can Elasticsearch do that?
How?
Update:
Existing mapping:
{
"tm": {
"mappings": {
"entries": {
"properties": {
"item": {
"type": "string"
}
}
}
}
}
}
Use he following Mapping.
{
"tm": {
"mappings": {
"entries": {
"properties": {
"item": {
"type": "string",
"index" : "not_analyzed"
}
}
}
}
}
}
And use term query to find exact match. Term queries are not analyzed.refer
curl -XGET "http://localhost:9200/tm/entries/_search" -d'
{
"query": {
"term": {
"item": {
"value": "foo bar"
}
}
}
}'
Try adding "index" : "not_analyzed" in the mapping.
And query should be something like
{
"match_phrase": {
"item": "foo"
}
}
You should use match query instead of query_string. It'll solve your issue.
{
"match" : {
"item" : "bar foo"
}
}
Take a look at this:
Also, make sure the terms you are searching is actually present in the indexed field. For that you need to use analyser "keyword".For more information take a look at this.
Thanks
If you are trying to search from GET request, I think this might help:
$ curl -XGET http://localhost:9200/tm/entries/_search?q=item:foo
so it is of syntax, _search?q= <field>:<value>
You can find documentation here, URI Search
And, If you are trying to have filter, it is good to have mapping with not_analyzed (as described above).
And for complex queries,
curl -XPOST "http://localhost:9200/tm/entries/_search" -d'
{
"filter": {
"term": {
"item": "foo"
}
}
}'
hope this helps.

How to make query_string search exact phrase in ElasticSearch

I put 2 documents in Elasticsearch :
curl -XPUT "http://localhost:9200/vehicles/vehicle/1" -d'
{
"model": "Classe A"
}'
curl -XPUT "http://localhost:9200/vehicles/vehicle/2" -d'
{
"model": "Classe B"
}'
Why is this query returns the 2 documents :
curl -XPOST "http://localhost:9200/vehicles/_search" -d'
{
"query": {
"query_string": {
"query": "model:\"Classe A\""
}
}
}'
And this one, only the second document :
curl -XPOST "http://localhost:9200/vehicles/_search" -d'
{
"query": {
"query_string": {
"query": "model:\"Classe B\""
}
}
}'
I want elastic search to match on the exact phrase I pass to the query parameter, WITH the whitespace, how can I do that ?
What you need to look at is the analyzer you're using. If you don't specify one Elasticsearch will use the Standard Analyzer. It is great for the majority of cases with plain text input, but doesn't work for the use case you mention.
What the standard analyzer will do is split the words in your string and then converts them to lowercase.
If you want to match the whole string "Classe A" and distinguish this from "Classe B", you can use the Keyword Analyzer. This will keep the entire field as one string.
Then you can use the match query which will return the results you expect.
Create the mapping:
PUT vehicles
{
"mappings": {
"vehicle": {
"properties": {
"model": {
"type": "string",
"analyzer": "keyword"
}
}
}
}
}
Perform the query:
POST vehicles/_search
{
"query": {
"match": {
"model": "Classe A"
}
}
}
If you wanted to use the query_string query, then you could set the operator to AND
POST vehicles/vehicle/_search
{
"query": {
"query_string": {
"query": "Classe B",
"default_operator": "AND"
}
}
}
Additionally, you can use query_string and escape the quotes will also return an exact phrase:
POST _search
{
"query": {
"query_string": {
"query": "\"Classe A\""
}
}
use match phrase query as mentioned below
GET /company/employee/_search
{
"query" : {
"match_phrase" : {
"about" : "rock climbing"
}
}
}
Seems like in the latest versions of ES you can just use .keyword
POST vehicles/_search
{
"query": {
"term": {
"model.keyword": "Classe A"
}
}
}
It will match exactly the string "Classe A"
Dynamic fields determined by ES as text will have a subfield 'keyword', very useful for this cases:
https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-field-mapping.html
Another nice solution would be using match and minimum_should_match(providing the percentage of the words you want to match). It can be 100% and will return the results containing at least the given text;
It is important that this approach is NOT considering the order of the words.
"query":{
"bool":{
"should":[
{
"match":{
"my_text":{
"query":"I want to buy a new new car",
"minimum_should_match":"90%"
}
}
}
]
}
}

No query registered for [match]

I'm working through some examples in the ElasticSearch Server book and trying to write a simple match query
{
"query" : {
"match" : {
"displayname" : "john smith"
}
}
}
This gives me the error:
{\"error\":\"SearchPhaseExecutionException[Failed to execute phase [query],
....
SearchParseException[[scripts][4]: from[-1],size[-1]: Parse Failure [Failed to parse source
....
QueryParsingException[[kb.cgi] No query registered for [match]]; }
I also tried
{
"match" : {
"displayname" : "john smith"
}
}
as per examples on http://www.elasticsearch.org/guide/reference/query-dsl/match-query/
EDIT: I think the remote server I'm using is not the latest 0.20.5 version because using "text" instead of "match" seems to allow the query to work
I've seen a similar issue reported here: http://elasticsearch-users.115913.n3.nabble.com/Character-escaping-td4025802.html
It appears the remote server I'm using is not the latest 0.20.5 version of ElasticSearch, consequently the "match" query is not supported - instead it is "text", which works
I came to this conclusion after seeing a similar issue reported here: http://elasticsearch-users.115913.n3.nabble.com/Character-escaping-td4025802.html
Your first query looks fine, but perhaps the way you use in the request is not correct. Here is a complete example that works:
curl -XDELETE localhost:9200/test-idx
curl -XPUT localhost:9200/test-idx -d '{
"settings": {
"index": {
"number_of_shards": 1,
"number_of_replicas": 0
}
},
"mappings": {
"doc": {
"properties": {
"name": {
"type": "string", "index": "analyzed"
}
}
}
}
}
'
curl -XPUT localhost:9200/test-idx/doc/1 -d '{
"name": "John Smith"
}'
curl -XPOST localhost:9200/test-idx/_refresh
echo
curl "localhost:9200/test-idx/_search?pretty=true" -d '{
"query": {
"match" : {
"name" : "john smith"
}
}
}
'
echo

Resources