Elastic Search don't return highlight results - elasticsearch

I'm sending a request like this:
{
"from": 0,
"query": {
"match": {
"_all": "presidencia"
}
}
,
"aggs": {
//... some aggregations
}
,
"highlight": {
"fields": {
"nomeOrgaoSuperior": {}
}
}
}
But my response doesn't come with highlight field.
Response:
{
"took": 68,
"timed_out": false,
"_shards": {"total": 15, "successful": 15, "failed": 0},
"hits": {
"total": 692785,
"max_score": 0.48536316,
"hits": [
//Some hits...
]
},
"aggregations": {
//some aggs ...
}
}
Do i need some extra configuration on my index or what?

Found the problem. I was trying to use highlight on field that wasn't analysed by my analyser. So, my search was analysed and the fields i was trying to get the highlight wasn't. That made the highlighter to never return a match.

Related

Elasticsearch not finding match for document that contains query

I am trying to search an index for documents that have exception field containing "semaphore" AND "RabbitMQ.Client.Impl".
Example exception:
System.ObjectDisposedException: The semaphore has been disposed.
at System.Threading.SemaphoreSlim.Release(Int32 releaseCount)
at RabbitMQ.Client.Impl.AsyncConsumerWorkService.WorkPool.HandleConcurrent(Work work, IModel model, SemaphoreSlim limiter)
When I search for "semaphore" - document is returned - great!
POST /logs-2023-01/_search?pretty=true
{
"query": {
"bool": {
"must": [
{
"match": {
"exception": "semaphore"
}
},
{
"range": {
"logDate": {
"gte": "now-43200m"
}
}
}
]
}
},
"size": 1000
}
Query above returns:
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 7.5582323,
"hits": [
{
"_index": "logs-2023-01",
"_type": "record",
"_id": "q21yk4UBAdlSjmEEw5gy",
"_score": 7.5582323,
"_source": {
"applicationName": "k8s-application",
"logDate": "2023-01-08T22:13:59.873",
"logLevel": "Error",
"loggerName": "TaskScheduler.UnobservedTaskException.Logger",
"machineName": "k8s-pod-6755d4997c-rztgl",
"threadId": "2",
"message": "An unobserved task exception occurred. The semaphore has been disposed.",
"exception": """
System.ObjectDisposedException: The semaphore has been disposed.
at System.Threading.SemaphoreSlim.Release(Int32 releaseCount)
at RabbitMQ.Client.Impl.AsyncConsumerWorkService.WorkPool.HandleConcurrent(Work work, IModel model, SemaphoreSlim limiter)
""",
"sortDate": "2023-01-08T22:13:59.000027026"
}
}
]
}
}
However when I do same search for query "RabbitMQ.Client.Impl" (which is 100% contained in the exception) - I get nothing - why?
POST /logs-2023-01/_search?pretty=true
{
"query": {
"bool": {
"must": [
{
"match": {
"exception": "RabbitMQ.Client.Impl"
}
},
{
"range": {
"logDate": {
"gte": "now-43200m"
}
}
}
]
}
},
"size": 1000
}
Query above returns:
{
"took": 8,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
Tldr;
match queries will look for exact tokens.
Solution
Tokens are generated at ingestion time by the analyser.
The default analyser split token on whitespace.
Which means rabbitmq.client.impl.asyncconsumerworkservice.workpool.handleconcurrent is going to be a token.
Which is not going to match RabbitMQ.Client.Impl
But you can use match_phrase_prefix
with the following query:
GET 75236255/_search
{
"query": {
"match_phrase_prefix": {
"exception": "RabbitMQ.Client.Impl"
}
}
}

How do I use the whitespace analyzer correctly?

I am currently having an issue where I cannot search for UUID's in my logs. For instance, I have a fieldname "log" and in there is a full log, for example:
"log": "time=\"2022-10-10T07:46:00Z\" level=info msg=\"message to endpoint (outgoing)\" message=\"{8503fb5a-3899-4305-8480-6ddc0f5df296 2022-10-10T09:45:59+02:00}\"\n",
I want to get this log in elastic search, and via Postman I send this:
{
"query": {
"match": {
"log": {
"analyzer": "whitespace",
"query": "8503fb5a-3899-4305-8480-6ddc0f5df296"
}
}
},
"size": 50,
"from": 0
}
As a response I get:
{
"took": 930,
"timed_out": false,
"num_reduce_phases": 2,
"_shards": {
"total": 581,
"successful": 581,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 0,
"relation": "eq"
},
"max_score": null,
"hits": []
}
}
But when I search on "8503fb5a" alone, then I get the wanted results. This means the dashes are still causing issues, but I thought using the whitespace analyzer should fix this? Am I doing something wrong?
These are the fields I have.
You not required to use whitespace analyzer.
You have 2 option to search entire UUID.
First, You can use match query with operator set to and:
{
"query": {
"match": {
"log":{
"query": "8503fb5a-3899-4305-8480-6ddc0f5df296",
"operator": "and"
}
}
}
}
Second, You can use match_phrase query which will search for exact match.
{
"query": {
"match_phrase": {
"log": "8503fb5a-3899-4305-8480-6ddc0f5df296"
}
}
}

Elasticsearch aggregation limitation

When I create an aggregate query what scope it is applied to: all entries in an index or just first 10000?
For example, here is a response I got for a script metric aggregation:
{
"took": 76,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 10000,
"relation": "gte"
},
"max_score": null,
"hits": []
},
"aggregations": {
"number_of_operations_in_progress": {
"value": 2
}
}
}
hits->total->value is 10000 what makes me think that the aggregate function is applied to first 10000 entries only, not the whole data set in the index.
Is my understanding correct? If yes, is there a way to apply an aggregate function to all entries?
Aggregations are always applied to the whole document set that is selected by the query.
hits.total.value only gives a hint at how many documents match the query, in this case more than 10K documents match the query.
you can usr track_total_hits to control how the total number of hits should be tracked
POST index1/_search
{
"track_total_hits": true,
"query": {
"match_all": {}
},
"aggs": {
"groupbyk1": {
"terms": {
"field": "k1"
}
}
}
}

Aggregation Field Missing in output of ElasticSearch

I am newbie in learning aggregations in elastic search
Below is my query in kibana
GET /vehicles/cars/_search
{
"aggs": {
"popular_cars": {
"terms": {"field": "make.keyword","size": 1000
},
"aggs": {
"avg_price": {
"avg": {
"field": "price"
}
}
}
}
}
My output from elastic search
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 16,
"max_score": 1,
"hits": [Here it contains the long list of hits]
}
}
}
My confusion is i am not getting aggregation field displayed in the output am i missing something here?

How to return number of matches according to specific term in search query?

In my search query I have this:
...
term: { CategoryId: [1,2,3] }
...
I need to return how many matches were found for each category. For now just total number of matches is returned. Is it possible? I think this might be related to aggregation, however I can't find the right solution...
A sample query can be,
POST /test/products/_search
{
"size": 0,
"aggs": {
"category": {
"terms": {
"field": "category"
}
}
}
}
so response is as,
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 10,
"max_score": 0,
"hits": []
},
"aggregations": {
"category": {
"buckets": [
{
"key": "1",
"doc_count": 10
},
{
"key": "2",
"doc_count": 12
}
]
}
}
}
Which gives no of documents for each category.
Hope this helps!!

Resources