Elasticsearch return unique string from array field after a given filter - elasticsearch

How would I get all values of all the ids with a given prefix from the elastic search records and make them unique.
Records
PUT items/1
{ "ids" : [ "apple_A", "orange_B" ] }
PUT items/2
{ "ids" : [ "apple_A", "apple_B" ] }
PUT items/3
{ "ids" : [ "apple_C", "banana_A" ] }
What I need is to find all the unique ids for a given prefix, for example if input is apple the output of ids should be ["apple_A", "apple_B", "apple_C"]
What I have tried so far is make use of the term aggregation, with the following query I was able to filter out the documents which have ids with given prefix but in the aggregation it will return all the ids part of the document.
{
"aggregations": {
"filterIds": {
"filter": {
"bool": {
"filter": [
{
"prefix": {
"ids.keyword": {
"value": "apple"
}
}
}
]
}
},
"aggregations": {
"uniqueIds": {
"terms": {
"field": "ids.keyword",
}
}
}
}
}
}
It's returning aggregation list as [ "appleA", "orange_B", "apple_B","apple_C", "banana_A"] if we give prefix input as apple. Basically returning all ids which have a matching filter.
Is there to get only the ids which match the prefix in array and not all the ids in the array of document ?

You can limit the returned values using the include parameter:
POST items/_search
{
"size": 0,
"aggregations": {
"filterIds": {
"filter": {
"bool": {
"filter": [
{
"prefix": {
"ids.keyword": {
"value": "apple"
}
}
}
]
}
},
"aggregations": {
"uniqueIds": {
"terms": {
"field": "ids.keyword",
"include": "apple.*" <--
}
}
}
}
}
}
Do check this other thread which deals with using regex within include -- it's very similar to your use case.

Related

Checking if item has been indexed from an Array

I have list or items to be added to Elasticsearch , but when i check the count i found that the items count is less in Elasticseach compare to the database .
So i created an array with all the ids in the database i want to know how can i compare it with elaticsearch
{
"size": 100,
"query": {
"bool": {
"should": [
{
"bool": {
"must_not": {
"terms": {
"ID": [
10400,
11024,
10401,
11026,
11053,
11061
]
}
}
}
}
]
}
}
}
You could use an aggregation query to list buckets for document IDs.
The following query will not include buckets for IDs that are not present in your index.
If you want buckets for IDs that are not in the index than you may want to use filter aggregation to write one filter query for each ID you are searching.
POST test_index/_search
{
"size": 0,
"aggs":{
"matching_values_field": {
"filter": {
"terms" : { "id" : [
10400,
11024,
10401,
11026,
11053,
11061
]}
},
"aggs": {
"myfield" : {
"terms" : {
"field" : "id"
}
}
}
}
}
}

terms query not working in Elastic search with value having space in it

We need to get the data based on multiple values.
So I am trying to use terms query in elastic search for modelNumber field.
But it is not working as expected.can anyone let me know what is wrong with the query.
POST index_name/_Search
{
"query": {
"bool": {
"must": [
{
"terms": {
"modelNumber": [
"test 1234rthg-1234-1234512-2345",
"testMode11l-123-rtyu-xyz11"
]
}
},
{
"terms": {
"userId": [
"123",
"VALUE2"
]
}
}
]
}
}
}
Terms query returns documents that contain one or more exact terms in
a provided field.
If you have not explicitly defined any index mapping, then you need to add .keyword to the modelNumber field. This uses the keyword analyzer instead of the standard analyzer (notice the ".keyword" after modelNumber field).
{
"query": {
"bool": {
"must": [
{
"terms": {
"modelNumber.keyword": [ // note this
"test 1234rthg-1234-1234512-2345",
"testMode11l-123-rtyu-xyz11"
]
}
},
{
"terms": {
"userId": [
"123",
"VALUE2"
]
}
}
]
}
}
}
OR you need to modify the mapping of modelNUmber field as -
{
"mappings": {
"properties": {
"modelNumber": {
"type": "keyword"
}
}
}
}

How to filter a specific value within a dictionary?

Let's say I have this dictionary:
{
"name": "Jorje",
"surname": "Costali",
"extra_information": {
"real_name": "mamino",
"fake_name": "bambino",
"age": "43",
"gang": "gang34"
}
}
How can I query to get all entries that have "extra_information.gang":"gang34" ? I would like to know how to filter after exact term or having a match.
I have tried:
{
"size": 20,
"query": {
"bool": {
"filter": [
{
"terms": {
"extra_information.gang": [
"gang34"
]
}
}
]
}
}
}
but it does not return any entries.
I have tried:
GET _search
{
"query": {
"bool": {
"must": [
{
"match": {
"extra_information.gang" : "gang34"
}
}
]
}
}
}
and works, but I want to make it into a filter, not a simple match query.
Did you try to use .keyword? like:
"terms": {
"extra_information.gang.keyword": [
"gang34"
]
}
I tried what you wrote on my nested dictionary document, it works like this to me.

Terms aggregation across two fields in Elasticsearch

I'm not sure what I want to do is possible. I have data that looks like this:
{
"Actor1Name": "PERSON",
"Actor2Name": "OTHERPERSON"
}
I use copy_to in order to populate a secondary field, ActorNames, with both values.
I am trying to build a typeahead capability where a user can start to type a name and it will populate with the top hits for that prefix. I want it to search across both actor fields. The only problem is when I search across ActorNames, I get both values even if only one matches. That means if I'm searching for prefix O that I will get both OTHERPERSON (desired) and PERSON (undesired) in my results based on the above document.
My current solution is to run 2 aggregations and combine them client side, but is it possible to do this purely in ES?
Current query:
{
"query": {
"prefix": {
"ActorNames": "O"
}
},
"aggs": {
"actor1": {
"filter": {
"prefix": {
"Actor1Name": "O"
}
},
"aggs": {
"actor1": {
"terms": {
"field": "Actor1Name",
}
}
}
},
"actor2": {
"filter": {
"prefix": {
"Actor2Name": "O"
}
},
"aggs": {
"actor2": {
"terms": {
"field": "Actor2Name",
}
}
}
}
}
}
If you want to check the prefix condition on both the fields, why not use ANDING of prefix on both fields? Like:
GET /my_index/_search
{
"query": {
"bool": {
"must": [
{
"prefix": {
"Actor1Name": "O"
}
},
{
"prefix": {
"Actor2Name": "O"
}
}
]
}
}
}

elasticsearch inner join

I have an index with some fields, my documents contains valid "category" data also contains "url"(analyzed field) data but not contains respsize..
in the other hand documents that contains "respsize" data (greater than 0) also contains "url" data but not contains "category" data..
I think you got the point, I need join or intersection whatever that a query returns all documents contains respsize and category that have same same url documents.
Here what I did so far;(url field analyzed, rest of them not_analyzed)
here documents that have category:
and other documents have respsize that I need to combine them based on url
I need a dsl query that return records that have same url token(in this scenario it will be www.domainname.com) with merge category and respsize,
I simply want field in second img "category":"27" like in img1 but of course with rest of all fields.
here is my query but not work
GET webproxylog/accesslog/_search
{
"query": {
"filtered": {
"filter" : {
"and" : {
"filters": [
{
"not": {
"filter": {
"terms": {
"category": [
"-",
"-1",
"0"
]
},
"term": {
"respsize": "0"
}
}
},
"term": {
"category": "www.hurriyet.com.tr"
}
}
],
"_cache" : true
}
}
}
},
"sort": [
{
"respsize": {
"order": "desc"
}
}
]
}
You can try the query below. It will require the url field to be the one you specify (i.e. must) and then either of the next two clauses (i.e. should) must be true, i.e. category should be not one of the given terms or the respsize must be greater than 0.
{
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"term": {
"url": "www.hurriyet.com.tr"
}
}
],
"should": [
{
"not": {
"terms": {
"category": [
"-",
"-1",
"0"
]
}
}
},
{
"range": {
"respsize": {
"gt": 0
}
}
}
]
}
}
}
}
}

Resources