Query for value in object - elasticsearch

I have multiple documents like:
{
labels: {
label1Key: "label1Value",
label2Key: "label2Value",
...
},
...
}
The keys of the labels object are arbitrary. I would like to query for the existence of specific values in the labels object without knowing the key, e.g. I want all data that contain label2Value as a value in the labels object.
I've tried to solve this via an exists query, but this way I can only access the key of an object. Is there a way to query for values?

With a Multimatch query you can use wildcards on the field names
Ingest data
POST test_bene/_doc
{
"labels": {
"label1Key": "label1Value",
"label2Key": "label2Value"
}
}
Query
POST test_bene/_search
{
"query": {
"multi_match": {
"query": "label1Value",
"fields": ["labels.*"]
}
}
}
Response
{
"took" : 24,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.2876821,
"hits" : [
{
"_index" : "test_bene",
"_type" : "_doc",
"_id" : "RtBd_ncB46EpgstaHy3Y",
"_score" : 0.2876821,
"_source" : {
"labels" : {
"label1Key" : "label1Value",
"label2Key" : "label2Value"
}
}
}
]
}
}

Related

How to do a fields query using query string search on elastic search?

I want to convert this query:
GET demo-index/_search
{
"fields": [
"*"
]
}
into something like this:
GET demo-index/_search?fields=*
is it possible to do fields queries in this way or a way similar to it without using a json body for the request?
You can try with filter_path :
If we've have the next return:
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 10000,
"relation" : "gte"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "demo-index",
"_type" : "_doc",
"_id" : "32223223d3e23dd23d2x23",
"_score" : 1.0,
"_source" : {
"username" : "Mike",
"date" : "2022-04-04"
}
}
]
}
}
And we would like to return all fields,we should write the path as follows:
GET demo-index/_search?filter_path=hits.hits._source.*
If we would like the specifics fields like "username", we should write the path as follows
GET demo-index/_search?filter_path=hits.hits._source.username

Query consecutive words using match_phrase elasticsearch works unexpected

I have the parameter name as a text:
{
"mappings": {
"_doc": {
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
}
}
}
}
Because of the nature of text type in ElasticSearch, matchs every word on the phrase. That's why in some cases I get the next results:
POST /example-tags/_search
{
"query": {
"match": {
"name": "Jordan Rudess was born in 1956"
}
}
}
// Results
{
"took" : 28,
"timed_out" : false,
"_shards" : {
"total" : 2,
"successful" : 2,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 2.1596613,
"hits" : [
{
"_index" : "example-tags",
"_type" : "_doc",
"_id" : "6101e538bc8ec610aff699e4",
"_score" : 4.1596613,
"_source" : {
"name" : "Jordan Rudess"
}
},
{
"_index" : "example-tags",
"_type" : "_doc",
"_id" : "610123538bc8ec61034ff699e4",
"_score" : 4.1796613,
"_source" : {
"name" : "Alice in Chains"
}
},
]
}
}
As you can see, in the text Jordan Rudess was born in 1956 I get the result Alice in Chains just for the word in. I want to avoid this behaviour.
If I try:
POST /example-tags/_search
{
"query": {
"match_phrase": {
"name": "Dream Theater keyboardist's Jordan Rudess was born in 1956"
}
}
}
// Results
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
So, in the past example I was expecting to get the Jordan Rudess tag name but I get empty results.
I need to get the maximum ocurrences in tag.name of consecutive words in a phrase. How can I achieve that?

Elasticsearch: retrieve only document _id where field doesn't exist

I would like to retrieve all document _ids (without other fields) where field "name" doesn't exist:
I know I can search for where field "name" doesn't exist like this:
"query": {
"bool": {
"must_not": {
"exists": {
"field": "name"
}
}
}
}
and I think that to get the _id of the document only without any fields i need to use (correct me if I'm wrong):
"fields": []
How do I combine these 2 parts to make a query that works?
You can just add _source and set to false as Elasticsearch will return the entire JSON object in that field by default
"_source": false,
"query":{
...
}
and this will retrieve just the metadata from your specified index, so your hits array will contain _index, _type, _id and _score for each result
e.g
{
"took" : 11,
"timed_out" : false,
"_shards" : {
"total" : 12,
"successful" : 12,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 20,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "filebeat-7.8.1-2021.01.28",
"_type" : "_doc"
"_id" : "SomeUniqeuId86aa",
"_score" : 1.0
},
{
"_index" : "filebeat-7.8.1-2021.01.28",
"_type" : "_doc"
"_id" : "An0therrUniqueiD",
"_score" : 1.0
}
]
}
}

No matches when querying Elastic Search

I'm trying to run a query elastic search. When run this query
GET accounts/_search/
{
"query": {
"term": {
"address_line_1": "1000"
}
}
}
I get back multiple records like
"hits" : [
{
"_index" : "accounts",
"_type" : "_doc",
"_id" : "...",
"_score" : 8.355149,
"_source" : {
"state_id" : 35,
"first_name" : "...",
"last_name" : "...",
"middle_name" : "P",
"dob" : "...",
"status" : "ACTIVE",
"address_line_1" : "1000 BROADROCK CT",
"address_line_2" : "",
"address_city" : "PARMA",
"address_zip" : "",
"address_zip_plus_4" : ""
}
},
But when I try to expand it to include the more like below I don't get any matches
GET accounts/_search/
{
"query": {
"term": {
"address_line_1": "1000 B"
}
}
}
The response is
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
The term query is looking for exact matches. Your address_line_* fields were most probably indexed with the standard analyzer which lowercase-s all the letters which in turn prevents the query from matching.
So either use
GET accounts/_search/
{
"query": {
"match": { <--
"address_line_1": "1000 B"
}
}
}
which does not really 'care' about B being lower/upper case or adjust your field analyzers such that the capitalization is preserved.

How to compare 2 field in elasticsearch

Ok, I have example result on my data in elastic search :
"hits" : [
{
"_index" : "solutionpedia_data",
"_type" : "doc",
"_id" : "nyODP24BA840z5O6WguE",
"_score" : 46.63439,
"_source" : {
"ID" : "1",
"PRODUCT_NAME" : "ATM",
"UPDATEDATE" : "13-FEB-18",
"PROPOSAL" : [
{
}
],
"MARKETING_KIT" : [ ],
"VIDEO" : [ ]
}
},
{
"_index" : "classification",
"_type" : "doc",
"_id" : "5M-r5m4BNYha4zuWalJa",
"_score" : 39.25268,
"_source" : {
"productId" : "1",
"productName" : "ATM",
"productIconUrl" : "media/8ae0f0c3-1402-4559-901e-7ec9b874ce68-prod032.webp",
"type" : "nonconnectivity",
"businessLineId" : "",
"subsidiaries" : "",
"segment" : [],
"productType" : "Efisien",
"tariff" : null,
"tags" : [ ],
"contact" : [],
"mediaId" : [
"Med391"
],
"documentId" : [
"doc260",
"doc261"
],
"createdAt" : "2019-09-22T05:22:46.956Z",
"updatedAt" : "2019-09-22T05:22:46.956Z",
"totalClick" : 46
}
}
]
this is a result of my alias. can we search for the same data based on 2 different fields, the example above is the ID and productId fields. Can we make these 2 objects in one bucket or compare?
i was try with some aggregate but nothing :
{
"query": {
"match_all": {}
},
"size": 0,
"aggregations": {
"product catalog": {
"terms": {
"field": "productId.keyword",
"min_doc_count": 2,
"size": 100
},
"aggregations": {
"product solped": {
"terms": {
"field": "ID.keyword",
"min_doc_count": 2
}
}
}
}
}
}
result :
{
"took" : 9,
"timed_out" : false,
"_shards" : {
"total" : 10,
"successful" : 10,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 1276,
"max_score" : 0.0,
"hits" : [ ]
},
"aggregations" : {
"product catalog" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [ ]
}
}
}
You can achieve this with a Scripted Bucket Aggregation, using script logic to define your buckets (pseudo code: if field a exists value of field a, if field b exists value of field b).
Another (and better) way to achieve this is to change your data model and indexing logic on Elasticsearch side and store the information in a field of the same name.
You could also consider the alias data type to make fields with different names in different indices accessible under one common field name. This is also the approach Elastic takes with the Elastic Common Schema specification.

Resources