Elasticsearch: retrieve only document _id where field doesn't exist - elasticsearch

I would like to retrieve all document _ids (without other fields) where field "name" doesn't exist:
I know I can search for where field "name" doesn't exist like this:
"query": {
"bool": {
"must_not": {
"exists": {
"field": "name"
}
}
}
}
and I think that to get the _id of the document only without any fields i need to use (correct me if I'm wrong):
"fields": []
How do I combine these 2 parts to make a query that works?

You can just add _source and set to false as Elasticsearch will return the entire JSON object in that field by default
"_source": false,
"query":{
...
}
and this will retrieve just the metadata from your specified index, so your hits array will contain _index, _type, _id and _score for each result
e.g
{
"took" : 11,
"timed_out" : false,
"_shards" : {
"total" : 12,
"successful" : 12,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 20,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "filebeat-7.8.1-2021.01.28",
"_type" : "_doc"
"_id" : "SomeUniqeuId86aa",
"_score" : 1.0
},
{
"_index" : "filebeat-7.8.1-2021.01.28",
"_type" : "_doc"
"_id" : "An0therrUniqueiD",
"_score" : 1.0
}
]
}
}

Related

How to do a fields query using query string search on elastic search?

I want to convert this query:
GET demo-index/_search
{
"fields": [
"*"
]
}
into something like this:
GET demo-index/_search?fields=*
is it possible to do fields queries in this way or a way similar to it without using a json body for the request?
You can try with filter_path :
If we've have the next return:
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 10000,
"relation" : "gte"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "demo-index",
"_type" : "_doc",
"_id" : "32223223d3e23dd23d2x23",
"_score" : 1.0,
"_source" : {
"username" : "Mike",
"date" : "2022-04-04"
}
}
]
}
}
And we would like to return all fields,we should write the path as follows:
GET demo-index/_search?filter_path=hits.hits._source.*
If we would like the specifics fields like "username", we should write the path as follows
GET demo-index/_search?filter_path=hits.hits._source.username

How to set _id to a field from property in mapping in elasticsearch

I am trying to define a mapping in elasticsearch wherein _id will be set to one of the field of property in the mapping.
So every time i post data it should automatically extract this field and set it to _id.
But on saving data every time a new random _id is generated. Is this the correct way to set _id when setting mappings in elasticsearch.
PUT /index00001
{
"mappings": {
"_meta":{
"_id" : "userid"
},
"properties": {
"userid": {
"type": "text"
},
"nickname": {
"type": "text"
}
}
},
"settings" : {
"number_of_shards" : 1,
"number_of_replicas" : 0
}
}
POST /index00001/_doc
{
"userid": "6009001",
"nickname": "nick"
}
{
"took" : 438,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "index00001",
"_type" : "_doc",
"_id" : "IeKqnn0BuUqEU88H_tlq",
"_score" : 1.0,
"_source" : {
"userid" : "6009001",
"nickname" : "nick"
}
},
{
"_index" : "index00001",
"_type" : "_doc",
"_id" : "JeKrnn0BuUqEU88HNtnu",
"_score" : 1.0,
"_source" : {
"userid" : "6009001",
"nickname" : "amit"
}
}
]
}
}
Why is my _id not set to userid field from property
This is elasticsearch version - 7.8.0 lucene_version -8.5.1
It used to be possible to have ES automatically use a field value as the ID of the document in ES 1.X, but it is not possible anymore since ES 2.0.
Now you need to explicitly pass the ID of your documents when indexing them, otherwise one will be generated for you.

Query for value in object

I have multiple documents like:
{
labels: {
label1Key: "label1Value",
label2Key: "label2Value",
...
},
...
}
The keys of the labels object are arbitrary. I would like to query for the existence of specific values in the labels object without knowing the key, e.g. I want all data that contain label2Value as a value in the labels object.
I've tried to solve this via an exists query, but this way I can only access the key of an object. Is there a way to query for values?
With a Multimatch query you can use wildcards on the field names
Ingest data
POST test_bene/_doc
{
"labels": {
"label1Key": "label1Value",
"label2Key": "label2Value"
}
}
Query
POST test_bene/_search
{
"query": {
"multi_match": {
"query": "label1Value",
"fields": ["labels.*"]
}
}
}
Response
{
"took" : 24,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.2876821,
"hits" : [
{
"_index" : "test_bene",
"_type" : "_doc",
"_id" : "RtBd_ncB46EpgstaHy3Y",
"_score" : 0.2876821,
"_source" : {
"labels" : {
"label1Key" : "label1Value",
"label2Key" : "label2Value"
}
}
}
]
}
}

No matches when querying Elastic Search

I'm trying to run a query elastic search. When run this query
GET accounts/_search/
{
"query": {
"term": {
"address_line_1": "1000"
}
}
}
I get back multiple records like
"hits" : [
{
"_index" : "accounts",
"_type" : "_doc",
"_id" : "...",
"_score" : 8.355149,
"_source" : {
"state_id" : 35,
"first_name" : "...",
"last_name" : "...",
"middle_name" : "P",
"dob" : "...",
"status" : "ACTIVE",
"address_line_1" : "1000 BROADROCK CT",
"address_line_2" : "",
"address_city" : "PARMA",
"address_zip" : "",
"address_zip_plus_4" : ""
}
},
But when I try to expand it to include the more like below I don't get any matches
GET accounts/_search/
{
"query": {
"term": {
"address_line_1": "1000 B"
}
}
}
The response is
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
The term query is looking for exact matches. Your address_line_* fields were most probably indexed with the standard analyzer which lowercase-s all the letters which in turn prevents the query from matching.
So either use
GET accounts/_search/
{
"query": {
"match": { <--
"address_line_1": "1000 B"
}
}
}
which does not really 'care' about B being lower/upper case or adjust your field analyzers such that the capitalization is preserved.

How to compare 2 field in elasticsearch

Ok, I have example result on my data in elastic search :
"hits" : [
{
"_index" : "solutionpedia_data",
"_type" : "doc",
"_id" : "nyODP24BA840z5O6WguE",
"_score" : 46.63439,
"_source" : {
"ID" : "1",
"PRODUCT_NAME" : "ATM",
"UPDATEDATE" : "13-FEB-18",
"PROPOSAL" : [
{
}
],
"MARKETING_KIT" : [ ],
"VIDEO" : [ ]
}
},
{
"_index" : "classification",
"_type" : "doc",
"_id" : "5M-r5m4BNYha4zuWalJa",
"_score" : 39.25268,
"_source" : {
"productId" : "1",
"productName" : "ATM",
"productIconUrl" : "media/8ae0f0c3-1402-4559-901e-7ec9b874ce68-prod032.webp",
"type" : "nonconnectivity",
"businessLineId" : "",
"subsidiaries" : "",
"segment" : [],
"productType" : "Efisien",
"tariff" : null,
"tags" : [ ],
"contact" : [],
"mediaId" : [
"Med391"
],
"documentId" : [
"doc260",
"doc261"
],
"createdAt" : "2019-09-22T05:22:46.956Z",
"updatedAt" : "2019-09-22T05:22:46.956Z",
"totalClick" : 46
}
}
]
this is a result of my alias. can we search for the same data based on 2 different fields, the example above is the ID and productId fields. Can we make these 2 objects in one bucket or compare?
i was try with some aggregate but nothing :
{
"query": {
"match_all": {}
},
"size": 0,
"aggregations": {
"product catalog": {
"terms": {
"field": "productId.keyword",
"min_doc_count": 2,
"size": 100
},
"aggregations": {
"product solped": {
"terms": {
"field": "ID.keyword",
"min_doc_count": 2
}
}
}
}
}
}
result :
{
"took" : 9,
"timed_out" : false,
"_shards" : {
"total" : 10,
"successful" : 10,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 1276,
"max_score" : 0.0,
"hits" : [ ]
},
"aggregations" : {
"product catalog" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [ ]
}
}
}
You can achieve this with a Scripted Bucket Aggregation, using script logic to define your buckets (pseudo code: if field a exists value of field a, if field b exists value of field b).
Another (and better) way to achieve this is to change your data model and indexing logic on Elasticsearch side and store the information in a field of the same name.
You could also consider the alias data type to make fields with different names in different indices accessible under one common field name. This is also the approach Elastic takes with the Elastic Common Schema specification.

Resources