How to search exact text in nested document in elasticsearch - elasticsearch

I have a index like this,
"_index": "test",
"_type": "products",
"_id": "URpYIFBAQRiPPu1BFOZiQg",
"_score": null,
"_source": {
"currency": null,
"colors": [],
"api": 1,
"sku": 9999227900050002,
"category_path": [
{
"id": "cat00000",
"name": "B1"
},
{
"id": "abcat0400000",
"name": "Cameras & Camcorders"
},
{
"id": "abcat0401000",
"name": "Digital Cameras"
},
{
"id": "abcat0401005",
"name": "Digital SLR Cameras"
},
{
"id": "pcmcat180400050006",
"name": "DSLR Package Deals"
}
],
"price": 1034.99,
"status": 1,
"description": null,
}
And i want to search only exact text ["Camcorders"] in category_path field.
I did some match query, but it search all the products which has "Camcorders" as a part of the text. Can some one help me to solve this.
Thanks

To search in nested field use like following query
{
"query": {
"term": {
"category_path.name": {
"value": "b1"
}
}
}
}
HOpe it helps..!

you could add one more nested field raw_name with not_analyzed analyzer and match against it.

Related

Update large set of documents without knowing _id

I would like up update a large set of documents in Elasticsearch at once.
One document looks like this:
{
"_index": "vue_storefront_magento_1_1587559712",
"_type": "product",
"_id": "1123",
"_version": 56,
"_score": 7.7135754,
"_source": {
"sku": "381735",
"score": "1"
}
"fields": {
"updated_at": [
1589880769000
]
},
"highlight": {
"type_id": [
"#kibana-highlighted-field#configurable#/kibana-highlighted-field#"
],
"sku": [
"#kibana-highlighted-field#381735#/kibana-highlighted-field#"
]
}
}
I have a JSON file that contains the data I want to update, there is no _id field, only an SKU. I want to use this JSON to create the request to ElasticSearch to update.
[
{ "sku": 381735, "score": 2 },
{ "sku": 381736, "score": 3 },
{ "sku": 381737, "score": 4 }
]
I would like to update all of the score fields based on the SKU field in the _source.
Is this possible? I already looked at the update by query API but can't figure it out :-/

How to remove a field from json field in Elastic Search

I would like to remove member2 from members. I saw script
ctx._source.list_data.removeIf{list_item -> list_item.list_id == remove_id}
for a list but in my case it's not working. Is that possible?
"_index": "test",
"_type": "test",
"_id": "5",
"_score": 1.0,
"_source": {
"id": "1",
"description": "desc",
"name": "ss",
"members": {
"member1": {
"id": "2",
"role": "owner"
},
"member2": {
"role": "owner",
"id": "3"
}
}
}
}
You can use the update API:
POST test/_update/5
{
"script": "ctx._source.members.remove('member2')"
}
removeIf is for list. Your members2 is of type object so you need to use remove
{
"script": "if(ctx._source.members.member2.id=='3')
ctx._source.members.remove('member2')"
}

ElasticSearch NEST Reindex, edit name fields

I have an Index with nested Objects something like
"_index": "originindex",
"_source": {
"message": "",
"environment": "",
"nestedObj": {
"field1": "field1",
"field2": 1 },
"anotherfield": 1}
And I want to reindexit to something like
"_index": "newindex",
"_source": {
"message": "",
"nestedObj-field1":"field1",
"nestedObj-field2": 1 ,
"anotherfield": 1}
I'am new to all of this I'm using Nest on .Net V4.5, it proposes a ReindexAPI But don'tknow how to use it for this purpose
Thank you!
POST _reindex
{
"source": {
"index": "originindex"
},
"dest": {
"index": "newindex"
},
"script":{
"source":"ctx._source.nestedObj-field1 = ctx._source.remove(\"field1\");ctx._source.nestedObj-field2 = ctx._source.remove(\"field2\");"
}
Just make sure your mappings are in place on the dest index before you execute this.

Name searching in ElasticSearch

I have a index created in ElasticSearch with the field name where I store the whole name of a person: Name and Surname. I want to perform full text search over that field so I have indexed it using the analyzer.
My issue now is that if I search:
"John Rham Rham"
And in the index I had "John Rham Rham Luck", that value has higher score than "John Rham Rham".
Is there any posibility to have better score on the exact field than in the field with more values in the string?
Thanks in advance!
I worked out a small example (assuming you're running on ES 5.x cause of the difference in scoring):
DELETE test
PUT test
{
"settings": {
"similarity": {
"my_bm25": {
"type": "BM25",
"b": 0
}
}
},
"mappings": {
"test": {
"properties": {
"name": {
"type": "text",
"similarity": "my_bm25",
"fields": {
"length": {
"type": "token_count",
"analyzer": "standard"
}
}
}
}
}
}
}
POST test/test/1
{
"name": "John Rham Rham"
}
POST test/test/2
{
"name": "John Rham Rham Luck"
}
GET test/_search
{
"query": {
"function_score": {
"query": {
"match": {
"name": {
"query": "John Rham Rham",
"operator": "and"
}
}
},
"functions": [
{
"script_score": {
"script": "_score / doc['name.length'].getValue()"
}
}
]
}
}
}
This code does the following:
Replace the default BM25 implementation with a custom one, tweaking the B parameter (field length normalisation)
-- You could also change the similarity to 'classic' to go back to TF/IDF which doesn't have this normilisation
Create an inner field for your name field, which counts the number of tokens inside your name field.
Update the score according to the length of the token
This will result in:
"hits": {
"total": 2,
"max_score": 0.3596026,
"hits": [
{
"_index": "test",
"_type": "test",
"_id": "1",
"_score": 0.3596026,
"_source": {
"name": "John Rham Rham"
}
},
{
"_index": "test",
"_type": "test",
"_id": "2",
"_score": 0.26970196,
"_source": {
"name": "John Rham Rham Luck"
}
}
]
}
}
Not sure if this is the best way of doing it, but it maybe point you in the right direction :)

Elastic search range query

Consider 2 documents in an index as like below:
{
"_index": "32",
"_type": "places",
"_id": "_FqlAzzSRN6Ge_294D5Mwg",
"_score": 1,
"_source": {
"name_3": "xxxx",
"id_3": "xxxxx",
"name_2": "xxxx",
"id_2": "xxx",
"name_1": "xxx",
"id_1": "xxx",
"tempid": "xxxxx",
"field1": 316.6666666666667,
"type": "processeddata"
}
},
{
"_index": "32",
"_type": "places",
"_id": "3RCO-zHeSr2nWFZd8W-MDg",
"_score": 1,
"_source": {
"name_3": "yyyy",
"id_3": "yyy",
"name_2": "yyy",
"id_2": "yyy",
"name_1": "yyyy",
"id_1": "yyy",
"tempid": "yy",
"field2": 400.6666666666667,
"type": "processeddata"
}
}
I want to construct a query for the following scenario. I have to find the documents for field in particular range.
field1:200-400
field2:300-400 so the above 2 documents should come.
My query is as follows:
"query": {
"bool": {
"must": [
{
"range": {
"field1": {
"gte": 200,
"lte": 400
}
},"range": {
"field2": {
"gte": 300,
"lte": 400
}
}
}
]
}
}
But the above query "Looks for 2 fields in a singe document, so no result is coming. SO i have to make to search if any of the filed satisfies the range in the document should return. Please share your ideas. Thanks in advance.
You need to use bool should and not bool must. That would mean match any document that matches at least one condition.
NOTE: Your second condition won't match second document as 400.66 does not fall in the range [300, 400].

Resources