aioes 'delete_by_query' method doesn't work - elasticsearch

I have several words in my elastic which shows when I search by 'match' keyword.
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 10,
"successful": 10,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.30685282,
"hits": [
{
"_index": "my_words_pack",
"_type": "work_g1",
"_id": "AVetfhx1AM1sow6PcrL0",
"_score": 0.30685282,
"_source": {
"keyword": "morteza"
}
}
]
}
}
but when I want to remove them by '_id' it doesn't work find and shows me this error:
es.delete_by_query(index='my_words_pack', doc_type='work_g1' body={"query": {"match": {"_id": "AVetfhx1AM1sow6PcrL0"}}})
Error:
aioes.exception.NotFoundError: TransportError(404, '{"found":false,"_index":"my_words_pack","_type":"work_g1","_id":"_query","_version":1,"_shards":{"total":2,"successful":1,"failed":0}}')

Elasticsearch removed the delete by query ability in version 2.0 and added it as a plugin that you must install if you would like to use this ability.
Since you already have the document IDs, its better if you delete these documents by id rather than by query. I think the way to do it in the Python extension is
es.delete(index="my_words_pack",doc_type="work_g1",id="AVetfhx1AM1sow6PcrL0")

Related

Delete Indexes by index name and type using elasticSearch 2.3.3 in java

I have a project in java where I index the data using elastic search 2.3.3. The indexes are of two types.
My index doc looks like:
{
"took": 10,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits": [
{
"_index": "test_index",
"_type": "movies",
"_id": "uReb0g9KSLKS18sTATdr3A",
"_score": 1,
"_source": {
"genre": "Thriller"
}
},
{
"_index": "test_index",
"_type": "drama",
"_id": "cReb0g9KSKLS18sTATdr3B",
"_score": 1,
"_source": {
"genre": "SuperNatural"
}
},
{
"_index": "index1",
"_type": "drama",
"_id": "cReb0g9KSKLS18sT76ng3B",
"_score": 1,
"_source": {
"genre": "Romance"
}
}
]
}
}
I need to delete index of a particular name and type only.
For eg:- From the above doc, I want to delete indexes with Name "test_index" and type "drama".
So the result should look like:
{
"took": 10,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_index": "test_index",
"_type": "movies",
"_id": "uReb0g9KSLKS18sTATdr3A",
"_score": 1,
"_source": {
"genre": "Thriller"
}
},
{
"_index": "index1",
"_type": "drama",
"_id": "cReb0g9KSKLS18sT76ng3B",
"_score": 1,
"_source": {
"genre": "Romance"
}
}
]
}
}
Solutions tried:
client.admin().indices().delete(new DeleteIndexRequest("test_index").actionGet();
But it delete both indexes with name "test_index"
I have also tried various queries in sense beta plugin like:
DELETE /test_index/drama
It gives the error: No handler found for uri [/test_index/drama] and method [DELETE]
DELETE /test_index/drama/_query?q=_id:*&analyze_wildcard=true
It also doesn't work.
When I fire delete index request at that time id of indexes are unknown to us and I have to delete the indexes by name and type only.
How can I delete the required indexes using java api?
This used to be possible till ES 2.0 using the delete mapping API, however since 2.0 Delete Mapping API does not exist any more.
To do this you will have to install the Delete by Query plugin. Then you can simply do a match all query on your index and type and then delete all of them.
The query will look something like this:
DELETE /test_index/drama/_query
{
"query": {
"query": {
"match_all": {}
}
}
}
Also keep in mind that this will delete the documents in the mapping and not the mapping itself. If you want to remove the mapping too you'll have to reindex without the mapping.
This might be able to help you with the java implementation

I want to use a wildcard query for url in elasticsearch. I am using elasticsearch 2.3.0

My index looks like this:
GET pibtest1/_search
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 11,
"max_score": 1,
"hits": [
{
"_index": "pibtest1",
"_type": "SearchTech",
"_id": "_update",
"_score": 1,
"_source": {
"script": "ctx._source.remove(\"wiki_collection\")"
}
},
{
"_index": "pibtest1",
"_type": "SearchTech",
"_id": "http://www.searchtechnologies.com/bundles/jquery?v=gOdOgfykTFJnypePAvGweyMPwl-krhx8ntIhefPKelg1",
"_score": 1,
"_source": {
"extension": {
"X-Parsed-By": "org.apache.tika.parser.DefaultParser",
"Content-Encoding": "ISO-8859-1",
"resourceName": "http://www.searchtechnologies.com/bundles/jquery?v=gOdOgfykTFJnypePAvGweyMPwl-krhx8ntIhefPKelg1"
},
"keywords": "keywords-NOT-PROVIDED",
"default_collection": true,
"wiki_collection": false,
"description": "description-NOT-PROVIDED",
"connectorSpecific": {
"discoveredBy": "http://www.searchtechnologies.com/",
"xslt": "false",
"pathFromSeed": "E",
"md5": "OKTGVLEWTE5V4PWXUBM2RK3KMQ"
},
"title": "Title-NOT-PROVIDED",
"url": "http://www.searchtechnologies.com/bundles/jquery?v=gOdOgfykTFJnypePAvGweyMPwl-krhx8ntIhefPKelg1",
"remove": "wiki_collection",
"UD": "http://www.searchtechnologies.com/bundles/jquery?v=gOdOgfykTFJnypePAvGweyMPwl-krhx8ntIhefPKelg1",
Now I want to use a wildcard query to search for few url which includes some pattern(for eg. http://www.searchtechnologies.com/bundles)
This is my wildcard query:
GET pibtest1/_search
{
"query": {
"wildcard": {
"url": {
"value": "http://www.searchtechnologies.com/bundles*"
}
}
}
}
I am using "*" wildcard which matches any character sequence. But I am not getting any results. My output looks like this:
{
"took": 11,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
I want my results to include those url which matches this "http://www.searchtechnologies.com/bundles" pattern. Any help would be appreciated.
Based on comments your url field is an analyzed field. So when you insert data the data will be tokenized as ["www.searchtechnologies.com", "v", "jquery", "gOdOgfykTFJnypePAvGweyMPwl", ...]. So your query wont match this field.
You should delete your index.
Insert a mapping and specify url field as not analyzed {"index":"not_analyzed"}
Insert your data.
Run wildcard query.
If you dont want to delete your index because a downtime check: https://www.elastic.co/blog/changing-mapping-with-zero-downtime

How to get a new field in results for the searched query in Elasticsearch?

So I’m using elasticsearch V2.3.1. Below is my elasticsearch query:
GET pibtest1/_search?q=white
{
"size": 1,
"fields": ["U", "UE", "UD", "T"]
}
I get the following result after running the above query:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 85,
"max_score": 0.15116164,
"hits": [
{
"_index": "pibtest1",
"_type": "SearchTech",
"_id": "1",
"_score": 0.15116164,
"fields": {
"UE": [
"Some value1"
],
"U": [
"Some value2"
],
"T": [
"Some value3"
],
"UD": [
"Some value4"
]
}
}
]
}
}
As you can see in the results, Elasticsearch doesn’t provide any information about the query which is searched. In my case, the query is “white”. So is there any way to get the searched query (“white”) in the result? For example, I would like to get something like this in the result ->
“query”: “white”
I checked the explain API of Elasticsearch. It does provide the details of how the score gets computed but it doesn’t explicitly contain any field for searched query. Thank you everyone.

Append to array in Elasticsearch

I am currently struggling a bit on how to append a value to an array in elasticsearch.
The Document looks something like this:
{
"took": 11,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "iethreads",
"_type": "thread",
"_id": "AVRk6WRMU5h_y_zwo4s0",
"_score": 1,
"fields": {
"links": [
"[\"https://somelink123.net/thread-714222&page=1\", \"https://somelink123.net/thread-714222&page=2\", \"https://somelink123.net/thread-714222&page=3\", \"https://somelink123.net/thread-714222&page=4\"]"
]
}
}
]
}
}
then I run the following update query
POST _update
{
"script" : "ctx._source.links+=new_posts",
"params" : {
"new_posts":"blabliblub"
}
}
and I get this:
{
"took": 11,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "iethreads",
"_type": "thread",
"_id": "AVRk6WRMU5h_y_zwo4s0",
"_score": 1,
"fields": {
"links": [
"[\"https://somelink123.net/thread-714222&page=1\", \"https://somelink123.net/thread-714222&page=2\", \"https://somelink123.net/thread-714222&page=3\", \"https://somelink123.net/thread-714222&page=4\"]blabliblub"
]
}
}
]
}
}
So for me this looks like the array is treated like a string and it just appends the string - this is not what I want.
How would I append the "blabliblub" as a new element to the array ?
It seems your links field actually has one element as string instead of an array. To your update be succesful, your structure must be like that:
"fields": {
"links": [
"https://somelink123.net/thread-714222&page=1",
"https://somelink123.net/thread-714222&page=2",
"https://somelink123.net/thread-714222&page=3",
"https://somelink123.net/thread-714222&page=4"
]
}

How to filter out elements from an array that doesn’t match the query?

stackoverflow won't let me write that much example code so I put it on gist.
So I have this index
with this mapping
here is a sample document I insert into newly created mapping
this is my query
GET products/paramSuggestions/_search
{
"size": 10,
"query": {
"filtered": {
"query": {
"match": {
"paramName": {
"query": "col",
"operator": "and"
}
}
}
}
}
}
this is the unwanted result I get from previous query
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.33217794,
"hits": [
{
"_index": "products",
"_type": "paramSuggestions",
"_id": "1",
"_score": 0.33217794,
"_source": {
"productName": "iphone 6",
"params": [
{
"paramName": "color",
"value": "white"
},
{
"paramName": "capacity",
"value": "32GB"
}
]
}
}
]
}
}
and finally the wanted result, how I want the query result to look like
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.33217794,
"hits": [
{
"_index": "products",
"_type": "paramSuggestions",
"_id": "1",
"_score": 0.33217794,
"_source": {
"productName": "iphone 6",
"params": [
{
"paramName": "color",
"value": "white"
},
]
}
}
]
}
}
How should the query look like to achieve the wanted result with filtered array field which matches the query? In other words, all other non-matching array items should not appear in the final result.
The final result is the _source document that you indexed. There is no feature that lets you mask field elements of your document out of the Elasticsearch response.
That said, depending on your goal, you can look into how Highlighters and Suggesters identify result terms matching the query, or possibly, roll-your-own client-side masking using info returned from setting "explain": true in your query.

Resources