Percolate not returning results as expected - elasticsearch

We're trying to set up and use percolate, but we aren't quite getting results as expected.
First, I register a few queries:
curl -XPUT 'localhost:9200/index-234234/.percolator/query1' -d '{
"query" : {
"range" : {
"price" : { "gte": 100 }
}
}
}'
curl -XPUT 'localhost:9200/index-234234/.percolator/query2' -d '{
"query" : {
"range" : {
"price" : { "gte": 200 }
}
}
}'
And then, when I try to match it against 150, which should ideally match only query1, instead it matches both queries:
curl -XGET 'localhost:9200/index-234234/message/_percolate' -d '{
"doc" : {
"price" : 150
}
}'
{"took":4,"_shards":{"total":5,"successful":5,"failed":0},"total":2,"matches":[{"_index":"index-234234","_id":"query1"},{"_index":"index-234234","_id":"query2"}]}
Any pointers as to why this is happening would be much appreciated.

The problem is that you are registering your percolator queries prior to setting up the mappings for the document. The percolator has to register the query without a defined mapping and this can be an issue particularly for range queries.
You should start over again by deleting the index and then run this mapping command first:
curl -XPOST localhost:9200/index-234234 -d '{
"mappings" : {
"message" : {
"properties" : {
"price" : {
"type" : "long"
}
}
}
}
}'
Then execute your previous commands (register the two percolator queries and then percolate one document) you will get the following correct response:
{"took":3,"_shards":{"total":5,"successful":5,"failed":0},"total":1,"matches":[{"_index":"index-234234","_id":"query1"}]}
You may find this discussion from a couple of years ago helpful:
http://grokbase.com/t/gg/elasticsearch/124x6hq4ev/range-query-in-percolate-not-working

Not a solution, but this works (without knowing why) for me:
Register both percolator queries
Do the _percolator request (returns your result: "total": 2)
Register both percolator queries again (both are now in version 2)
Do the _percolator request again (returns right result: "total": 1)

Related

Bulk delete elasticsearch

i am using elastic search 2.2.
here is the count of documents
curl 'xxxxxxxxx:9200/_cat/indices?v'
yellow open app 5 1 28019178 5073 11.4gb 11.4gb
In the "app" index we have two types of document.
"log"
"syslog"
Now i want to delete all the documents under type "syslog".
Hence, i tried using the following command
curl -XDELETE "http://xxxxxx:9200/app/syslog"
But am getting the following error
No handler found for uri [/app/syslog]
i have installed delete-by-query plugin as well. Is there any way i can do a bulk delete operation ?
For now , i am deleting records by fetching the id.
curl -XDELETE "http://xxxxxx:9200/app/syslog/A121312"
it took around 5 mins for me to delete 10000 records. i have more than 1000000 docs which needs to be deleted. please help.
[EDIT -1]
i ran the below query to delete syslog type docs
curl -XDELETE 'http://xxxxxx:9200/app/syslog/_query' -d'
{
"query": {
"bool": {
"must": [
{
"match_all": {}
}
]
}
}
}'
And result is below
{"found":false,"_index":"app","_type":"syslog","_id":"_query","_version":1,"_shards":{"total":2,"successful":1,"failed":0}}
i used to query to get this message from index
{
"_index" : "app",
"_type" : "syslog",
"_id" : "AVckPMQnKYIebrQhF556",
"_score" : 1.0,
"_source" : {
"message" : "some test message",
"#version" : "1",
"#timestamp" : "2016-09-13T15:49:04.562Z",
"type" : "syslog",
"host" : "1.2.3.4",
"priority" : 0,
"severity" : 0,
"facility" : 0,
"facility_label" : "kernel",
"severity_label" : "Emergency"
}
[EDIT 2]
Delete by query listed as plugin
sudo /usr/share/elasticsearch/bin/plugin list
Installed plugins in /usr/share/elasticsearch/plugins/node1:
- delete-by-query
I had similar problem, after filling elasticsearch with 77 millions of unwanted documents in last couple of days. Setting timeout in query is your friend. As mentioned here. Curl has parameter to increase too (-m 3600)
curl --request DELETE \
--url 'http://127.0.0.1:9200/nadhled/tree/_query?timeout=60m' \
--header 'content-type: application/json' \
-m 3600 \
--data '{"query":{
"filtered":{
"filter":{
"range":{
"timestamp":{
"lt":1564826247
},
"timestamp":{
"gt":1564527660
}
}
}
}
}
}'
I know this is not your bulk delete, but I've found this page during my research so I post it here. Hope it helps you too.
In latest Elasticsearch(5.2), you could use _delete_by_query
curl -XPOST "http://localhost:9200/index/type/_delete_by_query" -d'
{
"query":{
"match_all":{}
}
}'
The delete-by-query API is new and should still be considered
experimental. The API may change in ways that are not backwards
compatible
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html
I would suggest that you should rather create a new index and reindex the documents you want to keep
But if you wanna use delete by query you should use this,
curl -XDELETE 'http://xxxxxx:9200/app/syslog/_query'
{
"query": {
"bool": {
"must": [
{
"match_all": {}
}
]
}
}
}
but then you'll be left with mapping.

Elastic Search Percolate Boolean Queries

I am trying to get boolean queries which are stored in ES using Percolate API.
Index mapping is given below
curl -XPUT 'localhost:9200/my-index' -d '{
"mappings": {
"taggers": {
"properties": {
"content": {
"type": "string"
}
}
}
}
}'
I am inserting records like this (Queries contain proper boolean format (AND, OR, NOT etc) as given in below example)
curl -XPUT 'localhost:9200/my-index/.percolator/1' -d '{
"query" : {
"match" : {
"content" : "Audi AND BMW"
}
}
}'
And then I am posting a document to get matched queries.
curl -XGET 'localhost:9200/my-index/my-type/_percolate' -d '{
"doc" : {
"content" : "I like audi very much"
}
}'
In above case no records should come because boolean query is "Audi AND BMW" but it is still giving record. It means that it is ignoring AND condition. I am not able to figure out that why it is not working for boolean queries.
You need to percolate this query instead, match queries do not understand the AND operator (they will treat it like the normal token and), but query_string does.
curl -XPUT 'localhost:9200/my-index/.percolator/1' -d '{
"query" : {
"query_string" : {
"query" : "Audi AND BMW",
"default_field": "content"
}
}
}'

Is there any analyzer that performs exact word matching in elastic search

Is there any analyzer that performs exact word matching in elastic search
Example if, i have words like "America" and "American" and "America's", if i searched for "America" i should get only first one.. With standard analyzer it gives all the three ones.
I want to make sure this only at query time. I don't want to make changes to existing index. Please help me.
Set your mapping to not_analyzed
curl -XPOST localhost:9200/test -d '{
"mappings" : {
"type1" : {
"properties" : {
"field1" : { "type" : "string", "index" : "not_analyzed" }
}
}
}
}'
That will not do anything to your input string, but will still allow it to be searchable. Note, if you do this searching on "america" will not match "America" as there is a difference in case.
If you want to be able to match those, then you should try the keyword analyzer.
curl -XPOST localhost:9200/test -d '{
"mappings" : {
"type1" : {
"properties" : {
"field1" : { "type" : "string", "analyzer" : "keyword" }
}
}
}
}'
You need not to worry about analyzer.. While querying use term and terms queries it Ll behave as you asked..

Fuzzy string matching using Levenshtein algorithm in Elasticsearch

I have just started exploring Elasticsearch. I created a document as follows:
curl -XPUT "http://localhost:9200/cities/city/1" -d'
{
"name": "Saint Louis"
}'
I now tried do a fuzzy search on the name field with a Levenshtein distance of 5 as follows :
curl -XGET "http://localhost:9200/_search " -d'
{
"query": {
"fuzzy": {
"name" : {
"value" : "St. Louis",
"fuzziness" : 5
}
}
}
}'
But its not returning any match. I expect the Saint Louis record to be returned. How can i fix my query ?
Thanks.
The problem with your query is that only a maximum edit distance of 2 is allowed.
In the case above what you probably want to do is have a synonym for St. to Saint, and that would match for you. Of course, this would depend on your data as St could also be "street".
If you want to just test the fuzzy searching, you could try this example
curl -XGET "http://localhost:9200/_search " -d'
{
"query": {
"fuzzy": {
"name" : {
"value" : "Louiee",
"fuzziness" : 2
}
}
}
}

ElasticSearch has_parent query

I am experimenting with Elasticsearch parent/child with some simple examples from fun-with-elasticsearch-s-children-and-nested-documents/. I am able to query child elements by running the query in the blog
curl -XPOST localhost:9200/authors/bare_author/_search -d '{
However, I could not tweak the example for has_parent query. Can someone please point what I am doing wrong, as I keep getting 0 results.
This is what I tried
#Returns 0 hits
curl -XPOST localhost:9200/authors/book/_search -d '{
"query": {
"has_parent": {
"type": "bare_author",
"query" : {
"filtered": {
"query": { "match_all": {}},
"filter" : {"term": { "name": "Alastair Reynolds"}}
}
}
}
}
}'
#did not work either
curl -XPOST localhost:9200/authors/book/_search -d '{
"query": {
"has_parent" : {
"type" : "bare_author",
"query" : {
"term" : {
"name" : "Alastair Reynolds"
}
}
}
}
}'
This works with match but its just matching the first name
#works but matches just first name
curl -XPOST localhost:9200/authors/book/_search -d '{
"query": {
"has_parent" : {
"type" : "bare_author",
"query" : {
"match" : {"name": "Alastair"}
}
}
}
}'
I suppose you are using the default mappings, thus analysing the name field using the standard analyzer. On the other hand, term query and term filter don't support text analysis thus you search for the token Alastair Reynolds while in the index you have alastair and reynolds as two different tokens and lowercased.
The match query returns result because it's analyzed, thus underneath lowercased and it finds matches. You can just change your term query and make it a match query, it will find matches even with multiple terms, because in that case it will be tokenized on whitespaces and will generate a boolean or dismax query out of the different terms provided.

Resources