Register and call query in ElasticSearch - elasticsearch

Is it possible to register query (like the percolate process) and call them by name to execute them.
I am building an application that will let the user save search query associated with a label. I would like to save the query generated by the filter in ES.
If I save the query in an index, I have to call ES first to retrieve the query, extract the field containing the query and then call ES again to execute it. Can I do it in one call ?
The other solution is to register queries (labels with _percolator with an identifier of the user:
/_percolate/transaction/user1_label1
{
"userId": "user1",
"query":{
"term":{"field1":"foo" }
}
}
and when there is a new document use the percolator in a non indexing mode (filtered per userId) to retrieve which query match, and then update the document by adding a field "label":["user1_label1", "user1_label2"] and finaly index the document. SO the labelling is done at indexing time.
What do you think ?
Thanks in advance.

Try Filter Aliases.
curl -XPOST 'http://localhost:9200/_aliases' -d '
{
"actions" : [
{
"add" : {
"index" : "the_real_index",
"alias" : "user1",
"filter" : { "term" : { "field1" : "foo" } }
}
}
]
}'

Related

Is it possible to check that specific data matches the query without loading it to the index?

Imagine that I have a specific data string and a specific query. The simple way to check that the query matches the data is to load the data into the Elastic index and run the online query. But can I do it without putting it into the index?
Maybe there are some open-source libraries that implement the Elastic search functionality offline, so I can call something like getScore(data, query)? Or it's possible to implement by using specific API endpoints?
Thanks in advance!
What you can do is to leverage the percolator type.
What this allows you to do is to store the query instead of the document and then test whether a document would match the stored query.
For instance, you first create an index with a field of type percolator that will contain your query (you also need to add in the mapping any field used by the query so ES knows what their types are):
PUT my_index
{
"mappings": {
"properties": {
"query": {
"type": "percolator"
},
"message": {
"type": "text"
}
}
}
}
Then you can index a real query, like this:
PUT my_index/_doc/match_value
{
"query" : {
"match" : {
"message" : "bonsai tree"
}
}
}
Finally, you can check using the percolate query if the query you've just stored would match
GET /my_index/_search
{
"query" : {
"percolate" : {
"field" : "query",
"document" : {
"message" : "A new bonsai tree in the office"
}
}
}
}
So all you need to do is to only store the query (not the documents), and then you can use the percolate query to check if the documents would have been selected by the query you stored, without having to store the documents themselves.

Delete by Query with Sort in Elasticsearch

I want to delete the most current item in my Elasticsearch index sorted by myDateField which is a date type. Is that possible? I want something like this query but this would delete all matching items even though I have the size at 1.
{
"query" : {
"match_all" : {
}
},
"size" : "1",
"sort" : [
{
"myDateField" : {
"order" : "desc"
}
}
]
}
Delete by query is unlikely to support any sorting features.
If you try Delete by query - however you'll get the error: request does not support [sort]. I couldn't find any documentation saying that the "sort" parameter is not supported in delete by query.
I've one idea to do it but don't know it's the best way or not?
Step 1: Do a normal query based on your conditions+sorting and get those ids.
Step 2: Build a bulk query to delete all documents retrieved above by id those you got on Step 1.

Protecting data in elastic search

I have a elastic search engine running locally with an index which contains data from Multiple customers. When a customer makes a query, is there a way to dynamically add Customer Id in the filtering criteria so a customer cannot access the records from other customers.
Yes, you can achieve that using filtered aliases. So you'd create one alias per customer like this:
POST /_aliases
{
"actions" : [
{
"add" : {
"index" : "customer_index",
"alias" : "customer_1234",
"filter" : { "term" : { "customer_id" : "1234" } }
}
}
]
}
Then your customer can simply query the alias customer_1234 and only his data is going to come back.

Elasticsearch - How to delete a list of documents?

I have an array of _id.
On this page I found out how to retrieve a list of documents from it :
GET ads/_mget
{
"ids": [ "586213440e7d2c7f10fe2574",
"586213440e7d2c7f10fe2575",
"586213450e7d2c7f10fe2576",
"586213450e7d2c7f10fe2577" ]
}
This works and returns a list of 4 full documents, as expected.
(sidenote)
I find it weird to have to write "ids" in the query, when it actually acts on the "_id" field.
(end sidenote)
Now I can't figure out how to DELETE these documents from the same _id list.
I tried DELETE ads/_mget but I get an error : No handler found for uri [/ads/_mget] and method [DELETE]
I tried _mdelete instead of _mget but it doesn't seem to exist.
I also tried
DELETE ads
{
"ids": [ "586213440e7d2c7f10fe2574",
"586213440e7d2c7f10fe2575",
"586213450e7d2c7f10fe2576",
"586213450e7d2c7f10fe2577" ]
}
...but this... just deletes EVERYTHING and I have to reindex the database.
You can always use feature of Delete By Query and supply payload like:
POST ads/_delete_by_query
{
"query" : {
"terms" : {
"_id" :
[ "586213440e7d2c7f10fe2574",
"586213440e7d2c7f10fe2575",
"586213450e7d2c7f10fe2576",
"586213450e7d2c7f10fe2577" ]
}
}
}
For more infromation about terms query please follow https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-terms-query.html

Full-text schema in ElasticSearch

I'm (extremely) new to ElasticSearch so forgive my potentially ridiculous question. I currently use MySQL to perform full-text searches, and want to move this to ElasticSearch. Currently my table has a fulltext index spanning three columns:
title,description,tags
In ES, each document would therefore have title, description and tags fields, allowing me to do a fulltext search for a general phrase, or filter on a given tag.
I also want to add further searchable fields such as username (so I can retrieve posts by a given user). So, how do I specify that a fulltext search should match title OR description OR tags but not username?
From the OR filter example, I'd assume I'd have to use something like this:
{
"filtered" : {
"query" : {
"match_all" : {}
},
"filter" : {
"or" : [
{
"term" : { "title" : "foobar" }
},
{
"term" : { "description" : "foobar" }
},
{
"term" : { "tags" : "foobar" }
}
]
}
}
}
Coming at this new, it doesn't seem like this is very efficient. Is there a better way of doing this, or do I need to move the username field to a separate index?
This is fine.
I general I would suggest getting familiar with ElasticSearch mapping types and options.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping.html

Resources