Check for id existence in param Array with Elasticsearch custom script field - elasticsearch

Is it possible to add a custom script field that is a Boolean and returns true if the document's id exists in an array that is sent as a param?
Something like this https://gist.github.com/2437370
What would be the correct way to do this with mvel?
Update:
Having trouble getting it to work as specified in Imotov's answer.
Mapping:
Sort:
:sort=>{:_script=>{:script=>"return friends_visits_ids.contains(_fields._id.value)", :type=>"string", :params=>{:friends_visits_ids=>["4f8d425366eaa71471000011"]}, :order=>"asc"}}}
place: {
properties: {
_id: { index: "not_analyzed", store: "yes" },
}
}
I don't get any errors, the documents just doesn't get sorted right.
Update 2
Oh, and I do get this back on the documents:
"sort"=>["false"]

You were on the right track. It just might be more efficient to store list of ids in a map instead of an array if this list is large.
"sort" : {
"_script" : {
"script" : "return friends_visits_ids.containsKey(_fields._id.value)",
"type" : "string",
"params": {
"friends_visits_ids": { "1" : {}, "2" : {}, "4" : {}}
}
}
}
Make sure that id field is stored. Otherwise _fields._id.value will return null for all records.

Related

Search for empty/present arrays in elasticsearch

I'm currently using the elasticsearch 6.5.4 and I'm trying to query for all docs in an index with an empty array on a specific field. I found the the elasticsearch has a exists dsl who is supposed to cover the empty array case.
The problem is: whem I query for a must exists no doc is returned and when I query for must not exists all documents are returned.
Since I can't share the actual mapping for legal reasons, this is the closest I can give you:
{
"foo_production" : {
"mappings" : {
"foo" : {
"properties" : {
"bar" : {
"type" : "text",
"index" : false
}
}
}
}
}
}
And the query I am performing is:
GET foo_production/_search
{
"query": {
"bool": {
"must": {
"exists": {
"field": "bar"
}
}
}
}
}
Can you guys tell me where the problem is?
Note: Upgrading the elasticsearch version is not a viable solution
Enable indexing for the field bar by setting "index" : true
The index option controls whether field values are indexed. It accepts true or false and defaults to true. Fields that are not indexed are not queryable.
Source : https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-index.html

Mapping to limit length of Array datatype in Elasticsearch

I'm trying to create an elasticsearch Mapping which limits the length of an array datatype to x number of items.
mapping = """
{
"mappings": {
"document": {
"properties": {
"pages": {
"type": "text"
}
}
}
}
}
}
"""
in this case, how do I set the "pages" array to have a maximum of 1,000 list items? Also, is there a way to "ignore" insert errors triggered by ES when this limit has been reached?
Elasticsearch has no such limits, you'd have to enforce it in your application.
As for ignoring errors look at the ignore_malformed option for many fields.
Hope this helps!
Thanks Honza !
I had assumed so eventually ... to expand on your answer, here's how I'm inserting/indexing documents now:
data = {
"_op_type": "index",
"_index" : "myIndex",
"_type" : "document",
'script' : {
'inline': 'if(ctx._source.pages.length < 1001){ ctx._source.pages.add(params.page);}',
'params' : {
"page" : "{}".format(item['page'])
}
}
}
I'm using the script field, combined with the "painless" language to check the field length before indexing the document.
Note, I'm using Python Elasticsearch library's bulk helper in the above example, which is why you see the "_op_type" field.

Exists query for objects inside fields

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-exists-query.html says that it is possible to query for documents that have at least one non-null value in the original field.
If the value of the original fields is an object, is it possible to query for the existence of a key in the object?
Example: a document is
{
"user": {
"name": "XY",
"passport_id": 1234
}
}
Can one make an exists query for user.name? I tried
{
"query": {
"exists" : { "field" : "user.name" }
}
}
but it does not give any results.

Elasticsearch: conditionally sort on 2 fields, 1 replaces the other if it exists

Without scripting, I need to sort records based on rating. The system-rating exists for all records, but a user-rating may or may not exist. If a user-rating does exist I want to use that value in the sort instead of the system-rating, for that particular record and only for that record.
Tried looking into the missing setting but it only allows _first, _last or a custom value (that will be used for missing docs as the sort value):
{
"sort" : [
{ "user_rating" : {"missing" : "_last"} },
],
"query" : {
"term" : { "meal" : "cabbage" }
}
}
...but is there a way to specify the custom value should be system_rating when user_rating is missing?
I can do the following:
query_hash[:sort] = []
if user_rating.exist?
query_hash[:sort] << {
"user_rating" => {
"order": sort_direction,
"unmapped_type": "long",
"missing": "_last",
}
}
end
query_hash[:sort] << {
"system_rating" => {
"order": sort_direction,
"unmapped_type": "long",
}
}
...but that will always sort user rated records on top regardless of the user_rating value.
I know that scripting will allow me to do it but we cannot use scripting. Is it possible?
The only way is scripting or building a custom field at indexing time that will contain the already built value for sorting.

Elasticsearch document aliases

I have multiple mappings which come from the same datasource but have small differences, like the example below.
{
"type_A" : {
"properties" : {
"name" : {
"type" : "string"
}
"meta_A" : {
"type" : "string"
}
}
}
}
{
"type_B" : {
"properties" : {
"name" : {
"type" : "string"
}
"meta_B" : {
"type" : "string"
}
}
}
}
What I want to be able to is:
Directly query specific fields (like meta_A)
Directly query all documents from the datsource
Query all documents from a specific mapping
What I was looking into is the type filter, so preferably I could write a query like this:
{
"query": {
"filtered" : {
"filter" : {
"type" : { "value" : "unified_type" }
}
}
// other query clauses
}
}
So instead of typing "type_A","type_B" in an or clause in the type filter I would like to have this "unified_type", but without giving up the possibility to directly query "type_A".
How could I achive this?
I don't think that it's possible. However, you could use copy_to functionality, so you would have your fields as they are now and their values copied into unified name.
The copy_to parameter allows you to create custom _all fields. In
other words, the values of multiple fields can be copied into a group
field, which can then be queried as a single field. For instance, the
first_name and last_name fields can be copied to the full_name field
as follows:
So you'd be copying both "meta_A" and "meta_B" into some "unified_meta" field and query this one.

Resources