Elasticsearch - how to append term? - elasticsearch

Is there a way to append term into an array of values?
For example if my document looks like this:
{
"items": ["item1", "item2", "item3"]
}
I want to append "item4" and "item5" to it.
I must do it in 2 queries? one to load the current list of values, and on to update that list? or is there more elegant way that will let me append those items in one query?
I am trying to do it with elastic4s like this:
client.execute(ElasticDsl.update id id in indexName / documentType script {
script(s"ctx._source.items += tag").params(Map("tag"->"item4"))
})
In order to use the above code snippet, I need to enable groovy scripts, and I am not sure how to do it with multiple items.
Any idea?

Here is a full example of how you could achieve this.
Merge new values to array and make it unique after:
DELETE test/test/1
POST test/test/1
{
"terms":["item1", "item2", "item3"]
}
GET test/test/1
POST test/test/1/_update
{
"script" : " ctx._source.terms << newItems; ctx._source.terms = ctx._source.terms.flatten().unique()",
"params" : {
"newItems" : ["a","b"]
}
}
make sure you have scripting enabled in server config
user:/etc/elasticsearch# head elasticsearch.yml
script.inline: true
script.indexed: true
...

Try using 'terms' filter in your code.If you are using NEST then following link will be useful https://nest.azurewebsites.net/nest/writing-queries.html

Related

$elemMatch with $in SpringData Mongo Query

I am in the process of attempting to create a method that will compose a query using Spring Data and I have a couple of questions. I am trying to perform a query using top level attributes of a document (i.e. the id field) as well as attributes of an subarray.
To do so I am using a query similar to this:
db.getCollection("journeys").find({ "_id._id": "0104", "journeyDates": { $elemMatch: { "period": { $in: [ 1,2 ] } } } })
As you can see I would also like to filter using $in for the values of the subarray. Running the above query though result in wrong results, as if the $elemMatch is ignored completely.
Running a similiar but slightly different query like this:
db.getCollection("journeys").find({ "_id._id": { $in: [ "0104" ] } }, { journeyDates: { $elemMatch: { period: { $in: [ 1, 2 ] } } } })
does seem to yield better results but it returns the only first found element matching the $in of the subarray filter.
Now my question is, how can I query using both top level attributes as well subarrays using $in. Preferably I would like to avoid aggregations. Secondly, how can I translate this native Mongo query to a Spring data Query object?

Set hint for update to use indexes

As per documentation it is possible to provide a hint to an update.
Now I'm using the java mongo client and mongo collection to do an update.
For this update I cannot find any way to provide a hint which index to use.
I see for the update I'm doing a COLSCAN in the logs, so wanting to provide the hint.
this.collection.updateOne(
or(eq("_id", "someId"), eq("array1.id", "someId")),
and(
addToSet("array1", new Document()),
addToSet("array2", new Document())
)
);
Indexes are available for both _id and array1.id
I found out in the logs the query for this update is using a COLSCAN to find the document.
Anyone who can point me in the right direction?
Using AWS DocumentDB, which is MongoDB v3.6
Lets consider a document with an array of embedded documents:
{ _id: 1, arr: [ { fld1: "x", fld2: 43 }, { fld1: "r", fld2: 80 } ] }
I created an index on arr.fld1; this is a Multikey index (indexes on arrays are called as so). The _id field already has the default unique index.
The following query uses the indexes on both fields - arr.fld1 and the _id. The query plan generated using explain() on the query showed an index scan (IXSCAN) for both fields.
db.test.find( { $or: [ { _id: 2 }, { "arr.fld1": "m" } ] } )
Now the same query filter is used for the update operation also. So, the update where we add two sub-documents to the array:
db.test.update(
{ $or: [ { _id: 1 }, { "arr.fld1": "m" } ] },
{ $addToSet: { arr: { $each: [ { "fld1": "xx" }, { "fld1": "zz" } ] } } }
)
Again, the query plan showed that both the indexes are used for the update operation. Note, I have not used the hint for the find or the update query.
I cannot come to conclusion about what the issue is with your code or indexes (see point Notes: 1, below).
NOTES:
The above observations are based on queries run on a MongoDB server
version 4.0 (valid for version 3.6 also, as I know).
The
explain
method is used as follows for find and update:
db.collection.explain().find( ... ) and
db.collection.explain().update( ... ).
Note that you cannot generate a query plan using explain() for
updateOne method; it is only available for findAndModify() and
update() methods. You can get a list of methods that can generate a
query plan by using the command at mongo shell:
db.collection.explain().help().
Note on Java Code:
The Java code to update an array field with multiple sub-document add, is as follows:
collection.updateOne(
or(eq("_id", new Integer(1)), eq("arr.fld1", "m")),
addEachToSet("arr", Arrays.asList(new Document("fld1", "value-1"), new Document("fld1", "value-2"))
);

Correct syntax for adding a filter aggregation in a Kibana visualization as a JSON input (filtering for a specific property value)

I am trying to perform the simplest filter for a specific property value, as a JSON input, in a Kibana visualization, thoroughly without success.
I can't, to my surprise, find concrete examples in doing that (have been searching for a couple of minutes now).
Say we have a document with the following structure:
{
a: true,
b: 10
}
How can I add a Filter aggregation for all documents with a = true ?
I tried using "script", "query", "filters" api, but all give me parse errors. My filter jsons are all valid, my problem is with the exact syntax elastic is expecting, but all examples I found out there and tried - give me parsing errors (after making the amendments to my index structure).
Kibana's version: 6.4.3
How is this accomplished ?
An example:
POST /sales/_search?size=0
{
"aggs" : {
"docs" : {
"filter" : { "term": { "a": "true" } },
}
}
}
Here is the link to the official documentation with example.

Proper groovy script for sum of fields in Elasticsearch documents

This question is a followup to this question.
If my documents look like so:
{"documentid":1,
"documentStats":[ {"foo_1_1":1}, {"foo_2_1":5}, {"boo_1_1":3} ]
}
What would be the correct groovy script to be used in a script_field for returning the sum of all documentStats per document that match a particular pattern, (e.g., contain _1_)
Similar to the referred question, there's a one-liner that does the same thing with your new structure:
{
"query" : {
...
},
"script_fields" : {
"sum" : {
"script" : "_source.documentStats.findAll{ it.keySet()[0] =~'_1_' }.collect{it.values()}.flatten().sum()"
}
}
}
I don't know ES, but in pure Groovy you would do:
document.documentStats.collectMany { Map entry ->
// assumes each entry has a single key and a single int value
def item = entry.entrySet()[0]
item.key.contains('_1_') ? [item.value] : []
}.sum()
Hope this helps.

How would you implement these queries efficiently in MongoDB?

Links have one or more tags, so at first it might seem natural to embed the tags:
link = { title: 'How would you implement these queries efficiently in MongoDB?'
url: 'http://stackoverflow.com/questions/3720972'
tags: ['ruby', 'mongodb', 'database-schema', 'database-design', 'nosql']}
How would these queries be implemented efficiently?
Get links that contain one or more given tags (for searching links with given tags)
Get a list of all tags without repetition (for search box auto-completion)
Get the most popular tags (to display top 10 tags or a tag cloud)
The idea to represent the link as above is based on the MongoNY presentation, slide 38.
Get links that contain "value" tag:
db.col.find({tags: "value"});
Get links that contain "val1", "val2" tags:
db.col.find({tags: { $all : [ "val1", "val2" ] }});
Get list of all tags without repetition:
db.col.distinct("tags");
Get the most popular tags - this isn't something that can be queried on an existing db, what you need to do is add a popularity field update it whenever a query fetches the document, and then do a query with the sort field set to the popularity.
Update: proposed solution for popularity feature.
Try adding the following collection, let's call it tags.
doc = { tag: String, pop: Integer }
now once you do a query you collect all the tags that were shown (these can be aggregated and done asynchronously) so let's say you end up with the following tags: "tag1", "tag2", "tag3".
You then call the update method and increment the pop field value:
db.tags.update({tag: { $in: ["tag1", "tag2", "tag3"] }}, { $inc: { pop: 1 }});
You can also use $addToSet to change your tag array instead of $push. This doesn't modify the document when tag already exists.
This will be a bit more efficient if you modify your tags frequently (as the documents won't grow that much).
Here is an example:
> db.tst_tags.remove()
> db.tst_tags.update({'name':'test'},{'$addToSet':{'tags':'tag1'}}, true)
> db.tst_tags.update({'name':'test'},{'$addToSet':{'tags':'tag1'}}, true)
> db.tst_tags.update({'name':'test'},{'$addToSet':{'tags':'tag2'}}, true)
> db.tst_tags.update({'name':'test'},{'$addToSet':{'tags':'tag2'}}, true)
> db.tst_tags.update({'name':'test'},{'$addToSet':{'tags':'tag3'}}, true)
> db.tst_tags.find()
{ "_id" : ObjectId("4ce244548736000000003c6f"), "name" : "test",
"tags" : [ "tag1", "tag2", "tag3" ] }

Resources