I am in the process of attempting to create a method that will compose a query using Spring Data and I have a couple of questions. I am trying to perform a query using top level attributes of a document (i.e. the id field) as well as attributes of an subarray.
To do so I am using a query similar to this:
db.getCollection("journeys").find({ "_id._id": "0104", "journeyDates": { $elemMatch: { "period": { $in: [ 1,2 ] } } } })
As you can see I would also like to filter using $in for the values of the subarray. Running the above query though result in wrong results, as if the $elemMatch is ignored completely.
Running a similiar but slightly different query like this:
db.getCollection("journeys").find({ "_id._id": { $in: [ "0104" ] } }, { journeyDates: { $elemMatch: { period: { $in: [ 1, 2 ] } } } })
does seem to yield better results but it returns the only first found element matching the $in of the subarray filter.
Now my question is, how can I query using both top level attributes as well subarrays using $in. Preferably I would like to avoid aggregations. Secondly, how can I translate this native Mongo query to a Spring data Query object?
Related
If I have a Documents defined as
[
{
"title": "title",
"tags": [ "cool", "amazing", "funny" ]
},
{
"title": "another title",
"tags": [ "nice", "amazing", "funny" ]
}
]
I'd like to be able to query with MongoTemplate in order to pass a list of values like ["cool","amazing"] and have in return the first collection above, not the second.
For what I mean to achieve, the $in condition doesn't seems enough.
I tried with the $all condition and from the Mongo console it works as I need, but in my code something isn't working and it takes forever for my query to elaborate. With the $in operator my code goes fast, instead.
My method in my repository (for other reasons, I have to do an aggregation as below):
public Page<MyDocument> findByProperties(String title, List<ObjectId> tags, Pageable page) {
final List<Criteria> criteria = new ArrayList<>();
Aggregation aggregation = null;
if (title != null && !title.isEmpty()) {
criteria.add(Criteria.where("title").is(title));
}
if (tags != null && !tags.isEmpty()) {
criteria.add(Criteria.where("MyBook.tags").all(tags));
}
if (!criteria.isEmpty()) {
aggregation = Aggregation.newAggregation(
Aggregation.lookup("from_collection", "_id", "idParent", "MyBook"),
Aggregation.unwind("MyBook"),
Aggregation.match(new Criteria().andOperator(criteria.toArray(new Criteria[0]))),
Aggregation.skip(page.getOffset()),
Aggregation.limit(page.getPageSize()),
Aggregation.sort(page.getSort())
);
} else {
aggregation = Aggregation.newAggregation(
Aggregation.lookup("from_collection", "_id", "idParent", "MyBook"),
Aggregation.unwind("MyBook"),
Aggregation.skip(page.getOffset()),
Aggregation.limit(page.getPageSize()),
Aggregation.sort(page.getSort())
);
}
List<MyDocument> results = mongoTemplate.aggregate(aggregation, "my_document", MyDocument.class).getMappedResults();
return PageableExecutionUtils.getPage(results, page,
() -> (long)results.size());
}
Looking at this answer, I tried with
criteria.add(Criteria.where("MyBook.tags").in(tags).all(tags));
But nothing changed, the query takes forever and not with the expected output.
Any ideas please? Thank you!
To find if tags array field contains all the members of youe array, your can use the bellow, if you need it to be in aggregation pipeline use the second.
In your query you have more stages, and more code, if you want to ask more if you can give example data and expected output in JSON, and what you want to do, so people can help.
Query1
find using $all operator
Test code here
db.collection.find({"tags": {"$all": ["cool","amazing"]}})
Query2
aggregation solution with set difference
Test code here
aggregate(
[{"$match":
{"$expr":
{"$eq": [{"$setDifference": [["cool", "amazing"], "$tags"]}, []]}}}])
As per documentation it is possible to provide a hint to an update.
Now I'm using the java mongo client and mongo collection to do an update.
For this update I cannot find any way to provide a hint which index to use.
I see for the update I'm doing a COLSCAN in the logs, so wanting to provide the hint.
this.collection.updateOne(
or(eq("_id", "someId"), eq("array1.id", "someId")),
and(
addToSet("array1", new Document()),
addToSet("array2", new Document())
)
);
Indexes are available for both _id and array1.id
I found out in the logs the query for this update is using a COLSCAN to find the document.
Anyone who can point me in the right direction?
Using AWS DocumentDB, which is MongoDB v3.6
Lets consider a document with an array of embedded documents:
{ _id: 1, arr: [ { fld1: "x", fld2: 43 }, { fld1: "r", fld2: 80 } ] }
I created an index on arr.fld1; this is a Multikey index (indexes on arrays are called as so). The _id field already has the default unique index.
The following query uses the indexes on both fields - arr.fld1 and the _id. The query plan generated using explain() on the query showed an index scan (IXSCAN) for both fields.
db.test.find( { $or: [ { _id: 2 }, { "arr.fld1": "m" } ] } )
Now the same query filter is used for the update operation also. So, the update where we add two sub-documents to the array:
db.test.update(
{ $or: [ { _id: 1 }, { "arr.fld1": "m" } ] },
{ $addToSet: { arr: { $each: [ { "fld1": "xx" }, { "fld1": "zz" } ] } } }
)
Again, the query plan showed that both the indexes are used for the update operation. Note, I have not used the hint for the find or the update query.
I cannot come to conclusion about what the issue is with your code or indexes (see point Notes: 1, below).
NOTES:
The above observations are based on queries run on a MongoDB server
version 4.0 (valid for version 3.6 also, as I know).
The
explain
method is used as follows for find and update:
db.collection.explain().find( ... ) and
db.collection.explain().update( ... ).
Note that you cannot generate a query plan using explain() for
updateOne method; it is only available for findAndModify() and
update() methods. You can get a list of methods that can generate a
query plan by using the command at mongo shell:
db.collection.explain().help().
Note on Java Code:
The Java code to update an array field with multiple sub-document add, is as follows:
collection.updateOne(
or(eq("_id", new Integer(1)), eq("arr.fld1", "m")),
addEachToSet("arr", Arrays.asList(new Document("fld1", "value-1"), new Document("fld1", "value-2"))
);
Say, I have an index called blog which has 10 documents called article. The article is a JSON with one of the property being views which is initialized to 0.
I was wondering if there's a good way of updating the views counter everytime the document gets explicitly called via _search endpoint using document id, so that I can sort it by view on my other queries.
Or would that be something that will have to be taken care of at the application layer?
My feeble attempt query dsl so far:
let options = {
index: 'blog',
body: {
query: {
function_score: {
query: {
match: { _id: req.params.articleID }
},
"weight" : 2
,
score_mode: "sum"
,
script_score : {
script : {
inline: "(2 + doc['view'].value)"
}
}
}
},
}
};
I have been trying inline script but that would require me to send two separate request. First search & then update if found. I was wondering if I could do it on a single query i.e trigger the views counter to increase by one automatically everytime I query via _search.
So, since I'm too dumb obviously to figure this out myself, I'll ask you better folks here on SO instead.
Basically i have a datastructure that looks like the following:
....,
{
"id": 12345
....
"policy_subjects": [
{
"compiled": "^(user|max|anonymous)$",
"template": "<user|max|anonymous>"
},
{
"compiled": "^max$",
"template": "max"
}
]
....
}
compiled is a "compiled" regex
template is the same regex without regex-modifiers
What I want is to do a simple query in RethinkDB using the "compiled" value and matching that against a string, say "max".
Basically
r.table("regex_policies").filter(function(policy_row) {
return "max".match("(?i)"+policy_row("policy_subjects")("compiled"))
}
Is what i want to do (+case-insensitive search)
There are of course lots of policy_subjects in the database so in this example the result should be the whole dataset (1 result) that matches "max". Since "max" exists twice in this case and it matches both regexes (once would have been enough).
"foobar" would likewise in this example yield 0 results, since any of the compiled regexes does not match "foobar".
Does anyone know how to do this relatively simple query?
You definitely want to use r.expr here and I got this example to work:
r.expr([{
"id": 12345,
"policy_subjects": [
{
compiled: "^(user|max|anonymous)$",
template: "<user|max|anonymous>"
},
{
compiled: "^max$",
template: "max"
}
]
}]).merge(function(policy_row) {
return {
"policy_subjects": policy_row("policy_subjects").filter(function(item){
return r.expr("max").match(r.expr("(?i)").add(item("compiled"))).ne(null);
})
}
})
Changing max to something else that does not match, returns the document with no elements inside policy_subjects.
For example, changing max => to wat (my favorite test string of all time) looks like this:
.merge(function(policy_row) {
return {
"policy_subjects": policy_row("policy_subjects").filter(function(item){
return r.expr("wat").match(r.expr("(?i)").add(item("compiled"))).ne(null);
})
}
})
And results in this:
[
{
"id": 12345 ,
"policy_subjects": [ ]
}
]
I think your logic for reducing to the one policy_subject document you want might be a little subjective to your use case so I'm not sure what the right answer is but you can use .reduce(...) to just return the right-most value.
I’m playing around with ElasticSearch in combination with NEST in my C# project. My use case includes several indices with different document types which I query separately so far. Now I wanna implement a global search function which queries against all existing indices, document types and score the result properly.
So my question: How do I accomplish that by using NEST?
Currently I’m using the function SetDefaultIndex but how can I define multiple indices?
Maybe for a better understanding, this is the query I wanna realize with NEST:
{
"query": {
"indices": {
"indices": [
"INDEX_A",
"INDEX_B"
],
"query": {
"term": {
"FIELD": "VALUE"
}
},
"no_match_query": {
"term": {
"FIELD": "VALUE"
}
}
}
}
}
TIA
You can explicitly tell NEST to use multiple indices:
client.Search<MyObject>(s=>s
.Indices(new [] {"Index_A", "Index_B"})
...
)
If you want to search across all indices
client.Search<MyObject>(s=>s
.AllIndices()
...
)
Or if you want to search one index (thats not the default index)
client.Search<MyObject>(s=>s.
.Index("Index_A")
...
)
Remember since elasticsearch 19.8 you can also specify wildcards on index names
client.Search<MyObject>(s=>s
.Index("Index_*")
...
)
As for your indices_query
client.Search<MyObject>(s=>s
.AllIndices()
.Query(q=>q
.Indices(i=>i
.Indices(new [] { "INDEX_A", "INDEX_B"})
.Query(iq=>iq.Term("FIELD","VALUE"))
.NoMatchQuery(iq=>iq.Term("FIELD", "VALUE"))
)
)
);
UPDATE
These tests show off how you can make C#'s covariance work for you:
https://github.com/Mpdreamz/NEST/blob/master/src/Nest.Tests.Integration/Search/SubClassSupport/SubClassSupportTests.cs
In your case if all the types are not subclasses of a shared base you can still use 'object'
i.e:
.Search<object>(s=>s
.Types(typeof(Product),typeof(Category),typeof(Manufacturer))
.Query(...)
);
This will search on /yourdefaultindex/products,categories,manufacturers/_search and setup a default ConcreteTypeSelector that understands what type each returned document is.
Using ConcreteTypeSelector(Func<dynamic, Hit<dynamic>, Type>) you can manually return a type based on some json value (on dynamic) or on the hit metadata.