RethinkDB - Filter using Match() by value in same dataset and table - go

So, since I'm too dumb obviously to figure this out myself, I'll ask you better folks here on SO instead.
Basically i have a datastructure that looks like the following:
....,
{
"id": 12345
....
"policy_subjects": [
{
"compiled": "^(user|max|anonymous)$",
"template": "<user|max|anonymous>"
},
{
"compiled": "^max$",
"template": "max"
}
]
....
}
compiled is a "compiled" regex
template is the same regex without regex-modifiers
What I want is to do a simple query in RethinkDB using the "compiled" value and matching that against a string, say "max".
Basically
r.table("regex_policies").filter(function(policy_row) {
return "max".match("(?i)"+policy_row("policy_subjects")("compiled"))
}
Is what i want to do (+case-insensitive search)
There are of course lots of policy_subjects in the database so in this example the result should be the whole dataset (1 result) that matches "max". Since "max" exists twice in this case and it matches both regexes (once would have been enough).
"foobar" would likewise in this example yield 0 results, since any of the compiled regexes does not match "foobar".
Does anyone know how to do this relatively simple query?

You definitely want to use r.expr here and I got this example to work:
r.expr([{
"id": 12345,
"policy_subjects": [
{
compiled: "^(user|max|anonymous)$",
template: "<user|max|anonymous>"
},
{
compiled: "^max$",
template: "max"
}
]
}]).merge(function(policy_row) {
return {
"policy_subjects": policy_row("policy_subjects").filter(function(item){
return r.expr("max").match(r.expr("(?i)").add(item("compiled"))).ne(null);
})
}
})
Changing max to something else that does not match, returns the document with no elements inside policy_subjects.
For example, changing max => to wat (my favorite test string of all time) looks like this:
.merge(function(policy_row) {
return {
"policy_subjects": policy_row("policy_subjects").filter(function(item){
return r.expr("wat").match(r.expr("(?i)").add(item("compiled"))).ne(null);
})
}
})
And results in this:
[
{
"id": 12345 ,
"policy_subjects": [ ]
}
]
I think your logic for reducing to the one policy_subject document you want might be a little subjective to your use case so I'm not sure what the right answer is but you can use .reduce(...) to just return the right-most value.

Related

$elemMatch with $in SpringData Mongo Query

I am in the process of attempting to create a method that will compose a query using Spring Data and I have a couple of questions. I am trying to perform a query using top level attributes of a document (i.e. the id field) as well as attributes of an subarray.
To do so I am using a query similar to this:
db.getCollection("journeys").find({ "_id._id": "0104", "journeyDates": { $elemMatch: { "period": { $in: [ 1,2 ] } } } })
As you can see I would also like to filter using $in for the values of the subarray. Running the above query though result in wrong results, as if the $elemMatch is ignored completely.
Running a similiar but slightly different query like this:
db.getCollection("journeys").find({ "_id._id": { $in: [ "0104" ] } }, { journeyDates: { $elemMatch: { period: { $in: [ 1, 2 ] } } } })
does seem to yield better results but it returns the only first found element matching the $in of the subarray filter.
Now my question is, how can I query using both top level attributes as well subarrays using $in. Preferably I would like to avoid aggregations. Secondly, how can I translate this native Mongo query to a Spring data Query object?

Terms query not returning results for list of strings

I have this Elastic query which fails to return the desired results for terms.letter_score. I'm certain there is available matches in the index. This query (excluding letter_score) returns the expected filtered results but nothing with letter_score. The only difference is (as far as I can tell), is that the cat_id values is a list of integers vs strings. Any ideas of what could be the issue here? I'm basically trying to get it to match ANY value from the letter_score list.
Thanks
{
"size": 10,
"query": {
"bool": {
"filter": [
{
"terms": {
"cat_id": [
1,
2,
4
]
}
},
{
"terms": {
"letter_score": [
"A",
"B",
"E"
]
}
}
]
}
}
}
It sounds like your letter_score field is of type text, and hence, has been analyzed, so the tokens A, B and E have been stored as a, b and e so the terms query won't match them.
Also if that's the case, the probability is high that the token a has been ignored at indexing time because it is a stop word and the standard analyzer (default) ignores them (if you're using ES 5+).
A first approach is to use a match query instead of terms, like this:
{
"match": {
"letter_score": "A B E"
}
}
If that still doesn't work, I suggest that you change the mapping of your letter_score field to keyword (requires reindexing your data) and then your query will work as it is now

couchDB- complex query on a view

I am using cloudantDB and want to query a view which looks like this
function (doc) {
if(doc.name !== undefined){
emit([doc.name, doc.age], doc);
}
what should be the correct way to get a result if I have a list of names(I will be using option 'keys=[]' for it) and a range of age(for which startkey and endkey should be used)
example: I want to get persons having name "john" or "mark" or "joseph" or "santosh" and lie between age limit 20 to 30.
If i go for list of names, query should be keys=["john", ....]
and if I go for age query should use startkey and endkey
I want to do both :)
Thanks
Unfortunately, you can't do so. Using the keys parameter query the documents with the specified key. For example, you can't only send keys=["John","Mark"]&startkey=[null,20]&endkey=[{},30]. This query would only and ONLY return the document having the name John and Mark with a null age.
In your question you specified CouchDB but if you are using Cloudant, index query might be interesting for you.
You could have something like that :
{
"selector": {
"$and": [
{
"name": {
"$in":["Mark","John"]
}
},
{
"year": {
"$gt": 20,
"$lt": 30
}
}
]
},
"fields": [
"name",
"age"
]
}
As for CouchDB, you need to either separate your request (1 request for the age and 1 for the people) or you do the filtering locally.

Elsticsearch : Contains query

I have a column in my mapping that holds an array of strings
col1
["asd","fgh","wer"]
["qwer","cvbvbn","popop"]
["cvbml","fhjhfrjk","fsdfd"]
["asd","trth","fdf"]
The column col is not analyzed in the index and i do not want to change the mapping.
"col1":
{
"type":"string",
"index":"not_analyzed"
}
Now, i want to retrieve all records where the string asd appears. so in this case, i want the first and fourth records. I tried using the query
query: {
wildcard:{
"col1":"asd"
}
}
with
POST localhost:9200/indexName/test/_search
but that gives me empty results? Which query should i use in this case?
Edit
So i was able to solve the above problem. Here is a follow up. Consider that this was my data
col1
["asd fd","fgh bn","wer kl"]
["qwer","cvbvbn","popop"]
["cvbml","fhjhfrjk wewe","fsdfd rtr"]
["asd","trth","fdf"]
so now, the array contains some strings that have multiple words. Now, i still want to return the first and fourth record. If i go with the solution that i posted, i only get the fourth one. How can i apply the contains logic to each element of the array in col1?
Note
A partial solution is
{ "query": { "match_phrase_prefix": { "col1": "asd" } } }
so again, for the data
col1
["asd fd","fgh bn","wer kl"]
["qwer","cvbvbn","popop"]
["cvbml","fhjhfrjk wewe","fsdfd rtr"]
["asd","trth","fdf"]
it returns the first and fourth records. However, if i have
col1
["fd asd","fgh bn","wer kl"]
["qwer","cvbvbn","popop"]
["cvbml","fhjhfrjk wewe","fsdfd rtr"]
["asd","trth","fdf"]
then, once again it only returns the fourth one, which is understandable as now, asd is no longer a prefix for that value in the first record.
Is there a way to to a contains type match instead of just prefix match?
You can use a simple term query and it should work
POST localhost:9200/indexName/test/_search
{
"query": {
"terms": { "col1" : "asd" }
}
}
so, here is the proper query
{
fields : ["col1","col2"],
query: {
filtered: {
query: {
match_all: {}
},
filter: {
terms: {
col1: ["asd"]
}
}
}
}
}
Final Answer
query: {
wildcard:{
col1:{
value:"*asd*"
}
}
}
:)

Rethinkdb: Including a subdocument for nested doc

I am performing an operation, and it works, but I want to know if there is a better or more efficient way to do what I want.
I have an object in my db that looks like this:
{
"id": "testId",
"name": "testName",
"products": [
{
"name": "product1"
"info": "sampleInfo",
"templateIds": [
"asdf-1",
"asdf-2"
]
},
{
"name": "product2"
"info": "sampleInfo",
"templateIds": [
"asdf-1",
"asdf-2"
]
}
]
}
As you can see, each "product" in the "products" array has a sub-array of templateIds. These match templates stored in another table. What I want to do is create a query that merges those templates onto each product object before I send it all back.
Currently I am doing this with sub-merges:
r.table('suites').get('testId').merge(function(suite){
return {
products: suite('products').merge(function(product){
return {
templates: r.expr(product('templateIds')).map(function(id) {
return r.table('templates').get(id)
})
}
})
}
})
My question is: is there a more efficient way to do this? Or is there a completely different way of thinking I should employ to do this?
Thanks guys!
That looks right to me. The only thing I can think of is that r.table('templates').get_all(r.args(product('templateIds'))) is shorter than product('templateIds').map(function(id){ return t.table('templates').get(id);}) and might well be faster.
EDIT: If you have a small number of templates, another thing that would make this run faster would be to do the substitution in the client instead and cache the retrieved templates by ID. RethinkDB will have to do a separate read for each template ID, even if it sees the same one over and over again, because it doesn't know enough to know whether or not caching those values is safe.

Resources