In Couchbase or N1QL how can I check if the values in an array match - view

In a couchbase I have the following document structure...
{
name: "bob",
permissions: [
2,
4,
6
]
}
I need to be able to create a view, or N1QL query which will check if the permissions for "bob" are contained within a given array.
e.g I have an array with contents
[1,2,3,4,5,6]
I need the "bob" document to be returned because my array contains 2,4,6 and so does "bob"
If my array contained 1,3,4,5,6 "bob" should not be selected because my array does not contain "2"
Essentially I want to match any documents whose permission entries are all contained in my array.
The solution can either a view or an N1QL query.

Using N1QL, you can do the following:
SELECT * FROM my_bucket WHERE EVERY p IN permissions SATISFIES p IN [ 1,2,3,4,5,6 ] END;

Related

Efficient data-structure to searching data only in documents a user can access

Problem description:
The goal is to efficiently query strings from a set of JSON documents while respecting document-level security, such that a user is only able to retrieve data from documents they have access to.
Suppose we have the following documents:
Document document_1, which has no restrictions:
{
"id": "document_1",
"set_of_strings_1": [
"the",
"quick",
"brown"
],
"set_of_strings_2": [
"fox",
"jumps",
"over",
],
"isPublic": true
}
Document document_2, which can only be accessed by 3 users:
{
"id": "document_2",
"set_of_strings_1": [
"the"
"lazy"
],
"set_of_strings_2": [
"dog",
],
"isPublic": false,
"allowed_users": [
"Alice",
"Bob",
"Charlie"
]
}
Now suppose user Bob (has access to both documents) makes the following query:
getStrings(
user_id: "Bob",
set_of_strings_id: "set_of_strings_1"
)
The correct response should be the union of set_of_strings_1 from both documents:
["the", "quick", "brown", "lazy"]
Now suppose user Dave (has access to document_1 only) makes the following query:
getStrings(
user_id: "Dave",
set_of_strings_id: "set_of_strings_1"
)
The correct response should be set_of_strings_1 from document_1:
["the", "quick", "brown"]
A further optimization is to handle prefix tokens. E.g. for the query
getStrings(
user_id: "Bob",
set_of_strings_id: "set_of_strings_1",
token: "t"
)
The correct response should be:
["the"]
Note: empty token should match all strings.
However, I am happy to perform a simple in-memory prefix-match after the strings have been retrieved. The bottleneck here is expected to be the number of documents, not the number of strings.
What I have tried:
Approach 1: Naive approach
The naive solution here would be to:
put all the documents in a SQL database
perform a full-table scan to get all the documents (we can have millions of documents)
iterate through all the documents to figure out user permissions
filtering out the set of documents the user can access
iterating through the filtered list to get all the strings
This is too slow.
Approach 2: Inverted indices
Another approach considered is to create an inverted index from users to documents, e.g.
users
documents_they_can_see
user_1
document_1, document_2, document_3
user_2
document_1
user_3
document_1, document_4
This will efficiently give us the document ids, which we can use against some other index to construct the string set.
If this next step is done naively, it still involves a linear scan through all the documents the user is able to access. To avoid this, we can create another inverted index mapping document_id#set_of_strings_id to the corresponding set of strings then we just take the union of all the sets to get the result and then we can run prefix match after. However, this involves doing the union of a large number of sets.
Approach 3: Caching
Use redis with the following data model:
key
value
user_id#set_of_strings_id
[String]
Then we perform prefix match in-memory on the set of strings we get from the cache.
We want this data to be fairly up-to-date so the source-of-truth datastore still needs to be performant.
I don't want to reinvent the wheel. Is there a data structure or some off-the-shelf system that does what I am trying to do?

Couchbase Filter Query -> number in range between two numbers using Spring Data Couchbase (SpEL notation).)

I'm trying to make a query in a Couchbase Database. The idea is to retrieve the elements which are in the range of two numbers. I'm using Spring Data Couchbase.
My query looks like this:
#Query("#{#n1ql.selectEntity} WHERE #{#n1ql.filter} AND $age BETWEEN minAge AND maxAge ")
Optional<Room> findByMinAgeAndMaxAge(#Param("age") int age);
But
Unable to execute query due to the following n1ql errors:
{"msg":"No index available on keyspace bucketEx that matches your query. Use CREATE INDEX or CREATE PRIMARY INDEX to create an index, or check that your expected index is online.","code":4000}
This is what I get in the console:
SELECT META(`bucketEx`).id AS _ID, META(`bucketEx`).cas AS _CAS, `bucketEx`.* FROM `bucketEx` WHERE `docType` = \"com.rccl.middleware.engine.repository.model.salon\" AND $age BETWEEN minAge AND maxAge ","$age":7,"scan_consistency":"statement_plus"}
My doubt is if I have to create the indexes for the two fields ( minAge AND maxAge) or there is another issue related with my query. I'm starting with Couchbase and not pretty sure of what is happening.
My document looks like this:
{
"salons": [
{
"name": "salon_0",
"id": "salon-00",
"maxAge": 6,
"minAge": 3
}
],
"docType": "com.rccl.middleware.engine.repository.model.salon"
}
The age you are looking is inside salons array. If you want document if any one of the array object matches you should use array index on on one of the filed.
CREATE INDEX ix1 ON bucketEx(DISTINCT ARRAY v.maxAge FOR v IN salons END)
WHERE `docType` = "com.rccl.middleware.engine.repository.model.salon";
SELECT META( b ).id AS _ID, META( b ).cas AS _CAS, b.*
FROM `bucketEx` AS b
WHERE b.`docType` = "com.rccl.middleware.engine.repository.model.salon" AND
ANY v IN b.salons SATISFIES $age BETWEEN v.minAge AND v.maxAge END;

Appsync graphql: How to filter based on entry in an array field

In my code I have created filter as:
const myFilter: TableMyEntityFilterInput = {targets: {contains: 'username'}};
'targets' field is an array:
targets?: Array | null;
My objective is to fetch those records which has 'username' as an entry in 'targets' field.
But it does't work. Empty array is fetched. But if I use similar criteria on a simple string field, it works.
How to get it working for array field?
Edit:
'targets' sample value:
[ { "S" : "[\"Messi\",\"Ronaldo\"]" }]
CONTAINS is supported for lists: When evaluating "a CONTAINS b", "a" can be a list; however, "b" cannot be a set, a map, or a list.

Is there an equivalent of the mongo addToSet command in rethinkdb?

Say I have a users table with an embedded followers array property.
{
"followers": [
"bar"
] ,
"name": "foo" ,
}
In rethinkDb, what's the best way to add a username to that followers property. i.e. add a follower to a user.
In mongodb I would use the addToSet command to add a unique value to the embedded array. Should I use the merge command?
RethinkDB has a setInsert command. You can write table.get(ID).update({followers: r.row('followers').setInsert(FOLLOWER)}).

ElasticSearch, how to search for a document containing a specific array element

I am having a little problem with elasticsearch and wonder if someone can help me solve it.
I have a document containing an array of tuples (publications).
Something like :
{
....
publications: [
{
item1: 385294,
item2: 11
},
{
item1: 395078,
item2: 1
}
]
....
}
The problem i have is for retrieving documents who contain a specific tuple, for exemple (item1 = 395078 AND item2 = 1).
Whatever i try, it seems to always treat item1 and item2 separately, i fail to tell elasticsearch that item1 and item2 must have a specific value inside the same tuple, not accross the whole array...
Is there something i'm missing here ?
Thanks
This is not possible in the straight way.
ElasticSearch flattens the array before checking for condition.
Which mean
elasticSearch matches
a=x AND b=y1 to [{a=x,b=y},{a=x1,b=y1}] which doesnt happen in the conventianal array checking.
What you can do here is
Usage of nested type - https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html (but for each element in array , an extra document would be created)
Store the array as
publications: [
{
385294:11
},
{
395078:1
}
]

Resources