Dynamo db query list parameter - dynamodb-queries

I don't know exactly what I am doing wrong with this query. I am querying a dynamo table with entries that look like this:
| id | name | price | tags |
Tags are a list:
[ { "S" : "tag1" }, { "S" : "tag2" }, { "S" : "tag3" }]
I want to do a query where I want all entries that have any tags within another list of tags. For example, get every entry that has "tag1" or every entry that has "tag1","tag2."
The query I am trying to test with right now is:
{"TableName": "products-lambda-table",
"FilterExpression": "tags IN (:tags1)",
"ExpressionAttributeValues": {
":tags1": "tag1"
}
This throws back nada (using node, AWS.DynamoDB.DocumentClient().query(params))
Is there something I'm missing? This is all fairly brand new to me.

Solved on my own, was overthinking something/there was an error elsewhere that was throwing me off.

Related

Match keys with sibling object JSONATA

I have an JSON object with the structure below. When looping over key_two I want to create a new object that I will return. The returned object should contain a title with the value from key_one's name where the id of key_one matches the current looped over node from key_two.
Both objects contain other keys that also will be included but the first step I can't figure out is how to grab data from a sibling object while looping and match it to the current value.
{
"key_one": [
{
"name": "some_cool_title",
"id": "value_one",
...
}
],
"key_two": [
{
"node": "value_one",
...
}
],
}
This is a good example of a 'join' operation (in SQL terms). JSONata supports this in a path expression. See https://docs.jsonata.org/path-operators#-context-variable-binding
So in your example, you could write:
key_one#$k1.key_two[node = $k1.id].{
"title": $k1.name
}
You can then add extra fields into the resulting object by referencing items from either of the original objects. E.g.:
key_one#$k1.key_two[node = $k1.id].{
"title": $k1.name,
"other_one": $k1.other_data,
"other_two": other_data
}
See https://try.jsonata.org/--2aRZvSL
I seem to have found a solution for this.
[key_two].$filter($$.key_one, function($v, $k){
$v.id = node
}).{"title": name ? name : id}
Gives:
[
{
"title": "value_one"
},
{
"title": "value_two"
},
{
"title": "value_three"
}
]
Leaving this here if someone have a similar issue in the future.

Filter nested array using jmes query

I have to get the name of companies in which 'John' worked in the 'sales' department. My JSON
looks like this:
[
{
"name" : "John",
"company" : [{
"name" : "company1",
"department" : "sales"
},
{
"name" : "company2",
"department" : "backend"
},
{
"name" : "company3",
"department" : "sales"
}
],
"phone" : "1234"
}
]
And my jmesquery is like this:
jmesquery: "[? name=='John'].company[? department=='sales'].{Company: name}"
But with this query, I'm getting a null array.
This is because your first filter [?name=='John'] is creating a projection, and more specifically a filter projection, that you will have to reset in order to further filter it.
Resetting a projection can be achieved using pipes.
Projections are an important concept in JMESPath. However, there are times when projection semantics are not what you want. A common scenario is when you want to operate of the result of a projection rather than projecting an expression onto each element in the array.
For example, the expression people[*].first will give you an array containing the first names of everyone in the people array. What if you wanted the first element in that list? If you tried people[*].first[0] that you just evaluate first[0] for each element in the people array, and because indexing is not defined for strings, the final result would be an empty array, []. To accomplish the desired result, you can use a pipe expression, <expression> | <expression>, to indicate that a projection must stop.
Source: https://jmespath.org/tutorial.html#pipe-expressions
So, here would be a first step in your query:
[?name=='John'] | [].company[?department=='sales'].{Company: name}
This said, this still ends in an array of array:
[
[
{
"Company": "company1"
},
{
"Company": "company3"
}
]
]
Because you can end up with multiple people named John in a sales department.
So, one array for the users and another for the companies/departments.
In order to fix this, you can use the flatten operator: [].
So we end with:
[?name=='John'] | [].company[?department=='sales'].{Company: name} []
Which gives:
[
{
"Company": "company1"
},
{
"Company": "company3"
}
]

ES query over different types and join over a common field

I'm trying to investigate if this is a possiblity in elasticsearch, since I want to decouple at its best the data, I would like to have inside an index different types one for each field.
Question1: how bad is this decision?
Now, if I've type name:
/myindex/name
{
"name" : "mat",
"id":1
}
and surname
/myindex/surname
{
"surname" : "txt",
"id":1
}
question 2: how can i create a search that is name='mat AND surname='txt' and returns id?
If I run the query as this (on the index, without specifying the type):
/myindex/_search
{
"query" : {
"bool" : {
"should" : [{
"term" : { "name" : "mat" }
},
{
"term" : { "surname" : "txt" }
}]
}
}
}
it returns (obviusly) two documents, can I say something like join by id ?
If you're going to use ElasticSearch (or any other noSQL database), you really need to change your mindset around "joins". We don't do joins in noSQL...at all...ever. Instead, what you should do is put duplicate data in each record.
Take the following example: Let's say I have a product, and each product has a product type.
In SQL, the 2 records would look like this:
| Product ID | Product Name | Product Type ID |
| ---------- | ------------ | --------------- |
| 1 | product name 1 | 99 |
| 2 | product name 2 | 99 |
In elasticsearch, we would store the data like this:
products/product/1
{
"name": "product name 1",
"type: {
"type name": "product type 99"
}
}
products/product/2
{
"name": "product name 2",
"type: {
"type name": "product type 99"
}
}
Notice how the product type information is duplicated in both rows. As a general rule, noSQL has the advantage of being much faster, at the cost of data duplication. This means that it's much harder to change the name "product type 99" because I'd need to change it in N records.
Hopefully that helps.

couchDB- complex query on a view

I am using cloudantDB and want to query a view which looks like this
function (doc) {
if(doc.name !== undefined){
emit([doc.name, doc.age], doc);
}
what should be the correct way to get a result if I have a list of names(I will be using option 'keys=[]' for it) and a range of age(for which startkey and endkey should be used)
example: I want to get persons having name "john" or "mark" or "joseph" or "santosh" and lie between age limit 20 to 30.
If i go for list of names, query should be keys=["john", ....]
and if I go for age query should use startkey and endkey
I want to do both :)
Thanks
Unfortunately, you can't do so. Using the keys parameter query the documents with the specified key. For example, you can't only send keys=["John","Mark"]&startkey=[null,20]&endkey=[{},30]. This query would only and ONLY return the document having the name John and Mark with a null age.
In your question you specified CouchDB but if you are using Cloudant, index query might be interesting for you.
You could have something like that :
{
"selector": {
"$and": [
{
"name": {
"$in":["Mark","John"]
}
},
{
"year": {
"$gt": 20,
"$lt": 30
}
}
]
},
"fields": [
"name",
"age"
]
}
As for CouchDB, you need to either separate your request (1 request for the age and 1 for the people) or you do the filtering locally.

Elasticsearch bulk or search

Background
I am working on an API that allows the user to pass in a list of details about a member (name, email addresses, ...) I want to use this information to match up with account records in my Elasticsearch database and return a list of potential matches.
I thought this would be as simple as doing a bool query on the fields I want, however I seem to be getting no hits.
I'm relatively new to Elasticsearch, my current _search request looks like this.
Example Query
POST /member/account/_search
{
"query" : {
"filtered" : {
"filter" : {
"bool" : {
"should" [{
"term" : {
"email": "jon.smith#gmail.com"
}
},{
"term" : {
"email": "samy#gmail.com"
}
},{
"term" : {
"email": "bo.blog#gmail.com"
}
}]
}
}
}
}
}
Question
How should I update this query to return records that match any of the email addresses?
Am I able to prioritise records that match email and another field? Example "family_name".
Will this be a problem if I need to do this against a few hundred emails addresses?
Well , you need to make the change in the index side rather than query side.
By default your email ID is broken into
jon.smith#gmail.com => [ jon , smith , gmail , com]
While indexing.
Now when you are searching using term query , it does not apply the analyzer and it tries to get the exact match of jon.smith#gmail.com , which as you can see , wont work.
Even if you use match query , then you will end up getting all document as matches.
Hence you need to change the mapping to index email ID as a single token , rather than tokenizing it.
So using not_analyzed would be the best solution here.
When you define email field as not_analyzed , the following happens while indexing.
jon.smith#gmail.com => [ jon.smith#gmail.com]
After changing the mapping and indexing all your documents , now you can freely run the above query.
I would suggest to use terms query as following -
{
"query": {
"terms": {
"email": [
"jon.smith#gmail.com",
"samy#gmail.com",
"bo.blog#gmail.com"
]
}
}
}
To answer the second part of your question - You are looking for boosting and would recommend to go through function score query

Resources