Dynamo DB Query using begins_with in ruby - ruby

Below is the my dynamodb data
Table Name : Inventory
PartitionKey: Name (string)
SortKey : ID (String)
Below is sample data.
Name ID
Fruits Mango
Fruits Mango-Green
Fruits Mango-Green-10
Fruits Mango-Green-20
Fruits Apple
Fruits Apple-Red
Veggie Onion
Veggie Onion-White
Veggie Onion-White-10
How can I add the search to the below code to return all the rows that begins_with "Mango-Green" ? I cant modify the keys or the table data now.
table_name: 'Inventory',
key_condition_expression: "#Name = :Name AND #ID = :ID",
select: "ALL_ATTRIBUTES",
expression_attribute_names: {
"#ID" => "ID",
"#product" => "product"
},
expression_attribute_values: {
":ID" => ID,
":Name" => 'Fruits'
}

more info about ruby interface into code as comments
require 'dotenv/load'
require 'aws-sdk-dynamodb'
# https://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/DynamoDB.html
client_dynamodb = Aws::DynamoDB::Client.new(
region: ENV['AWS_REGION'],
credentials: Aws::Credentials.new(ENV['AWS_ACCESS_KEY_ID'], ENV['AWS_SECRET_ACCESS_KEY'])
)
# https://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/DynamoDB/Client.html#query-instance_method
fruits = client_dynamodb.query({
table_name: "Inventory",
projection_expression: "name, id",
key_conditions: {
"id" => {
attribute_value_list: ["Mango-Green"],
comparison_operator: "BEGINS_WITH",
}
},
scan_index_forward: false, # true => asc ; false => desc
}).items

Related

Elastic Search Nest Fuzzy per Field

I am struggling with Elasticsearch using NEST for C#.
Let's assume an index of UserAccounts which looks like
[{
AccountId: 1,
Name: "Test Account",
Email: "test#test.com",
Phone: "01234/5678",
Street: "test Street 1",
Zip: "12345"
},
{
AccountId: 2,
Name: "Test Akkount",
Email: "test#gmail.com",
Phone: "0987/6543"
Street: "test Street 1",
Zip: "54321"
},
{
AccountId: 3,
Name: "Bla Bla",
Email: "qwer#yahoo.com",
Phone: null,
Street: "bla Street 3",
Zip: "45678
},
{
AccountId: 4,
Name: "Foo",
Email: "asdf#msn.com",
Phone: null,
Street: "ghjk Street 9",
Zip: "65487,
}]
now I want get all accounts similar to my query.
string name = "aggount";
string email = "test#gmail.com";
string phone = "0987/6543";
string street = "test Str 1"
string zip = "54321"
But each field has its own criteria.
Field "name" should match over fuzzy logic.
Field "email" should match to 100% but not when its null or empty.
Field "phone" should match to 100% but not when its null or empty.
Field "street" should only match with fuzzy, when "zip" matches to 100%.
I want a list of account with possibilities. If name matches but email not, than there should be a result because of name. Do elastic trim always the provided values?
If it is possible to get a score per field. But this is a nice to have.
My code do not work because when I provide a email and the email is not matching, elastic skip the match over name.
var response = elasticClient.Search<Accounts>(search => search
.Index(INDEX_NAME_ACCOUNT)
.Query(q => q.Bool(b =>
{
if (name != null)
{
b = b.Should(s => s.Match(m => m.Query(name).Field(f => f.Name).Boost(1.5).Fuzziness(Fuzziness.EditDistance(3))));
}
if (street != null && zipCode != null)
{
b = b.Should(s =>
s.Match(m => m.Query(street).Field(f => f.MainAddress.Street).Boost(0.5).Fuzziness(Fuzziness.EditDistance(3))) &&
s.Match(m => m.Query(zipCode).Field(f => f.MainAddress.Zip).Boost(0.7).Fuzziness(Fuzziness.EditDistance(0)))
);
}
if (string.IsNullOrEmpty(name1) && string.IsNullOrEmpty(street))
{
b = b.Should(s => s.MatchNone());
}
b = b.MustNot(s => s.Match(m => m.Query(null).Field(f => f.DeletedTimestamp)));
return b;
}))
.Explain()
.Human()
);
Thank you in advance

Expect orWhere() to works with andWhere() instead where()

I have a query:
topics = await topicRepository.createQueryBuilder('topic')
.leftJoinAndSelect('topic.user', 'user', 'topic.userId = user.id')
.where('topic.categoryId = :id', {
id: categoryId,
})
.andWhere('topic.title like :search', { search: `%${searchKey}%`})
// It should take the first where
.orWhere('user.pseudo like :search', { search: `%${searchKey}%` })
.addOrderBy(filter === 'latest' ? 'topic.created_at' : 'topic.repliesCount', 'DESC')
.take(limit)
.skip(skip)
.getMany();
Generated SQL query is:
SELECT DISTINCT distinctAlias.topic_id as \"ids_topic_id\", distinctAlias.topic_created_at FROM (SELECT topic.id AS topic_id, topic.title AS topic_title, topic.content AS topic_content, topic.created_at AS topic_created_at, topic.views AS topic_views, topic.repliesCount AS topic_repliesCount, topic.categoryId AS topic_categoryId, topic.userId AS topic_userId, topic.surveyId AS topic_surveyId, user.id AS user_id, user.email AS user_email, user.pseudo AS user_pseudo, user.password AS user_password, user.rank AS user_rank, user.avatar AS user_avatar, user.createdAt AS user_createdAt, user.lastActivity AS user_lastActivity, user.signature AS user_signature, user.post_count AS user_post_count, user.updatedAt AS user_updatedAt FROM topic topic LEFT JOIN user user ON user.id=topic.userId AND (topic.userId = user.id) WHERE topic.categoryId = '5' AND topic.title like '%admin%' OR topic.user.pseudo like '%admin%') distinctAlias ORDER BY distinctAlias.topic_created_at DESC, topic_id ASC LIMIT 20
The problem is here:
WHERE topic.categoryId = '5' AND topic.title like '%admin%' OR topic.user.pseudo like '%admin%')
I expected :
WHERE (topic.categoryId = '5' AND topic.title like '%admin%') OR (topic.categoryId = '5' AND topic.user.pseudo like '%admin%')
I want the .orWhere being OR from .andWhere instead .where
I don't find any documentation / issue about this use case.
The precedence of query conditions can be controlled by using the Brackets class:
topics = await topicRepository.createQueryBuilder('topic')
.leftJoinAndSelect('topic.user', 'user', 'topic.userId = user.id')
.where('topic.categoryId = :id', {
id: categoryId,
})
.andWhere(new Brackets(qb => {
qb.where('topic.title like :search', { search: `%${searchKey}%`})
.orWhere('user.pseudo like :search', { search: `%${searchKey}%` });
}))
.addOrderBy(filter === 'latest' ? 'topic.created_at' : 'topic.repliesCount', 'DESC')
.take(limit)
.skip(skip)
.getMany();
(edit: added period to correct syntax)

Querying DynamoDB table by hash and range key

I want to query DynamoDB table by hash and range key, using AWS SDK for Ruby V2. Following code can work.
dynamodb = Aws::DynamoDB::Client.new(region: 'somewhere')
dynamodb.query(
table_name: TABLE_NAME,
key_conditions: {
HASH_KEY_NAME => {
attribute_value_list: ['hoge'],
comparison_operator: 'EQ'
},
RANGE_KEY_NAME => {
attribute_value_list: ['foo'],
comparison_operator: 'EQ'
}
}
)
But, I want to set multiple items to range key condition.
Like this:
dynamodb = Aws::DynamoDB::Client.new(region: 'somewhere')
dynamodb.query(
table_name: TABLE_NAME,
key_conditions: {
HASH_KEY_NAME => {
attribute_value_list: ['hoge'],
comparison_operator: 'EQ'
},
RANGE_KEY_NAME => {
attribute_value_list: ['foo', 'bar'],
comparison_operator: 'EQ'
}
}
)
This code returns lib/ruby/gems/2.2.0/gems/aws-sdk-core-2.0.48/lib/seahorse/client/plugins/raise_response_errors.rb:15:in `call': One or more parameter values were invalid: Invalid number of argument(s) for the EQ ComparisonOperator (Aws::DynamoDB::Errors::ValidationException).
I've tried to use IN operator.
dynamodb = Aws::DynamoDB::Client.new(region: 'somewhere')
dynamodb.query(
table_name: TABLE_NAME,
key_conditions: {
HASH_KEY_NAME => {
attribute_value_list: ['hoge'],
comparison_operator: 'EQ'
},
RANGE_KEY_NAME => {
attribute_value_list: ['foo', 'bar'],
comparison_operator: 'IN'
}
}
)
It returns lib/ruby/gems/2.2.0/gems/aws-sdk-core-2.0.48/lib/seahorse/client/plugins/raise_response_errors.rb:15:in `call': Attempted conditional constraint is not an indexable operation (Aws::DynamoDB::Errors::ValidationException).
How do I query DynamoDB table by one hash key and multiple range keys?
The Query operation only allows the following operators on the Range Key:
EQ | LE | LT | GE | GT | BEGINS_WITH | BETWEEN
For a Query operation, Condition is used for specifying the
KeyConditions to use when querying a table or an index. For
KeyConditions, only the following comparison operators are supported:
EQ | LE | LT | GE | GT | BEGINS_WITH | BETWEEN
Source:
http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/QueryAndScan.html
You can still meet the requirements by using a FilterExpression :
:filter_expression => "RANGE_KEY_NAME in (:id1, :id2)",{ ":id1" => "hoge",":id2" => "foo"}
However the consumed provisioned throughput will be based on the query returned results rather than filtered result set.
Another option would be to send multiple GetItem requests (each one with a possible Range Key value) via BatchGetItem. The result would contain only the matching records:
resp = dynamodb.batch_get_item(
# required
request_items: {
"TableName" => {
# required
keys: [
{
"AttributeName" => "value", #<Hash,Array,String,Numeric,Boolean,nil,IO,Set>,
},
],
attributes_to_get: ["AttributeName", '...'],
consistent_read: true,
projection_expression: "ProjectionExpression",
expression_attribute_names: { "ExpressionAttributeNameVariable" => "AttributeName" },
},
},
return_consumed_capacity: "INDEXES|TOTAL|NONE",
)
Source : http://docs.aws.amazon.com/sdkforruby/api/Aws/DynamoDB/Client.html

Compare three arrays of hashes and get the result without duplicates in ruby?

I m using the fql gem to retrieve the data from facebook. The original array of hashes is like this. Here. When i compare these three arrays of hashes then i want to get the final result in this way:
{
"photo" => [
[0] {
"owner" : "1105762436",
"src_big" : "https://fbcdn-sphotos-b-a.akamaihd.net/hphotos-ak-xap1/t31.0-8/q71/s720x720/10273283_10203050474118531_5420466436365792507_o.jpg",
"caption" : "Rings...!!\n\nView Full Screen.",
"created" : 1398953040,
"modified" : 1398953354,
"like_info" : {
"can_like" : true,
"like_count" : 22,
"user_likes" : true
},
"comment_info" : {
"can_comment" : true,
"comment_count" : 2,
"comment_order" : "chronological"
},
"object_id" : "10203050474118531",
"pid" : "4749213500839034982"
}
],
"comment" => [
[0] {
"text" : "Wow",
"text_tags" : [],
"time" : 1398972853,
"likes" : 1,
"fromid" : "100001012753267",
"object_id" : "10203050474118531"
},
[1] {
"text" : "Woww..",
"text_tags" : [],
"time" : 1399059923,
"likes" : 0,
"fromid" : "100003167704574",
"object_id" : "10203050474118531"
}
],
"users" =>[
[0] {
"id": "1105762436",
"name": "Nilanjan Joshi",
"username": "NilaNJan219"
},
[1] {
"id": "1105762436",
"name": "Ashish Joshi",
"username": "NilaNJan219"
}
]
}
Here is my attempt:
datas = File.read('source2.json')
all_data = JSON.parse(datas)
photos = all_data[0]['fql_result_set'].group_by{|x| x['object_id']}.to_a
comments = all_data[1]['fql_result_set'].group_by{|x| x['object_id']}.to_a
#photos_comments = []
#comments_users = []
#photo_users = []
photos.each do |a|
comments.each do |b|
if a.first == b.first
#photos_comments << {'photo' => a.last, 'comment' => b.last}
else
#comments_users << {'photo' => a.last, 'comment' => ''} unless #photos_comments.include? (a.last)
end
end
end
#photo_users = #photos_comments | #comments_users
#photo_comment_users = {photos_comments: #photo_users }
Here is what i'm getting final result
Still there are duplicates in the final array. I've grouped by the array by object id which is common between the photo and the comment array. But the problem it is only taking those photos which has comments. I'm not getting the way how to find out the photos which don't have the comments.
Also in order to find out the details of the person who has commented, ive users array and the common attribute between comments and users is fromid and id. I'm not able to understand how to get the user details also.
I think this is what you want:
photos = all_data[0]['fql_result_set']
comments = all_data[1]['fql_result_set'].group_by{|x| x['object_id']}
#photo_comment_users = photos.map do |p|
{ 'photo' => p, 'comment' => comments[p['object_id']] || '' }
end
For each photo it takes all the comments with the same object_id, or if none exist - returns ''.
If you want to connect the users too, you can map them by id, and select the relevant ones by the comment:
users = Hash[all_data[2]['fql_result_set'].map {|x| [x['id'], x]}]
#photo_comment_users = photos.map do |p|
{ 'photo' => p, 'comment' => comments[p['object_id']] || '',
'user' => (comments[p['object_id']] || []).map {|c| users[c['formid']]} }
end

How do I extract values from nested JSON?

After parsing some JSON:
data = JSON.parse(data)['info']
puts data
I get:
[
{
"title"=>"CEO",
"name"=>"George",
"columns"=>[
{
"display_name"=> "Salary",
"value"=>"3.85",
}
, {
"display_name"=> "Bonus",
"value"=>"994.19",
}
, {
"display_name"=> "Increment",
"value"=>"8.15",
}
]
}
]
columns has nested data in itself.
I want to save the data in a database or CSV file.
title, name, value_Salary, value_Bonus, value_increment
But I'm not concerned about getting display_name, so just the values of first of columns, second of columns data, etc.
Ok I tried data.map after converting to hash & hash.flatten could find a way out.. .map{|x| x['columns']}
.map {|s| s["value"]}
tried to get the values atleast separately - but couldnt...
This is a simple problem, and resolves down to a couple nested map blocks.
Here's the data retrieved from JSON, plus an extra row to demonstrate how easy it is to handle a more complex JSON response:
data = [
{
"title" => "CEO",
"name" => "George",
"columns" => [
{
"display_name" => "Salary",
"value" => "3.85",
},
{
"display_name" => "Bonus",
"value" => "994.19",
},
{
"display_name" => "Increment",
"value" => "8.15",
}
]
},
{
"title" => "CIO",
"name" => "Fred",
"columns" => [
{
"display_name" => "Salary",
"value" => "3.84",
},
{
"display_name" => "Bonus",
"value" => "994.20",
},
{
"display_name" => "Increment",
"value" => "8.15",
}
]
}
]
Here's the code:
records = data.map { |record|
title, name = record.values_at('title', 'name')
values = record['columns'].map{ |column| column['value'] }
[title, name, *values]
}
Here's the resulting data structure, an array of arrays:
records
# => [["CEO", "George", "3.85", "994.19", "8.15"],
# ["CIO", "Fred", "3.84", "994.20", "8.15"]]
Saving it into a database or CSV is left for you to figure out, but Ruby's CSV class makes it trivial to write a file, and an ORM like Sequel makes it really easy to insert the data into a database.

Resources