Querying DynamoDB table by hash and range key - ruby

I want to query DynamoDB table by hash and range key, using AWS SDK for Ruby V2. Following code can work.
dynamodb = Aws::DynamoDB::Client.new(region: 'somewhere')
dynamodb.query(
table_name: TABLE_NAME,
key_conditions: {
HASH_KEY_NAME => {
attribute_value_list: ['hoge'],
comparison_operator: 'EQ'
},
RANGE_KEY_NAME => {
attribute_value_list: ['foo'],
comparison_operator: 'EQ'
}
}
)
But, I want to set multiple items to range key condition.
Like this:
dynamodb = Aws::DynamoDB::Client.new(region: 'somewhere')
dynamodb.query(
table_name: TABLE_NAME,
key_conditions: {
HASH_KEY_NAME => {
attribute_value_list: ['hoge'],
comparison_operator: 'EQ'
},
RANGE_KEY_NAME => {
attribute_value_list: ['foo', 'bar'],
comparison_operator: 'EQ'
}
}
)
This code returns lib/ruby/gems/2.2.0/gems/aws-sdk-core-2.0.48/lib/seahorse/client/plugins/raise_response_errors.rb:15:in `call': One or more parameter values were invalid: Invalid number of argument(s) for the EQ ComparisonOperator (Aws::DynamoDB::Errors::ValidationException).
I've tried to use IN operator.
dynamodb = Aws::DynamoDB::Client.new(region: 'somewhere')
dynamodb.query(
table_name: TABLE_NAME,
key_conditions: {
HASH_KEY_NAME => {
attribute_value_list: ['hoge'],
comparison_operator: 'EQ'
},
RANGE_KEY_NAME => {
attribute_value_list: ['foo', 'bar'],
comparison_operator: 'IN'
}
}
)
It returns lib/ruby/gems/2.2.0/gems/aws-sdk-core-2.0.48/lib/seahorse/client/plugins/raise_response_errors.rb:15:in `call': Attempted conditional constraint is not an indexable operation (Aws::DynamoDB::Errors::ValidationException).
How do I query DynamoDB table by one hash key and multiple range keys?

The Query operation only allows the following operators on the Range Key:
EQ | LE | LT | GE | GT | BEGINS_WITH | BETWEEN
For a Query operation, Condition is used for specifying the
KeyConditions to use when querying a table or an index. For
KeyConditions, only the following comparison operators are supported:
EQ | LE | LT | GE | GT | BEGINS_WITH | BETWEEN
Source:
http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/QueryAndScan.html
You can still meet the requirements by using a FilterExpression :
:filter_expression => "RANGE_KEY_NAME in (:id1, :id2)",{ ":id1" => "hoge",":id2" => "foo"}
However the consumed provisioned throughput will be based on the query returned results rather than filtered result set.
Another option would be to send multiple GetItem requests (each one with a possible Range Key value) via BatchGetItem. The result would contain only the matching records:
resp = dynamodb.batch_get_item(
# required
request_items: {
"TableName" => {
# required
keys: [
{
"AttributeName" => "value", #<Hash,Array,String,Numeric,Boolean,nil,IO,Set>,
},
],
attributes_to_get: ["AttributeName", '...'],
consistent_read: true,
projection_expression: "ProjectionExpression",
expression_attribute_names: { "ExpressionAttributeNameVariable" => "AttributeName" },
},
},
return_consumed_capacity: "INDEXES|TOTAL|NONE",
)
Source : http://docs.aws.amazon.com/sdkforruby/api/Aws/DynamoDB/Client.html

Related

Laravel 8 json colum where with array of object

I would like to perform a query in a table where the column is of json type and it contains array of objects, but only if a certain condition is met
This is my current code
$initial_results = DB::table('toys')->select('id','name')->where(['name' => 'sammy', 'email' => 'whateveremail']);
if($sk ==='yes') {
$results = $initial_results->>whereRaw('JSON_CONTAINS(`info`,\'{"sku":"B07V3SSLN11"}\')')
>whereRaw('JSON_CONTAINS(`info`,\'{"asin":"DTI-LALF3-EA18"}\')')
->get();
} else {
$results = $initial_results->get();
}
But I always get 0 result if the condition is met. In database, the info I want to query indeed exist. What is the proper way to query a json column which contains array of objects? See my example data
[
{
"sku": "DTI-LALF3-EA18",
"adId": 244077676726655,
"asin": "B07V3SSLN11",
"cost": 0,
},
{
"sku": "DTI-LALF3-EA18",
"adId": 242968940906362,
"asin": "B07V3SSLN11",
"cost": 10,
.........
................
I even tried
$initial_results = DB::table('toys')->select('id','name')->where(['name' => 'sammy', 'email' => 'whateveremail'])->->whereIn(DB::raw("JSON_EXTRACT(info, '$[*].asin')"),['B07V3SSLN11']);
Thanks in advance
You can query JSON columns by using the -> operator in your clause:
->where('info->asin', 'DTI-LALF3-EA18')
JSON Where Clauses Docs

Dynamo DB Query using begins_with in ruby

Below is the my dynamodb data
Table Name : Inventory
PartitionKey: Name (string)
SortKey : ID (String)
Below is sample data.
Name ID
Fruits Mango
Fruits Mango-Green
Fruits Mango-Green-10
Fruits Mango-Green-20
Fruits Apple
Fruits Apple-Red
Veggie Onion
Veggie Onion-White
Veggie Onion-White-10
How can I add the search to the below code to return all the rows that begins_with "Mango-Green" ? I cant modify the keys or the table data now.
table_name: 'Inventory',
key_condition_expression: "#Name = :Name AND #ID = :ID",
select: "ALL_ATTRIBUTES",
expression_attribute_names: {
"#ID" => "ID",
"#product" => "product"
},
expression_attribute_values: {
":ID" => ID,
":Name" => 'Fruits'
}
more info about ruby interface into code as comments
require 'dotenv/load'
require 'aws-sdk-dynamodb'
# https://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/DynamoDB.html
client_dynamodb = Aws::DynamoDB::Client.new(
region: ENV['AWS_REGION'],
credentials: Aws::Credentials.new(ENV['AWS_ACCESS_KEY_ID'], ENV['AWS_SECRET_ACCESS_KEY'])
)
# https://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/DynamoDB/Client.html#query-instance_method
fruits = client_dynamodb.query({
table_name: "Inventory",
projection_expression: "name, id",
key_conditions: {
"id" => {
attribute_value_list: ["Mango-Green"],
comparison_operator: "BEGINS_WITH",
}
},
scan_index_forward: false, # true => asc ; false => desc
}).items

How to display unique count with latest value from Elasticsearch in Kibana

I'm New in ELK. I have created index with name "ordersatus" which store the status published by logistic partner.
whenever logistic partner update the order status then new status is pushed into elasticseach.
Now every order is having multiple enteries with order status like "ORDER CONFIRM" , "APPOINTMENT SCHEDULED" , "OUT FOR DELIVERY" etc..
Problem arise when i need to see how many order are in which status.
Total Order Count is 2, but in order status i get total count 4. because it count older values too. as you can see in attached screenshot.
I Want to Display all unique order status along with the order count having that status.
i.e
ORDER STATUS | TOTAL COUNT
APPOINTMENT_CONFIRMED | 1
ASSIGN_FOR_DELIVERY | 1
as of now its Displaying order status "CONFIRMED" with count 2. which is older value of these 2 orders.
Screenshot1
Screenshot2
You want the distinct statuses and their count in the table of the second screenshot? You have two selections in the buckets — timestamp and status: Remove the timestamp, since this will split into every unique timestamp. What you want is only the status.
This is a different dataset, but you get the idea:
As a workaround, you can add the below to the logstash config for adding weights for each status. When the status transforming from the previous status, add -1 to the previous status weight and add 1 to the new status weight.
if [status] == "ORDER CONFIRM" {
mutate { "add_field" => { "order_confirm_weight" => "1" } }
mutate { convert => { "order_confirm_weight" => "integer" }}
} else if [status] == "APPOINTMENT SCHEDULED" {
mutate { "add_field" => { "order_confirm_weight" => "-1" } }
mutate { convert => { "order_confirm_weight" => "integer" }}
mutate { "add_field" => { "appointment_scheduled_weight" => "1" } }
mutate { convert => { "appointment_scheduled_weight" => "integer" }}
} else if [status] == "OUT FOR DELIVERY" {
mutate { "add_field" => { "appointment_scheduled_weight" => "-1" } }
mutate { convert => { "appointment_scheduled_weight" => "integer" }}
mutate { "add_field" => { "out_for_delivery_weight" => "1" } }
mutate { convert => { "out_for_delivery_weight" => "integer" }}
}
Once the configuration is added to the logstash, you can get the count of each status in kibana visualization using the below aggregating functions.
ORDER CONFIRM Count = Sum (order_confirm_weight)
APPOINTMENT SCHEDULED Count = Sum (appointment_scheduled_weight)
OUT FOR DELIVERY Count = Sum (out_for_delivery_weight)

Add/modify text between parentheses

I'm trying to make a classified text, and I'm having problem turning
(class1 (subclass1) (subclass2 item1 item2))
To
(class1 (subclass1 item1) (subclass2 item1 item2))
I have no idea to turn text above to below one, without caching subclass1 in memory. I'm using Perl on Linux, so any solution using shell script or Perl is welcome.
Edit: I've tried using grep, saving whole subclass1 in a variable, then modify and exporting it to the list; but the list may get larger and that way will use a lot of memory.
I have no idea to turn text above to below one
The general approach:
Parse the text.
You appear to have lists of space-separated lists and atoms. If so, the result could look like the following:
{
type => 'list',
value => [
{
type => 'atom',
value => 'class1',
},
{
type => 'list',
value => [
{
type => 'atom',
value => 'subclass1',
},
]
},
{
type => 'list',
value => [
{
type => 'atom',
value => 'subclass2',
},
{
type => 'atom',
value => 'item1',
},
{
type => 'atom',
value => 'item2',
},
],
}
],
}
It's possible that something far simpler could be generated, but you were light on details about the format.
Extract the necessary information from the tree.
You were light on details about the data format, but it could be as simple as the following if the above data structure was created by the parser:
my $item = $tree->{value}[2]{value}[1]{value};
Perform the required modifications.
You were light on details about the data format, but it could be as simple as the following if the above data structure was created by the parser:
my $new_atom = { type => 'atom', value => $item };
push #{ $tree->{value}[1]{value} }, $new_atom;
Serialize the data structure.
For the above data structure, you could use the following:
sub serialize {
my ($node) = #_;
return $node->{type} eq 'list'
? "(".join(" ", map { serialize($_) } #{ $node->{value} }).")"
: $node->{value};
}
Other approaches could be available depending on the specifics.

Multiple limit condition in mongodb

I have a collection in which one of the field is "type". I want to get some values of each type depending upon condition which is same for all the types. Like I want 2 documents for type A, 2 for type B like that.
How to do this in a single query? I am using Ruby Active Record.
Generally what you are describing is a relatively common question around the MongoDB community which we could describe as the "top n results problem". This is when given some input that is likely sorted in some way, how to get the top n results without relying on arbitrary index values in the data.
MongoDB has the $first operator which is available to the aggregation framework which deals with the "top 1" part of the problem, as this actually takes the "first" item found on a grouping boundary, such as your "type". But getting more than "one" result of course gets a little more involved. There are some JIRA issues on this about modifying other operators to deal with n results or "restrict" or "slice". Notably SERVER-6074. But the problem can be handled in a few ways.
Popular implementations of the rails Active Record pattern for MongoDB storage are Mongoid and Mongo Mapper, both allow access to the "native" mongodb collection functions via a .collection accessor. This is what you basically need to be able to use native methods such as .aggregate() which supports more functionality than general Active Record aggregation.
Here is an aggregation approach with mongoid, though the general code does not alter once you have access to the native collection object:
require "mongoid"
require "pp";
Mongoid.configure.connect_to("test");
class Item
include Mongoid::Document
store_in collection: "item"
field :type, type: String
field :pos, type: String
end
Item.collection.drop
Item.collection.insert( :type => "A", :pos => "First" )
Item.collection.insert( :type => "A", :pos => "Second" )
Item.collection.insert( :type => "A", :pos => "Third" )
Item.collection.insert( :type => "A", :pos => "Forth" )
Item.collection.insert( :type => "B", :pos => "First" )
Item.collection.insert( :type => "B", :pos => "Second" )
Item.collection.insert( :type => "B", :pos => "Third" )
Item.collection.insert( :type => "B", :pos => "Forth" )
res = Item.collection.aggregate([
{ "$group" => {
"_id" => "$type",
"docs" => {
"$push" => {
"pos" => "$pos", "type" => "$type"
}
},
"one" => {
"$first" => {
"pos" => "$pos", "type" => "$type"
}
}
}},
{ "$unwind" => "$docs" },
{ "$project" => {
"docs" => {
"pos" => "$docs.pos",
"type" => "$docs.type",
"seen" => {
"$eq" => [ "$one", "$docs" ]
},
},
"one" => 1
}},
{ "$match" => {
"docs.seen" => false
}},
{ "$group" => {
"_id" => "$_id",
"one" => { "$first" => "$one" },
"two" => {
"$first" => {
"pos" => "$docs.pos",
"type" => "$docs.type"
}
},
"splitter" => {
"$first" => {
"$literal" => ["one","two"]
}
}
}},
{ "$unwind" => "$splitter" },
{ "$project" => {
"_id" => 0,
"type" => {
"$cond" => [
{ "$eq" => [ "$splitter", "one" ] },
"$one.type",
"$two.type"
]
},
"pos" => {
"$cond" => [
{ "$eq" => [ "$splitter", "one" ] },
"$one.pos",
"$two.pos"
]
}
}}
])
pp res
The naming in the documents is actually not used by the code, and titles in the data shown for "First", "Second" etc, are really just there to illustrate that you are indeed getting the "top 2" documents from the listing as a result.
So the approach here is essentially to create a "stack" of the documents "grouped" by your key, such as "type". The very first thing here is to take the "first" document from that stack using the $first operator.
The subsequent steps match the "seen" elements from the stack and filter them, then you take the "next" document off of the stack again using the $first operator. The final steps in there are really justx to return the documents to the original form as found in the input, which is generally what is expected from such a query.
So the result is of course, just the top 2 documents for each type:
{ "type"=>"A", "pos"=>"First" }
{ "type"=>"A", "pos"=>"Second" }
{ "type"=>"B", "pos"=>"First" }
{ "type"=>"B", "pos"=>"Second" }
There was a longer discussion and version of this as well as other solutions in this recent answer:
Mongodb aggregation $group, restrict length of array
Essentially the same thing despite the title and that case was looking to match up to 10 top entries or greater. There is some pipeline generation code there as well for dealing with larger matches as well as some alternate approaches that may be considered depending on your data.
You will not be able to do this directly with only the type column and the constraint that it must be one query. However there is (as always) a way to accomplish this.
To find documents of different types, you would need to have some type of additional value that, on average distributed the types out according to how you want the data back.
db.users.insert({type: 'A', index: 1})
db.users.insert({type: 'B', index: 2})
db.users.insert({type: 'A', index: 3})
db.users.insert({type: 'B', index: 4})
db.users.insert({type: 'A', index: 5})
db.users.insert({type: 'B', index: 6})
Then when querying for items with db.users.find(index: {$gt: 2, $lt: 7}) you will have the right distribution of items.
Though I'm not sure this was what you were looking for

Resources