I can't seem to wrap my head around the AWS Ruby SDK documentation for DynamoDB (or more specifically the concepts of the DynamoDB data model).
Specifically I've been reading: http://docs.aws.amazon.com/AWSRubySDK/latest/frames.html#!AWS/DynamoDB.html
Note: I have read through the Data Model documentation as well and it's still not sinking in; I'm hoping a proper example in Ruby with clear up my confusion
In the following code snippet, I create a table called "my_books" which has a primary_key called "item_id" and it's a Hash key (not a Hash/Range combination)...
dyn = AWS::DynamoDB::Client::V20120810.new
# => #<AWS::DynamoDB::Client::V20120810>
dyn.create_table({
:attribute_definitions => [
{ :attribute_name => "item_id", :attribute_type => "N" }
],
:table_name => "my_books",
:key_schema => [
{ :attribute_name => "item_id", :key_type => "HASH" },
],
:provisioned_throughput => {
:read_capacity_units => 10,
:write_capacity_units => 10
}
})
# => {:table_description=>{:attribute_definitions=>[{:attribute_name=>"item_id", :attribute_type=>"N"}], :table_name=>"my_books", :key_schema=>[{:attribute_name=>"item_id", :key_type=>"HASH"}], :table_status=>"ACTIVE", :creation_date_time=>2014-11-24 16:59:47 +0000, :provisioned_throughput=>{:number_of_decreases_today=>0, :read_capacity_units=>10, :write_capacity_units=>10}, :table_size_bytes=>0, :item_count=>0}}
dyn.list_tables
# => {:table_names=>["my_books"]}
dyn.scan :table_name => "my_books"
# => {:member=>[], :count=>0, :scanned_count=>0}
I then try and populate the table with a new item. My understanding is that I should specify the numerical value for item_id (which is the primary key) and then I could specify other attributes for the new item/record/document I'm adding to the table...
dyn.put_item(
:table_name => "my_books",
:item => {
"item_id" => 1,
"item_title" => "My Book Title",
"item_released" => false
}
)
But that last command returns the following error:
expected hash value for value at key item_id of option item
So although I don't quite understand what the hash will be made of, I try doing that:
dyn.put_item(
:table_name => "my_books",
:item => {
"item_id" => { "N" => 1 },
"item_title" => "My Book Title",
"item_released" => false
}
)
But this now returns the following error...
expected string value for key N of value at key item_id of option item
I've tried different variations, but can't seem to figure out how this works?
EDIT/UPDATE: as suggested by Uri Agassi - I changed the value from 1 to "1". I'm not really sure why this has to be quoted as I've defined the type to be a number and not a string, but OK let's just accept this and move on.
I've finally figured out most of what I needed to understand the data model of DynamoDB and using the Ruby SDK.
Below is my example code, which hopefully will help someone else, and I've got a fully fleshed out example here: https://gist.github.com/Integralist/9f9f2215e001b15ac492#file-3-dynamodb-irb-session-rb
# https://github.com/BBC-News/alephant-harness can automate the below set-up when using Spurious
# API Documentation http://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_Operations.html
# Ruby SDK API Documentation http://docs.aws.amazon.com/AWSRubySDK/latest/frames.html#!AWS/DynamoDB/Client/V20120810.html
require "aws-sdk"
require "dotenv"
require "spurious/ruby/awssdk/helper"
Spurious::Ruby::Awssdk::Helper.configure
# => <AWS::Core::Configuration>
Dotenv.load(
File.join(
File.dirname(__FILE__), "config", "development", "env.yaml"
)
)
# => {"AWS_REGION"=>"eu-west-1", "AWS_ACCESS_KEY_ID"=>"development_access", "AWS_SECRET_ACCESS_KEY"=>"development_secret", "DYNAMO_LU"=>"development_lookup", "DYNAMO_SQ"=>"development_sequence", "SQS_QUEUE"=>"development_queue", "S3_BUCKET"=>"development_bucket"}
dyn = AWS::DynamoDB::Client.new :api_version => "2012-08-10"
dyn = AWS::DynamoDB::Client::V20120810.new
# => #<AWS::DynamoDB::Client::V20120810>
dyn.create_table({
# This section requires us to define our primary key
# Which will be called "item_id" and it must be a numerical value
:attribute_definitions => [
{ :attribute_name => "item_id", :attribute_type => "N" }
],
:table_name => "my_books",
# The primary key will be a simple Hash key (not a Hash/Range which requires both key types to be provided)
# The attributes defined above must be included in the :key_schema Array
:key_schema => [
{ :attribute_name => "item_id", :key_type => "HASH" }
],
:provisioned_throughput => {
:read_capacity_units => 10,
:write_capacity_units => 10
}
})
# => {:table_description=>{:attribute_definitions=>[{:attribute_name=>"item_id", :attribute_type=>"N"}], :table_name=>"my_books", :key_schema=>[{:attribute_name=>"item_id", :key_type=>"HASH"}], :table_status=>"ACTIVE", :creation_date_time=>2014-11-24 16:59:47 +0000, :provisioned_throughput=>{:number_of_decreases_today=>0, :read_capacity_units=>10, :write_capacity_units=>10}, :table_size_bytes=>0, :item_count=>0}}
dyn.list_tables
# => {:table_names=>["my_books"]}
dyn.scan :table_name => "my_books"
# => {:member=>[], :count=>0, :scanned_count=>0}
dyn.put_item(
:table_name => "my_books",
:item => {
"item_id" => { "N" => "1" }, # oddly this needs to be a String and not a strict Integer?
"item_title" => { "S" => "My Book Title"},
"item_released" => { "B" => "false" }
}
)
# Note: if you use an "item_id" that already exists, then the item will be updated.
# Unless you use the "expected" conditional feature
dyn.put_item(
:table_name => "my_books",
:item => {
"item_id" => { "N" => "1" }, # oddly this needs to be a String and not a strict Integer?
"item_title" => { "S" => "My Book Title"},
"item_released" => { "B" => "false" }
},
# The :expected key specifies the conditions of our "put" operation.
# If "item_id" isn't NULL (i.e. it exists) then our condition has failed.
# This means we only write the value when the key "item_id" hasn't been set.
:expected => {
"item_id" => { :comparison_operator => "NULL" }
}
)
# AWS::DynamoDB::Errors::ConditionalCheckFailedException: The conditional check failed
dyn.scan :table_name => "my_books"
# => {:member=>[{"item_id"=>{:n=>"1"}, "item_title"=>{:s=>"My Book Title"}, "item_released"=>{:b=>"false"}}], :count=>1, :scanned_count=>1}
dyn.query :table_name => "my_books", :consistent_read => true, :key_conditions => {
"item_id" => {
:comparison_operator => "EQ",
:attribute_value_list => [{ "n" => "1" }]
},
"item_title" => {
:comparison_operator => "EQ",
:attribute_value_list => [{ "s" => "My Book Title" }]
}
}
# => {:member=>[{"item_id"=>{:n=>"1"}, "item_title"=>{:s=>"My Book Title"}, "item_released"=>{:b=>"false"}}], :count=>1, :scanned_count=>1}
dyn.query :table_name => "my_books",
:consistent_read => true,
:select => "SPECIFIC_ATTRIBUTES",
:attributes_to_get => ["item_title"],
:key_conditions => {
"item_id" => {
:comparison_operator => "EQ",
:attribute_value_list => [{ "n" => "1" }]
},
"item_title" => {
:comparison_operator => "EQ",
:attribute_value_list => [{ "s" => "My Book Title" }]
}
}
# => {:member=>[{"item_title"=>{:s=>"My Book Title"}}], :count=>1, :scanned_count=>1}
dyn.delete_item(
:table_name => "my_books",
:key => {
"item_id" => { "n" => "1" }
}
)
# => {:member=>[], :count=>0, :scanned_count=>0}
Related
I have a hash that i use in watir automation and it returns a true/false value based on the presence of elements in a UI. Rather than returning the entire hash, can i just return anything that evaluates to false?
elements = {
"Title" => #b.title == 'Details',
"Name" => #b.div(:class => 'rpt-name').present?,
"Address" => #b.div(:class => 'rpt-address').present?,
"Stats" => #b.div(:class => 'rpt-stats-container-top').present?,
"Employee Information" => #b.div(:class => 'rpt-stats-container').present?,
"Reports" => #b.div(:class => 'rpt-ribbon-container').present?
}
if elements.values.include? false
puts "ERROR: Page Validation Failed. #{elements.inspect}"
valid = false
else
valid = true
end
valid
you can use select and return all values that are false
elements.select {|_key, value| !value }
your code would be like:
elements = {
"Title" => #b.title == 'Details',
"Name" => #b.div(:class => 'rpt-name').present?,
"Address" => #b.div(:class => 'rpt-address').present?,
"Stats" => #b.div(:class => 'rpt-stats-container-top').present?,
"Employee Information" => #b.div(:class => 'rpt-stats-container').present?,
"Reports" => #b.div(:class => 'rpt-ribbon-container').present?
}.select { |_key, value| !value }
I'm new on logstash and i'm trying to sum the value from two colums from my database and generate a new metric.
I've exhausted all my alternatives.
this is my conf file. the new varible that I'm creating is the 'tod_ped'
I created another variable to try to understand what is happenig with the value of 'TOT_PROD', the variable is 'valor'.
the two colums that i`m trying to sum is 'TOT_PROD' and 'TOT_SERV'.
input {
jdbc {
jdbc_driver_library => "jtds-1.3.1.jar"
jdbc_driver_class => "Java::net.sourceforge.jtds.jdbc.Driver"
jdbc_connection_string => "jdbc:jtds:sqlserver://xxxxxx:1433/dbPHXPSS"
jdbc_user => "readonly"
jdbc_password => "xxxxx"
statement => "SELECT [NVENDA]
,[CPROJETO]
,[TECNOLOGIA]
,[PREVISAO]
,[APROVACAO]
,[STATUS]
,[CLIENTE]
,[TITULO]
,[TOT_PROD]
,[TOT_SERV]
,[QTD_H_FE]
,[QTD_H_SE]
,[QTD_H_PM]
,[QTD_H_DES]
,[TOT_DESPESA]
,[VENDEDOR]
,[TIPO_SOLICITACAO]
,[TECNOLOGIAPROJ]
FROM [dbPHXPSS].[dbo].[VW_PROVISAOPROJETOS]
where QTD_H_PM IS NOT NULL"
}
}
filter {
ruby {
code =>"
hash = event.to_hash
hash.each do |k,v|
if v == nil
event.set(k,'0')
end
if k == 'TOT_PROD'
event.set(teste, v)
end
end
# testing the content from de varible 'TOT_PROD'
event.set('valor', event.get('teste'))
"
}
mutate {
convert => ["TOT_PROD","float_eu"]
}
ruby {
code =>"
# adding the values to 'tot_ped'
event.set('tot_ped', (event.get('TOT_PROD').to_f + event.get('TOT_SERV').to_f ))
"
}
}
output {
elasticsearch {
hosts => "localhost"
index => "phoenix"
document_type => "phxdb"
}
stdout {}
}
this is the return code from logstash.
What i've noticed is that the varible 'tot_ped' is not adding the values, and the test varible is returning the value of 'TOT_PROD' as nil.
{
"#version" => "1",
"cliente" => "OAB SP ",
"vendedor" => "Sxxxxxxxx",
"status" => "MEDIA",
"tipo_solicitacao" => "0",
"aprovacao" => "0",
"tot_serv" => 0.0,
"cprojeto" => "0",
"qtd_h_se" => 132.0,
"qtd_h_pm" => 24.0,
"valor" => nil,
"tot_ped" => 0.0,
"qtd_h_des" => "0",
"#timestamp" => 2019-03-15T13:42:15.243Z,
"tot_prod" => 134133.7195,
"tot_despesa" => "0",
"titulo" => "Projeto Wifi",
"previsao" => "0",
"tecnologia" => "VSF",
"tecnologiaproj" => "0",
"qtd_h_fe" => 56.0,
"nvenda" => 20361.0
}
Thank you advanced.
I fixed!
The problem were that I was using CAPS on the name of my variable, I just changed to lower case, and worked!
ruby {
code =>"
event.set('tot_ped', ((event.get('tot_prod').to_f * 2.25) + event.get('tot_serv').to_f ))
"
}
I have a hash which looks like this:
hash = {
'key1' => ['value'],
'key2' => {
'sub1' => ['string'],
'sub2' => ['string'],
},
'shippingInfo' => {
'shippingType' => ['Calculated'],
'shipToLocations' => ['Worldwide'],
'expeditedShipping' => ['false'],
'oneDayShippingAvailable' => ['false'],
'handlingTime' => ['3'],
}
}
I need to convert each value which is a single string inside an array so that it ends up like this:
hash = {
'key1' => 'value' ,
'key2' => {
'sub1' => 'string' ,
'sub2' => 'string' ,
},
'shippingInfo' => {
'shippingType' => 'Calculated' ,
'shipToLocations' => 'Worldwide' ,
'expeditedShipping' => 'false' ,
'oneDayShippingAvailable' => 'false' ,
'handlingTime' => '3' ,
}
}
I found this but couldn't get it work
https://gist.github.com/chris/b4138603a8fe17e073c6bc073eb17785
What about something like:
def deep_transform_values(hash)
return hash unless hash.is_a?(Hash)
hash.transform_values do |val|
if val.is_a?(Array) && val.length == 1
val.first
else
deep_transform_values(val)
end
end
end
Tested with something like:
hash = {
'key1' => ['value'],
'key2' => {
'sub1' => ['string'],
'sub2' => ['string'],
},
'shippingInfo' => {
'shippingType' => ['Calculated'],
'shipToLocations' => ['Worldwide'],
'expeditedShipping' => ['false'],
'oneDayShippingAvailable' => ['false'],
'handlingTime' => ['3'],
'an_integer' => 1,
'an_empty_array' => [],
'an_array_with_more_than_one_elements' => [1,2],
'a_symbol' => :symbol,
'a_string' => 'string'
}
}
Gives:
{
"key1"=>"value",
"key2"=>{
"sub1"=>"string",
"sub2"=>"string"
},
"shippingInfo"=> {
"shippingType"=>"Calculated",
"shipToLocations"=>"Worldwide",
"expeditedShipping"=>"false",
"oneDayShippingAvailable"=>"false",
"handlingTime"=>"3",
"an_integer"=>1,
"an_empty_array"=>[],
"an_array_with_more_than_one_elements"=>[1, 2],
"a_symbol"=>:symbol,
"a_string"=>"string"
}
}
Following your question in the comments, I guess the logic would change a bit:
class Hash
def deep_transform_values
self.transform_values do |val|
next(val.first) if val.is_a?(Array) && val.length == 1
next(val) unless val.respond_to?(:deep_transform_values)
val.deep_transform_values
end
end
end
hash = {
'key1' => ['value'],
'key2' => {
'sub1' => ['string'],
'sub2' => ['string'],
},
'shippingInfo' => {
'shippingType' => ['Calculated'],
'shipToLocations' => ['Worldwide', 'Web'],
'expeditedShipping' => ['false'],
'oneDayShippingAvailable' => ['false'],
'handlingTime' => ['3'],
}
}
def recurse(hash)
hash.transform_values do |v|
case v
when Array
v.size == 1 ? v.first : v
when Hash
recurse v
else
# raise exception
end
end
end
recurse hash
#=> {"key1"=>"value",
# "key2"=>{
# "sub1"=>"string",
# "sub2"=>"string"
# },
# "shippingInfo"=>{
# "shippingType"=>"Calculated",
# "shipToLocations"=>["Worldwide", "Web"],
# "expeditedShipping"=>"false",
# "oneDayShippingAvailable"=>"false",
# "handlingTime"=>"3"
# }
# }
As an alternative, consider using an object and allowing the initializer to deconstruct some of the keys for you.
One of the reasons a lot of people like myself started using Ruby in favour of Perl was because of the better expression of objects in place of primitives like arrays and hashes. Use it to your advantage!
class ShippingStuff # You've kept the data vague
def initialize key1:, key2:, shippingInfo:
#blk = -> val {
val.respond_to?(:push) && val.size == 1 ?
val.first :
cleankeys(val)
}
#key1 = cleankeys key1
#key2 = cleankeys key2
#shippingInfo = shippingInfo
end
attr_reader :key1, :key2, :shippingInfo
# basically a cut down version of what
# Sebastian Palma answered with
def cleankeys data
if data.respond_to? :transform_values
data.transform_values &#blk
else
#blk.call(data)
end
end
end
hash = {
'key1' => ['value'],
'key2' => {
'sub1' => ['string'],
'sub2' => ['string'],
},
'shippingInfo' => {
'shippingType' => ['Calculated'],
'shipToLocations' => ['Worldwide'],
'expeditedShipping' => ['false'],
'oneDayShippingAvailable' => ['false'],
'handlingTime' => ['3'],
}
}
shipper = ShippingStuff.new hash.transform_keys!(&:to_sym)
shipper.key1
# "value"
shipper.key2
# {"sub1"=>"string", "sub2"=>"string"}
shipper.shippingInfo
# {"shippingType"=>["Calculated"], "shipToLocations"=>["Worldwide"], "expeditedShipping"=>["false"], "oneDayShippingAvailable"=>["false"], "handlingTime"=>["3"]}
In the same vein, I'd even make an Info class for the shippingInfo data.
You may run into a different problem if key1 and key2 are dynamic, but there's ways around that too (double splat for one).
I noticed a lot of answers with unnecessary recursion. Current version of Ruby 2.7.x with ActiveSupport (I tested with 6.1.4.4) will allow you to do this:
Input data:
hash = {
'key1' => ['value'],
'key2' => {
'sub1' => ['string'],
'sub2' => ['string']},
'shippingInfo' => {
'shippingType' => ['Calculated'],
'shipToLocations' => ['Worldwide', 'Web'],
'expeditedShipping' => ['false'],
'oneDayShippingAvailable' => ['false'],
'handlingTime' => ['3']}}
Solution:
hash.deep_transform_values do |value|
# whatever you need to do to any nested value, like:
if value == value.to_i.to_s
value.to_i
else
value
end
end
The example above will return a typecast String to Integer.
I currently have items in the following structure:
[{
"category" => ["Alcoholic Beverages", "Wine", "Red Wine"],
"name" => "Robertson Merlot",
"barcode" => '123456789-000'
"wine_farm" => "Robertson Wineries",
"price" => 60.00
}]
I have made up this data, but the data I am using is in the same structure and I cannot change the data coming in.
I have > 100 000 of these.
Each product is nested between 1 and n (unlimited) categories.
Because of the tabular nature of this data, the categories are repeated. I want to use tree data to prevent this repetition and cut down the file size by 25 to 30%.
I am aiming at a tree structure something like this:
{
"type" => "category",
"properties" => {
"name" => "Alcoholic Beverages"
},
"children" => [{
"type" => "category",
"properties" => {
"name" => "Wine"
},
"children" => [{
"type" => "category",
"properties" => {
"name" => "Red Wine"
},
"children" => [{
"type" => "product",
"properties" => {
"name" => "Robertson Merlot",
"barcode" => '123456789-000',
"wine_farm" => "Robertson Wineries",
"price" => 60.00
}
}]
}]
}]
}
I can't seem to think of an efficient algorithm to get this right. I would appreciate any help in the right direction.
Should I be generating ID's and ad the parent ID for each node? I am concerned that using ID's will add more length to the text, which I am trying to shorten.
Although I have simplified it a bit from your requested structure, you can use the logic to get an idea of how it could be done:
require 'pp'
x = [{
"category" => ["Alcoholic Beverages", "Wine", "Red Wine"],
"name" => "Robertson Merlot",
"barcode" => '123456789-000',
"wine_farm" => "Robertson Wineries",
"price" => 60.00
}]
result = {}
x.each do |entry|
# Save current level in a variable
current_level = result
# We want some special logic for the last item, so let's store that.
item = entry['category'].pop
# For each category, check if it exists, else add a category hash.
entry['category'].each do |category|
unless current_level.has_key?(category)
current_level[category] = {'type' => 'category', 'children' => {}, 'name' => category}
end
current_level = current_level[category]['children'] # Set the new current level of the hash.
end
# Finally add the item:
entry.delete('category')
entry['type'] = 'product'
current_level[item] = entry
end
pp result
And it gives us:
{"Alcoholic Beverages"=>
{"type"=>"category",
"children"=>
{"Wine"=>
{"type"=>"category",
"children"=>
{:"Red Wine"=>
{"name"=>"Robertson Merlot",
"barcode"=>"123456789-000",
"wine_farm"=>"Robertson Wineries",
"price"=>60.0,
"type"=>"product"}},
"name"=>"Wine"}},
"name"=>"Alcoholic Beverages"}}
There are probably easier ways of doing this but this is all I can think of for now, it should match your structure.
require 'json'
# Initial set up, it seems the root keys are always the same looking at your structure
products = {
'type' => 'category',
'properties' => {
'name' => 'Alcoholic Beverages'
},
'children' => []
}
data = [{
"category" => ['Alcoholic Beverages', 'Wine', 'Red Wine'],
"name" => 'Robertson Merlot',
"barcode" => '123456789-000',
"wine_farm" => 'Robertson Wineries',
"price" => 60.00
}]
data.each do |item|
# Make sure we set the current to the top-level again
curr = products['children']
# Remove first entry as it's always 'Alcoholic Beverages'
item['category'].shift
item['category'].each do |category|
# Get the index for the category if it exists
index = curr.index {|x| x['type'] == 'category' && x['properties']['name'] == category}
# If it exists then change current hash level to the child of that category
if index
curr = curr[index]['children']
# Else add it in
else
curr << {
'type' => 'category',
'properties' => {
'name' => category
},
'children' => []
}
# We can use last as we know it'll be the last index.
curr = curr.last['children']
end
end
# Delete category from the item itself
item.delete('category')
# Add the item as product type to the last level of the hash
curr << {
'type' => 'product',
'properties' => item
}
end
puts JSON.pretty_generate(products)
Here is what it looks like:
{
"groups" => [
{ "venues" => [
{ "city" => "Madrid",
"address" => "Camino de Perales, s/n",
"name" => "Caja Mágica",
"stats" => {"herenow"=>"0"},
"geolong" => -3.6894333,
"primarycategory" => {
"iconurl" => "http://foursquare.com/img/categories/arts_entertainment/stadium.png",
"fullpathname" => "Arts & Entertainment:Stadium",
"nodename" => "Stadium",
"id" => 78989 },
"geolat" => 40.375045,
"id" => 2492239,
"distance" => 0,
"state" => "Spain" }],
"type" => "Matching Places"}]
}
Big and ugly... I just want to grab the id out. How would I go about doing this?
h = { "groups" => ......... }
The two ids are:
h["groups"][0]["venues"][0]["primarycategory"]["id"]
h["groups"][0]["venues"][0]["id"]
If the hash stores one id:(assuming the value is stored in a variable called hash)
hash["groups"][0]["venues"][0]["primarycategory"]["id"] rescue nil
If the hash stores multiple ids then:
ids = Array(hash["groups"]).map do |g|
Array(g["venues"]).map do |v|
v["primarycategory"]["id"] rescue nil
end.compact
end.flatten
The ids holds the array of id's.