I have the following Ruby hash key-value pairs:
[
{
"trait_type"=>"Status",
"value"=>"Unbuilt",
"display_type"=>nil,
"max_value"=>nil,
"trait_count"=>4866,
"order"=>nil
}
]
What I need to check is see if the following key-value pairs are both present:
{
"value"=>"Unbuilt",
"trait_type"=>"Status"
}
Essentially wanting something to the effect of...
traits = [{"trait_type"=>"Status", "value"=>"Unbuilt", "display_type"=>nil, "max_value"=>nil, "trait_count"=>4866, "order"=>nil}]
filter_traits = {"value"=>"Unbuilt", "trait_type"=>"Status"}
traits.include? filter_traits
If you are using ruby >= 2.3, there is a fancy new Hash >= Hash operation that is conceptually similiar to a hypothetical contains?
Using your traits array:
trait = traits[0]
trait >= {"trait_type" => "Status", "value" => "Unbuilt"}
# => true
trait >= {"trait_type" => "Status", "value" => "Built"}
# => false
So you could try something like:
traits.select{|trait|
trait >= filter_traits
}.length > 0
# => true
arr = [
{
"trait_type"=>"Status",
"value"=>"Unbuilt",
"display_type"=>nil,
"max_value"=>nil,
"trait_count"=>4866,
"order"=>nil
}
]
h1 = { "value"=>"Unbuilt", "trait_type"=>"Status" }
h2 = { "value"=>"Rebuilt", "trait_type"=>"Status" }
Here are three solutions.
arr[0].slice(*h1.keys) == h1
#=> true
arr[0].slice(*h2.keys) == h2
#=> false
arr[0].values_at(h1.keys) == h1.values
#=> true
arr[0].values_at(h2.keys) == h2.values
#=> false
arr[0] == arr[0].merge(h1)
#=> true
arr[0] == arr[0].merge(h2)
#=> false
We also have the aptly named all? and any? methods that should do exactly what you're looking for in a very logical way.
All we need to do is loop through your filter_traits hash and test to see if all (or any) those key:value pairs are equal to (==) any corresponding key:value pairs inside your traits array:
traits = [{"trait_type"=>"Status", "value"=>"Unbuilt", "display_type"=>nil, "max_value"=>nil, "trait_count"=>4866, "order"=>nil}]
filter_traits = {"value"=>"Unbuilt", "trait_type"=>"Status"}
filter_traits.all? {|k, v| filter_traits[k] == traits[0][k]}
#=> true
filter_traits = {"value"=>"Built", "trait_type"=>"Status"}
filter_traits.all? {|k, v| filter_traits[k] == traits[0][k]}
#=> false
filter_traits.any? {|k, v| filter_traits[k] == traits[0][k]}
#=> true
Example.
If input is
Output should be
So each row must consist only same values or undef if original unsorted column had no such string. Values in columns should be alphabetically sorted.
How to realize that type of sorting ?
P.S. Original task - we have some modules and we want to compare them visually for functions with similar names.
Something like this maybe?
use warnings;
use strict;
my #data = (
{ name => 'Foo', funcs => [qw/abc def ghi xyz/] },
{ name => 'Bar', funcs => [qw/def jkl mno uvw xyz/] },
{ name => 'Baz', funcs => [qw/abc uvw xyz/] },
);
my %allfuncs = ( map { map {$_=>undef} #{$$_{funcs}} } #data );
$$_{funcs} = { %allfuncs, map {$_=>1} #{$$_{funcs}} } for #data;
use Data::Dump;
dd #data;
# just for output:
use List::Util qw/max/;
my $maxlen = max map {length} map({$$_{name}} #data), keys(%allfuncs);
my $fmt = join(' ', ("%${maxlen}s") x #data)."\n";
printf $fmt, map { $$_{name} } #data;
for my $f (sort keys %allfuncs) {
printf $fmt, map { $$_{funcs}{$f}?$f:'' } #data;
}
Output:
(
{
funcs => { abc => 1, def => 1, ghi => 1, jkl => undef, mno => undef, uvw => undef, xyz => 1 },
name => "Foo",
},
{
funcs => { abc => undef, def => 1, ghi => undef, jkl => 1, mno => 1, uvw => 1, xyz => 1 },
name => "Bar",
},
{
funcs => { abc => 1, def => undef, ghi => undef, jkl => undef, mno => undef, uvw => 1, xyz => 1 },
name => "Baz",
},
)
Foo Bar Baz
abc abc
def def
ghi
jkl
mno
uvw uvw
xyz xyz xyz
Update: If your input data comes in the form of a AoA, this takes #table and produces the same #data as above (it basically transposes the AoA and then produces the hash structure):
my #table = ( [qw/Foo Bar Baz/], [qw/abc def abc/], [qw/def jkl uvw/],
[qw/ghi mno xyz/], [qw/xyz uvw/], [undef, qw/xyz/] );
my #data;
for my $col ( 0 .. $table[0]->$#* )
{ push #data, [ map {$_->[$col]//()} #table ] }
#data = map { {name=>shift #$_, funcs=>$_} } #data;
And if you need your output format to be an AoA as well:
my #out = ( [map {$$_{name}} #data] );
for my $f (sort keys %allfuncs)
{ push #out, [ map {$$_{funcs}{$f}?$f:undef} #data ] }
Produces #out:
(
["Foo", "Bar", "Baz"],
["abc", undef, "abc"],
["def", "def", undef],
["ghi", undef, undef],
[undef, "jkl", undef],
[undef, "mno", undef],
[undef, "uvw", "uvw"],
["xyz", "xyz", "xyz"],
)
A little manual but hopefully clear approach to "fill in" the missing spots: Gather a sorted reference of all values and use it to "pad" (with empty) for missing elements in each arrayref-column
use warnings;
use strict;
use feature 'say';
use Data::Dump qw(dd);
use List::MoreUtils qw(uniq);
my #data = (
[ qw(abc def ghi xyz) ],
[ qw(def jkl mno uvw xyz) ],
[ qw(abc uvw xyz) ]
);
my #all = uniq sort { $a cmp $b } map { #$_ } #data; # reference
# Changes #data in place. Use on deep copy to preserve the original
for my $ary (#data) {
my $cmp_at = 0;
my #res;
for my $i (0..$#all) {
if ($ary->[$cmp_at] eq $all[$i]) {
push #res, $ary->[$cmp_at];
++$cmp_at;
}
else {
push #res, undef;
}
}
$ary = \#res; # overwrite arrayref in #data
}
dd \#data;
Another way is to go row-wise and inspect and print for each element, again using a reference list. The column-wise preprocessing used here should be more flexible and general, I think.
The above prints
[
["abc", "def", "ghi", undef, undef, undef, "xyz"],
[undef, "def", undef, "jkl", "mno", "uvw", "xyz"],
["abc", undef, undef, undef, undef, "uvw", "xyz"],
]
where string "undef" produced by Data::Dump indicates array entries with nothing.
Now #data can be used to print in a chosen format, for example
use List::Util qw(max);
my $fmt = '%' . (max map { length } #all) . 's';
say join "\t", map { sprintf $fmt, $_ } qw(Foo Bar Baz);
for my $i (0..$#{$data[0]}) {
say join "\t", map { sprintf $fmt, $_->[$i]//'' } #data;
}
what prints a table in the fashion of the desired "output." For quick alignment I use the width of the longest word for all fields; this is most easily improved by using a text-formatting module.
If this goes to a spreadsheet, like tables in the question, then just join fields with a comma
open my $fh, '>', 'out.csv' or die "Can't open: $!";
say $fh join ',', qw(Foo Bar Baz);
foreach my $i (0..$#{$data[0]}) {
say $fh join ',', map { $_->[$i]//'' } #data;
}
The following code should do it, output format is CSV:
#!/usr/bin/perl
use strict;
use warnings;
use Text::CSV_XS qw( );
my $csv = Text::CSV_XS->new({ binary => 1, auto_diag => 2 });
my #headers = qw(Foo Bar Baz);
my #columns = (
[qw/abc def ghi xyz/],
[qw/def jkl mno uvw xyz/],
[qw/abc uvw xyz/],
);
my %output;
my $N = scalar(#columns);
my #new_row = ('') x $N;
foreach my $index (0..$N-1) {
my $column = $columns[$index];
foreach my $key (#{ $column }) {
$output{$key} ||= [ #new_row ];
$output{$key}->[$index] = $key;
}
}
$csv->say(\*STDOUT, \#headers);
# NOTE: add {....} after sort for special sorting requirements
foreach my $key (sort keys %output) {
$csv->say(\*STDOUT, $output{$key});
}
Example output:
$ perl dummy.pl
Foo,Bar,Baz,
abc,,abc,
def,def,,
ghi,,,
,jkl,,
,mno,,
,uvw,uvw,
xyz,xyz,xyz,
NOTE: the above is with the default sort algorithm, but you can insert your own, e.g.
# reverse sort
foreach my $key (sort { $b cmp $a } keys %output) {
Thanks for the question. This was fun :-)
EDIT2: guessing from the format of the question the input might also be in CSV, so a row-based variant of the algorithm might be more appropriate.
#!/usr/bin/perl
use strict;
use warnings;
use Text::CSV_XS qw( );
my $csv = Text::CSV_XS->new({ binary => 1, auto_diag => 2 });
my $headers;
my #new_row;
my %keys;
my $line = 0;
while ( my $row = $csv->getline(\*STDIN) ) {
if ($line == 0) {
$headers = $row;
#new_row = ('') x #$row;
} else {
foreach my $index (0..$#$row) {
my $key = $row->[$index];
$keys{$key} ||= [ #new_row ];
$keys{$key}->[$index] = $key;
}
}
$line++;
}
# delete "undefined" key
delete $keys{''};
$csv->say(\*STDOUT, $headers);
# NOTE: add {....} after sort for special sorting requirements
$csv->say(\*STDOUT, $keys{$_}) foreach (sort keys %keys);
Example output:
$ cat input.csv
Foo,Bar,Baz
abc,def,abc
def,jkl,uvw
ghi,mno,xyz
xyz,uvw,
,xyz,
$ perl dummy.pl <input.csv
Foo,Bar,Baz
abc,,abc
def,def,
ghi,,
,jkl,
,mno,
,uvw,uvw
xyz,xyz,xyz
Just another different solution using hashes and arrays and Hash::Merge. Looking at your example I concluded that each value from the table can be stored as a single hash entry as long as you keep track in which column they belong. After the unsorted hash has been created the program finishes with a example print algorithm to extract the data in the format that you want.
use warnings;
use strict;
use Hash::Merge ;
my #data = (
{ name => 'Foo', funcs => [qw/abc def ghi xyz/] },
{ name => 'Bar', funcs => [qw/def jkl mno uvw xyz/] },
{ name => 'Baz', funcs => [qw/abc uvw xyz/] },
);
my $merger = Hash::Merge->new('RETAINMENT_PRECEDENT');
my $unsorted = {} ;
for my $i ( 0..$#data) {
my $tmpH = {} ;
foreach( #{$data[$i]->{funcs}} ) {
if( exists $tmpH->{ $_ } ) {
push #{$tmpH->{ $_ }}, $i ;
} else {
$tmpH->{ $_ } = [ $i ] ;
}
} ;
$unsorted = $merger->merge( $unsorted, $tmpH ) ;
}
print "Foo\tBar\tBaz\n" ;
foreach ( sort keys %{$unsorted} ) {
my #txt;
#txt[#{$unsorted->{ $_ }}] = ($_) x #{$unsorted->{ $_ }} ;
{
no warnings 'uninitialized';
printf("%s\t%s\t%s\n", $txt[0], $txt[1], $txt[2]) ;
}
}
__END__
Foo Bar Baz
abc abc
def def
ghi
jkl
mno
uvw uvw
xyz xyz xyz
I have an array of hashes like this
arr_of_hashes = [
{"started_at"=>"2018-07-11", "stopped_at"=>"2018-07-11"},
{"started_at"=>"2018-07-13", "stopped_at"=>"2018-07-13"},
{"started_at"=>"2018-07-13", "stopped_at"=>"2018-07-13"},
{"started_at"=>"2018-07-16", "stopped_at"=>"2018-07-16"},
{"started_at"=>"2018-07-16", "stopped_at"=>"2018-07-16"},
{"started_at"=>"2018-07-16", "stopped_at"=>"still active"}
]
I want to remove duplicates. Also, among:
{"started_at"=>"2018-07-16", "stopped_at"=>"2018-07-16"},
{"started_at"=>"2018-07-16", "stopped_at"=>"still active"}
I want to keep only the last line. How can I do that?
I tried to do:
sorted_arr = arr_of_hashes.uniq
arr_of_hashes.reverse.uniq { |hash| hash["started_at"] }.reverse
About pass block to uniq and about reverse.
#result
[
{"started_at"=>"2018-07-11", "stopped_at"=>"2018-07-11"},
{"started_at"=>"2018-07-13", "stopped_at"=>"2018-07-13"},
{"started_at"=>"2018-07-16", "stopped_at"=>"still active"}
]
Something like this?
[2] pry(main)> arr_of_hashes.reject { |h| h['started_at'] == h['stopped_at'] }
[
[0] {
"started_at" => "2018-07-16",
"stopped_at" => "still active"
}
]
Its not clear form your question what output you want to get
arr_of_hashes.each_with_object({}) { |g,h| h.update(g["started_at"]=>g) }.values
#=> [{"started_at"=>"2018-07-11", "stopped_at"=>"2018-07-11"},
# {"started_at"=>"2018-07-13", "stopped_at"=>"2018-07-13"},
# {"started_at"=>"2018-07-16", "stopped_at"=>"still active"}]
See Hash#update (a.k.a. merge!) and note that values's receiver is as follows.
arr_of_hashes.each_with_object({}) { |g,h| h.update(g["started_at"]=>g) }
#=> {"2018-07-11"=>{"started_at"=>"2018-07-11", "stopped_at"=>"2018-07-11"},
# "2018-07-13"=>{"started_at"=>"2018-07-13", "stopped_at"=>"2018-07-13"},
# "2018-07-16"=>{"started_at"=>"2018-07-16", "stopped_at"=>"still active"}}
This question already has answers here:
Converting a nested hash into a flat hash
(8 answers)
Closed 8 years ago.
Here is a structure of hash of arrays:
[
{
"key1" => [
"value1",
{"key2" => ["value2"]},
{"key3" => [
"value3",
{
"key4" => "value4"
}
]
}
]
},
{
"anotherKey1" => [],
}
]
I want desired output for that structure like filepaths:
/key1/value1
/key1/key2/value2
/key3/value3
/key3/key4/value4
How can I do that without inventing a wheel? Simple recursion could help, but is there any ready-to-go modules?
I do not think you would be reinventing any wheels to do this. You would like to traverse a nested structure of arrays and hashes and react completely different to the elements depending on whether something is an Array or a Hash. No library function is going to do exactly that for you, as you would need to vary more than one thing with blocks in order to be as flexible as you might like to be.
In short: write your recursive function to do this.
(Btw: The top level of your data structure is an array of hashes, not a hash of arrays …)
I decided to write my own wheel (thanks for Patru, vote up).
And I have this function:
def flat_hash_of_arrays(hash,string = "",delimiter="/",result = [])
# choose delimiter
hash.each do |key,value|
# string dup for avoid string-reference (oh, Ruby)
newString = string + delimiter + key
# if value is array
if value.is_a?(Array)
# if array not empty
value.each do |elementOfArray|
# if a string, I dont need recursion, hah
if elementOfArray.is_a?(String)
resultString = newString + delimiter + elementOfArray
# add new object
result << resultString
end
# if a hash, I need recursion
if elementOfArray.is_a?(Hash)
flat_hash_of_arrays(elementOfArray,newString,delimiter,result)
end
end
end
end
end
and test it:
flatten_hash = {
"key1" => [
"value1",
{"key2" => ["value2"]},
{"key3" => [
"value3",
{
"key4" => "value4"
}
]
},
"value4",
{
"key4" => ["value5"],
}
]
}
result = []
flat_hash_of_arrays(flatten_hash,"","/",result)
puts result
output is:
/key1/value1
/key1/key2/value2
/key1/key3/value3
/key1/value4
/key1/key4/value5
fine!
There are workable answers for sorting an array of hashes and for natural sorting, but what is the best way to do both at once?
my_array = [ {"id":"some-server-1","foo":"bar"},{"id":"some-server-2","foo":"bat"},{"id":"some-server-10","foo":"baz"} ]
I would like to sort on "id" such that the final ordering is:
some-server-1
some-server-2
some-server-10
I feel like there must be a clever and efficient way to do this, though personally I don't need to break any speed records and will only be sorting a few hundred items. Can I implement a comparison function in sort_by?
First of all, your my_array is JavaScript/JSON so I'll assume that you really have this:
my_array = [
{"id" => "some-server-1", "foo" => "bar"},
{"id" => "some-server-2", "foo" => "bat"},
{"id" => "some-server-10", "foo" => "baz"}
]
Then you just need to sort_by the numeric suffix of the 'id' values:
my_array.sort_by { |e| e['id'].sub(/^some-server-/, '').to_i }
If the "some-server-" prefixes aren't always "some-server-" then you could try something like this:
my_array.sort_by { |e| e['id'].scan(/\D+|\d+/).map { |x| x =~ /\d/ ? x.to_i : x } }
That would split the 'id' values into numeric and non-numeric pieces, convert the numeric pieces to integers, and then compare the mixed string/integers arrays using the Array <=> operator (which compares component-wise); this will work as long as the numeric and non-numeric components always match up. This approach would handle this:
my_array = [
{"id" => "some-server-1", "foo" => "bar"},
{"id" => "xxx-10", "foo" => "baz"}
]
but not this:
my_array = [
{"id" => "11-pancakes-23", "foo" => "baz"},
{"id" => "some-server-1", "foo" => "bar"}
]
If you need to handle this last case then you'd need to compare the arrays entry-by-entry by hand and adjust the comparison based on what you have. You could still get some of the advantages of the sort_by Schwartzian Transform with something like this (not very well tested code):
class NaturalCmp
include Comparable
attr_accessor :chunks
def initialize(s)
#chunks = s.scan(/\D+|\d+/).map { |x| x =~ /\d/ ? x.to_i : x }
end
def <=>(other)
i = 0
#chunks.inject(0) do |cmp, e|
oe = other.chunks[i]
i += 1
if(cmp == 0)
cmp = e.class == oe.class \
? e <=> oe \
: e.to_s <=> oe.to_s
end
cmp
end
end
end
my_array.sort_by { |e| NaturalCmp.new(e['id']) }
The basic idea here is to push the comparison noise off to another class to keep the sort_by from degenerating into an incomprehensible mess. Then we use the same scanning as before to break the strings into pieces and implement the array <=> comparator by hand. If we have two things of the same class then we let that class's <=> deal with it otherwise we force both components to String and compare them as such. And we only care about the first non-0 result.
#mu gives a more than adequate answer for my case, but I also figured out the syntax for introducing arbitrary comparisons:
def compare_ids(a,b)
# Whatever code you want here
# Return -1, 0, or 1
end
sorted_array = my_array.sort { |a,b| compare_ids(a["id"],b["id"] }
I think that if you are sorting on the id field, you could try this:
my_array.sort { |a,b| a["id"].to_i <=> b["id"].to_i }