Best way to remove duplicated columns in linux - bash

It should be like running uniq command, but by columns. For example:
A B C B
A C B C
A A A A
Second and fourth columns are identical. Which is the best way to obtain the following result?
A B C
A C B
A A A
However, at first it is unknown which columns are identical, much like with the uniq command for rows.

Perl to the rescue!
perl -lane '
push #{ $c[$_] }, $F[$_] for 0 .. $#F;
}{
for (#c) {
$s = join "|", #$_;
$seen{$s}++ or push #r, $_;
}
print join " ", map shift #$_, #r while #{ $r[0] }
' -- inputfile
The first line pivots the input, i.e. it creates the following structure:
#c = ( [ 'A', 'A', 'A' ],
[ 'B', 'C', 'A' ],
[ 'C', 'B', 'A' ],
[ 'B', 'C', 'A' ] );
The }{ (called "Eskimo greeting") separates the code run for each line from the code run after the whole input has been processed. It walks the #c array and only keeps the unique columns (by creating a string from each of them like A|A|A, B|C|A, etc. and storing them in the %seen hash.
The structure will be
#r = ( [ 'A', 'A', 'A' ],
[ 'B', 'C', 'A' ],
[ 'C', 'B', 'A' ] );
and the hash will look like
%seen = ( 'B|C|A' => 2,
'A|A|A' => 1,
'C|B|A' => 1
);
The last print shifts the first element of each column, i.e. it pivots the result back.

Related

How do i read lines in txt and use them in a method for ruby, exactly as the text is?

class Test
def self.take_test( question, options, answer )
puts question
options.each_with_index { |option, idx| puts "#{ idx + 1 }: #{ option}" }
print "Answer: "
reply = gets.to_i
if answer == reply
puts "Correct!"
else
puts "Wrong. The answer is: " + answer.to_s
end
end
end
file = File.open("Matematik.txt", "r")
This is what i tried to do:
IO.foreach("Matematik.txt") { |line| Test.take_test(line) }
This is how the questions are set up in the file:
'What is 2+2?', [ '2', '3', '4', '5', ], 4
'What is 3+3?', [ '3', '6', '9', ], 6
I get the error: take_test wrong number of arguments given (given 1, expected 3) (ArgumentError)
It seems like it reads the line like 1 argument. Is there a way to read the lines exactly as i stands, and input it like this?:
#Test.take_test('What is 2+2?', [ '2', '3', '4', '5', ], 4)
In theory, yes, with eval("Test.take_test(#{line})"). However, eval is evil, and to be avoided if at all possible.
It would be much easier if you changed your file format so you can deconstruct it easily. (It is not impossible with your format, it's just you have a lot of unnecessary work, when compared to a simpler format.) For example, given lines formatted like CSV:
"What is 2+2?",2,3,4,5,4
it is very easy to do the following:
require 'csv'
Question = Struct.new(:text, :options, :answer)
questions = CSV.read("Mathematik.csv").map { |text, *options, answer|
Question.new(text, options, answer.to_i)
}
questions[0].text
# => "What is 3+3?"
questions[0].options
# => ["2", "3", "4", "5"]
questions[0].answer
# => 4

Using variable inside a Hash as a key to another Hash

I have some data to sort as an array in the form ['a1', 'b321', 'a33', 'c', ...].
I want to put all the 'aN' into sorted_data[:a] etc.
The code below loops through the data, and correctly runs the regexp on them.
What it doesn't do is put them in the right place - sorted_data[filter[:key]] is null.
How do use filter[:key] as a key to sorted_data?
Thanks.
sorted_data = { a: Array.new,
b: Array.new,
c: -1 }
filters = [{ re: /^a\d+$/, key: 'a' },
{ re: /^b\d+$/, key: 'b' },
{ re: /^c$/, key: 'c' }]
['a1', 'b321', 'a33', 'c', 'b', 'b1'].each {|cell|
filters.each {|filter|
if cell.match(filter[:re])
puts "#{cell} should go in #{filter[:key]}" + '....[' + sorted_data[filter[:key]].to_s + ']....'
break
end
}
}
The output of the above is
# a1 should go in a....[]....
# b321 should go in b....[]....
# a33 should go in a....[]....
# c should go in c....[]....
# b1 should go in b....[]....
I believe the program below produces the desired output:
l = ['a1', 'b321', 'a33', 'c', 'b', 'b1']
sorted_data = Hash.new { |hash, key| hash[key] = [] }
l.each do |item|
first_char = item[0].to_sym
sorted_data[first_char].push item
end
puts sorted_data
Output:
{:a=>["a1", "a33"], :b=>["b321", "b", "b1"], :c=>["c"]}

Replace multiples characters at same time in ruby

Are there a better way to wrote the same code below?
I am looking for a clean and minimal code.
val.gsub!('A', 'Q')
val.gsub!('B', 'W')
val.gsub!('C', 'E')
val.gsub!('D', 'R')
val.gsub!('E', 'T')
val.gsub!('F', 'Y')
Use tr, it's purpose-built for the problem you're describing:
> val = "ASDFGHJKL"
=> "ASDFGHJKL"
> val.tr("ABCDEF", "QWERTY")
=> "QSRYGHJKL"
Without using any other methods than the ones you already know about, you could build a key/value mapping, and then iterate over the pairs:
{ 'A' => 'Q', 'B' => 'W', 'C' => 'E' ...}.each { |x,y| val.gsub(x, y) }

How do you define element uniqueness by multiple keys/attributes?

I have queried my database which gave me an array of hashes, where the keys in the hash are the column names. I want to keep only the hashes(array elements), that are unique according to multiple (3 columns). I have tried:
array.uniq { |item| item[:col1], item[:col2], item[:col3] }
as well as
array = array.inject([{}]) do |res, item|
if !res.any? { |h| h[:col1] == item[:col1] &&
h[:col2] == item[:col2] &&
h[:col3] == item[:col3] }
res << item
end
end
Does anyone have any ideas as to what's wrong or another way of going about this?
Thanks
It's unclear to me what you're asking for. My best guess is that given the array of single-association Hashes:
array = [{:col1 => 'aaa'}, {:col2 => 'bbb'}, {:col3 => 'aaa'}]
You'd like to have only one Hash per hash value; that is, remove the last Hash because both it and the first one have 'aaa' as their value. If so, then this:
array.uniq{|item| item.values.first}
# => [{:col1=>"aaa"}, {:col2=>"bbb"}]
Does what you want.
The other possibility I'm imagining is that given an array like this:
array2 = [{:col1 => 'a', :col2 => 'b', :col3 => 'c', :col4 => 'x'},
{:col1 => 'd', :col2 => 'b', :col3 => 'c', :col4 => 'y'},
{:col1 => 'a', :col2 => 'b', :col3 => 'c', :col4 => 'z'}]
You'd like to exclude the last Hash for having the same values for :col1, :col2, and :col3 as the first Hash. If so, then this:
array2.uniq{|item| [item[:col1], item[:col2], item[:col3]]}
# => [{:col1=>"a", :col2=>"b", :col3=>"c", :col4=>"x"},
# {:col1=>"d", :col2=>"b", :col3=>"c", :col4=>"y"}]
Does what you want.
If neither of those guesses are really want you're looking for, you'll need to clarify what you're asking for, preferably including some sample input and desired output.
I'll also point out that it's quite possible that you can accomplish what you want at the database query level, depending on many factors not presented.
If the no. of column is constant i.e. 3 you are better off creating a 3 level hash something like below
where whatever value you want to store is at 3rd level.
out_hash = Hash.new
array.each do |value|
if value[:col1].nil?
out_hash[value[:col1]] = Hash.new
out_hash[value[:col1]][value[:col2]] = Hash.new
out_hash[value[:col1]][value[:col2]][value[:col3]] = value
else if value[:col1][:col2].nil?
out_hash[value[:col1]][value[:col2]] = Hash.new
out_hash[value[:col1]][value[:col2]][value[:col3]] = value
else if value[:col1][:col2][:col3].nil?
out_hash[value[:col1]][value[:col2]][value[:col3]] = value
end
end
I have not tested the code above its for giving you a idea...

Merging two arrays in one line of ruby code in a particular structure

I have two arrays
default = ['0', '0', '0', '0'] # this is fixed
new = ['2', '3', ''] # it can be of many variants like ['', '1'] or
# ['1', '', '', ''], but will never have
# more than 4 elements
I want to get a resultant array from above two arrays as
['2', '3', '0', '0']
How to achieve it one line of simple ruby code? I can do it in multiple line or with the help of inject/reduce.
default.zip(new).map { |d,n| (n.nil? or n.empty?) ? d : n }
If you're using rails --
n = 4 # number of elements you need
n.times.map{|x| new[x].presence || default[x] }
If not
n = 4 # number of elements you need
n.times.map{|x| (new[x].nil? || new[x] == "") ? default[x] : new[x] }
If I understand correctly, what you want to do is replace blanks/nils in "new" array with corresponding values from "default" array
try this
default.each_index.collect {|i| (new[i].nil? || new[i] == '') ? default[i] : new[i]}
This should work for any length of "default" array. The returned array will be of the same length as "default"

Resources