copy the lines of a file into hashmap in ruby - ruby

I have a file with multiple lines. In each line, there two words and a number, split by a comma - for example a, b, 1. It means that string A and string B have the key as 1. I wrote the below piece of code
File.open(ARGV[0], 'r') do |f1|
while line = f1.gets
puts line
end
end
i'm looking for an idea of how to split and copy the characters and number in such a way that the first two words have the last number as key in the hashmap.

Does this work for you?
hash = {}
File.readlines(ARGV[0]).each do |line|
var = line.gsub(' ','').split(',')
hash[var[2]] = var[0],var[1]
end
This would give:
hash['1'] = ['a','b']
I don't know if you want to store number one as an integer or a string, if it's a integer you're looking for, just do var[2].to_i before storing.
Modified your code a little bit, i think it's shorter this way, if i'm in any way wrong, do let me know.

Related

Take the user input of numbers and print equivalent in the form of a -

I'm trying to take the user's input (numbers separated by commas, e.g., "5,8,11"), and return the equivalent number of "-"s. For example, if the user inputs "4,2,4,5", then the output should be the following:
----
--
----
-----
with each on a new line. I need to take an input string, split it at the commas, which will turn it into an array, and then iterate through the array and print the amount of commas per element.
I tried this,
puts "Enter some numbers"
input = gets.chomp
input.split(',')
input.each do |times|
puts "-" * times
end
which returns a noMethodError. I'm not sure where I am wrong.
Any help would be greatly appreciated.
You need integers for that. Try
input = gets.chomp.split(',').map(&:to_i)
Couple of things...
input.split(',')
This DOES split input, but it doesn't change the contents of the input variable.
What would work...
input = input.split(',')
Secondly, the result will be an array of strings, not integers, so better would be...
input = input.split(',').map(&:to_i)
This will map the string array into an integer array

Ruby script which can replace a string in a binary file to a different, but same length string?

I would like to write a Ruby script (repl.rb) which can replace a string in a binary file (string is defined by a regex) to a different, but same length string.
It works like a filter, outputs to STDOUT, which can be redirected (ruby repl.rb data.bin > data2.bin), regex and replacement can be hardcoded. My approach is:
#!/usr/bin/ruby
fn = ARGV[0]
regex = /\-\-[0-9a-z]{32,32}\-\-/
replacement = "--0ca2765b4fd186d6fc7c0ce385f0e9d9--"
blk_size = 1024
File.open(fn, "rb") {|f|
while not f.eof?
data = f.read(blk_size)
data.gsub!(regex, str)
print data
end
}
My problem is that when string is positioned in the file that way it interferes with the block size used by reading the binary file. For example when blk_size=1024 and my 1st occurance of the string begins at byte position 1000, so I will not find it in the "data" variable. Same happens with the next read cycle. Should I process the whole file two times with different block size to ensure avoiding this worth case scenario, or is there any other approach?
I would posit that a tool like sed might be a better choice for this. That said, here's an idea: Read block 1 and block 2 and join them into a single string, then perform the replacement on the combined string. Split them apart again and print block 1. Then read block 3 and join block 2 and 3 and perform the replacement as above. Split them again and print block 2. Repeat until the end of the file. I haven't tested it, but it ought to look something like this:
File.open(fn, "rb") do |f|
last_block, this_block = nil
while not f.eof?
last_block, this_block = this_block, f.read(blk_size)
data = "#{last_block}#{this_block}".gsub(regex, str)
last_block, this_block = data.slice!(0, blk_size), data
print last_block
end
print this_block
end
There's probably a nontrivial performance penalty for doing it this way, but it could be acceptable depending on your use case.
Maybe a cheeky
f.pos = f.pos - replacement.size
at the end of the while loop, just before reading the next chunk.

Find and print lines in a file exactly matching string or regexp (Ruby)

In ruby 1.9.3, I'm trying to write a program that will find all words with n number of characters taken from an arbitrary set of characters. So for instance, if I'm given the characters [ b, a, h, s, v, i, e, y, k, s, a ] and n = 5, I need to find all 5-letter words that can be made using only those characters. Using the 2of4brif.txt word list from http://wordlist.sourceforge.net/ (to include British words and spellings, too), I have attempted the following code:
a = %w[b a h s v i e y k s a]
a.permutation(5).map(&:join).each do |x|
File.open('2of4brif.txt').each_line do |line|
puts line if line.match(/^[#{x}]+$/)
end
end
This does nothing (no error message, no output, as if frozen). I have also attempted variations based on the following threads:
What's the best way to search for a string in a file?
Ruby find string in file and print result
How to search for exact matching string in a text file using Ruby?
Finding lines in a text file matching a regular expression
Match a content with regexp in a file?
How to open a file and search for a word?
Every variation I have tried has resulted in either:
1) Freezing;
2) Printing all words from the list that contain the 5-character permutations (I assume that's what it's doing; I didn't go through and check all of the thousands of printed words); or
3) Printing all 5-character permutations found within words in the list (again, I assume that's what it's doing).
Again, I'm not looking for words that contain the 5-character permutations, I'm looking for 5-character permutations that are complete words in and of themselves, so a line in the text file should only be printed if it is a perfect match with a permutation.
What am I doing wrong? Thanks in advance!
You’re not really using regular expressions here. Your program is very inefficient, not only because you’re re-opening the file for each single permutation as has been pointed out (and there are 55k of them!); but above all because all you want to do is
/^[bahsvieyksa]{5}$/
for each line of the file.
I would thus suggest:
File.open('2of4brif.txt').each_line do |line|
puts line if line.match(/^[bahsvieyksa]{5}$/)
end
as a much more efficient alternative
This works for me using the english.0 file on that page (sorry, I couldn't find the specific file you mentioned):
a = %w[b a h s v i e y k s a l d n]
dict = {}
a.permutation(5).each do |p|
dict[p.join('')] = true
end
File.open('english.0').each_line do |line|
line.chomp!.downcase!
puts line if dict[line]
end
The structure should be pretty clear - I build the dictionary of permutations up front in one giant hash (you may need to revisit this depending on input sizes, but memory is cheap these days), and then I used the fact that the input was "one word per line" to simply key into that hash.
Also note, in my version, I read through the file only once. In yours you scan the file once per permutation, and there are thousands of permutations.
Simpler is to just count the occurrence of each char and compare:
a = %w[b a h s v i e y k s a l d n]
File.read('2of4brif.txt').split("\n").each do |line|
puts line if line.size == 5 && line.chars.all?{|x| line.count(x) <= a.count(x)}
end
For me the following worked out
File.open('file.txt').each_line do |line|
puts line if line[/<regexp>/]
end

Double "gsub" Variable

Is it possible to use variables in both fields of the gsub method ?
I'm trying to get this piece of code work :
$I = 0
def random_image
$I.to_s
random = rand(1).to_s
logo = File.read('logo-standart.txt')
logo_aleatoire = logo.gsub(/#{$I}/, random)
File.open('logo-standart.txt', "w") {|file| File.puts logo_aleatoire}
$I.to_i
$I += 1
end
Thanks in advance !
filecontents = File.read('logo-standart.txt')
filecontents.gsub!(/\d+/){rand(100)}
File.open("logo-standart.txt","w"){|f| f << filecontents }
The magic line is the second line.
The gsub! function modifies the string in-place, unlike the gsub function, which would return a new string and leave the first string unmodified.
The single parameter that I passed to gsub! is the pattern to match. Here, the goal is to match any string of one or more digits -- this is the number that you're going to replace. There's no need to loop through all of the possible numbers running gsub on each one. You can even match numbers as high as a googol (or higher) without your program taking longer and longer to run.
The block that gsub! takes is evaluated each time the pattern matches to programmatically generate a replacement number. So each time, you get a different random number. This is different from the more usual form of gsub! that takes two parameters -- there the parameter is evaluated once before any pattern matching occurs, and all matches are replaced by the same string.
Note that the way this is structured, you get a new random number for each match. So if the number 307 appears twice, it turns into two different random numbers.
If you wanted to map 307 to the same random number each time, you could do the following:
filecontents = File.read('logo-standart.txt')
randomnumbers = Hash.new{|h,k| h[k]=rand(100)}
filecontents.gsub!(/\d+/){|match| randomnumbers[match]}
File.open("logo-standart.txt","w"){|f| f << filecontents }
Here, randomnumbers is a hash that lets you look up the numbers and find what random number they correspond to. The block passed when constructing the hash tells the hash what to do when it finds a number that it hasn't seen before -- in this case, generate a new random number, and remember what that random number the mapping. So gsub!'s block just asks the hash to map numbers for it, and randomnumbers takes care of generating a new random number when you encounter a new number from the original file.

How do I extract the right most number in a string?

I have strings like this:
https://www.facebook.com/username_with_number_14/posts/101505775425654414
https://www.facebook.com/username/posts/101505775425654466
I need to extract the number on the end of the string in Ruby. In the first string, it is the second and last number, whereas in the second string it is the first, only and last number.
At the moment I am extracting the number like this:
int1 = Regexp.new('.*?(\\d+)',Regexp::IGNORECASE).match()[1]
But when this is applied to the first string, it extracts the number part of the username, not the desired number.
How can I do it so that it will work on both strings?
text = <<ENDTEXT
https://www.facebook.com/username_with_number_14/posts/101505775425654414
https://www.facebook.com/username/posts/101505775425654466
ENDTEXT
p text.lines.map{|line| line.scan(/\d+/).last}
#=> ["101505775425654414", "101505775425654466"]
for me works regexp like this:
^.*?(\d+)$
look here: http://rubular.com/r/CJzsgjedqJ
Try this
int1 = Regexp.new('.*\\/(\\d+)$',Regexp::IGNORECASE).match()[1]
The $ matches the end of the string. So I put all numbers from the last / to the end of the string into the capturing group 1.

Resources