need help printing contents of a ruby hash into a table - ruby

I have a file that contains this:
PQRParrot, Quagga, Raccoon
DEFDo statements, Else statements, For statements
GHIGeese, Hippos, If statements
YZ Yak, Zebra
JKLJelly Fish, Kudu, Lynx
MNOManatee, Nautilus, Octopus
ABCApples, Boas, Cats
VWXVulture, While statements, Xmen
STUSea Horse, Tapir, Unicorn
I need to display it in a table like this:
Key Data
ABC Apples, Boas, Cats
DEF Do statements, Else statements, For statements
GHI Geese, Hippos, If statements
JKL Jelly Fish, Kudu, Lynx
MNO Manatee, Nautilus, Octopus
PQR Parrot, Quagga, Raccoon
STU Sea Horse, Tapir, Unicorn
VWX Vulture, While statements, Xmen
YZ Yak, Zebra
Here is the code that I have so far:
lines = File.open("file.txt").read.split
fHash = {}
lines.each do |line|
next if line == ""
fHash[line[0..2]] = line[3..-1]
end
f = File.open("file.txt")
fHash = {}
loop do
x = f.gets
break unless x
fHash[x[0..2]] = x[3..-1]
end
fHash = fHash.to_a.sort.to_h
puts fHash
f.close
And this is what the code outputs:
{ "ABC" => "Apples, Boas, Cats\n",
"DEF" => "Do statements, Else statements, For statements\n",
"GHI" => "Geese, Hippos, If statements\n",
"JKL" => "Jelly Fish, Kudu, Lynx\n",
"MNO" => "Manatee, Nautilus, Octopus\n",
"PQR" => "Parrot, Quagga, Raccoon\n",
"STU" => "Sea Horse, Tapir, Unicorn\n",
"VWX" => "Vulture, While statements, Xmen\n",
"YZ " => "Yak, Zebra\n"
}
So what i'm trying to do is read the contents of the file, take the first three characters and set it as the key while the rest as data, sort the hash by the key value, then display the data as a table.
I have looked around, found a few things similar to my issue but nothing worked out for me.

I think you're overthinking this. If you have a file with those contents, to print the table all you need to do is insert a space after the third character of each line and then sort them (or the other way around). That's pretty simple:
lines = File.foreach("homework02.txt")
.map {|line| line.insert(3, " ") }
puts "Key Data"
puts lines.sort
If instead you want to build a Hash from the lines of the file, all you have to do is this:
hsh = File.foreach("homework02.txt")
.map {|line| [ line.slice!(0,3), line ] }
.sort.to_h
This builds an array of two-element arrays whose first element is the first three characters of each line and whose second is the rest of the line, then sorts it and turns it into a hash.
Then, to print your table:
puts "Key Data"
puts hsh.map {|key, val| "#{key} #{val}" }

I got it to work by changing the way it sorts. Updated code below.
lines = File.open("homework02.txt").read.split
fHash = {}
lines.each do |line|
next if line == ""
fHash[line[0..2]] = line[3..-1]
end
f = File.open("homework02.txt")
fHash = {}
loop do
x = f.gets
break unless x
fHash[x[0..2]] = x[3..-1]
end
fHash = Hash[fHash.sort_by { |k, v| k }]
print "Key ", " Data \n"
fHash.each do |key, val|
print key, " ", val
end
f.close

I have assumed that every line begins with one or more capital letters, followed by an optional space, followed by a capital letter, followed by a lowercase letter.
Code
R = /
\A[A-Z]+ # Match start of string followed by one or more capital letters
\K # Forget everything matched so far
(?=[A-Z][a-z]) # Match a capital letter followed by a lowercase letter
# in a postive lookahead
/x # Extended/free-spacing regex definition mode
Read the file, line by line, format each line, partition each line on the first space and sort:
def make_string(fname)
File.foreach(fname).map { |s| s.gsub(R, ' ').chomp.partition(' ') }.
sort.
map(&:join)
end
If you instead wish to create the specified hash, you could write:
def make_hash(fname)
File.foreach(fname).map { |s| s.gsub(R, ' ').chomp.partition(' ') }.
sort.
map { |f,_,l| [f,l] }.
to_h
end
In the regex the first part of the string cannot be matched in a positive lookbehind because the match is variable-length. That's why I used \K, which does not have that limitation.
Examples
First, let's create the file:
str = <<_
PQRParrot, Quagga, Raccoon
DEFDo statements, Else statements, For statements
GHIGeese, Hippos, If statements
YZ Yak, Zebra
JKLJelly Fish, Kudu, Lynx
MNOManatee, Nautilus, Octopus
ABCApples, Boas, Cats
VWXVulture, While statements, Xmen
STUSea Horse, Tapir, Unicorn
_
FName = 'temp'
File.write(FName, str)
#=> 265
Then
puts make_string(FName)
ABC Apples, Boas, Cats
DEF Do statements, Else statements, For statements
GHI Geese, Hippos, If statements
JKL Jelly Fish, Kudu, Lynx
MNO Manatee, Nautilus, Octopus
PQR Parrot, Quagga, Raccoon
STU Sea Horse, Tapir, Unicorn
VWX Vulture, While statements, Xmen
YZ Yak, Zebra
make_hash(FName)
#=> {"ABC"=>"Apples, Boas, Cats",
# "DEF"=>"Do statements, Else statements, For statements",
# "GHI"=>"Geese, Hippos, If statements",
# "JKL"=>"Jelly Fish, Kudu, Lynx",
# "MNO"=>"Manatee, Nautilus, Octopus",
# "PQR"=>"Parrot, Quagga, Raccoon",
# "STU"=>"Sea Horse, Tapir, Unicorn",
# "VWX"=>"Vulture, While statements, Xmen",
# "YZ"=>"Yak, Zebra"}
As a second example, suppose:
str = <<_
PQRSTUVParrot, Quagga, Raccoon
DEFDo statements, Else statements, For statements
Y Yak, Zebra
_
FName = 'temp'
File.write(FName, str)
#=> 94
Then
puts make_string(FName)
PQRSTUV Parrot, Quagga, Raccoon
Y Yak, Zebra
make_hash(FName)
# => {"DEF"=>"Do statements, Else statements, For statements",
# "PQRSTUV"=>"Parrot, Quagga, Raccoon", "Y"=>"Yak, Zebra"}

Related

Ruby regex for checking if 2 words exist with correct capital of first letter of each

I have achieved checking if capitalization exists with the first letter of each word with the below,
NAME_CAPS = /^\p{Lu}\S*{2,}(?:[[:space:]]+\p{Lu}\S*{2,})*$/
Maintaining that, I would like to also add to check if 2 words are inserted, you can see my attempt above by adding the {2,}. Currently the problem is, it will pass as correct with only one word inserted by the user with correct capitalization - 2 words must be inserted.
More relevant code:
def valid_name?(name)
!!name.match(NAME_CAPS)
end
puts "Now, go for it!"
while (name=gets)
names = name.split(" ", 2)
if valid_name?(name)
puts "Correct."
break
else
puts "Wrong."
end
end
I know you're trying to do this with a regex, but you can also do this:
def valid_name?(name)
names = name.split
names.size == 2 && names.all?{|n| n == n.capitalize}
end
valid_name?("john doe") #=> false
valid_name?("John Doe") #=> true
valid_name?("John") #=> false
valid_name?("John Q Doe") #=> false
You can change zero-or-more repetition (*) to one-or-more repetition (+) in the second word group:
NAME_CAPS = /^\p{Lu}\S*{2,}(?:[[:space:]]+\p{Lu}\S*{2,})+$/
Btw this patter will match longer sequences as well:
pry> 'Joe'.match(NAME_CAPS)
=> nil
pry> 'Joe Doe'.match(NAME_CAPS)
=> #<MatchData "Joe Doe">
pry> 'Joe Doe Zoe'.match(NAME_CAPS)
=> #<MatchData "Joe Doe Zoe">
To avoid it (and simplify the pattern) you can resign from the repetition:
NAME_CAPS = /^\p{Lu}\S*{2,}(?:[[:space:]]+\p{Lu}\S*{2,})$/

Replace matched lines in a file but ignore commented-out lines using Ruby

How to replace a file in Ruby, but do not touch commented-out lines? To be more specific I want to change variable in configuration file. An example would be:
irb(main):014:0> string = "#replaceme\n\t\s\t\s# replaceme\nreplaceme\n"
=> "#replaceme\n\t \t # replaceme\nreplaceme\n"
irb(main):015:0> puts string.gsub(%r{replaceme}, 'replaced')
#replaced
# replaced
replaced
=> nil
irb(main):016:0>
Desired output:
#replaceme
# replaceme
replaced
I don't fully understand the question. To do a find and replace in each line, disregarding text following a pound sign, one could do the following.
def replace_em(str, source, replacement)
str.split(/(\#.*?$)/).
map { |s| s[0] == '#' ? s : s.gsub(source, replacement) }.
join
end
str = "It was known that # that dog has fleas, \nbut who'd know that that dog # wouldn't?"
replace_em(str, "that", "the")
#=> "It was known the # that dog has fleas, \nbut who'd know the the dog # wouldn't?"
str = "#replaceme\n\t\s\t\s# replaceme\nreplaceme\n"
replace_em(str, "replaceme", "replaced")
#=> "#replaceme\n\t \t # replaceme\nreplaced\n"
For the string
str = "It was known that # that dog has fleas, \nbut who'd know that that dog # wouldn't?"
source = "that"
replacement = "the"
the steps are as follows.
a = str.split(/(\#.*?$)/)
#=> ["It was known that ", "# that dog has fleas, ",
# "\nbut who'd know that that dog ", "# wouldn't?"]
Note that the body of the regular expression must be put in a capture group in order that the text used to split the string be included as elements in the resulting array. See String#split.
b = a.map { |s| s[0] == '#' ? s : s.gsub(source, replacement) }
#=> ["It was known the ", "# that dog has fleas, ",
# "\nbut who'd know the the dog ", "# wouldn't?"]
b.join
#=> "It was known the # that dog has fleas, \nbut who'd know the the dog # wouldn't?"
How about this?
puts string.gsub(%r{^replaceme}, 'replaced')

How do I extract the part of a string whose individual words begin with letters?

I'm using Ruby 2.4. Let's say I have a string that has a number of spaces in it
str = "abc def 123ffg"
How do I capture all the consecutive words at the beginning of the string that begin with a letter? So for example, in the above, I would want to capture
"abc def"
And if I had a string like
"aa22 b cc 33d ff"
I would want to capture
"aa22 b cc"
but if my string were
"66dd eee ff"
I would want to return nothing because the first word of that string does not begin with a letter.
If you can spare the extra spaces between words, you could then split the string and iterate the resulting array with take_while, using a regex to get the desired output; something like this:
str = "abc def 123ffg"
str.split.take_while { |word| word[0] =~ /[[:alpha:]]/ }
#=> ["abc", "def"]
The output is an array, but if a string is needed, you could use join at the end:
str.split.take_while { |word| word[0] =~ /[[:alpha:]]/ }.join(" ")
#=> "abc def"
More examples:
"aa22 b cc 33d ff".split.take_while { |word| word[0] =~ /[[:alpha:]]/ }
#=> ["aa22", "b", "cc"]
"66dd eee ff".split.take_while { |word| word[0] =~ /[[:alpha:]]/ }
#=> []
The Regular Expression
There's usually more than one way to match a pattern, although some are simpler than others. A relatively simple regular express that works with your inputs and expected outputs is as follows:
/(?:(?:\A|\s*)\p{L}\S*)+/
This matches one or more strings when all of the following conditions are met:
start-of-string, or zero or more whitespace characters
followed by a Unicode category of "letter"
followed by zero or more non-whitespace characters
The first item in the list, which is the second non-capturing group, is what allows the match to be repeated until a word starts with a non-letter.
The Proofs
regex = /(?:(?:\A|\s*)\p{L}\S*)+/
regex.match 'aa22 b cc 33d ff' #=> #<MatchData "aa22 b cc">
regex.match 'abc def 123ffg' #=> #<MatchData "abc def">
regex.match '66dd eee ff' #=> #<MatchData "">
The sub method can be used to replace with an empty string '' everything that needs to be removed from the expression.
In this case, a first sub method is needed to remove the whole text if it starts with a digit. Then another sub will remove everything starting from any word that starts with a digit.
Answer:
str.sub(/^\d+.*/, '').sub(/\s+\d+.*/, '')
Outputs:
str = "abc def 123ffg"
# => "abc def"
str = "aa22 b cc 33d ff"
# => "aa22 b cc"
str = "66dd eee ff"
# => ""

If a line is matched the regex => key, all lines after that => its value until the next match

I have data look like
>header\n
something1\n
something2\n
something3\n
>header2\n ...
I want everything that start with > to be a key, all the rest is its value until the next header. What should I do to create hash by Ruby in this format?
{{:>header=>"something1something2something3"},
{:>header2=>"something4something5something6"}, ...}
Regex
This regex scans for a line beginning with >, followed by other lines with characters that aren't >:
str = ">header
something1
something2
something3
>header2
>header3
something4
something5"
p str.scan(/(^>[^\n]+)([^>]+)/m).map { |header, lines| [header.to_sym, lines.strip] }.to_h
# {:">header"=>"something1\nsomething2\nsomething3", :">header2"=>"", :">header3"=>"something4\nsomething5"}
File.foreach
If you're working on huge files, it might be a good idea to use :
headers = {}
last_header = nil
File.foreach('headers.txt') do |line|
if line =~ /^>/
last_header = line.chomp.to_sym
headers[last_header] = ""
elsif last_header
headers[last_header] << line
end
end
p headers
#=> {:">header"=>"something1\nsomething2\nsomething3\n", :">header2"=>"", :">header3"=>"something4\nsomething5\n"}

Ruby regex to get text blocks including delimiters

When using scan in Ruby, we are searching for a block within a text file.
Sample file:
sometextbefore
begin
sometext
end
sometextafter
begin
sometext2
end
sometextafter2
We want the following result in an array:
["begin\nsometext\nend","begin\nsometext2\nend"]
With this scan method:
textfile.scan(/begin\s.(.*?)end/m)
we get:
["sometext","sometext2"]
We want the begin and end still in the output, not cut off.
Any suggestions?
You may remove the capturing group completely:
textfile.scan(/begin\s.*?end/m)
See the IDEONE demo
The String#scan method returns captured values only if you have capturing groups defined inside the pattern, thus a non-capturing one should fix the issue.
UPDATE
If the lines inside the blocks must be trimmed from leading/trailing whitespace, you can just use a gsub against each matched block of text to remove all the horizontal whitespace (with the help of \p{Zs} Unicode category/property class):
.scan(/begin\s.*?end/m).map { |s| s.gsub(/^\p{Zs}+|\p{Zs}+$/, "") }
Here, each match is passed to a block where /^\p{Zs}+|\p{Zs}+$/ matches either the start of a line with 1+ horizontal whitespace(s) (see ^\p{Zs}+), or 1+ horizontal whitespace(s) at the end of the line (see \p{Zs}+$).
See another IDEONE demo
Here's another approach, using Ruby's flip-flop operator. I cannot say I would recommend this approach, but Rubiests should understand how the flip-flop operator works.
First let's create a file.
str =<<_
some
text
at beginning
begin
some
text
1
end
some text
between
begin
some
text
2
end
some text at end
_
#=> "some\ntext\nat beginning\nbegin\n some\n text\n 1\nend\n...at end\n"
FName = "text"
File.write(FName, str)
Now read the file line-by-line into the array lines:
lines = File.readlines(FName)
#=> ["some\n", "text\n", "at beginning\n", "begin\n", " some\n", " text\n",
# " 1\n", "end\n", "some text\n", "between\n", "begin\n", " some\n",
# " text\n", " 2\n", "end\n", "some text at end\n"]
We can obtain the desired result as follows.
lines.chunk { |line| true if line =~ /^begin\s*$/ .. line =~ /^end\s*$/ }.
map { |_,arr| arr.map(&:strip).join("\n") }
#=> ["begin\nsome\ntext\n1\nend", "begin\nsome\ntext\n2\nend"]
The two steps are as follows.
First, select and group the lines of interest, using Enumerable#chunk with the flip-flop operator.
a = lines.chunk { |line| true if line =~ /^begin\s*$/ .. line =~ /^end\s*$/ }
#=> #<Enumerator: #<Enumerator::Generator:0x007ff62b981510>:each>
We can see the objects that will be generated by this enumerator by converting it to an array.
a.to_a
#=> [[true, ["begin\n", " some\n", " text\n", " 1\n", "end\n"]],
# [true, ["begin\n", " some\n", " text\n", " 2\n", "end\n"]]]
Note that the flip-flop operator is distinguished from a range definition by making it part of a logical expression. For that reason we cannot write
lines.chunk { |line| line =~ /^begin\s*$/ .. line =~ /^end\s*$/ }.to_a
#=> ArgumentError: bad value for range
The second step is the following:
b = a.map { |_,arr| arr.map(&:strip).join("\n") }
#=> ["begin\nsome\ntext\n1\nend", "begin\nsome\ntext\n2\nend"]
Ruby has some great methods in Enumerable. slice_before and slice_after can help with this sort of problem:
string = <<EOT
sometextbefore
begin
sometext
end
sometextafter
begin
sometext2
end
sometextafter2
EOT
ary = string.split # => ["sometextbefore", "begin", "sometext", "end", "sometextafter", "begin", "sometext2", "end", "sometextafter2"]
.slice_after(/^end/) # => #<Enumerator: #<Enumerator::Generator:0x007fb1e20b42a8>:each>
.map{ |a| a.shift; a } # => [["begin", "sometext", "end"], ["begin", "sometext2", "end"], []]
ary.pop # => []
ary # => [["begin", "sometext", "end"], ["begin", "sometext2", "end"]]
If you want the resulting sub-arrays joined then that's an easy step:
ary.map{ |a| a.join("\n") } # => ["begin\nsometext\nend", "begin\nsometext2\nend"]

Resources