Filter through a text file - ruby

I want to sort through a text file and leave only a certain section. I have this text in the text file:
{
"id"=>”0000001”,
"type"=>”cashier”,
"summary"=>”Henock”,
"self"=>"https://google.com/accounts/0000001”,
"html_url"=>"https://google.com/accounts/0000001”
}
{
"id"=>”0000002”,
"type"=>”cashier”,
"summary"=>”Vic”,
"self"=>"https://google.com/accounts/0000002”,
"html_url"=>"https://google.com/accounts/0000002”
}
{
"id"=>”0000003”,
"type"=>”cashier”,
"summary"=>”Mo”,
"self"=>"https://google.com/accounts/0000003”,
"html_url"=>"https://google.com/accounts/0000003”
}
How would I sort it so that only the information with person "Mo" is shown?
This is what I tried:
somefile.readlines("filename.txt").grep /Mo}/i
but it is useless.

Code
def retrieve_block(fname, summary_target)
arr = []
File.foreach(fname) do |line|
next if line.strip.empty?
arr << line
next unless arr.size == 7
return arr.join if arr[3].match?(/\"summary\"=>\"#{summary_target}\"/)
arr = []
end
end
Example
Let's first create a file.
text =<<_
{
"id"=>"0000001",
"type"=>"cashier",
"summary"=>"Henock",
"self"=>"https://google.com/accounts/0000001",
"html_url"=>"https://google.com/accounts/0000001"
}
{
"id"=>"0000003",
"type"=>"cashier",
"summary"=>"Mo",
"self"=>"https://google.com/accounts/0000003",
"html_url"=>"https://google.com/accounts/0000003"
}
_
All of the keys and values represented in this string are surrounded with double-quotes. In the question however, many of these keys and values are surrounded by special characters that have a superficial appearance of a double quote. I have assumed that those characters would be converted to double quotes in a pre-processing step.
FName = "test"
File.write(FName, text)
#=> 325
puts retrieve_block(FName, "Mo")
{
"id"=>"0000003",
"type"=>"cashier",
"summary"=>"Mo",
"self"=>"https://google.com/accounts/0000003",
"html_url"=>"https://google.com/accounts/0000003"
}
This should work because of the consistent format of the file.
To return a hash, rather than a string, a slight modification is required.
def retrieve_block(fname, summary_target)
h = {}
File.foreach(fname) do |line|
line.strip!
next if line.empty? || line == '{'
if line == '}'
if h["summary"] == summary_target
break h
else
h = {}
end
else
k, v = line.delete('",').split("=>")
h[k] = v
end
end
end
retrieve_block(FName, "Mo")
#=> {"id"=>"0000003",
# "type"=>"cashier",
# "summary"=>"Mo",
# "self"=>"https://google.com/accounts/0000003",
# "html_url"=>"https://google.com/accounts/0000003"}

Related

Simplify similar loops into one

I am writing a braille converter. I have this method to handle the top line of a braille character:
def top(input)
braille = ""
#output_first = ""
#top.each do |k, v|
input.chars.map do |val|
if k.include?(val)
braille = val
braille = braille.gsub(val, v)
#output_first = #output_first + braille
end
end
end
#output_first
end
I'm repeating the same each loop for the middle and bottom lines of a character. The only thing that is different from the method above is that the #top is replaced with #mid and #bottom to correspond to the respective lines.
Trying to figure a way to simplify the each loop so I can call it on top, mid and bottom lines.
You can put the loop in a separate method.
def top(input)
#output_first = handle_line(#top)
end
def handle_line(line)
result = ''
line.each do |k, v|
input.chars.map do |val|
if k.include?(val)
braille = val
braille = braille.gsub(val, v)
result = result + braille
end
end
end
result
end
You can then call handle_line in your #mid and #bottom processing
I'm not sure whats in the #top var but I believe braille has limited number of characters and therefore I would consider some map structure
BRAILLE_MAP = {
'a' => ['..',' .','. '], # just an example top,mid,bot line for character
'b' => ['..','..',' '],
# ... whole map
}
def lines(input)
top = '' # representation of each line
mid = ''
bot = ''
input.each_char do |c|
representation = BRAILLE_MAP[c]
next unless representation # handle invalid char
top << representation[0] # add representation to each line
mid << representation[1]
bot << representation[2]
end
[top,mid,bot] # return the lines
end
There may be better way to handle those 3 variables, but I cant think of one right now

Transposing a string

Given a (multiline) string, where each line is separated by "\n" and may not be necessarily of the same length, what is the best way to transpose it into another string as follows? Lines shorter than the longest one should be padded with space (right padding in terms of the original, or bottom padding in terms of the output). Applying the operation on a string twice should be idempotent modulo padding.
Input string
abc
def ghi
jk lm no
Output string
adj
bek
cf
l
gm
h
in
o
Here are five approaches. (Yes, I got a bit carried away, but I find that trying to think of different ways to accomplish the same task is good exercise for the grey cells.)
#1
An uninteresting, brute-force method:
a = str.split("\n")
l = a.max_by(&:size).size
puts a.map { |b| b.ljust(l).chars }
.transpose
.map { |c| c.join.rstrip }.join("\n")
adj
bek
cf
l
gm
h
in
o
#2
This method and all that follow avoid the use of ljust and transpose, and make use of the fact that if e is an empty array, e.shift returns nil and leaves e an empty array. (Aside: I am often reaching for the non-existent method String#shift. Here it would have avoided the need to convert each line to an array of characters.)
a = str.split("\n").map(&:chars)
a.max_by(&:size).size.times.map { a.map { |e| e.shift || ' ' }.join.rstrip }
#3
This and the remaining methods avoid the need to compute the length of the longest string:
a = str.split("\n").map(&:chars)
a_empty = Array(a.size, [])
[].tap { |b| b << a.map { |e| e.shift || ' ' }.join.rstrip while a != a_empty }
#4
This method makes use of Enumerator#lazy, which has been available since v2.0.
a = str.split("\n").map(&:chars)
(0..Float::INFINITY).lazy.map do |i|
a.each { |e| e.shift } if i > 0
a.map { |e| e.first || ' ' }.join.rstrip
end.take_while { c = a.any? { |e| !e.empty? } }.to_a
(I initially had a problem getting this to work, as I was not getting the element of the output (" o"). The fix was adding the third line and changing the line that follows from a.map { |e| e.shift || ' ' }.join.rstrip to what I have now. I mention this because it seems like it may be common problem when using lazy.)
#5
Lastly, use recursion:
def recurse(a, b=[])
return b[0..-2] if a.last.empty?
b << a.map { |e| e.shift || ' ' }.join.rstrip
recurse(a, b)
end
a = str.split("\n").map(&:chars)
recurse(a)
I would write it like this:
def transpose s
lines = s.split(?\n)
longest = lines.map { |l| l.length }.max
(0..longest).map do |index|
lines.map { |l| l[index] || ' ' }.join
end * ?\n
end
This one works
s = "abc\ndef ghi\njk lm no\n"
s = s.split("\n")
s2 = ''
i = 0
while true
line = ''
s.each do |row|
line += (row[i] or ' ')
end
if line.strip == ''
break
end
s2 += line + "\n"
i += 1
end
puts s2
This one also works
s = "abc\ndef ghi\njk lm no\n"
s = s.split("\n")
maxlen = s.inject(0) {|m,r| m=[m, r.length].max}
s.map! {|r| r.ljust(maxlen).split(//)}
s = s.transpose.map {|r| r.join('')}.join("\n")
puts s
A play on what Chron did for an earlier version of ruby (e.g., 1.8.x). Example based on your original input that showed newline characters
str="abc\\n
def ghi\\n
jk lm no\\n"
def transpose s
lines = s.gsub("\\n","").split("\n")
longest = lines.map { |line| line.length }.max
(0..longest).map do |char_index|
lines.map { |line| line.split('')[char_index] || ' ' }.join
end * "\\n\n"
end
puts transpose(str)
I would write it like this:
def transpose_text(text)
# split the text into lines
text = text.split("\n")
# find the length of the longest line
max_line_length = text.map(&:size).max
# pad each line with white space and convert them to character arrays
text.map! { |line| line.ljust(max_line_length).chars }
#transpose the character arrays and then join them all into one string
text.transpose.map(&:join).join("\n")
end

split array element before . eg li.mean-array

I'm new to ruby i would like to know how can i split element containing special character.
I have the following array :
my_array = ["sh.please-word", ".things-to-do" , "#cool-stuff", "span.please-word-not"]
my_array.slice!(0..1)
puts my_array
=>#cool-stuff
=>span.please-word
i want it to split array elements that doesn't start with either a dot(.) or a (#) and return the list like this:
.please-word
.things-to-do
#cool_stuff
.please-word-not
i tried to use the slice method for a string which works perfectly, but when i try with the array element it doesn't work.
this is what i have done so far.
list_of_selectors = []
file = File.open("my.txt")
file.each_line do |line|
list_of_selectors << line.split(' {')[0] if line.start_with? '.' or line.start_with? '#'
end
while line = file.gets
puts line
end
i = 0
while i < list_of_selectors.length
puts "#{list_of_selectors[i]}"
i += 1
end
list = []
list_of_selectors.each { |x|
list.push(x.to_s.split(' '))
}
list_of_selectors = list
puts list_of_selectors
list_of_selectors.map! { |e| e[/[.#].*/]}
puts list_of_selectors
result_array = my_array.map { |x| x[/[.#].*/] }
# => [".please-word", ".things-to-do", "#cool-stuff", ".please-word-not"]
The above uses a regular expression to extract the text, beginning with either a dot(.) or a hashtag (#), and return it in the resulting array.

selective replacing of printf statements

I am trying to search for a bunch of print statements that I want to filter as follows:
I want to select all dbg_printfs.
Out of all of those I want to select those that have value.stringValue().
Out of those I only want those that do not have value.stringValue().value().
Finally, I want to replace those lines with value.stringValue() to value.stringValue().value().
I don't know why my current code isn't working?
fileObj = File.new(filepath, "r")
while (line = fileObj.gets)
line.scan(/dbg_printf/) do
line.scan(/value.stringValue()/) do
if !line.scan(/\.value\(\)/)
line.gsub!(/value.stringValue()/, 'value.stringValue().value()')
end
end
end
fileObj.close
Primarily, your problem seems to be that you expect altering the string returned from gets to alter the contents of the file. There isn't actually that kind of relationship between strings and files. You need to explicitly write the modifications to the file. Personally, I would probably write that code like this:
modified_contents = IO.readlines(filepath).map do |line|
if line =~ /dbg_printf/
# This regex just checks for value.stringValue() when not followed by .value()
line.gsub /value\.stringValue\(\)(?!\.value\(\))/, 'value.stringValue().value()'
else
line
end
end
File.open(filepath, 'w') {|file| file.puts modified_contents }
The problem is that you are not writing the changed lines back to the same file or a new file. To write them to the same file, read the file into an array, change the array and then write it back to the same or a different file (the later being the more prudent). Here's one way to do that with few lines of code.
Code
fin_name and fout_name are the names (with paths) of the input and output files, respectively.
def filter_array(fin_name, fout_name)
arr_in = File.readlines(fin_name)
arr_out = arr_in.map { |l| (l.include?('dbg_printfs') &&
l.include?('value.stringValue()') &&
!l.include?('value.stringValue().value()')) ?
'value.stringValue() to value.stringValue().value()' : l }
File.open(fout_name, 'w') { |f| f.puts arr_out }
end
Because you are reading code files, they will not be so large that reading them all at once into memory will be a problem.
Example
First, we'll construct an input file:
array = ["My dbg_printfs was a value.stringValue() as well.",
"Her dbg_printfs was a value.stringValue() but not " +
"a value.stringValue().value()",
"value.stringValue() is one of my favorites"]
fin_name = 'fin'
fout_name = 'fout'
File.open(fin_name, 'w') { |f| f.puts array }
We can confirm its contents with:
File.readlines(fin_name).map { |l| puts l }
Now try it:
filter_array(fin_name, fout_name)
Read the output file to see if it worked:
File.readlines(fout_name).map { |l| puts l }
#=> value.stringValue() to value.stringValue().value()
# Her dbg_printfs was a value.stringValue() but not a value.stringValue().value()
# value.stringValue() is one of my favorites
It looks OK.
Explanation
def filter_array(fin_name, fout_name)
arr_in = File.readlines(fin_name)
arr_out = arr_in.map { |l| (l.include?('dbg_printfs') &&
l.include?('value.stringValue()') &&
!l.include?('value.stringValue().value()')) ?
'value.stringValue() to value.stringValue().value()' : l }
File.open(fout_name, 'w') { |f| f.puts arr_out }
end
For the above example,
arr_in = File.readlines('fin')
#=> ["My dbg_printfs was a value.stringValue() as well.\n",
# "Her dbg_printfs was a value.stringValue() but not a value.stringValue().value()\n",
# "value.stringValue() is one of my favorites\n"]
The first element of arr_in passed to map is:
l = "My dbg_printfs] was a value.stringValue() as well."
We have
l.include?('dbg_printfs') #=> true
l.include?('value.stringValue()') #=> true
!l.include?('value.stringValue().value()') #=> true
so that element is mapped to:
"value.stringValue() to value.stringValue().value()"
Neither of the other two elements are replaced by this string, because
!l.include?('value.stringValue().value()') #=> false
and
l.include?('dbg_printfs') #=> false
respectively. Hence,
arr_out = arr_in.map { |l| (l.include?('dbg_printfs') &&
l.include?('value.stringValue()') &&
!l.include?('value.stringValue().value()')) ?
'value.stringValue() to value.stringValue().value()' : l }
#=> ["value.stringValue() to value.stringValue().value()",
# "Her dbg_printfs was a value.stringValue() but not a value.stringValue().value()\n",
# "value.stringValue() is one of my favorites\n"]
The final step is writing arr_out to the output file.

Confused about Ruby length property

Why is the my foo() function printing out the string "YAAR" as having a length of 5?
def foo()
map = Hash.new
File.open('dictionary.txt').each_line{ |s|
word = s.split(',')
if word.any? { |b| b.include?('AA') }
puts word.last
puts word.last.length
end
}
end
someFile.txt
265651,YAAR
265654,YAARS
output
YAAR
5
YAARS
6
You're getting a newline '\n' at the end of both strings. So, your split is receiving:
"265651,YAAR\n"
"265654","YAARS\n"
reading from a file you'll get a new line char (\n) at the end of all the lines (except maybe the last one)
instead of
word = s.split(',')
in your loop, use this
word = s.split(',').map { |s| s.chomp }

Resources