I want to be able to read the file into an associative array where I can access the elements by the column head name.
My file is formatted as follows:
KeyName Val1Name Val2Name ... ValMName
Key1 Val1-1 Val2-1 ... ValM-1
Key2 Val1-2 Val2-2 ... ValM-2
Key3 Val1-3 Val2-3 ... ValM-3
.. .. .. .. ..
KeyN Val1-N Val2-N ... ValM-N
The only problem is I don't have a clue how to do it. So far I have:
scores = File.read("scores.txt")
lines = scores.split("\n")
lines.each { |x|
y = x.to_s.split(' ')
}
Which gets close to what I want, but still am unable to get it into the format that is usable for me.
f = File.open("scores.txt") #get an instance of the file
first_line = f.gets.chomp #get the first line in the file (header)
first_line_array = first_line.split(/\s+/) #split the first line in the file via whitespace(s)
array_of_hash_maps = f.readlines.map do |line|
Hash[first_line_array.zip(line.split(/\s+/))]
end
#read the remaining lines of the file via `IO#readlines` into an array, split each read line by whitespace(s) into an array, and zip the first line with them, then convert it into a `Hash` object, and return a collection of the `Hash` objects
f.close #close the file
puts array_of_hash_maps #print the collection of the Hash objects to stdout
Can be done in 3 lines (This is why I love Ruby)
scores = File.readlines('/scripts/test.txt').map{|l| l.split(/\s+/)}
headers = scores.shift
scores.map!{|score|Hash[headers.zip(score)]}
now scores contains your hash array
Here is a verbose explanation
#open the file and read
#then split on new line
#then create an array of each line by splitting on space and stripping additional whitespace
scores = File.open('scores.txt', &:read).split("\n").map{|l| l.split(" ").map(&:strip)}
#shift the array to capture the header row
headers = scores.shift
#initialize an Array to hold the score hashs
scores_hash_array = []
#loop through each line
scores.each do |score|
#map the header value based on index with the line value
scores_hash_array << Hash[score.map.with_index{|l,i| [headers[i],l]}]
end
#=>[{"KeyName"=>"Key1", "Val1Name"=>"Val1-1", "Val2Name"=>"Val2-1", "..."=>"...", "ValMName"=>"ValM-1"},
{"KeyName"=>"Key2", "Val1Name"=>"Val1-2", "Val2Name"=>"Val2-2", "..."=>"...", "ValMName"=>"ValM-2"},
{"KeyName"=>"Key3", "Val1Name"=>"Val1-3", "Val2Name"=>"Val2-3", "..."=>"...", "ValMName"=>"ValM-3"},
{"KeyName"=>"..", "Val1Name"=>"..", "Val2Name"=>"..", "..."=>"..", "ValMName"=>".."},
{"KeyName"=>"KeyN", "Val1Name"=>"Val1-N", "Val2Name"=>"Val2-N", "..."=>"...", "ValMName"=>"ValM-N"}]
scores_hash_array now has a hash for each row in the sheet.
You can try something like this:-
enter code here
fh = File.open("scores.txt","r")
rh={} #result Hash
fh.readlines.each{|line|
kv=line.split(/\s+/)
puts kv.length
rh[kv[0]] = kv[1..kv.length-1].join(",") #***store the values joined by ","
}
puts rh.inspect
fh.close
If you want to get an array of values,replace the last line in loop by
rh[kv[0]] = kv[1..kv.length-1]
Related
I have a text file that has around 100 plus entries like out.txt:
domain\1esrt
domain\2345p
yrtfj
tkpdp
....
....
I have to read out.txt, line-by-line and check whether the strings like "domain\1esrt" are present in any of the files under a different directory. If present delete only that string occurrence and save the file.
I know how to read a file line-by-line and also know how to grep for a string in multiple files in a directory but I'm not sure how to join those two to achieve my above requirement.
You can create an array with all the words or strings you want to find and then delete/replace:
strings_to_delete = ['aaa', 'domain\1esrt', 'delete_me']
Then to read the file and use map to create an array with all the lines who doesn't match with none of the elements in the array created before:
# read the file 'text.txt'
lines = File.open('text.txt', 'r').map do|line|
# unless the line matches with some value on the strings_to_delete array
line unless strings_to_delete.any? do |word|
word == line.strip
end
# then remove the nil elements
end.reject(&:nil?)
And then open the file again but this time to write on it, all the lines which didn't match with the values in the strings_to_delete array:
File.open('text.txt', 'w') do |line|
lines.each do |element|
line.write element
end
end
The txt file looks like:
aaa
domain\1esrt
domain\2345p
yrtfj
tkpdp
....
....
delete_me
I don't know how it'll work with a bigger file, anyways, I hope it helps.
I would suggest using gsub here. It will run a regex search on the string and replace it with the second parameter. So if you only have to replace any single string, I believe you can simply run gsub on that string (including the newline) and replace it with an empty string:
new_file_text = text.gsub(/regex_string\n/, "")
I need to parse a file according to different rules.
The file contains several lines.
I go through the file line by line. When I find a specific string, I have to store the data present in the next lines until a specific character is found.
Example of file:
start {
/* add comment */
first_step {
sub_first_step {
};
sub_second_step {
code = 50,
post = xxx (aaaaaa,
bbbbbb,
cccccc,
eeeeee),
number = yyyy (fffffff,
gggggg,
jjjjjjj,
ppppppp),
};
So, in this case:
File.open(#file_to_convert, "r").each_line do |line|
In "line" I have my current line. I need to:
1) find when the line contains the string "xxx"
if line.include?("union") then
Correct?
2) store the next values (e.g.: aaaa, bbbb, ccccc,eeee) in an array until I find the character ")". This highlights that the section is finished.
I think we I reach the line with the string "xxxx" I have to iterate the next lines inside the block "if".
Try this:
file_contents = File.read(#file_to_convert)
lines = file_contents[/xxx \(([^)]+)\)/, 1].split
# => ["aaaaaa,", "bbbbbb,", "cccccc,", "eeeeee"]
The regex (xxx \(([^)]+)\)) takes all the text after xxx ( until the next ), and split splits it into its items.
It think this is what you are looking for:
looking = true
results = []
File.open(#file_to_convert, "r").each_line do |line|
if looking
if line.include?("xxx")
looking = false
results << line.scan(/\(([^,]*)/x)
end
else
if line.include?(")")
results << line.strip.delete('),')
break
else
results << line.strip.delete(',')
end
end
end
puts results
I'm trying to parse an XML file with REXML on Ruby.
What I want is print all values and the corresponding element name as header. The issue I have
is that some nodes have child elements that appear repeated and have the same Xpath, so for those
elements I want to printing in the same column. Then for the small sample below, the output desired
for the elements of Node_XX would be:
Output I'm looking for:
RepVal|YurVal|CD_val|HJY_val|CD_SubA|CD_SubB
MTSJ|AB01-J|45|01|87|12
||34|11|43|62
What I have so far is the code below, but I don´t know how to do in order repeated
elements be printed in the same column.
Thanks in advance for any help.
Code I have so far:
#!/usr/bin/env ruby
require 'rexml/document'
include REXML
xmldoc = Document.new File.new("input.xml")
arr_H_Xpath = [] # Array to store only once all Xpath´s (without Xpath repeated)
arr_H_Values = [] # Array for headers (each child element´s name)
arr_Values = [] # Values of each child element.
xmldoc.elements.each("//Node_XYZ") {|element|
element.each_recursive do |child|
if (child.has_text? && child.text =~ /^[[:alnum:]]/) && !arr_H_Xpath.include?(child.xpath.gsub(/\[.\]/,"")) # Check if element has text and Xpath is stored in arr_H_Xpath.
arr_H_Xpath << child.xpath.gsub(/\[.\]/,"") #Remove the [..] for repeated XPaths
arr_H_Values << child.xpath.gsub(/\/\w.*\//,"") #Get only name of child element to use it as header
arr_Values << child.text
end
print arr_H_Values + "|"
arr_H_Values.clear
end
puts arr_Values.join("|")
}
The input.xml is:
<TopNode>
<NodeX>
<Node_XX>
<RepCD_valm>
<RepVal>MTSJ</RepVal>
</RepCD_valm>
<RepCD_yur>
<Yur>
<YurVal>AB01-J</YurVal>
</Yur>
</RepCD_yur>
<CodesDif>
<CD_Ranges>
<CD_val>45</CD_val>
<HJY_val>01</HJY_val>
<CD_Sub>
<CD_SubA>87</CD_SubA>
<CD_SubB>12</CD_SubB>
</CD_Sub>
</CD_Ranges>
</CodesDif>
<CodesDif>
<CD_Ranges>
<CD_val>34</CD_val>
<HJY_val>11</HJY_val>
<CD_Sub>
<CD_SubA>43</CD_SubA>
<CD_SubB>62</CD_SubB>
</CD_Sub>
</CD_Ranges>
</CodesDif>
</Node_XX>
<Node_XY>
....
....
....
</Node_XY>
</NodeX>
</TopNode>
Here's one way to solve your problem. It is probably a little unusual, but I was experimenting. :)
First, I chose a data structure that can store the headers as keys and multiple values per key to represent the additional row(s) of data: a MultiMap. It is like a hash with multiple keys.
With the multimap, you can store the elements as key-value pairs:
data = Multimap.new
doc.xpath('//RepVal|//YurVal|//CD_val|//HJY_val|//CD_SubA|//CD_SubB').each do |elem|
data[elem.name] = elem.inner_text
end
The content of data is:
{"RepVal"=>["MTSJ"],
"YurVal"=>["AB01-J"],
"CD_val"=>["45", "34"],
"HJY_val"=>["01", "11"],
"CD_SubA"=>["87", "43"],
"CD_SubB"=>["12", "62"]}
As you can see, this was a simple way to collect all the information you need to create your table. Now it is just a matter of transforming it to your pipe-delimited format. For this, or any delimited format, I recommend using CSV:
out = CSV.generate({col_sep: "|"}) do |csv|
columns = data.keys.to_a.uniq
csv << columns
while !data.values.empty? do
csv << columns.map { |col| data[col].shift }
end
end
The output is:
RepVal|YurVal|CD_val|HJY_val|CD_SubA|CD_SubB
MTSJ|AB01-J|45|01|87|12
||34|11|43|62
Explanation:
CSV.generate creates a string. If you wanted to create an output file directly, use CSV.open instead. See the CSV class for more information. I added the col_sep option to delimit with a pipe character instead of the default of a comma.
Getting a list of columns would just be the keys if data was a hash. But since it is a Multimap which will repeat key names, I have to call .to_a.uniq on it. Then I add them to the output using csv << columns.
In order to create the second row (and any subsequent rows), we slice down and get the first value for each key of data. That's what the data[col].shift does: it actually removes the first value from each value in data. The loop is in place to keep going as long as there are more values (more rows).
I'm trying to read through a file, find a certain pattern and then grabbing a set number of lines of text after the line that contains that pattern. Not really sure how to approach this.
If you want the n number of lines after the line matching pattern in the file filename:
lines = File.open(filename) do |file|
line = file.readline until line =~ /pattern/ || file.eof;
file.eof ? nil : (1..n).map { file.eof ? nil : file.readline }.compact
end
This should handle all cases, like the pattern not present in the file (returns nil) or there being less than n lines after the matching lines (the resulting array containing the last lines of the file.)
First parse the file into lines. Open, read, split on the line break
lines = File.open(file_name).read.split("\n")
Then get index
index = line.index{|x| x.match(/regex_pattern/)}
Where regex_pattern is the pattern that you are looking for. Use the index as a starting point and then the second argument is the number of lines (in this case 5)
lines[index, 5]
It will return an array of 'lines'
You could combine it a bit more to reduce the number of lines. but I was attempting to keep it readable.
If you're not tied to Ruby, grep -A 12 trivet will show the 12 lines after any line with trivet in it. Any regex will work in place of "trivet"
matched = false;
num = 0;
res = "";
new File(filename).each_line { |line|
if (matched) {
res += line+"\n";
num++;
if (num == num_lines_desired) {
break;
}
} elsif (line.match(/regex/)) {
matched = true;
}
}
This has the advantage of not needing to read the whole file in the event of a match.
When done, res will hold the desired lines.
in rails (only difference is how I generate the file object)
file = File.open(File.join(Rails.root, 'lib', 'file.json'))
#convert file into an array of strings, with \n as the separator
line_ary = file.readlines
line_count = line_ary.count
i = 0
#or however far up the document you want to be...you can get very fancy with this or just do it manually
hsh = {}
line_count.times do |l|
child_id = JSON.parse(line_ary[i])
i += 1
parent_ary = JSON.parse(line_ary[i])
i += 1
hsh[child_id] = parent_ary
end
haha I've said too much that should definitely get you started
How can I in Ruby read a string from a file into an array and only read and save in the array until I get a certain marker such as ":" and stop reading?
Any help would be much appreciated =)
For example:
10.199.198.10:111 test/testing/testing (EST-08532522)
10.199.198.12:111 test/testing/testing (EST-08532522)
10.199.198.13:111 test/testing/testing (EST-08532522)
Should only read the following and be contained in the array:
10.199.198.10
10.199.198.12
10.199.198.13
This is a rather trivial problem, using String#split:
results = open('a.txt').map { |line| line.split(':')[0] }
p results
Output:
["10.199.198.10", "10.199.198.12", "10.199.198.13"]
String#split breaks a string at the specified delimiter and returns an array; so line.split(':')[0] takes the first element of that generated array.
In the event that there is a line without a : in it, String#split will return an array with a single element that is the whole line. So if you need to do a little more error checking, you could write something like this:
results = []
open('a.txt').each do |line|
results << line.split(':')[0] if line.include? ':'
end
p results
which will only add split lines to the results array if the line has a : character in it.