Parse YAML to key value and include yaml categories - ruby

Was looking to parse a YAML file into plain key=value strings.
I have some initial structure, but I wanted to get some of the keys from a yaml as well.
test:
line1: "line 1 text"
line2: "line 2 text"
line3: "line 3 text"
options:
item1: "item 1 text"
item2: "item 2 text"
item3: "item 3 text"
Ruby:
File.open("test.yml") do |f|
f.each_line do |line|
line.chomp
if line =~ /:/
line.chop
line.sub!('"', "")
line.sub!(": ", "=")
line.gsub!(/\A"|"\Z/, '')
printline = line.strip
puts "#{printline}"
target.write( "#{printline}")
end
end
end
The results currently look like
test:
line1=line 1 text
line2=line 2 text
line3=line 4 text
options:
item1=item 1 text
item2=item 2 text
item3=item 3 text
But I am looking to add the category before like:
test/line1=line 1 text
test/line2=line 2 text
test/line3=line 3 text
options/item1=item 1 text
options/item2=item 2 text
options/item3=item 3 text
What is the best way to include the category for each line?

You could use the YAML#load_file, read each line and adapt it to your need:
foo = YAML.load_file('file.yaml').map do |key, value|
value.map { |k, v| "#{key}/#{k}=#{v}" }
end
foo.each { |value| puts value }
# test/line1=line 1 text
# test/line2=line 2 text
# test/line3=line 3 text
# options/item1=item 1 text
# options/item2=item 2 text
# options/item3=item 3 text

You can easily convert YAML to a hash:
#test.yml
test:
line1: "line 1 text"
line2: "line 2 text"
line3: "line 3 text"
options:
item1: "item 1 text"
item2: "item 2 text"
item3: "item 3 text"
#ruby
hash = YAML.load File.read('test.yml')
Now you can do anything you want with the hash, get the keys, values etc.
hash['options']['item1'] #=> "item 1 text"
hash['test']['line1'] #=> "line 1 text"

Related

How to get the desire ouput using bash script?

I am trying to get this ouput, i don't know how to get it i search through the internet but i didn't know what will be the exact keyword for searching, so i post it here my question
i have a csv file data.csv which it contents are shown below
I have tried so far is shown my MWE
cat data.csv|sed 's/\n.*//g'
10,1,1,"line 1 text"
10,1,2,"line 2 text"
10,1,3,"line 3 text"
10,1,4,"line 4 text"
10,1,5,
line 5 text
10,1,6,"<J>
line 6 text"
10,1,7,"line 7 text"
10,1,8,"
line 8 text"
10,1,9,"line 9 text"
I want the ouput as shown below
10,1,1,"line 1 text"
10,1,2,"line 2 text"
10,1,3,"line 3 text"
10,1,4,"line 4 text"
10,1,5,"line 5 text"
10,1,6,"<J>line 6 text"
10,1,7,"line 7 text"
10,1,8,"line 8 text"
10,1,9,"line 9 text"
With GNU awk for mult-char RS, RT, and gensub() you can just describe each record as a series of 4 comma-separated fields ending in newline and then remove the newlines and spaces around them:
$ awk -v RS='([^,]*,){3}[^,]*\n' '{$0=gensub(/\s*\n\s*/,"","g",RT)} 1' file
10,1,1,"line 1 text"
10,1,2,"line 2 text"
10,1,3,"line 3 text"
10,1,4,"line 4 text"
10,1,5,line 5 text
10,1,6,"<J>line 6 text"
10,1,7,"line 7 text"
10,1,8,"line 8 text"
10,1,9,"line 9 text"
and to ensure quotes around the last field:
$ awk -v RS='([^,]*,){3}[^,]*\n' '{$0=gensub(/\s*\n\s*/,"","g",RT); $0=gensub(/,([^",]*)$/,",\"\\1\"",1)} 1' file
10,1,1,"line 1 text"
10,1,2,"line 2 text"
10,1,3,"line 3 text"
10,1,4,"line 4 text"
10,1,5,"line 5 text"
10,1,6,"<J>line 6 text"
10,1,7,"line 7 text"
10,1,8,"line 8 text"
10,1,9,"line 9 text"
Note that this will work no matter how many lines your 4th field is split over:
$ cat file
10,1,1,"line 1 text"
10,1,2,
foo
line
2
text
bar
10,1,3,"line 3 text"
$ awk -v RS='([^,]*,){3}[^,]*\n' '{$0=gensub(/\s*\n\s*/,"","g",RT); $0=gensub(/,([^",]*)$/,",\"\\1\"",1)} 1' file
10,1,1,"line 1 text"
10,1,2,"fooline2textbar"
10,1,3,"line 3 text"
In addition to Cyrus's answer, to ensure 'line 5 text' is surrounded with double-quotes you can add additional expressions to replace the ', ' with ',"' and lines that do not end in '"' with a '"', e.g.
sed -e '/".*"$/!{N;s/\n *//}' -e 's/, /,"/' -e '/"$/!{s/$/"/}' file
The first expression is exactly the same. This would provide your requested output of:
$ sed -e '/".*"$/!{N;s/\n *//}' -e 's/, /,"/' -e '/"$/!{s/$/"/}' file
10,1,1,"line 1 text"
10,1,2,"line 2 text"
10,1,3,"line 3 text"
10,1,4,"line 4 text"
10,1,5,"line 5 text"
10,1,6,"<J>line 6 text"
10,1,7,"line 7 text"
10,1,8,"line 8 text"
10,1,9,"line 9 text"
With GNU sed:
sed '/".*"$/!{N;s/\n *//}' file
If a line does not match regex ".*"$ append next line (N) to sed's pattern space and replace newline followed by none, one or more white spaces with nothing (s/\n *//).
Output:
10,1,1,"line 1 text"
10,1,2,"line 2 text"
10,1,3,"line 3 text"
10,1,4,"line 4 text"
10,1,5, line 5 text
10,1,6,"line 6 text"
10,1,7,"line 7 text"
10,1,8,"line 8 text"
10,1,9,"line 9 text"
I did not add the missing quotation marks in line 5.
See: man sed and The Stack Overflow Regular Expressions FAQ

How to search and replace multi line text AWK

I have a file with following content (snippet) -- The test could be anywhere in the file.
More text here
Things-I-DO-NOT-NEED:
name: "Orange"
count: 8
count: 10
Things-I-WANT:
name: "Apple"
count: 3
count: 4
More text here
I would like to replace : (Including indentation)
Things-I-WANT:
name: "Apple"
count: 3
count: 4
with
Things-I-WANT:
name: "Banana"
count: 7
Any suggestions on achieving it using awk/sed? Thanks!
You can do this in awk:
#!/usr/bin/env awk
# Helper variable
{DEFAULT = 1}
# Matches a line that begins with an alphabet
/^[[:alpha:]]+/ {
# This matches "Things-I-WANT:"
if ($0 == "Things-I-WANT:") {
flag = 1
}
# Matches a line that begins with an alphabet and which is
# right after "Things-I-WANT:" block
else if (flag == 1) {
print "\tname: \"Banana\""
print ""
print "\tcount: 7"
print ""
flag = 0
}
# Matches any other line that begins with an alphabet
else {
flag = 0
}
print $0
DEFAULT = 0
}
# If line does not begin with an alphabet, do this
DEFAULT {
# Print any line that's not within "Things-I-WANT:" block
if (flag == 0) {
print $0
}
}
You can run this in bash using:
$ awk -f test.awk test.txt
The output of this script will be:
More text here
Things-I-DO-NOT-NEED:
name: "Orange"
count: 8
count: 10
Things-I-WANT:
name: "Banana"
count: 7
More text here
As you can see, the Things-I-WANT: block has been replaced.

How to add lines after a pattern using sed

In shell script, how can I add lines after a certain pattern? Say I have the following file and I want to add two lines after block 1 and blk 2.
abc
def
[block 1]
apples = 3
grapes = 4
[blk 2]
banana = 2
apples = 3
[block 1] and [blk 2] will be present in the file.
The output I am expecting is below.
abc
def
[block 1]
oranges = 5
pears = 2
apples = 3
grapes = 4
[blk 2]
oranges = 5
pears = 2
banana = 2
apples = 3
I thought of doing this with sed. I tried the below command but it does not work on my Mac. I checked these posts but I couldn't find what I am doing wrong.
$sed -i '/\[block 1\]/a\n\toranges = 3\n\tpears = 2' sample2.txt
sed: 1: "sample2.txt": unterminated substitute pattern
How can I fix this? Thanks for your help!
[Edit]
I tried the below and these didn't work on my Mac.
$sed -E '/\[block 1\]|\[blk 2\]/r\\n\\toranges = 3\\n\\tpears = 2' sample2.txt
abc
def
[block 1]
apples = 3
grapes = 4
[blk 2]
banana = 2
apples = 3
$sed -E '/\[block 1\]|\[blk 2\]/r\n\toranges = 3\n\tpears = 2' sample2.txt
abc
def
[block 1]
apples = 3
grapes = 4
[blk 2]
banana = 2
apples = 3
Awk attempt:
$awk -v RS= '/\[block 1\]/{$0 = $0 ORS "\toranges = 3" ORS "\tpears = 2" ORS}
/\[blk 2\]/{$0 = $0 ORS "\toranges = 5" ORS "\tpears = 2" ORS} 1' sample2.txt
abc
def
[block 1]
apples = 3
grapes = 4
[blk 2]
banana = 2
apples = 3
oranges = 3
pears = 2
oranges = 5
pears = 2
Note that the text provided to the a command has to be on a separate line:
sed '/\[block 1\]/ {a\
\toranges = 3\n\tpears = 2
}' file
and all embedded newlines have to be escaped. Another way to write it (probably more readable):
sed '/\[block 1\]/ {a\
oranges = 3\
pears = 2
}' file
Also, consider the r command as an alternative to the a command when larger amounts of text have to be inserted (e.g. more than one line). It will read data from a text file provided:
sed '/\[block 1\]/r /path/to/text' file
To handle multiple sections with one sed program, you can use the alternation operator (available in ERE, notice the -E flag):
sed -E '/\[block 1\]|\[blk 2\]/r /path/to/text' file
This awk should work with empty RS. This breaks each block into a single record.
awk -v RS= '/\[block 1\]/{$0 = $0 ORS "\toranges = 3" ORS "\tpears = 2" ORS}
/\[blk 2\]/{$0 = $0 ORS "\toranges = 5" ORS "\tpears = 2" ORS} 1' file
abc
def
[block 1]
apples = 3
grapes = 4
oranges = 3
pears = 2
[blk 2]
banana = 2
apples = 3
oranges = 5
pears = 2
This might work for you (GNU sed):
sed '/^\[\(block 1\|blk 2\)\]\s*$/{n;h;s/\S.*/oranges = 5/p;s//pears = 2/p;x}' file
Locate the required match, print it and then store the next line in the hold space. Replace the first non-space character to the end of the line with the first required line, repeat for the second required string and then revert to the original line.

Print out the duplicates and the amount of duplicates in ruby arrays

if I gave you an array:
['apples', 'bananas', 'apples','apples','apples', 'cat', 'dog', 'dog', 'troll']
and said:
Print me out the name of an each items and how often they appear, such that the out put was:
apples 4
bananas 1
cat 1
dog 2
troll 1
How would you do this, it seems simple, but to me it is stumping me.
Do as below :-
array = [
'apples', 'bananas', 'apples','apples',
'apples', 'cat', 'dog', 'dog', 'troll'
]
array.group_by(&:to_s).each do |k,v|
puts "#{k} #{v.size}"
end
# >> apples 4
# >> bananas 1
# >> cat 1
# >> dog 2
# >> troll 1

Ruby variable scoping is killing me

I have a parser that reads files. Inside a file, you can declare a filename and the parser will go and read that one, then when it is done, pick up right where it left off and continue. This can happen as many levels deep as you want.
Sounds pretty easy so far. All I want to do is print out the file names and line numbers.
I have a class called FileReader that looks like this:
class FileReader
attr_accessor :filename, :lineNumber
def initialize(filename)
#filename = filename
#lineNumber = 0
end
def readFile()
# pseudocode to make this easy
open #filename
while (lines)
#lineNumber = #lineNumber + 1
if(line startsWith ('File:'))
FileReader.new(line).readFile()
end
puts 'read ' + #filename + ' at ' + #lineNumber.to_s()
end
puts 'EOF'
end
end
Simple enough. So lets say I have a file that refers other files like this. File1->File2->File3. This is what it looks like:
read File1 at 1
read File1 at 2
read File1 at 3
read File2 at 1
read File2 at 2
read File2 at 3
read File2 at 4
read File3 at 1
read File3 at 2
read File3 at 3
read File3 at 4
read File3 at 5
EOF
read File3 at 5
read File3 at 6
read File3 at 7
read File3 at 8
EOF
read File2 at 4
read File2 at 5
read File2 at 6
read File2 at 7
read File2 at 8
read File2 at 9
read File2 at 10
read File2 at 11
And that doesnt make any sense to me.
File 1 has 11 lines
File 2 has 8 lines
File 3 has 4 lines
I would assume creating a new object would have its own scope that doesn't affect a parent object.
class FileReader
def initialize(filename)
#filename = filename
end
def read_file
File.readlines(#filename).map.with_index {|l, i|
next "read #{#filename} at #{i}" unless l.start_with?('File:')
FileReader.new(l.gsub('File:', '').chomp).read_file
}.join("\n") << "\nEOF"
end
end
puts FileReader.new('File1').read_file
or just
def File.read_recursively(filename)
readlines(filename).map.with_index {|l, i|
next "read #{filename} at #{i}" unless l.start_with?('File:')
read_recursively(l.gsub('File:', '').chomp)
}.join("\n") << "\nEOF"
end
puts File.read_recursively('File1')
I agree that something in your rewriting code has obfuscated the problem. Yes, those instance variables should be local to the instance.
Watch out for things where a block of code or conditional may be returning a value and assigning it to the instance variable... for example, if your open statement uses the next block and returns the filename somehow... #filename = open(line) {}
I say this because the filename obviously didn't change back after the EOF
This is what I came up with. It's not pretty but I tried to stay as close to your code as possible while Ruby-fying it too.
file_reader.rb
#!/usr/bin/env ruby
class FileReader
attr_accessor :filename, :lineNumber
def initialize(filename)
#filename = filename
#lineNumber = 0
end
def read_file
File.open(#filename,'r') do |file|
while (line = file.gets)
line.strip!
#lineNumber += 1
if line.match(/^File/)
FileReader.new(line).read_file()
end
puts "read #{#filename} at #{#lineNumber} : Line = #{line}"
end
end
puts 'EOF'
end
end
fr = FileReader.new("File1")
fr.read_file
And the File1, File2, and File3 looking like:
Line 1
Line 2
Line 3
File2
Line 5
Line 6
Line 7
Line 8
Line 9
Line 10
Line 11
Output:
read File1 at 1 : Line = Line 1
read File1 at 2 : Line = Line 2
read File1 at 3 : Line = Line 3
read File2 at 1 : Line = Line 1
read File2 at 2 : Line = Line 2
read File2 at 3 : Line = Line 3
read File2 at 4 : Line = Line 4
read File3 at 1 : Line = Line 1
read File3 at 2 : Line = Line 2
read File3 at 3 : Line = Line 3
read File3 at 4 : Line = Line 4
EOF
read File2 at 5 : Line = File3
read File2 at 6 : Line = Line 6
read File2 at 7 : Line = Line 7
read File2 at 8 : Line = Line 8
EOF
read File1 at 4 : Line = File2
read File1 at 5 : Line = Line 5
read File1 at 6 : Line = Line 6
read File1 at 7 : Line = Line 7
read File1 at 8 : Line = Line 8
read File1 at 9 : Line = Line 9
read File1 at 10 : Line = Line 10
read File1 at 11 : Line = Line 11
EOF
To reiterate we really have to see your code to know where the problems is.
I understand you thinking it has something to do with variable scoping so the question makes sense to me.
For the other people. Please be a little more kind to the novices trying to learn. This is supposed to be a place for helping. Thank you. </soapbox>

Resources