Why can't I split a string on control characters? - ruby

I am trying to split a line when I find the characters ^C or ^B together. For some reason it is not splitting properly.
I have been on Rubular and tested this and supposedly it should split it.
The lines that I am reading in and trying to split look something like this:
SOME_KEY^CSOME_VALUE^BSOME_KEY^CSOME_VALUE
The code is:
final_array = []
temp_array = []
array__with_all_of_the_data.each do |x|
temp_array = x.split(/\^C/)
temp_array.each do |y|
final_array << y.split(/\^B/)
end
#final_array << final_array.join(",")
end

Split using the regular expression /\^[BC]/:
>> 'SOME_KEY^CSOME_VALUE^BSOME_KEY^CSOME_VALUE'.split(/\^[BC]/)
=> ["SOME_KEY", "SOME_VALUE", "SOME_KEY", "SOME_VALUE"]
If you want replace \B / \C with ,, use gsub instead of split + join:
>> 'SOME_KEY^CSOME_VALUE^BSOME_KEY^CSOME_VALUE'.gsub(/\^[BC]/, ',')
=> "SOME_KEY,SOME_VALUE,SOME_KEY,SOME_VALUE"

Related

how to split a string between 2 parametres in ruby

Hi I try to separate input like this : <Text1><Text2><Text2>..<TextN>
in a array who only have each text in each index, how I can use split with double parameters?
I try make a double split but doesn't work:
request = client.gets.chomp
dev = request.split("<")
split_doble(dev)
dev.each do |devo|
puts devo
end
def split_doble (str1)
str1.each do |str2|
str2.split(">")
end
end
When you have a string like this
string = "<text1><text2><textN>"
then you can extract the text between the < and > chars like that:
string.scan(/\w+/)
#=> ["text1", "text2", "textN"]
/\w+/ is a regular expression that matches a sequence of word characters (letter, number, underscore) and therefore ignores the < and > characters.
Also see docs about String#scan.
In the string "<text1><text2><textN>" the leading < and ending > are in the way, so get rid of them by slicing them off. Then just split on "><".
str = "<text1><text2><textN>"
p str[1..-2].split("><") # => ["text1", "text2", "textN"]

Ruby: Using an array list in order to select specific columns

I'm new in Ruby.
Here the script, I would like to use the selector in line 10 instead of fields[0] etc...
How can I do that ?
For the example the data are embedded.
Don't hesitate to correct me if I'm doing wrong when I'm opening or writing a file or anything else, I like to learn.
#!/usr/bin/ruby
filename = "/tmp/log.csv"
selector = [0, 3, 5, 7]
out = File.open(filename + ".rb.txt", "w")
DATA.each_line do |line|
fields = line.split("|")
columns = fields[0], fields[3], fields[5], fields[7]
puts columns.join("|")
out.puts(columns.join("|"))
end
out.close
__END__
20180704150930|rtsp|645645643|30193|211|KLM|KLM00SD624817.ts|172.30.16.34|127299264|VERB|01780000|21103|277|server01|OK
20180704150931|api|456456546|30130|234|VC3|VC300179201139.ts|172.30.16.138|192271838|VERB|05540000|23404|414|server01|OK
20180704150931|api|465456786|30154|443|BAD|BAD004416550.ts|172.30.16.50|280212202|VERB|04740000|44301|18|server01|OK
20180704150931|api|5437863735|30157|383|VSS|VSS0011062009.ts|172.30.16.66|312727922|VERB|05700000|38303|381|server01|OK
20180704150931|api|3453432|30215|223|VAE|VAE00TF548197.ts|172.30.16.74|114127126|VERB|05060000|22305|35|server01|OK
20180704150931|api|312121|30044|487|BOV|BOVVAE00549424.ts|172.30.16.58|69139448|VERB|05300000|48708|131|server01|OK
20180704150931|rtsp|453432123|30127|203|GZD|GZD0900032066.ts|172.30.16.58|83164150|VERB|05460000|20303|793|server01|OK
20180704150932|api|12345348|30154|465|TYH|TYH0011224259.ts|172.30.16.50|279556843|VERB|04900000|46503|241|server01|OK
20180704150932|api|4343212312|30154|326|VAE|VAE00TF548637.ts|172.30.16.3|28966797|VERB|04740000|32601|969|server01|OK
20180704150932|api|312175665|64530|305|TTT|TTT000000011852.ts|172.30.16.98|47868183|VERB|04740000|30501|275|server01|OK
You can get fields at specific indices using Ruby's splat operator (search for 'splat') and Array.values_at like so:
columns = fields.values_at(*selector)
A couple of coding style suggestions:
1.You may want to make selector a constant since its unlikely that you'll want to mutate it further down in your code base
2.The out and out.close and appending to DATA can all be condensed into a CSV.open:
CSV.open(filenname, 'wb') do |csv|
columns.map do |col|
csv << col
end
end
You can also specify a custom delimiter (pipe | in your case) as noted in this answer like so:
...
CSV.open(filenname, 'wb', {col_sep: '|') do |csv|
...
Let's begin with a more manageable example. First note that if your string is held by the variable data, each line of the string contains the same number (14) of vertical bars ('|'). Lets reduce that to the first 4 lines of data with each line terminated immediately before the 6th vertical bar:
str = data.each_line.map { |line| line.split("|").first(6).join("|") }.first(4).join("\n")
puts str
20180704150930|rtsp|645645643|30193|211|KLM
20180704150931|api|456456546|30130|234|VC3
20180704150931|api|465456786|30154|443|BAD
20180704150931|api|5437863735|30157|383|VSS
We need to also modify selector (arbitrarily):
selector = [0, 3, 4]
Now on to answering the question.
There is no need to divide the string into lines, split each line on the vertical bars, select the elements of interest from the resulting array, join the latter with a vertical bar and then lastly join the whole shootin' match with a newline (whew!). Instead, simply use String#gsub to remove all unwanted characters from the string.
terms_per_row = str.each_line.first.count('|') + 1
#=> 6
r = /
(?:^|\|) # match the beginning of a line or a vertical bar in a non-capture group
[^|\n|]+ # match one or more characters other than a vertical bar or newline
/x # free-spacing regex definition mode
line_idx = -1
new_str = str.gsub(r) do |s|
line_idx += 1
selector.include?(line_idx % terms_per_row) ? s : ''
end
puts new_str
20180704150930|30193|211
20180704150931|30130|234
20180704150931|30154|443
20180704150931|30157|383
Lastly, we write new_str to file:
File.write(fname, new_str)

ruby: Grab numbers only within quotes

I would like the following sub-string
1100110011110000
from
foo = "bar9-9 '11001100 11110000 A'A\n"
I have so far used the below, which yields
puts foo.split(',').map!(&:strip)[0].gsub(/\D/, '')
>> 991100110011110000
Getting rid of the 2 leading 9's is not too difficult in this scenario, but I would like a general solution which grabs numbers only within the ' ' single quotes
You can find the quoted part first with scan and then remove non-digits:
> results = "bar9-9 '11001100 11110000 A'A\n".scan(/'[^']*'/).map{|m| m.gsub(/\D/, '')}
# => ["1100110011110000"]
> results[0]
# => "1100110011110000"
The zeros and ones within the quoted string can be extracted using String#gsub with a regular expression, as opposed to methods that convert the string to an array of strings, modify the array and converted it back to a string. Here are three ways of doing that.
str ="bar9-9 '11001100 11110000 A'A\n"
#1: Extract the substring of interest and then remove characters other than zero and one
def extract(str)
str[str.index("'")+1, str.rindex("'")-1].gsub(/[^01]/,'')
end
extract str
#=> "1100110011110000"
#2 Use a flag to indicate when zeroes and ones are to be kept
def extract(str)
found = false
str.gsub(/./m) do |c|
found = !found if c == "'"
(found && (c =~ /[01]/)) ? c : ''
end
end
extract str
#=> "1100110011110000"
Here the regular expression requires the m modifier (to enable multiline mode) in order to convert the newline character to an empty string. (One could alternatively write str.chomp.gsub(/./)....)
Notice that this second method works when there are multiple single-quoted substrings.
extract "bar9-9 '11001100 11110000 A'A'10x1y'\n"
#=> "1100110011110000101"
#3 Use the flip-flop operator (variant of #2)
def extract(str)
str.gsub(/./m) do |c|
next '' if (c=="'") .. (c=="'")
c =~ /[01]/ ? c : ''
end
end
extract str
#=> "1100110011110000"
extract "bar9-9 '11001100 11110000 A'A'10x1y'\n"
#=> "1100110011110000101"
foo.slice(/'.*?'/).scan(/\d+/).join
#=> "1100110011110000"

how do you use multiple arguments with gsub? (Ruby)

I need to add multiple arguments to the gsub parenthesis, but whatever I try it doesn't seem to work.
# encoding: utf-8
# !/usr/bin/ruby
# create an empty array
original_contents = []
# open file to read and write
f = File.open("input.txt", "r")
# pass each line through the array
f.each_line do |line|
# push edited text to the array
original_contents << line.gsub(/[abc]/, '*')
end
f.close
new_file = File.new("output.txt", "r+")
new_file.puts(original_contents)
new_file.close
I need it so I can do a lot of different search and replaces like this:
original_contents << line.gsub(/[abc]/, '*' || /[def]/, '&' || /[ghi]/, '£')
Of course I know this code doesn't work but you get the idea. I've tried using an array for each argument but it ends up printing the text into the output file multiple times.
Any ideas?
As Holger Just said, I also suggest you run gsub multiple times. You can make the code a bit prettier when you store the replacements in a hash and then iteratively apply them to the string with Enumerable#reduce.
replacements = {
/[abc]/ => '*',
/[def]/ => '&',
/[ghi]/ => '£'
}
f = File.open("input.txt", "r")
original_contents = f.lines.map do |line|
replacements.reduce(line) do |memo, (pat, replace)|
memo.gsub(pat, replace)
end
end
f.close
new_file = File.new("output.txt", "r+")
new_file.puts(original_contents)
new_file.close

Replace in Ruby commas by different strings

I have a string called "example", like this:
192.168.1.40,8.8.8.8,12.34.45.56,408,-,1812
192.168.1.128,192.168.101.222,12.34.45.56,384,-,1807
and I would like to obtain this output:
{"string1":"192.168.1.40","string2":"8.8.8.8",“string3":“12.34.45.56”,“string4”:408,“string5”:“-”,"string6":1812}
{"string1":"192.168.1.128","string2":"192.168.101.222",“string3":“12.34.45.56”,“string4”:384,“string5”:“-”,"string6":1807}
I did this:
example = example.gsub("\n","}\n{\"string1\": \"")
example = example.insert(0, "{\"string1\": \"")
example = example.concat("}")
and I obtained:
{"string1":"192.168.1.40,8.8.8.8,12.34.45.56,408,-,1812}
{"string1":"192.168.1.128,192.168.101.222,12.34.45.56,384,-,1807}
but I don't know how can I do the others changes. Thanks!!
Well, to get it as a ruby hash, which you can output as json or whatever you need:
out = {}
your_input_data.split(",").each_with_index { |val, i| out["string#{i}"] = val }
(but you would need to do this for each line: input.lines.each { |line| ... do the above here } - but I am not clear - do you want a list of maps?)
I made the assumption that you didn't want values that were just numbers to be double-quoted.
DATA.each_line do |line|
l = line.chomp.split(',').map.with_index do |v, i|
v = v =~ /^\d+$/ ? v : "\"#{v}\""
"\"string#{i+1}\":#{v}"
end
print "{", l.join(','), "}\n"
end
__END__
192.168.1.40,8.8.8.8,12.34.45.56,408,-,1812
192.168.1.128,192.168.101.222,12.34.45.56,384,-,1807
Result:
{"string1":"192.168.1.40","string2":"8.8.8.8","string3":"12.34.45.56","string4":408,"string5":"-","string6":1812}
{"string1":"192.168.1.128","string2":"192.168.101.222","string3":"12.34.45.56","string4":384,"string5":"-","string6":1807}
It seems from the code you wrote that you are looking for a single string as output rather than a more elaborate Ruby data structure or output to a printed stream.
This is working for me:
example = '192.168.1.40,8.8.8.8,12.34.45.56,408,-,1812
192.168.1.128,192.168.101.222,12.34.45.56,384,-,1807'
result = example.split("\n").map do |line|
n = 0
line.split(',').map{|s| %Q|"string#{n+=1}":"#{s}"|}.join(',')
end.map{|c| "{#{c}}"}.join("\n")
puts result
{"string1":"192.168.1.40","string2":"8.8.8.8","string3":"12.34.45.56","string4":"408","string5":"-","string6":"1812"}
{"string1":" 192.168.1.128","string2":"192.168.101.222","string3":"12.34.45.56","string4":"384","string5":"-","string6":"1807"}
This splits into lines then splits each line into separate strings, then concatenates each string with its JSON key and finally reassembles with join first with commas and then with newline. If you'd rathet have lists than reassembled strings, just omit the respective join.

Resources