Split omitting the text before - ruby

I'm trying to go through a string:
str = "foo chapter 1 bar v1 baz v2 qux chapter 2 quux v1"
and find chapter numbers and verse numbers, e.g. ("chapter 1 foo v1"). When I find a verse number, I want to add the text:
"id=\"(current chapter number)_(current verse number)\""
My expected output is:
"foo chapter 1 bar id=\"chapter_1_v1\" baz id=\"chapter_1_v2\" qux chapter 2 quux id=\"chapter_2_v1\""
Using split removes whatever text that doesn't have the specified text to split on. This is my code:
str.split(/(?=chapter \d+)/).each do |c|
c.scan(/(chapter) (\d+)/) {|chap, num| puts c.gsub(/(v\d+)/, 'id="' + chap.to_s + '_' + num.to_s + '_\1"')}
end
How do I keep the text before the split? or what is a better way of achieving this result?

Instead of splitting, you could use gsub!() to directly replace the text for every match. And there's a catch, if it matches chapter \d+, simply store the value and don't do a replace (replace with the whole match).
I'll use the following regex to either match a chapter or a verse:
/\bchapter (\d+)|\b(v\d+)\b/
Code:
c = "foo chapter 1 bar v1 baz v2 qux chapter 2 quux v1"
current_chapter = "1"
c.gsub!(/\bchapter (\d+)|\b(v\d+)\b/) { |match|
if ($1)
current_chapter = $1
match
else
"id=\"chapter_" + current_chapter + "_#$2\""
end
}
puts c
Output:
foo chapter 1 bar id="chapter_1_v1" baz id="chapter_1_v2" qux chapter 2 quux id="chapter_2_v1"
DEMO
Disclaimer: I never code in Ruby, so please consider the logic I used, knowing the script more than probably should be improved. -All edits are more than welcome!

Related

no Errors but no Result

I need to find each occurrence of "$" and change it to a number using a count. eg str = "foo $ bar $ foo $ bar $ * run code here * => "foo 1 bar 2 foo 3 bar 4
It feels like this should be a lot easier than i'm making it out to be. Here's my code:
def counter(file)
f = File.open(file, "r+")
count = 0
contents = f.readlines do |s|
if s.scan =~ /\$/
count += 1
f.seek(1)
s.sub(/\$/, count.to_s)
else
puts "Total changes: #{count}"
end
end
end
However I'm not sure if I'm meant to be using .match, .scan, .find or whatever else.
When i run this it doesn't come up with any errors but it doesn't change anything either.
Your syntax for scan is incorrect and it should throw error.
You can try something along this line:
count = 0
str = "foo $ bar $ foo $ bar $ "
occurences = str.scan('$')
# => ["$", "$", "$", "$"]
occurences.size.times do str.sub!('$', (count+=1).to_s) end
str
# => "foo 1 bar 2 foo 3 bar 4 "
Explanation:
I am finding all occurences of $ in the string, then I am using sub! in iteration as it replaces only the first occurrence at a time.
Note: You may want to improve scan line by using regex with boundary match instead of plain "$" as it will replace $ even from within words. Eg: exa$mple will also get replace to something like: exa1mple
Why your code is not throwing error?
If you read the description about readlines, you will find:
Reads the entire file specified by name as individual lines, and
returns those lines in an array.
As it reads the entire file at once there is no value passing block along this method. Following example will make it more clear:
contents = f.readlines do |s|
puts "HELLO"
end
# => ["a\n", "b\n", "c\n", "d\n", "asdasd\n", "\n"] #lines of file f
As you can see "HELLO" never gets printed, showing the block code is never executed.

Ruby Count lines in file including last line(empty)

I'm trying to count the lines of a file with ruby but I can't get either IO or File to count the last line.
What do I mean by last line?
Here's a screenshot of Atom editor getting that last line
Ruby returns 20 lines, I need 21 lines. Here is such file
https://copy.com/cJbiAS4wxjsc9lWI
Interesting question (although your example file is cumbersome). Your editor shows a 21st line because the 20th line ends with a newline character. Without a trailing newline character, your editor would show 20 lines.
Here's a simpler example:
a = "foo\nbar"
b = "baz\nqux\n"
A text editor would show:
# file a
1 foo
2 bar
# file b
1 baz
2 qux
3
Ruby however sees 2 lines in either cases:
a.lines #=> ["foo\n", "bar"]
a.lines.count #=> 2
b.lines #=> ["baz\n", "qux\n"]
b.lines.count #=> 2
You could trick Ruby into recognizing the trailing newline by adding an arbitrary character:
(a + '_').lines #=> ["foo\n", "bar_"]
(a + '_').lines.count #=> 2
(b + '_').lines #=> ["baz\n", "qux\n", "_"]
(b + '_').lines.count #=> 3
Or you could use a Regexp that matches either end of line ($) or end of string (\Z):
a.scan(/$|\Z/) #=> ["", ""]
a.scan(/$|\Z/).count #=> 2
b.scan(/$|\Z/) #=> ["", "", ""]
b.scan(/$|\Z/).count #=> 3
Ruby lines method doesn't count the last empty line.
To trick, you can add an arbitrary character at the end of your stream.
Ruby lines returns 2 lines for this example:
1 Hello
2 World
3
Instead, it returns 3 lines in this case
1 Hello
2 World
3 *

ruby regex - how replace nth instance of a match in a string

In my app I need to be able to find all number substrings, then scan each one, find the first one that matches a range (such as between 5 and 15) and replace THAT instance with another string "X".
My test string s = "1 foo 100 bar 10 gee 1"
My initial pattern is any string of 1 or more digits, eg, re = Regexp.new(/\d+/)
matches = s.scan(re) gives ["1", "100", "10", "1"]
If I want to replace the Nth match, and only the Nth match, with "X" how do I?
For example if I want to replace the third match "10" (matches[2]) I can't just say
s[matches[2]] = "X" because that does two replacements
"1 foo X0 bar X gee 1"
Any help would be appreciated!
String#gsub has a form that takes a block. It yields to the block for each match, and replaces the match with the result of the block. So:
first = true
"1 foo 100 bar 10 gee 1 12".gsub(/\d+/) do |digits|
number = digits.to_i
if number >= 5 && number <= 15 && first
# do the replacement
first = false
'X'
else
# don't replace; i.e. replace with itself
digits
end
end
# => "1 foo 100 bar X gee 1 12"
An alternate way is to construct number range using character class (if it is not too complicated)
>> s = "1 foo 100 bar 10 gee 1"
=> "1 foo 100 bar 10 gee 1"
>> s.sub(/(?<!\d)([5-9]|1[0-5])(?!\d)/, 'X')
=> "1 foo 100 bar X gee 1"
the negative lookarounds ensure that part of digit sequence are not matched
you can use \b instead of lookarounds if the numbers cannot be part of words like abc12ef or 8foo
([5-9]|1[0-5]) will match numbers from 5 to 15
Initially, the title lead me to think that you want to replace Nth occurrence - for ex: N=2 means replace second occurrence of any digit sequence. For that you can use this:
# here the number inside {} will be N-1
>> s.sub(/(\d+.*?){1}\K\d+/, 'X')
=> "1 foo X bar 10 gee 1"

How do I print Arrays?

I have the array:
example = ['foo', 'bar', 'quux']
I want to iterate over it and print it so it comes out like: foo bar quux, not ['foo', 'bar', 'quux'] which would be the case if I used each or for.
Note: I can't just do: example[0];example[1], etc. because the length of the array is variable.
How do I do this?
Here:
puts array.join(' ') # The string contains one space
example.join(" ") #=> foo bar quux.
If you used each to print, it would work fine:
example.each {|item| print item; print " " } #=> foo bar quux
However, if what you want is a string with the items separated by spaces, that's what the join method is for:
example.join(' ') #=> "foo bar quux"
I suspect your problem is that you're confusing printing with iterating, as each just returns the original array — if you want things printed inside of it, you need to actually print like I did in the example above.
if they may be printed underneath each other just use
puts example
=>
foo
bar
quux
otherwise use the solutions from the other answers
puts example.join(" ")

Ruby, iterating through strings, matching exact patterns and replacing each but the first one

I have a column of strings (cities) in a csv file. I'd need to go through the list, iterate through all matching patterns, keep only the first one and replace all similar ones with blank lines.
I am no programmer, but if I could do this that would help me a lot at work!
I have notions of Ruby and notions of regexp in Emacs.
Is this feasible? Can anyone help?
Thank you in advance!
File looks like this:
Bordeaux
Bordeaux
Paris
Paris
Paris
Riom
File should look like this:
Bordeaux
(blank)
Paris
(blank)
(blank)
Riom
Keeping the empty lines:
file_in = File.open('test_villes_ruby.txt','r')
file_out = File.open('test_villes_ruby_stripped.txt','w')
memo = ""
file_in.each do |city|
if city == memo then
file_out << "\n"
else
file_out << city
memo = city
end
end
file_in.close
file_out.close
For such simple tasks, you can also pass your ruby script directly to the interpreter using -e command line parameter. If you combine it with -n or -p, your ruby script will be performed on every line of the input, in turns. Variable $_ then holds the content of the line currently being processed.
So, if your input file looks like this:
jablan-mbp:dev $ cat test1.txt
foo
foo
foo
bar
bar
foo
bar
bar
bar
bar
foo
You can execute a simple script this way:
jablan-mbp:dev $ ruby -n -e 'puts(#memo == $_ ? "" : #memo = $_)' < test1.txt
foo
bar
foo
bar
foo
Solution:
File.open('cities', 'r') do |f_in|
File.open('cities_uniq', 'w') do |f_out|
f_in.inject("") { |o, c| f_out.puts o == c ? "\n" : c ; c}
end
end
Input:
Bordeaux
Bordeaux
Paris
Paris
Paris
Riom
Riom
Riom
Frankfurt
Wien
Wien
Output:
Bordeaux
Paris
Riom
Frankfurt
Wien
Note: There's an empty line after the final "Wien", but I can't get it to display here...
Probably the simpliest way is just to use a set (or SortedSet if order matters)
cities = Set.new
cities_in_csv.each do |city|
cities.add(city)
end
Nothing extra. Sets by definition do not contain duplicate elements.

Resources