Ruby replace text within single quotes or backticks for html tag - ruby

Hello I am trying to build a simple action in Ruby that takes one string like
result = "This is my javascript variable 'var first = 1 + 1;' and here is another 'var second = 2 + 2;' and that's it!"
So basically I would like to take the text within single quotes ' or backticks ` and and replace it by:
<code>original text</code> note I'm replacing it by an opening and closing code tag
Just like in markdown
so I would have a result like
result = "This is my javascript variable <code>var first = 1 + 1;<code> and here is another <code>var second = 2 + 2;</code> and that's it"
If it's possible to run this natively without the need of any extra gem it would be great :)
Thanks a lot

I guess you'll need to iterate the string and parse it. While you can do non-greedy regex matches, e.g. result.gsub!(/'([^']*)'/, '<code>\1</code>') you might find the result might not behave correctly in corner-cases.

Without any other advanced requirement
>> result.gsub(/\s+'/,"<code>").gsub(/'\s+/,"</code>")
=> "This is my javascript variable<code>var first = 1 + 1;</code>and here is another<code>var second = 2 + 2;</code>and that's it!"

You will need to come-up with a character as a delimiter for your code, which you don't use otherwise..
Why? because of all the corner cases. E.g. the following string
result = "This's my javascript variable 'var first = 1 + 1;' and here is another 'var second = 2 + 2;' and that's it!"
which would otherwise produce:
"This<code>s my javascript variable </code>var first = 1 + 1;<code> and here is another </code>var second = 2 + 2;<code> and that</code>s it!"
Total garbage out..
However if you use a unique character as a delimiter that's otherwise not used, you can create a non-greedy RegExp which will do the search/replace
e.g. using a # character to delimit the code:
"This's my javascript variable #var first = 1 + 1;# and here is another #var second = 2 + 2;# and that's it!"

Related

Extract values after pattern in Ruby string

I have a string like this:
"<root><some ProdCode=\"40\" ProducerName=\"demo1\" ProdCode=\"40\" Need_Confirmation=\"1\"/><some ProdCode=\"40\" ProducerName=\"demo1\" ProdCode=\"40\" Need_Confirmation=\"1\"/></root>"
I'm trying to pull the content from this string which is between =\"content\" and put it in an array, like ["40","demo1","40","1",40......]
You should use :scan to select elements by regexp pattern. Then remove escape characters.
string.scan(/"[^"]+"/).map { |element| element.delete('\\"') }
Explanation of pattern:
/ – regexp starts
" – first char should be "
[^"]+ – next should be any char except ". + sign says that number of such chars should be at least 1.
" – next should be again "
/ – regexp ends
So string.scan(/"[^"]+"/) would return:
["\"40\"", "\"demo1\"", "\"40\"", "\"1\"", "\"40\"", "\"demo1\"", "\"40\"", "\"1\""]
Then we can just delete \" using :delete method.
Convenient tool to build regexps is http://rubular.com/
When your string is this simple you can use scan + regular expression like this:
result = html.scan(/ProdCode="\d+?"/)
If it is more complex you can use a html parser like nokogiri or oga.

RegEx to remove new line characters and replace with comma

I scraped a website using Nokogiri and after using xpath I was left with the following string (which is a few td's pushed into one string).
"Total First Downs\n\t\t\t\t\t\t\t\t359\n\t\t\t\t\t\t\t\t274\n\t\t\t\t\t\t\t"
My goal is to make this into an array that looks like the following(it will be a nested array):
["Total First Downs", "359", "274"]
The issue is creating a regex equation that removes the escaped characters, subs in one "," but does not sub in a "," after the last set of integers. If the comma after the last set of integers is necessary, I could use #compact to get rid of the nil that occurs in the array. If you need the code on how I scraped the website here it is: (please note i saved the webpage for testing in order for my ip address to not get burned during the trial phase)
f = File.open('page')
doc = Nokogiri::HTML:(f)
f.close
number = doc.xpath('//tr[#class="tbdy1"]').count
stats = Array.new(number) {Array.new}
i = 0
doc.xpath('//tr[#class="tbdy1"]').each do |tr|
stats[i] << tr.text
i += 1
end
Thanks for your help
I don't fully understand your problem, but the result can be easily achieved with this:
"Total First Downs\n\t\t\t\t\t\t\t\t359\n\t\t\t\t\t\t\t\t274\n\t\t\t\t\t\t\t"
.split(/[\n\t]+/)
# => ["Total First Downs", "359", "274"]
Try with gsub
"Total First Downs\n\t\t\t\t\t\t\t\t359\n\t\t\t\t\t\t\t\t274\n\t\t\t\t\t\t\t".gsub("/[\n\t]+/",",")

Testing for string containing variable characters in ruby

can i do something like this?
string = "123" + rand(10).to_s + "abc"
=> "1237abc"
string.include?("123" + string.any_character(1) + "abc") # any_character(digit)
=> true
the point is to know a part of a string like lets say a html tag string, testing for something like Name Changes Every Day
and easily find the title from the source every time no matter what it might be, but for me, it will always be one character, soooooo, any help here?
Try that code:
string = "123" + rand(10).to_s + "abc"
string =~ /123\dabc/
0 means that pattern starts at 0 character. This return nil when pattern doesnt match. If you want to match only whole text change /123\dabc/ into /^123\dabc$/
You can just use regular expressions with ruby. Example:
string = "123" + rand(10).to_s + "abc"
string.match /123\dabc/

Issue dealing with white space with Ruby regular expressions

I'm trying to write a simple script expression that allows me to identify the java files in a directory that have a private constructor. I have had some luck but I want my script to acknowledge there is white space between the access modifier and the constructor name but not care if it is a space or n spaces or a tab or n tabs etc.
I am trying to use...
"private\s+"+object_name
but the + (1 or more) is not finding a constructor with 2 spaces between the modifier and the constructor name.
I know I am missing something. Any help would be greatly appreciated.
Thanks.
Here is the full code if it helps...
!#/usr/bin/ruby
path = ARGV[0]
if path.nil?
puts "missing path argument"
exit
end
entries = Dir.entries( path )
entries.each do |file_name|
file_name = file_name.rstrip
if ( file_name.end_with? "java" )
text = File.read( path+file_name )
object_name = file_name.chomp( ".java" )
search_str = "private\s+"+object_name
matches = text.match( Regexp.escape( search_str ) )
if ( !matches.nil? && matches.length > 0 )
puts matches
end
end
end
I think you want to escape the \ in your Ruby string and also Regexp.escape your object name and not the whole regex including the whitespace matcher, e.g.,
[...]
search_regex = Regexp.new("private\\s+" + Regexp.escape(object_name))
matches = text.match(search_regex)
As #LBg also points out, if you want to use + concatenation, better to use single quotes that won't require escaping the \. Or use doubles with substitution as in:
search_regex = Regexp.new("private\\s+#{Regexp.escape(object_name)}")
A double-quoted string reads "\s" as " ", no problems with that, but prefer use single-quoted in this case. Regexp.escape removes the funcionality of the regex's symbols of the string. private + ("\s" is " ") is converted to private\ \+ and, with match, will try to find the string private +object_name, what is not what you want. Remove the Regexp.escape and it should work well.

How do I parse a quoted string inside another string?

I want to extract the quoted substrings from inside a string. This is an example:
string = 'aaaa' + string_var_x + 'bbbb' + string_var_y
The output after parsing should be:
["'aaaa'", "'bbbb'"]
The initial solution was to string.scan /'\w'/ which is almost ok.
Still I can't get it working on more complex string, as it's implied that inside '...' there can be any kind of characters (including numbers, and !##$%^&*() whatever).
Any ideas?
I wonder if there's some way to make /'.*'/ working, but make it less greedy?
Lazy should fix this:
/'.*?'/
Another possibility is to use this:
/'[^']*'/
An alternate way to do it is:
>> %{string = 'aaaa' + string_var_x + 'bbbb' + string_var_y}.scan(/'[^'].+?'/)
#=> ["'aaaa'", "'bbbb'"]
String.scan gets overlooked a lot.

Resources