Regex to match value only once in text value - ruby

I am dealing with a dirty data source that has some key value pairs I have to extract. for example:
First Name = John Last Name = Smith Home Phone = 555-333-2345 Work Phone = Email = john.doe#email.com Zip From = 11772 Zip To = 11782 First Name = John First Name = John
To extract the First Name, I am using this regular expression:
/First Name = ([a-zA-Z]*)/
How do I prevent multiple matches in the case where the First Name is duplicated as shown above?
Here is a version of this on Rubular.

match will only get the first match (you would use scan to get all):
str.match(/First Name = ([a-zA-Z]*)/).captures.first
#=> "John"
(given your string is in str)

[] will also give you the first match:
str[/First Name = ([a-zA-Z]*)/, 1]
The 1 means the first capture group

/^First Name = ([a-zA-Z]*)/
this will work too. just add ^ to indicate start of line

Related

I wrote a code to update the Lettering of the first name in Zoho but it's not working

Here's the deluge script to capitalize the first letter of the sentence and make the other letters small that isn't working:
a = zoho.crm.getRecordById("Contacts",input.ID);
d = a.get("First_Name");
firstChar = d.subString(0,1);
otherChars = d.removeFirstOccurence(firstChar);
Name = firstChar.toUppercase() + otherChars.toLowerCase();
mp = map();
mp.put("First_Name",d);
b = zoho.crm.updateRecord("Contacts", Name,{"First_Name":"Name"});
info Name;
info b;
I tried capitalizing the first letter of the alphabet and make the other letters small. But it isn't working as expected.
Try using concat
Name = firstChar.toUppercase().concat( otherChars.toLowerCase() );
Try removing the double-quotes from the Name value in the the following statement. The reason is that Name is a variable holding the case-adjusted name, but "Name" is the string "Name".
From:
b = zoho.crm.updateRecord("Contacts", Name,{"First_Name":"Name"});
To
b = zoho.crm.updateRecord("Contacts", Name,{"First_Name":Name});

How to get the script to only return one letter of the first name they Input?

I am trying to figure out how to get only the first letter of the first name they input but everything I tried or look at other similar problems doesn't help.
this should work:
name = input("name:")
first_letter = name[0]
print(first_letter)

#scan suddenly returns an empty array

I am creating a scraper for articles from www.dev.to, which should read in the title, author and body of the article. I am using #scan to get rid of white space and other characters after the author name. At first i assumed the author name would consist of first name and last name, then realized some only have one name listed. Now that I changed the regex accordingly, the method stopped working and #scan returns an empty array. How can I fix this?
def scrape_post(path)
url = "https://dev.to/#{path}"
html_content = open(url).read
doc = Nokogiri::HTML(html_content)
doc.search('.article-wrapper').each do |element|
title = element.search('.crayons-article__header__meta').search('h1').text.strip
author_raw = element.search('.crayons-article__subheader').text.strip
author = author_raw.scan(/\A\w+(\s|\w)\w+/).first
body = doc.at_css('div#article-body').text.strip
#post = Post.new(id: #next_id, path: path, title: title, author: author, body: body, read: false)
end
#post
end
Example of input data:
path = rahxuls/preventing-copying-text-in-a-webpage-4acg
Expected output:
title = "Preventing copying text in a webpage 😁"
author_raw = "Rahul\n \n\n \n Nov 6\n\n\n ・2 min read"
author = "Rahul"
From the scan docs.
If the pattern contains no groups, each individual result consists of the matched string, $&. If the pattern contains groups, each individual result is itself an array containing one entry per group.
By adding the parentheses to the middle of your regex, you created a capturing group. Scan will return whatever that group captures. In the example you gave, it will be 'u'.
"Rahul\n \n\n \n Nov 6\n\n\n ・2 min read".scan(/\A\w+(\s|\w)\w+/) #=> [["u"]]
The group can be marked as non-capturing to return to your old implementation
"Rahul\n \n\n \n Nov 6\n\n\n ・2 min read".scan(/\A\w+(?:\s|\w)\w+/) #=> ["Rahul"]
# ^
Or you can add a named capture group to what you actually want to extract.
"Rahul\n \n\n \n Nov 6\n\n\n ・2 min read".match(/\A(?<name>\w+(\s|\w)\w+)/)[:name] #=> "Rahul"

best way to find substring in ruby using regular expression

I have a string https://stackverflow.com. I want a new string that contains the domain from the given string using regular expressions.
Example:
x = "https://stackverflow.com"
newstring = "stackoverflow.com"
Example 2:
x = "https://www.stackverflow.com"
newstring = "www.stackoverflow.com"
"https://stackverflow.com"[/(?<=:\/\/).*/]
#⇒ "stackverflow.com"
(?<=..) is a positive lookbehind.
If string = "http://stackoverflow.com",
a really easy way is string.split("http://")[1]. But this isn't regex.
A regex solution would be as follows:
string.scan(/^http:\/\/(.+)$/).flatten.first
To explain:
String#scan returns the first match of the regex.
The regex:
^ matches beginning of line
http: matches those characters
\/\/ matches //
(.+) sets a "match group" containing any number of any characters. This is the value returned by the scan.
$ matches end of line
.flatten.first extracts the results from String#scan, which in this case returns a nested array.
You might want to try this:
#!/usr/bin/env ruby
str = "https://stackoverflow.com"
if mtch = str.match(/(?::\/\/)(/S)/)
f1 = mtch.captures
end
There are two capturing groups in the match method: the first one is a non-capturing group referring to your search pattern and the second one referring to everything else afterwards. After that, the captures method will assign the desired result to f1.
I hope this solves your problem.

How do I extract the right most number in a string?

I have strings like this:
https://www.facebook.com/username_with_number_14/posts/101505775425654414
https://www.facebook.com/username/posts/101505775425654466
I need to extract the number on the end of the string in Ruby. In the first string, it is the second and last number, whereas in the second string it is the first, only and last number.
At the moment I am extracting the number like this:
int1 = Regexp.new('.*?(\\d+)',Regexp::IGNORECASE).match()[1]
But when this is applied to the first string, it extracts the number part of the username, not the desired number.
How can I do it so that it will work on both strings?
text = <<ENDTEXT
https://www.facebook.com/username_with_number_14/posts/101505775425654414
https://www.facebook.com/username/posts/101505775425654466
ENDTEXT
p text.lines.map{|line| line.scan(/\d+/).last}
#=> ["101505775425654414", "101505775425654466"]
for me works regexp like this:
^.*?(\d+)$
look here: http://rubular.com/r/CJzsgjedqJ
Try this
int1 = Regexp.new('.*\\/(\\d+)$',Regexp::IGNORECASE).match()[1]
The $ matches the end of the string. So I put all numbers from the last / to the end of the string into the capturing group 1.

Resources