I have an array of string pairs.
For example: [["vendors", "users"], ["jobs", "venues"]]
I have a list of files within a directory:
folder/
-478_accounts
-214_vendors
-389_jobs
I need somehow to rename files with the second value from subarrays so it would look like this:
folder/
-478_accounts
-214_users
-389_venues
How do I resolve the problem?
folder = %w| -478_accounts -214_vendors -389_jobs |
#=> ["-478_accounts", "-214_vendors", "-389_jobs"]
h = [["vendors", "users"], ["jobs", "venues"]].to_h
#=> {"vendors"=>"users", "jobs"=>"venues"}
r = Regexp.union(h.keys)
folder.each { |f| File.rename(f, f.sub(r,h)) if f =~ r }
I've used the form of String#sub that employs a hash to make the substitution.
You might want to refine the regex to require the string to be replaced to follow an underscore and be at the end of the string.
r = /
(?<=_) # match an underscore in a positive lookbehind
#{Regexp.union(h.keys)} # match one of the keys of `h`
\z # match end of string
/x # free-spacing regex definition mode
#=> /
# (?<=_) # match an underscore in a positive lookbehind
# (?-mix:vendors|jobs) # match one of the keys of `h`
# \z # match end of string
# /x
You don't have to use a regex.
keys = h.keys
folder.each do |f|
prefix, sep, suffix = f.partition('_')
File.rename(f, prefix+sep+h[suffix]) if sep == '_' && keys.include?(suffix)
end
Related
I am busy working through some problems I have found on the net and I feel like this should be simple but I am really struggling.
Say you have the string 'AbcDeFg' and the next string of 'HijKgLMnn', I want to be able to find the same characters in the string so in this case it would be 'g'.
Perhaps I wasn't giving enough information - I am doing Advent of Code and I am on day 3. I just need help with the first bit which is where you are given a string of characters - you have to split the characters in half and then compare the 2 strings. You basically have to get the common character between the two. This is what I currently have:
file_data = File.read('Day_3_task1.txt')
arr = file_data.split("\n")
finals = []
arr.each do |x|
len = x.length
divided_by_two = len / 2
second = x.slice!(divided_by_two..len).split('')
first = x.split('')
count = 0
(0..len).each do |z|
first.each do |y|
if y == second[count]
finals.push(y)
end
end
count += 1
end
end
finals = finals.uniq
Hope that helps in terms of clarity :)
Did you try to convert both strings to arrays with the String#char method and find the intersection of those arrays?
Like this:
string_one = 'AbcDeFg'.chars
string_two = 'HijKgLMnn'.chars
string_one & string_two # => ["g"]
One way to do that is to use the method String#scan with the regular expression
rgx = /(.)(?!.*\1.*_)(?=.*_.*\1)/
I'm not advocating this approach. I merely thought some readers might find it interesting.
Suppose
str1 = 'AbcDgeFg'
str2 = 'HijKgLMnbn'
Now form the string
str = "#{str1}_#{str2}"
#=> "AbcDeFg_HijKgLMnbn"
I've assumed the strings contain letters only, in which case they are separated in str with any character other than a letter. I've used an underscore. Naturally, if the strings could contain underscores a different separator would have to be used.
We then compute
str.scan(rgx).flatten
#=> ["b", "g"]
Array#flatten is needed because
str.scan(rgx)
#=>[["b"], ["g"]]
The regular expression can be written in free-spacing mode to make it self-documenting:
rgx =
/
(.) # match any character, same to capture group 1
(?! # begin a negative lookahead
.* # match zero or more characters
\1 # match the contents of capture group 1
.* # match zero or more characters
_ # match an underscore
) # end the negative lookahead
(?= # begin a positive lookahead
.* # match zero or more characters
_ # match an underscore
.* # match zero or more characters
\1 # match the contents of capture group 1
) # end the positive lookahead
/x # invoke free-spacing regex definition mode
Note that if a character appears more than once in str1 and at least once in str2 the negative lookahead ensures that only the last one in str1 is matched, to avoid returning duplicates.
Alternatively, one could write
str.gsub(rgx).to_a
The uses the (fourth) form of String#gsub which takes a single argument and no block and returns an enumerator.
I have a large file, and I want to be able to check if a word is present twice.
puts "Enter a word: "
$word = gets.chomp
if File.read('worldcountry.txt') # do something if the word entered is present twice...
How can i check if the file worldcountry.txt include twice the $word i entered ?
I found what i needed from this: count-the-frequency-of-a-given-word-in-text-file-in-ruby
On the Gerry post with this code
word_count = 0
my_word = "input"
File.open("texte.txt", "r") do |f|
f.each_line do |line|
line.split(' ').each do |word|
word_count += 1 if word == my_word
end
end
end
puts "\n" + word_count.to_s
Thanks, i will pay more attention next time.
If the file is not overly large, it can be gulped into a string. Suppose:
str = File.read('cat')
#=> "There was a dog 'Henry' who\nwas pals with a dog 'Buck' and\na dog 'Sal'."
puts str
There was a dog 'Henry' who
was pals with a dog 'Buck' and
a dog 'Sal'.
Suppose the given word is 'dog'.
Confirm the file contains at least two instances of the given word
One can attempt to match the regular expression
r1 = /\bdog\b.*\bdog\b/m
str.match?(r1)
#=> true
Demo
Confirm the file contains exactly two instances of the given word
Using a regular expression to determine is the file contains exactly two instances of the the given word is somewhat more complex. Let
r2 = /\A(?:(?:.(?!\bdog\b))*\bdog\b){2}(?!.*\bdog\b)/m
str.match?(r1)
#=> false
Demo
The two regular expressions can be written in free-spacing mode to make them self-documenting.
r1 = /
\bdog\b # match 'dog' surrounded by word breaks
.* # match zero or more characters
\bdog\b # match 'dog' surrounded by word breaks
/m # cause . to match newlines
r2 = /
\A # match beginning of string
(?: # begin non-capture group
(?: # begin non-capture group
. # match one character
(?! # begin negative lookahead
\bdog\b # match 'dog' surrounded by word breaks
) # end negative lookahead
) # end non-capture group
* # execute preceding non-capture group zero or more times
\bdog\b # match 'dog' surrounded by word breaks
) # end non-capture group
{2} # execute preceding non-capture group twice
(?! # begin negative lookahead
.* # match zero or more characters
\bdog\b # match 'dog' surrounded by word breaks
) # end negative lookahead
/xm # # cause . to match newlines and invoke free-spacing mode
I am looking to find method names for python functions. I only want to find method names if they aren't after "def ". E.g.:
"def method_name(a, b):" # (should not match)
"y = method_name(1,2)" # (should find `method_name`)
My current regex is /\W(.*?)\(/.
str = "def no_match(a, b):\ny = match(1,2)"
str.scan(/(?<!def)\s+\w+(?=\()/).map(&:strip)
#⇒ ["match"]
The regex comments:
negative lookbehind for def,
followed by spaces (will be stripped later),
followed by one or more word symbols \w,
followed by positive lookahead for parenthesis.
Sidenote: one should never use regexps to parse long strings for any purpose.
I have assumed that lines that do not contain "def" are of the form "[something]=[zero or more spaces][method name]".
R1 = /
\bdef\b # match 'def' surrounded by word breaks
/x # free-spacing regex definition mode
R2 = /
[^=]+ # match any characters other than '='
= # match '='
\s* # match >= 0 whitespace chars
\K # forget everything matched so far
[a-z_] # match a lowercase letter or underscore
[a-z0-9_]* # match >= 0 lowercase letters, digits or underscores
[!?]? # possibly match '!' or '?'
/x
def match?(str)
(str !~ R1) && str[R2]
end
match?("def method_name1(a, b):") #=> false
match?("y = method_name2(1,2)") #=> "method_name2"
match?("y = method_name") #=> "method_name"
match?("y = method_name?") #=> "method_name?"
match?("y = def method_name") #=> false
match?("y << method_name") #=> nil
I chose to use two regexes to be able to deal with both my first and penultimate examples. Note that the method returns either a method name or a falsy value, but the latter may be either false or nil.
So I've got a string that's an improperly formatted name. Let's say, "Jean-paul Bertaud-alain".
I want to use a regex in Ruby to find the first character after every dash and make it uppercase. So, in this case, I want to apply a method that would yield: "Jean-Paul Bertaud-Alain".
Any help?
String#gsub can take a block argument, so this is as simple as:
str = "Jean-paul Bertaud-alain"
str.gsub(/-[a-z]/) {|s| s.upcase }
# => "Jean-Paul Bertaud-Alain"
Or, more succinctly:
str.gsub(/-[a-z]/, &:upcase)
Note that the regular expression /-[a-z]/ will only match letters in the a-z range, meaning it won't match e.g. à. This is because String#upcase does not attempt to capitalize characters with diacritics anyway, because capitalization is language-dependent (e.g. i is capitalized differently in Turkish than in English). Read this answer for more information: https://stackoverflow.com/a/4418681
"Jean-paul Bertaud-alain".gsub(/(?<=-)\w/, &:upcase)
# => "Jean-Paul Bertaud-Alain"
I suggest you make the test more demanding by requiring the letter to be upcased: 1) be preceded by a capitalized word followed by a hypen and 2) be followed by lowercase letters followed by a word break.
r = /
\b # Match a word break
[A-Z] # Match an upper-case letter
[a-z]+ # Match >= 1 lower-case letters
\- # Match hypen
\K # Forget everything matched so far
[a-z] # Match a lower-case letter
(?= # Begin a positive lookahead
[a-z]+ # Match >= 1 lower-case letters
\b # Match a word break
) # End positive lookahead
/x # Free-spacing regex definition mode
"Jean-paul Bertaud-alain".gsub(r) { |s| s.upcase }
#=> "Jean-Paul Bertaud-Alain"
"Jean de-paul Bertaud-alainM".gsub(r) { |s| s.upcase }
#=> "Jean de-paul Bertaud-alainM"
Given a string of an array in Ruby with some items in quotes that contain commas:
my_string.inspect
# => "\"hey, you\", 21"
How can I get an array of:
["hey, you", " 21"]
The Ruby standard CSV library's .parse_csv, does exactly this.
require 'csv'
"\"hey, you\", 21".parse_csv
# => ["hey, you", " 21"]
Yes, using CSV::parse_line or String#parse_csv, which require 'csv' adds to String's instance methods) is the way to go here, but you could also do it with a regex:
r = /
(?: # Begin non-capture group
(?<=\") # Match a double-quote in a positive lookbehined
.+? # Match one or more characters lazily
(?=\") # Match a double quote in a positive lookahead.
) # End non-capture group
| # Or
\s\d+ # Match a whitespace character followed by one or more digits
/x # Extended mode
str = "\"hey, you\", 21"
str.scan(r)
#=> ["hey, you", " 21"]
If you'd prefer to have "21" rather than " 21", just remove \s.