Regexp argument to readlines - ruby

I'm trying to pass in /\!|\.|\?/ to the separator argument for readlines. It seems it's not possible. Or is it?
f.readlines(/\!|\.|\?/)
I know the alternative is to use read and split, which accepts Regexp, but I want to know if this is also possible with readlines

IO#readlines expects a string, not a regular expression. But the desired behaviour might be easily achieved with read + split since according to the documentation readlines “reads the entire file”:
f.read.split /\!|\.|\?/
Please also read the valuable comment by #tom-lord with a significant improvement suggestion.

Related

What is the simplest way to get UTF-8 substring in Julia

UTF-8 string in Julia cannot use slice operator because it slice the byte index of string not character. For example
s = "ポケットモンスター"
s[1:4]
s[1:4] will be "ポケ" not "ポケット".
I would like to know the simplest and most readable for get UTF-8 sub-string in Julia.
Perhaps this question calls attention to some missing functions in the standard string library (which is supposed to undergo changes in the next version of Julia). In the meantime, if we define:
substr(s,i,j) = s[chr2ind(s,i):chr2ind(s,j)]
Then,
substr(s,1,4)
Would be "ポケット"
You might want to consider using UTF32String instead of UTF8String, if you are going to be doing this a lot, and only converting to UTF8String if necessary, when you are finished.

Changing "word" to "Word" using a RegEx like [A-Z]([a-z]*)\b

The title sums up my conundrum pretty well. I've been searching around the net for a while, and being new to Ruby and Regular Expressions as a whole, I'm stuck trying to figure out how to alter the case of a single word string using a RegEx "filter" such as [A-Z]([a-z]*)\b.
Basically I want the flow to be
input: woRD
filter: [A-Z]([a-z]*)\b
output: Word
I already have the words filtered into a list, so I don't need to match words; I only need to filter the case of the word using a RegEx filter.
I do not want to use standard capitalization methods, I want this to be done using Regular Expressions.
You can use
"woRD".downcase.capitalize
Ruby provides some predefined methods for these type of functionality. Try to use them instead of regex. which saves coding time!
Well, for some reason you want to use regexps. Here you go:
# prepare hashes for gsub
to_down = (to_upper = Hash[('a'..'z').zip('A'..'Z')]).invert
# convert to downcase
downcased = 'woRD'.gsub(/[A-Z]/, to_down)
# ⇛ 'word'
titlecased = downcased.gsub(/^\w/, to_upper)
# ⇒ 'Word'
Hope it helps. Note the usage of String#gsub(re, hash) method.
You can't use Regex to such altering as you want to do.
Please read carefully this topic: How to change case of letters in string using regex in Ruby.
The best way to solve your problem is to use:
"woRD".downcase.capitalize
or
name_of_your_variable.downcase!.capitalize!
if you want to alter string in your variable permanently without need of assign it to other variable.

Ruby: Rubeque: Variable in regexp?

I'm solving http://www.rubeque.com/problems/a-man-comma--a-plan-comma--a-canal--panama-excl-/solutions but I'm a bit confused about treating #{} as comment in regexp.
My code look like this now
def longest_palindrome(txt)
txt[/#{txt.reverse}/]
end
I tried txt[/"#{txt.reverse}"/] or txt[#{txt.reverse}] but nothing works as I wish. How should I implicate variable into regexp?
This is not something you can do with a regex.
While you could use variable interpolation in the construction of a regex (see the other answers/comments), that wouldn't help you here. You could only use that to reverse a literal string, not a regex match result. Even if you could, you still wouldn't have solved the "find the longest palindrome" part, at least not with acceptable runtime performance.
Use a different approach to the problem.
It is hard to tell how do you wish that happens without examples, but I suppose you are after
txt[/#{Regexp.escape(txt.reverse)}/]
See the Regexp#escape method

What do Ruby's printf arguments mean?

Can someone please help me understand the following expression?
printf("%3d - %s\n", counter, name)
That line prints something like this 6 - Install Adobe software
I have looked up information and read the reference but I can't find a simple answer and I'm a bit confused. If you can refer me to a good reference, please do so.
%3d Ok, according to what I could understand, %3d is the number of characters or spaces. Please point me to a reference that explains it.
%s\n I couldn't figure out what this does. I guess \n is a newline or something similar, but by looking at the output it doesn't seem to work like that.
Why are counter and name variables separated by commas?
By looking at the output is seems that %3d is kind of replaced by counter and %s\n is replaced by name. I'm not sure how it works but I would like to understand it.
For syntax look at any printf docs, but check the sprintf docs on ruby-doc.
They're separated by commas because they're separate parameters to the function, but that's more or less syntactic sugar. Think varargs.
Not sure what you mean with the %s\n thing, it's a string then a newline: that's what it outputs.
If your question is specifically "how does the code turn a formatting string and a group of arguments into output" I'd probably search for source, for example, a tiny embedded printf. Nutshell version is that the format string is searched for formatting options, they consume their associated parameters, outputting an appropriately-formatted string. It's a tiny little DSL.

Ruby regex: extract a list of urls from a string

I have a string of images' URLs and I need to convert it into an array.
http://rubular.com/r/E2a5v2hYnJ
How do I do this?
URI.extract(your_string)
That's all you need if you already have it in a string. I can't remember, but you may have to put require 'uri' in there first. Gotta love that standard library!
Here's the link to the docs URI#extract
Scan returns an array
myarray = mystring.scan(/regex/)
See here on regular-expressions.info
The best answer will depend very much on exactly what input string you expect.
If your test string is accurate then I would not use a regex, do this instead (as suggested by Marnen Laibow-Koser):
mystring.split('?v=3')
If you really don't have constant fluff between your useful strings then regex might be better. Your regex is greedy. This will get you part way:
mystring.scan(/https?:\/\/[\w.-\/]*?\.(jpe?g|gif|png)/)
Note the '?' after the '*' in the part capturing the server and path pieces of the URL, this makes the regex non-greedy.
The problem with this is that if your server name or path contains any of .jpg, .jpeg, .gif or .png then the result will be wrong in that instance.
Figuring out what is best needs more information about your input string. You might for example find it better to pattern match the fluff between your desired URLs.
Use String#split (see the docs for details).
Part of the problem is in rubular you are using https instead of http.. this gets you closer to what you want if the other answers don't work for you:
http://rubular.com/r/cIjmjxIfz5

Resources