Methods to concatenate strings on separate lines - ruby

This produces newlines:
%(https://api.foursquare.com/v2/venues/search
?ll=80.914207,%2030.328466&radius=200
&v=20161201&m=foursquare&categoryId=4d4b7105d754a06374d81259
&intent=browse)
This produces spaces:
"https://api.foursquare.com/v2/venues/search
?ll=80.914207,%2030.328466&radius=200
&v=20161201&m=foursquare&categoryId=4d4b7105d754a06374d81259
&intent=browse"
This produces one string:
"https://api.foursquare.com/v2/venues/search"\
"?ll=80.914207,%2030.328466&radius=200"\
"&v=20161201&m=foursquare&categoryId=4d4b7105d754a06374d81259"\
"&intent=browse"
When I want to separate one string on multiple lines to read it better on screen, is it preferred to use the escape character?
My IDE complains that I should use single quoted strings rather than double quoted strings since there is no interpolation.

Normally you'd put something like this on one line, readability be damned, because the alternatives are going to be problematic. There's no way of declaring a string with whitespace ignored, but you can do this:
url = %w[ https://api.foursquare.com/v2/venues/search
?ll=80.914207,%2030.328466&radius=200
&v=20161201&m=foursquare&categoryId=4d4b7105d754a06374d81259
&intent=browse
].join
Where you explicitly remove the whitespace.
I'd actually suggest avoiding this whole mess by properly composing this URI:
uri = url("https://api.foursquare.com/v2/venues/search",
ll: [ 80.914207,30.328466 ],
radius: 200,
v: 20161201,
m: 'foursquare',
categoryId: '4d4b7105d754a06374d81259',
intent: 'browse'
)
Where you have some kind of helper function that properly encodes that using URI or other tools. By keeping your parameters as data, not as encoded strings, for as long as possible you make it easier to spot bugs as well as make last-second changes to them.

The answer by #tadman definitely suggests the proper way to do it; I’ll post another approach just for the sake of diversity:
query = "https://api.foursquare.com/v2/venues/search"
"?ll=80.914207,%2030.328466&radius=200"
"&v=20161201&m=foursquare&categoryId=4d4b7105d754a06374d81259"
"&intent=browse"
Yes, without any visible concatenation, 4 strings in quotes one by one in a row. This example won’t work in irb/pry (due to it’s REPL nature,) but the above is the most efficient way to concatenate strings in ruby without producing any intermediate result.
Contrived example to test in pry/irb:
value = "a" "b" "c" "d"

Related

Get the same results from string.start_with? and string[ ]

Basically, I want to check if a string (main) starts with another string (sub), using both of the above methods. For example, following is my code:
main = gets.chomp
sub = gets.chomp
p main.start_with? sub
p main[/^#{sub}/]
And, here is an example with I/O - Try it online!
If I enter simple strings, then both of them works exactly the same, but when I enter strings like "1\2" in stdin, then I get errors in the Regexp variant, as seen in TIO example.
I guess this is because of the reason that the string passed into second one isn't raw. So, I tried passing sub.dump into second one - Try it online!
which gives me nil result. How to do this correctly?
As a general rule, you should never ever blindly execute inputs from untrusted sources.
Interpolating untrusted input into a Regexp is not quite as bad as interpolating it into, say, Kernel#eval, because the worst thing an attacker can do with a Regexp is to construct an Evil Regex to conduct a Regular expression Denial of Service (ReDoS) attack (see also the section on Performance in the Regexp documentation), whereas with eval, they could execute arbitrary code, including but not limited to, deleting the entire file system, scanning memory for unencrypted passwords / credit card information / PII and exfiltrate that via the network, etc.
However, it is still a bad idea. For example, when I say "the worst thing that happen is a ReDoS", that assumes that there are no bugs in the Regexp implementation (Onigmo in the case of YARV, Joni in the case of JRuby and TruffleRuby, etc.) Ruby's Regexps are quite powerful and thus Onigmo, Joni and co. are large and complex pieces of code, and may very well have their own security holes that could be used by a specially crafted Regexp.
You should properly sanitize and escape the user input before constructing the Regexp. Thankfully, the Ruby core library already contains a method which does exactly that: Regexp::escape. So, you could do something like this:
p main[/^#{Regexp.escape(sub)}/]
The reason why your attempt at using String#dump didn't work, is that String#dump is for representing a String the same way you would have to write it as a String literal, i.e. it is escaping String metacharacters, not Regexp metacharacters and it is including the quote characters around the String that you need to have it recognized as a String literal. You can easily see that when you simply try it out:
sub.dump
#=> "\"1\\\\2\""
# equivalent to '"1\\2"'
So, that means that String#dump
includes the quotes (which you don't want),
escapes characters that don't need escaping in Regexp just because they need escaping in Strings (e.g. # or "), and
doesn't escape characters that don't need escaping in Strings (e.g. [, ., ?, *, +, ^, -).

YAML multiline wrap without space

I tried to understand the specifications here but they're actually quite difficult to understand.
http://www.yaml.org/spec/1.2/spec.html#id2779048
As far as I can see, there are three ways of wrapping text but their function is very similar... in fact so similar that I don't get the point in having all of them instead of one or two.
Well my problem is that I have some String that is really long (~700 characters) but has no whitespaces.
Now of course I want to put it into multiple lines but there seems to be no way to do so without having any linefeeds or space characters that I do not want.
So is this actually possible?
---
aTest:
hereComes
SomeText
ThatShould
NotHave
AnyWhitespaces
It's possible. See. Is there a way to represent a long string that doesnt have any whitespace on multiple lines in a YAML document?
Quoted example:
"abcdefghi\
jklmnopqr\
stuvwxyz"
Single quotes may also work depending on the parsing library so YMMV.

Regex can this be achieved

I'm too ambitious or is there a way do this
to add a string if not present ?
and
remove a the same string if present?
Do all of this using Regex and avoid the if else statement
Here an example
I have string
"admin,artist,location_manager,event_manager"
so can the substring location_manager be added or removed with regards to above conditions
basically I'm looking to avoid the if else statement and do all of this plainly in regex
"admin,artist,location_manager,event_manager".test(/some_regex/)
The some_regex will remove location_manager from the string if present else it will add it
Am I over over ambitions
You will need to use some sort of logic.
str += ',location_manager' unless str.gsub!(/location_manager,/,'')
I'm assuming that if it's not present you append it to the end of the string
Regex will not actually add or remove anything in any language that I am aware of. It is simply used to match. You must use some other language construct (a regex based replacement function for example) to achieve this functionality. It would probably help to mention your specific language so as to get help from those users.
Here's one kinda off-the-wall solution. It doesn't use regexes, but it also doesn't use any if/else statements either. It's more academic than production-worthy.
Assumptions: Your string is a comma-separated list of titles, and that these are a unique set (no duplicates), and that order doesn't matter:
titles = Set.new(str.split(','))
#=> #<Set: {"admin", "artist", "location_manager", "event_manager"}>
titles_to_toggle = ["location_manager"]
#=> ["location_manager"]
titles ^= titles_to_toggle
#=> #<Set: {"admin", "artist", "event_manager"}>
titles ^= titles_to_toggle
#=> #<Set: {"location_manager", "admin", "artist", "event_manager"}>
titles.to_a.join(",")
#=> "location_manager,admin,artist,event_manager"
All this assumes that you're using a string as a kind of set. If so, you should probably just use a set. If not, and you actually need string-manipulation functions to operate on it, there's probably no way around except for using if-else, or a variant, such as the ternary operator, or unless, or Bergi's answer
Also worth noting regarding regex as a solution: Make sure you consider the edge cases. If 'location_manager' is in the middle of the string, will you remove the extraneous comma? Will you handle removing commas correctly if it's at the beginning or the end of the string? Will you correctly add commas when it's added? For these reasons treating a set as a set or array instead of a string makes more sense.
No. Regex can only match/test whether "a string" is present (or not). Then, the function you've used can do something based on that result, for example replace can remove a match.
Yet, you want to do two actions (each can be done with regex), remove if present and add if not. You can't execute them sequentially, because they overlap - you need to execute either the one or the other. This is where if-else structures (or ternary operators) come into play, and they are required if there is no library/native function that contains them to do exactly this job. I doubt there is one in Ruby.
If you want to avoid the if-else-statement (for one-liners or expressions), you can use the ternary operator. Or, you can use a labda expression returning the correct value:
# kind of pseudo code
string.replace(/location,?|$/, function($0) return $0 ? "" : ",location" )
This matches the string "location" (with optional comma) or the string end, and replaces that with nothing if a match was found or the string ",location" otherwise. I'm sure you can adapt this to Ruby.
to remove something matching a pattern is really easy:
(admin,?|artist,?|location_manager,?|event_manager,?)
then choose the string to replace the match -in your case an empty string- and pass everything to the replace method.
The other operation you suggested was more difficult to achieve with regex only. Maybe someone knows a better answer

Split string suppressing all null fields

I want to split a string suppressing all null fields
Command:
",1,2,,3,4,,".split(',')
Result:
["", "1", "2", "", "3", "4", ""]
Expected:
["1", "2", "3", "4"]
How to do this?
Edit
Ok. Just to sum up all that good questions posted.
What I wanted is that split method (or other method) didn't generate empty strings. Looks like it isn't possible.
So, the solution is two step process: split string as usual, and then somehow delete empty strings from resulting array.
The second part is exactly this question
(and its duplicate)
So I would use
",1,2,,3,4,,".split(',').delete_if(&:empty?)
The solution proposed by Nikita Rybak and by user229426 is to use reject method. According to docs reject returns a new array. While delete_if method is more efficient since I don't want a copy. Using select proposed by Mark Byers even more inefficient.
steenslag proposed to replace commas with space and then use split by space:
",1,2,,3,4,,".gsub(',', ' ').split(' ')
Actually, the documentation says that space is actually a white space. But results of "split(/\s/)" and "split(' ')" are not the same. Why's that?
Mark Byers proposed another solution - just using regular expressions. Seems like this is what I need. But this solution implies that you have to be a master of regexp. But this is great solution! For example, if I need spaces to be separators as well as any non-alphanumeric symbol I can rewrite this to
",1,2, ,3 3,4 4 4,,".scan(/\w+[\s*\w*]*/)
the result is:
["1", "2", "3 3", "4 4 4"]
But again regexps are very unintuitive and they need an experience.
Summary
I expect that split to work with whitespaces as if whitespaces were a comma or even regexp. I expect it to do not produce empty strings. I think this is a bug in ruby or my misunderstanding.
Made it a community question.
There's a reject method in Array:
",1,2,,3,4,,".split(',').reject { |s| s.empty? }
Or if you prefer Symbol#to_proc:
",1,2,,3,4,,".split(',').reject(&:empty?)
Hoping to illuminate a bit here:
But results of "split(/\s/)" and "split(' ')" are not the same. Why's that?
If you look at the docs for String#split you'll see that split with ' ' is a special case:
If pattern is a single space, str is split on whitespace,
with leading whitespace and runs of contiguous whitespace characters ignored.
You also mention:
I expect it to do not produce empty strings. I think this is a bug in ruby or my misunderstanding.
The problem probably lies between the keyboard and the chair. ;-)
split will happily produce empty strings as it should, because there are times when you would definitely want this ability, and there are plenty of easy ways to work around it. Consider if you were splitting a csv from an Excel file. Anywhere you see ',,' would be an empty column, not a column you should just get rid of.
Regardless, you've seen a bunch of solutions - and here's another one that might show you the things you can do with ruby and split!
It seems you want to split up data between multiple commas, so why not try that and see what happens?
a = ",1,2,,3,4,,5,,,,6,,,".split(/,+/)
It's a simple enough regular expression: /,+/ means one or more commas, so we'll split on that.
This almost gives you want you want, except that you also want to ignore the leading empty field. You'll note that split ignores the empty field on the end because (from the String#split docs):
If the limit parameter is omitted, trailing null fields are suppressed.
So that means we can either use something that will remove that nil at the front of the array or just remove the initial commas. We can use gsub for that:
a = ",1,2,,3,4,,5,,,,6,,,".gsub(/^,+/,'')
If you print that out you'll see that our trailing empty "field" is now gone. So we can combine them all in one line:
a = ",1,2,,3,4,,5,,,,6,,,".gsub(/^,+/,'').split(/,+/)
And you have another solution!
And incidentally, this points out another possibility, that we can just cleanup our string entirely before sending it to split if we want a simple split. I'll leave it to you to figure out what this one is doing:
a = ",1,2,,3,4,,5,,,,6,,,".gsub(/,+/,',').gsub(/^,/,'').split(',')
There's lots of ways to do things in ruby. If it seems that ruby isn't doing what you want, then take a look at the docs and realize that it probably works the way that it does for a reason (there are plenty of people who would be upset if split wasn't able to spit out empty fields :)
Hope that helps!
You could use split followed by select:
",1,2,,3,4,,".split(',').select{|x|!x.empty?}
Or you could use a regular expression to match what you want to keep instead of splitting on the delimiter:
",1,2,,3,4,,".scan(/[^,]+/)
",1,2,,3,4,,".split(/,/).reject(&:empty?)
",1,2,,3,,,4,,".squeeze(",").sub(/^,*|,*$/,"").split(",")
String#split(pattern) behaves as desired when pattern is a single space (ruby-doc).
",1,2,,3,4,,".gsub(',', ' ').split(' ')

Using regex to replace all spaces NOT in quotes in Ruby

I'm trying to write a regex to replace all spaces that are not included in quotes so something like this:
a = 4, b = 2, c = "space here"
would return this:
a=4,b=2,c="space here"
I spent some time searching this site and I found a similar q/a ( Split a string by spaces -- preserving quoted substrings -- in Python ) that would replace all the spaces inside quotes with a token that could be re-substituted in after wiping all the other spaces...but I was hoping there was a cleaner way of doing it.
It's worth noting that any regular expression solution will fail in cases like the following:
a = 4, b = 2, c = "space" here"
While it is true that you could construct a regexp to handle the three-quote case specifically, you cannot solve the problem in the general sense. This is a mathematically provable limitation of simple DFAs, of which regexps are a direct representation. To perform any serious brace/quote matching, you will need the more powerful pushdown automaton, usually in the form of a text parser library (ANTLR, Bison, Parsec).
With that said, it sounds like regular expressions should be sufficient for your needs. Just be aware of the limitations.
This seems to work:
result = string.gsub(/( |(".*?"))/, "\\2")
I consider this very clean:
mystring.scan(/((".*?")|([^ ]))/).map { |x| x[0] }.join
I doubt gsub could do any better (assuming you want a pure regex approach).
try this one, string in single/double quoter is also matched (so you need to filter them, if you only need space):
/( |("([^"\\]|\\.)*")|('([^'\\]|\\.)*'))/

Resources