How Can I split a string by the character " - go

That is what I have, and I tried to split like:
(Using Golang)
idPost = strings.Split(idPost,'"')
but the compiler said IncompatibleAssign using '"'.

Use a back quote instead of an apostrophe, example:
idPost = strings.Split(idPost,`"`)

you need to escape it, so your question is how to escape characters in go.
Escapes and multiline strings
You will find this is similar across many languages:
dPost = strings.Split(idPost,"\"")

Related

Regex that considers apostrophes as word characters? [duplicate]

So I want to split a string in java on any non-alphanumeric characters.
Currently I have been doing it like this
words= Str.split("\\W+");
However I want to keep apostrophes("'") in there. Is there any regular expression to preserve apostrophes but kick the rest of the junk? Thanks.
words = Str.split("[^\\w']+");
Just add it to the character class. \W is equivalent to [^\w], which you can then add ' to.
Do note, however, that \w also actually includes underscores. If you want to split on underscores as well, you should be using [^a-zA-Z0-9'] instead.
For basic English characters, use
words = Str.split("[^a-zA-Z0-9']+");
If you want to include English words with special characters (such as fiancé) or for languages that use non-English characters, go with
words = Str.split("[^\\p{L}0-9']+");

Escaping an apostrophe in golang

How can I escape an apostrophe in golang?
I have a string
s = "I've this book"
and I want to make it
s = "I\'ve this book"
How to achieve this?
Thanks in advance.
Escaping a character is only necessary if it can be interpreted in two or more ways. The apostrophe in your string can only be interpreted as an apostrophe, escaping is therefore not necessary as such. This is probably why you see the error message unknown escape sequence: '.
If you need to escape the apostrophe because it is inserted into a database, first consider using library functions for escaping or inserting data directly. Correct escaping has been the culprit of many security problems in the last decades. You will almost certainly do it wrong.
Having said that, you have to escape \ to do what you want (click to play):
fmt.Println("\\'") # outputs \'
As you're using cassandra, you can use packages like gocql which provide you with parametrized queries:
session.Query(`INSERT INTO sometable (text) VALUES (?)`, "'escaping'").Exec();

Converting String to Regex string

How can I transform a string into a regex string, properly escaping all regex-specific characters? I am using interpolation to build the regex string to allow users to customize the regex without having to touch the code (or expecting them to know regex)
Example
custom_text = "Hello"
my_regex = /#{custom_text}:\s*(\d+)/i
Which results in the following regex when my code uses it
/Hello:\s*(\d+)/i
This allows users to perhaps provide language localizations without having to worry about figuring out where my regex is used, how it's used, or whether they will break the script if they changed something.
However if they wanted to include things like periods or question marks like Hello?, I would probably need to escape them first.
Use Regexp.escape:
my_regex = /#{Regexp.escape(custom_text)}:\s*(\d+)/i
For example:
>> puts /#{Regexp.escape('Hello?')}/.inspect
/Hello\?/

Using regexes in ruby with a need to match lots of * and /

I need to find strings with * and / using reg-exes, I am writing in Ruby.The reason for this need to find lots of * and / is that I am building a tokenizer for an language and there are multi-line comments that use the C style of multi-line comments (/* */). I have the single line comments handled already.
Is there a way to use reg-ex without having to use the two foreword slashes to indicate some regular expression because I am finding it impossible to find my mistakes due to the insane amount of escaping. Or can someone give me advise on how to handle the escaping in a sane matter? I already tried writing the sequence first then escaping it.
Thank you for your time and advise.
One trick that might help is the %r literal:
%r{http://www\.google\.com}
I like to use pipes myself, when they're not in the regex.
%r|http://www\.google\.com|
You can also create new instances of Regexp via Regexp.new and pass a string.
Finally, you might also look at Regexp.quote:
Escapes any characters that would have special meaning in a regular expression. Returns a new escaped string, or self if no characters are escaped. For any string, Regexp.new(Regexp.escape(str))=~str will be true.

gsub ASCII code characters from a string in ruby

I am using nokogiri to screen scrape some HTML. In some occurrences, I am getting some weird characters back, I have tracked down the ASCII code for these characters with the following code:
#parser.leads[0].phone_numbers[0].each_byte do |c|
puts "char=#{c}"
end
The characters in question have an ASCII code of 194 and 160.
I want to somehow strip these characters out while parsing.
I have tried the following code but it does not work.
#parser.leads[0].phone_numbers[0].gsub(/160.chr/,'').gsub(/194.chr/,'')
Can anyone tell me how to achieve this?
I found this question while trying to strip out invisible characters when "trimming" a string.
s.strip did not work for me and I found that the invisible character had the ord number 194
None of the methods above worked for me but then I found "Convert non-breaking spaces to spaces in Ruby " question which says:
Use /\u00a0/ to match non-breaking spaces: s.gsub(/\u00a0/, ' ') converts all non-breaking spaces to regular spaces
Use /[[:space:]]/ to match all whitespace, including Unicode whitespace like non-breaking spaces. This is unlike /\s/, which matches only ASCII whitespace.
So glad I found that! Now I'm using:
s.gsub(/[[:space:]]/,'')
This doesn't answer the question of how to gsub specific character codes, but if you're just trying to remove whitespace it seems to work pretty well.
Your problem is that you want to do a method call but instead you're creating a Regexp. You're searching and replacing strings consisting of the string "160" followed by any character and then the string "chr", and then doing the same except with "160" replaced with "194".
Instead, do gsub(160.chr, '').
Update (2018): This code does not work in current Ruby versions. Please refer to other answers.
You can also try
s.gsub(/\xA0|\xC2/, '')
or
s.delete 160.chr+194.chr
First thought would be should you be using gsub! instead of gsub
gsub returns a string and gsub! performs the substitution in place
I was getting "invalid multibyte escape" error while trying the above solution, but for a different situation. Google was return \xA0 when the number is greater than 999 and I wanted to remove it. So what I did was use return_value.gsub(/[\xA0]/n,"") instead and it worked perfectly fine for me.

Resources