I am trying to split a String in Ruby based on a regex.
The String has the following pattern:
00 0 0 00 00 0
I want to be able to split the string on every second space but I am quite new to Ruby and my experience with Regexes is limited.
I have tried the following:
line.split(/[0-9._%+-]+\s+[0-9._%+-]+/)
But this just returns an array of blank values. I have tried various different combinations of the regex pattern but have not got close to what I want. The result should be an array like this:
Array[0] => '00 0'
Array[1] => '0 00'
Array[2] => '00 0'
Could anyone explain how I could best do this in a Regex? And if possible explain why my attempt doesn't work and why you're working example does work, I want to increase my knowledge of Regexes by solving this problem.
Use String#scan
line = "00 0 0 00 00 0"
line.scan(/[0-9._%+-]+\s+[0-9._%+-]+/)
#=> ["00 0", "0 00", "00 0"]
When you use String#split, you pass a regex to match the things that you don't want in your output. That is, the things that should be in between the strings in the output array.
When using RegEx to split a given string, the matched part is removed from the result set. Therefore you cannot use the RegEx syntax to split those numbers and keep the values at the same time.
Use the following code instead:
"00 0 0 00 00 0".scan(/[0-9]+\s[0-9]+/)
Related
I need to check if the last number of character of a ${string} is equal to 9.
The string or numbers that I have to handle with is something 831, 519 or 1351.
However I dont know how do do it properly. I tried already something like:
${string?replace((string.length)-1,"9")}
At the end there should be instead of 831 --> 839 or 1351 --> 1359 and so on.
Any sugestions about how I can archive this ?
Oh and by the way. If I use the fuction above this error massage comes up:
Script error: You have used ?number on a string-value which is not a number (or is empty or contains spaces).
And what I tried also was:
code snippet
Because the original number is somethink like 831.896.
You could use string slicing to keep all characters except for the last one like this:
<#assign string = "1234">
<#assign string = string[0..<string?length-1] + "9">
${string}
Results in:
1239
Since you want to replace that thing, use ?replace. This replaces the last character with 9, if the last character is a digit that's not already 9:
${s?replace("[0-8]$", '9', 'r')}
I am currently working with UDP packets and I need to craft and send custom data. Because it is easier for me to read I work with strings representing hexadecimal values. I have something like this :
a = "12"
b = "15"
header = "c56b4040003300" + a + "800401" + b + "90000000"
Now, what I want to do is converting my header variable into hexadecimal (but not with the hexadecimal value of every character in header). It means that if I write my header variable in a file and I open it with a hexadecimal editor, I want to see
c5 6b 40 40 00 33 00 12 80 04 01 15 90 00 00 00
I don't have a good knowledge of ruby and I couldn't find a way to do it so far. The pack function converts characters in hexa but not hexadecimal string representation as hexadecimal value. And doing something like
header = "\xc5\x6b\x40\x40\x00\x33\x00\x" + a + "\x80\x04\x01\x" + b + "\x90\x00\x00\x00"
will throw me an error saying "invalid hex escape" (which make sense).
So if you have a solution to this problem please tell me (if possible without using any external library)
require 'strscan'
s = StringScanner.new('hexstring here')
s.scan(/../).map { |x| x.hex.chr }.join
String#to_i takes a base argument that will do what you want:
["c56b4040003300", a, "800401", b, "90000000"].join.to_i(16)
But it may not make sense to represent your data as an large integer. If you just want a blob of binary data, you can concatenate everything together and use Array#pack:
[["c56b4040003300", a, "800401", b, "90000000"].join].pack('H*')
Or you can pack the individual components and concatenate the results:
["c56b4040003300", a, "800401", b, "90000000"].map { |s| [s].pack('H*') }.join
Or you can just work with an array of bytes throughout your program:
bytes = "c56b4040003300".scan(/../)
bytes << a
bytes.concat "800401".scan(/../)
bytes << b
bytes.concat "90000000".scan(/../)
bytes.unpack('H*' * bytes.size)
Hope that helps!
That's what I am doing:
c.scan(/[1-9]|1[0-2]/)
For some reason, it returns only numbers from 1 to 9, ignoring the second part. I tried experimenting a little bit, it seems that the method will search for 10-12 only if 1 is excluded from [1-9] part, e.g., c.scan(/[2-9]|1[0-2]/) will do. What is the reason?
P.S. I know that this method lacks lookbehinds and will search for numbers and "part of numbers" as well
Change the order of your patterns and add word boundaries if necessary.
c.scan(/\b(?:1[0-2]|[1-9])\b/)
The pattern before | is used first. So in our case, it matches all the numbers from 10 to 12. After that the next pattern, that is the one after | is used and now it matches all the remaining numbers ranges from 1 to 9. Note that this would match 9 in 59 also. So i suggest you to put your pattern inside a capturing or non-capturing group and add word boundary \b (matches between a word character and a non-word character) before and after to that group .
DEMO
| matches left to right, and the first part of the right side (1) is always matched by the left side. Reverse them:
c.scan(/1[0-2]|[1-9]/)
Here's another way you might consider extracting numbers between 1 and 12 (assuming that's what you want to do):
c = '14 0 11x 15 003 y12'
c.scan(/\d+/).map(&:to_i).select { |n| (1..12).cover?(n) }
#=> [11, 3, 12]
I've returned an array of integers, rather than strings, thinking that probably would be more useful, but if you want strings:
c.scan(/\d+/).map { |s| s.to_i.to_s }
.select { |s| ['10', '11', '12', *'1'..'9'].include?(s) }
#=> ["11", "3", "12"]
I see several advantages to this approach, versus using a single regex:
it's easy to understand;
the regex is simple;
it's easy to modify if the permissible values change; and
it can be broken into three pieces to facilitate testing.
I am having a very difficult time with this:
# contained within:
"MA\u008EEIKIAI"
# should be
"MAŽEIKIAI"
# nature of string
$ p string3
"MA\u008EEIKIAI"
$ puts string3
MAEIKIAI
$ string3.inspect
"\"MA\\u008EEIKIAI\""
$ string3.bytes
#<Enumerator: "MA\u008EEIKIAI":bytes>
Any ideas on where to start?
Note: this is not a duplicate of my previous question.
\u008E means that the unicode character with the codepoint 8e (in hex) appears at that point in the string. This character is the control character “SINGLE SHIFT TWO” (see the code chart (pdf)). The character Ž is at the codepoint u017d. However it is at position 8e in the Windows CP-1252 encoding. Somehow you’ve got your encodings mixed up.
The easiest way to “fix” this is probably just to open the file containing the string (or the database record or whatever) and edit it to be correct. The real solution will depend on where the string in question came from and how many bad strings you have.
Assuming the string is in UTF-8 encoding, \u008E will consist of the two bytes c2 and 8e. Note that the second byte, 8e, is the same as the encoding of Ž in CP-1252. On way to convert the string would be something like this:
string3.force_encoding('BINARY') # treat the string just as bytes for now
string3.gsub!(/\xC2/n, '') # remove the C2 byte
string3.force_encoding('CP1252') # give the string the correct encoding
string3.encode('UTF-8') # convert to the desired encoding
Note that this isn’t a general solution to fix all issues like this. Not all CP-1252 characters, when mangled and expressed in UTF-8 this way will amenable to conversion like this. Some will be two bytes c2 xx where xx the correct byte (like in this case), others will be c3 yy where yy is a different byte.
What about using Regexp & String#pack to convert the Unicode escape?
str = "MA\\u008EEIKIAI"
puts str #=> MA\u008EEIKIAI
str.gsub!(/\\u(.{4})/) do |match|
[$1.to_i(16)].pack('U')
end
puts str #=> MA EIKIAI
If I have a string like
6d7411014f
I want to read the the occurrence of first two integers and put the final number in a variable
Based on above example my variable would contain 67
more examples:
d550dfe10a
variable would be 55
What i've tried is \d but that gives me 6. how do I get the second number?
I'd use scan for this sort of thing:
n = my_string.scan(/\d/)[0,2].join.to_i
You'd have to decide what you want to do if there aren't two numbers though.
For example:
>> '6d7411014f'.scan(/\d/)[0,2].join.to_i
=> 67
>> 'd550dfe10a'.scan(/\d/)[0,2].join.to_i
=> 55
>> 'pancakes'.scan(/\d/)[0,2].join.to_i
=> 0
>> '6 pancakes'.scan(/\d/)[0,2].join.to_i
=> 6
References:
String#scan
Array#[]
Array#join
I really can't answer this exactly in Ruby, but a regex to do it is:
/^\D*(\d)\D*(\d)/
Then you have to concatenate $1 and $2 (or whatever they are called in Ruby).
Building off of sidyll's answer,
string = '6d7411014f'
matched_vals = string.match(/^\D*(\d)\D*(\d)/)
extracted_val = matched_vals[1].to_i * 10 + matched_vals[2].to_i