So what I want is to get n characters until it hits a specific character.
i have this String :
a='2.452811139617034,42.10874821716908|3.132087902867818,42.028314077306646|-0.07934861041448178,41.647538468746916|-0.07948265046522918,41.64754863599606'
How can I make it to only get to the character | , but without getting that character,
and get it like this:
2.452811139617034,42.10874821716908
You can avoid creating an unnecessary Array (like Array#split) or using a Regex (like Array#gsub) by using.
a = "2.452811139617034,42.10874821716908|3.132087902867818,42.028314077306646|-0.07934861041448178,41.647538468746916|-0.07948265046522918,41.64754863599606"
a[0,a.index('|')]
#=>"2.452811139617034,42.1087482171"
This means select characters at positions 0 up to the index of the first pipe (|). Technically speaking it is start at position 0 and select the length of n where n is the index of the pipe character which works in this case because ruby uses 0 based indexing.
As #CarySwoveland astutely pointed out the string may not contain a pipe in which case my solution would need to change to
#to return entire string
a[0,a.index('|') || a.size]
# or
b = a.index(?|) ? a[0,b] : a
# or to return empty string
a[0,a.index('|').to_i]
# or to return nil
a[0,a.index(?|) || -1]
You can simply do it like this:
a.split('|').first
a[/[^|]+/]
#=> "2.452811139617034,42.10874821716908"
The regular expression simply matches as many characters other than '|' that it can.
I'm always curious as to the performance of various options and so I took the liberty of looping through some of the approaches suggested. I named them as follows:
split_all = a.split('|').first
partition = a.partition('|').first
split_two = a.split('|', 2).first
string_brack_args = a[0,a.index('|')]
string_brack_range = a[0...a.index('|')]
gsub_regex = a.gsub(/\|.*$/, "")
plain_regex = a[/[^|]+/]
The results were as follows (slowest at top):
user system total real
gsub_regex 0.170000 0.000000 0.170000 ( 0.162666)
split_all 0.110000 0.000000 0.110000 ( 0.109498)
split_two 0.040000 0.000000 0.040000 ( 0.041792)
partition 0.040000 0.000000 0.040000 ( 0.037161)
string_brack_range 0.030000 0.000000 0.030000 ( 0.034021)
plain_regex 0.040000 0.000000 0.040000 ( 0.033468)
string_brack_args 0.020000 0.000000 0.020000 ( 0.022455)
That's nearly an eight-fold increase in efficiency between the slowest and fastest. Even some of the little things make a huge difference. Granted, I looped through these 100_000 times, so the difference is pretty small for the one instance, but even something as simple as using using split(str, 2) vs split(str) is well over twice as fast. In fact, the faster half of the approaches listed average 3 times as fast as the slower half.
One approach uses a gsub replacement:
a = "2.452811139617034,42.10874821716908|3.132087902867818,42.028314077306646|-0.07934861041448178,41.647538468746916|-0.07948265046522918,41.64754863599606"
output = a.gsub(/\|.*$/, "")
puts output # 2.452811139617034,42.10874821716908
Related
i am trying to multiply string value which is 33.000000 but i am getting error ORA-01722.
Can you please advise how to proper convert/cast this field in order to multiply?
select c1.name as Model,
p22.value as basisAMT,
p23.value as TotalAMT
MODEL BasisAMT TotalAMT
Auto 0.000000 0.000000
Auto 22.000000 33.000000
Auto 0.000000 0.000000
Auto 0.000000 0.000000
But then i am trying to do
select c1.name as Model,
100 * p22.value as basisAMT,
p23.value as TotalAMT
The error appears.
can you debug code that you cannot see? No? So why do you expect that of others?
It is illogical to do math on a character string. The fact that the characters in the string are all numeric doesn't change the fact that computers treat them as characters, no different than
'my name is fred' * 23
And we don't even know that all the characters are numeric. It could be padded with space characters.
You need to look at the TO_NUMBER() function.
I am trying to extract measurements from file names, and they are very inconsistent; for example:
FSTCAR.5_13UNC_1.00
FSTCAR.5_13UNC_1.00GR5P
FSTCAR.5_13UNC_1.00SS316
I have to be able to match all numbers (with decimals and with without leading zeros). I think I have that working with this:
/\d*\.?\d+/i
However, I also want to be able to exclude numbers preceded by SS or GR. Something like this seems to partial work:
/(?<!GR|SS)\d*\.?\d+/i
That will exclude the 5 from FSTCAR.5_13UNC_1.00GR5P above but anything more than a single digit is not excluded so 16 from the 316 would be a match. I am doing this in ruby.
Anytime you have to dither floating number strings its not a trivial feat.
This just takes your last regex and adds some extra stuff to the lookbehind.
This secures that the engine won't bypass a number just to match the regex.
# (?<!GR)(?<!SS)(?<![.\d])\d*\.?\d+
# (?<! GR | SS | [.\d] )
(?<! GR )
(?<! SS )
(?<! [.\d] )
\d* \.? \d+
Perl test case
#ary = (
'FSTCAR.5_13UNC_1.00 ',
'FSTCAR.5_13UNC_1.00GR5P',
'FSTCAR.5_13UNC_1.00SS316'
);
foreach $fname (#ary)
{
print "filename: $fname\n";
while ( $fname =~ /(?<!GR)(?<!SS)(?<![.\d])\d*\.?\d+/ig ) {
print " found $&\n";
}
}
Output >>
filename: FSTCAR.5_13UNC_1.00
found .5
found 13
found 1.00
filename: FSTCAR.5_13UNC_1.00GR5P
found .5
found 13
found 1.00
filename: FSTCAR.5_13UNC_1.00SS316
found .5
found 13
found 1.00
To fix the SS and GR exclusion, try this:
/(?<!GR|SS)[\d\.]+/i
I'm not sure exactly what your layout is, but using this would be faster for your negative look behind:
(?<![GRS]{2})
Edit: the + still isn't greedy enough.
You might need to use two regex. One to remove the GR/SS numbers, and one to match (note: I'm not very familiar with Ruby):
val.gsub('/[GRS]{2}[\d\.]+/', '')
val =~ /[\d\.]+/
Say I have a string like this:
May 12 -
Where what I want to end up with is:
May 12
I tried doing a gsub(/\s+\W/, '') and that works to strip out the trailing space and the last hyphen.
But I am not sure how I remove the first space before the M.
Thoughts?
Use match instead of gsub (i.e. extract the relevant string, instead of trying to strip irrelevant parts), using the regex /\w+(?:\W+\w+)*/:
" May 12 - ".match(/\w+(?:\W+\w+)*/).to_s # => "May 12"
Note that this is vastly more efficient than using gsub – pitting my match regex against the suggested gsub regex, I get these benchmarks (on 5 million repetitions):
user system total real
match: 19.520000 0.060000 19.580000 ( 22.046307)
gsub: 31.830000 0.120000 31.950000 ( 35.781152)
Adding a gstrip! step as suggested does not significantly change this:
user system total real
match: 19.390000 0.060000 19.450000 ( 20.537461)
gsub.strip!: 30.800000 0.110000 30.910000 ( 34.140044)
use .strip! on your result .
" May 12".strip! # => "May 12"
How about:
/^\s+|\s+\W+$/
explanation:
/ : regex delim
^ : begining of string
\s+ : 1 or more spaces
| : OR
\s+\W+ : 1 or more spaces followed by 1 or more non word char
$ : end of string
/ : regex delim
i'm trying to loop through a Ruby string containing many lines using the each_line method, but I also want to change them. I'm using the following code, but it doesn't seem to work:
string.each_line{|line| line=change_line(line)}
I suppose, that Ruby is sending a copy of my line and not the line itself, but unfortunatelly there is no method each_line!. I also tried with the gsub! method, using /^.*$/ to detect each line, but it seems that it calls the change_line method only ones and replaces all lines with it. Any ideas how to do that?
Thanks in advance :)
#azlisum: You are not storing the result of your concatenation. Use:
output = string.lines.map{|line|change_line(line)}.join
Comparing four ways to process by line in a string:
# Inject method (proposed by #steenslang)
output = string.each_line.inject(""){|s, line| s << change_line(line)}
# Join method (proposed by #Lars Haugseth)
output = string.lines.map{|line|change_line(line)}.join
# REGEX method (proposed by #olistik)
output = string.gsub!(/^(.*)$/) {|line| change_line(line)}
# String concatenation += method (proposed by #Erik Hinton)
output = ""
string.each_line{|line| output += change_line(line)}
The timing with Benchmark:
user system total real
Inject Time: 7.920000 0.010000 7.930000 ( 7.920128)
Join Time: 7.150000 0.010000 7.160000 ( 7.155957)
REGEX Time: 11.660000 0.010000 11.670000 ( 11.661059)
+= Time: 7.080000 0.010000 7.090000 ( 7.076423)
As #steenslag pointed out, 's += a' will generate a new string for each concatenation and is therefor not usually the best choice.
So given that, and given the times, your best bet is:
output = string.lines.map{|line|change_line(line)}.join
Also, this is the cleaner looking choice IMHO.
Notes:
Using Benchmark
Ruby-Doc: Benchmark
You should try starting out with a blank string too, each_lining through the string and then pushing the results onto the blank string.
output = ""
string.each_line{|line| output += change_line(line)}
In your original example, you are correct. Your changes are occuring but they are not being ssved anywhere. Each in Ruby does not alter anything by default.
You could use gsub! passing a block to it:
string.gsub!(/^(.*)$/) {|line| change_line(line)}
source: String#gsub!
String#each_line is meant for reading lines in a string, not writing them. You can use this to get the result you want like so:
changed_string = ""
string.each_line{ |line| changed_string += change_line(line) }
If you don't give each_line a block, you'll get an enumerator, which has the inject method.
str = <<HERE
smestring dsfg
line 2
HERE
res = str.each_line.inject(""){|m,line|m << line.upcase}
Does Oracle have built-in string character class constants (digits, letters, alphanum, upper, lower, etc)?
My actual goal is to efficiently return only the digits [0-9] from an existing string.
Unfortunately, we still use Oracle 9, so regular expressions are not an option here.
Examples
The field should contain zero to three letters, 3 or 4 digits, then zero to two letters. I want to extract the digits.
String --> Result
ABC1234YY --> 1234
D456YD --> 456
455PN --> 455
No string constants, but you can do:
select translate
( mystring
, '0'||translate (mystring, 'x0123456789', 'x')
, '0'
)
from mytable;
For example:
select translate
( mystring
, '0'||translate (mystring, 'x0123456789', 'x')
, '0'
)
from
( select 'fdkhsd1237ehjsdf7623A#L:P' as mystring from dual);
TRANSLAT
--------
12377623
If you want to do this often you can wrap it up as a function:
create function only_digits (mystring varchar2) return varchar2
is
begin
return
translate
( mystring
, '0'||translate (mystring, 'x0123456789', 'x')
, '0'
);
end;
/
Then:
SQL> select only_digits ('fdkhsd1237ehjsdf7623A#L:P') from dual;
ONLY_DIGITS('FDKHSD1237EHJSDF7623A#L:P')
-----------------------------------------------------------------
12377623
You can check the list for predefined datatypes on Oracle here, but you are not going to find what are you looking for.
To extract the numbers of an string you can use some combination of these functions:
TO_NUMBER, to convert an string to number.
REPLACE, to remove occurences.
TRANSLATE, to convert chars.
If you provide a more concise example will be easier to give you a detailed solution.
If you are able to use PL/SQL here, another approach is write your own regular expression matcher function. One starting point is Rob Pike's elegant, very tiny regular expression matcher in Chapter 1 of Beautiful Code. One of the exercises for the reader is to add character classes. (You'd first need to translate his 30 lines of C code into PL/SQL.)