When using regular expressions in Ruby, what is the difference between $1 and \1?
\1 is a backreference which will only work in the same sub or gsub method call, e.g.:
"foobar".sub(/foo(.*)/, '\1\1') # => "barbar"
$1 is a global variable which can be used in later code:
if "foobar" =~ /foo(.*)/ then
puts "The matching word was #{$1}"
end
Output:
"The matching word was bar"
# => nil
Keep in mind there's a third option, the block form of sub. Sometimes you need it. Say you want to replace some text with the reverse of that text. You can't use $1 because it's not bound quickly enough:
"foobar".sub(/(.*)/, $1.reverse) # WRONG: either uses a PREVIOUS value of $1,
# or gives an error if $1 is unbound
You also can't use \1, because the sub method just does a simple text-substitution of \1 with the appropriate captured text, there's no magic taking place here:
"foobar".sub(/(.*)/, '\1'.reverse) # WRONG: returns '1\'
So if you want to do anything fancy, you should use the block form of sub ($1, $2, $`, $' etc. will be available):
"foobar".sub(/.*/){|m| m.reverse} # => returns 'raboof'
"foobar".sub(/(...)(...)/){$1.reverse + $2.reverse} # => returns 'oofrab'
Related
I'm trying to get back an array from perl to bash.
My perl scrip has an array and then I use return(#arr)
from my bash script I use
VAR = `perl....
when I echo VAR
I get the aray as 1 long string with all the array vars connected with no spaces.
Thanks
In the shell (and in Perl), backticks (``) capture the output of a command. However, Perl's return is normally for returning variables from subroutines - it does not produce output, so you probably want print instead. Also, in bash, array variables are declared with parentheses. So this works for me:
$ ARRAY=(`perl -wMstrict -le 'my #array = qw/foo bar baz/; print "#array"'`); \
echo "<${ARRAY[*]}> 0=${ARRAY[0]} 1=${ARRAY[1]} 2=${ARRAY[2]}"
<foo bar baz> 0=foo 1=bar 2=baz
In Perl, interpolating an array into a string (like "#array") will join the array with the special variable $" in between elements; that variable defaults to a single space. If you simply print #array, then the array elements will be joined by the variable $,, which is undef by default, meaning no space between the elements. This probably explains the behavior you mentioned ("the array vars connected with no spaces").
Note that the above will not work the way you expect if the elements of the array contain whitespace, because bash will split them into separate array elements. If your array does contain whitespace, then please provide an MCVE with sample data so we can perhaps make an alternative suggestion of how to return that back to bash. For example:
( # subshell so IFS is only affected locally
IFS=$'\n'
ARRAY=(`perl -wMstrict -e 'my #array = ("foo","bar","quz baz"); print join "\n", #array'`)
echo "0=<${ARRAY[0]}> 1=<${ARRAY[1]}> 2=<${ARRAY[2]}>"
)
Outputs: 0=<foo> 1=<bar> 2=<quz baz>
Here is one way using Bash word splitting, it will split the string on white space into the new array array:
array_str=$(perl -E '#a = 1..5; say "#a"')
array=( $array_str )
for item in ${array[#]} ; do
echo ": $item"
done
Output:
: 1
: 2
: 3
: 4
: 5
I have a free form string which I need to sanitize in bash in order to produce safe-and-nice filenames.
Example:
STAGE_NAME="Some usafe name 2/2#"
Expected sanitized result"
"some-unsafe-name-2-2"
Logic:
lowercase chars
replace all unsupported or unsafe chars with dash (including spaces)
remove duplicated dashes
remove any dashes from prefix or suffix
Use of external tools like sed is allowed as long they are portable (not using options that are no available on BSD/OSX/...).
You can use this pure bash function for this sanitization:
sanitize() {
local s="${1?need a string}" # receive input in first argument
s="${s//[^[:alnum:]]/-}" # replace all non-alnum characters to -
s="${s//+(-)/-}" # convert multiple - to single -
s="${s/#-}" # remove - from start
s="${s/%-}" # remove - from end
echo "${s,,}" # convert to lowercase
}
Then call it as:
sanitize "///Some usafe name 2/2##"
some-usafe-name-2-2
sanitize "Some usafe name 2/2#"
some-usafe-name-2-2
Just for an academic exercise here is an awk one-liner doing the same:
awk -F '[^[:alnum:]]+' -v OFS=- '{$0=tolower($0); $1=$1; gsub(/^-|-$/, "")} 1'
When using regular expressions in Ruby, what is the difference between
$1
and
#{$1}
?
NOTE:
markup =~ /(\d+)/
#a = $1
s = "<div> ... '#{$1}' ... </div>"
my_function(par_1,#{$1},par_3)
NOTE 2:
I try again ...
regular expression: /(\d+)/
string: 123
The value of $1 is 123, correct ?
If I want to pass the value of the $1 variable to a function, shall I write
my_function(par_1,#{$1},par_3)
or
my_function(par_1,$1,par_3)
If I want to pass the address of the $1 variable to a function, shall I write
my_function(par_1,#{$1},par_3)
or
my_function(par_1,$1,par_3)
Last question: any reference where I could learn more ?
$1 refers to a numbered capture group
#{$1} would be referring to a named capture group with the name "1", which is weird.
Normally it would be #{$named} or #{r[:named]}
I need change price the HTML file, which search and store them in array but I have to change and save /nuevo-focus.html
price=( `cat /home/delkav/info-sitioweb/html/productos/autos/nuevo-focus.html | grep -oiE '([$][0-9.]{1,7})'|tr '\n' ' '` )
price2=( $90.880 $0 $920 $925 $930 $910 $800 $712 $27.220 $962 )
sub (){
for item in "${price[#]}"; do
for x in ${price2[#]}; do
sed s/$item/$x/g > /home/delkav/info-sitioweb/html/productos/autos/nuevo-focus.html
done
done
}
sub
Output the "cat /home/.../nuevo-focus.html|grep -oiE '([$][0-9.]{1,7})'|tr '\n' ' '` )" is...
$86.880 $0 $912 $908 $902 $897 $882 $812 $25.725 $715
In bash the variables $0 through $9 refer to the respective command line arguments of the script being run. In the line:
price2=( $90.880 $0 $920 $925 $930 $910 $800 $712 $27.220 $962 )
They will be expanded to either empty strings or the command line arguments that you gave the script.
Try doing this instead:
price2=( '$90.880' '$0' '$920' '$925' '$930' '$910' '$800' '$712' '$27.220' '$962' )
EDIT for part two of question
If what you are trying to do with the sed line is replace the prices in the file, overwriting the old ones, then you should do this:
sed -i s/$item/$x/g /home/delkav/info-sitioweb/html/productos/autos/nuevo-focus.html
This will perform the substitution in place (-i), modifying the input file.
EDIT for part three of the question
I just realized that your nested loop does not really make sense. I am assuming that what you want to do is replace each price from price with the corresponding price in price2
If that is the case, then you should use a single loop, looping over the indices of the array:
for i in ${!price[*]}
do
sed -i "s/${price[$i]}/${price2[$i]}/g" /home/delkav/info-sitioweb/html/productos/autos/nuevo-focus.html
done
I'm not able to test that right now, but I think it should accomplish what you want.
To explain it a bit:
${!price[*]} gives you all of the indices of your array (e.g. 0 1 2 3 4 ...)
For each index we then replace the corresponding old price with the new one. There is no need for a nested loop as you have done. When you execute that, what you are Basically doing is this:
replace every occurence of "foo" with "bar"
# at this point, there are now no more occurences of "foo" in your file
# so all of the other replacements do nothing
replace every occurence of "foo" with "baz"
replace every occurence of "foo" with "spam"
replace every occurence of "foo" with "eggs"
replace every occurence of "foo" with "qux"
replace every occurence of "foo" with "whatever"
etc...
cud any body tell me how this expression works
output = "#{output.gsub(/grep .*$/,'')}"
before that opearation value of ouptput is
"df -h | grep /mnt/nand\r\n/dev/mtdblock4 248.5M 130.7M 117.8M 53% /mnt/nand\r\n"
but after opeartion it comes
"df -h | \n/dev/mtdblock4 248.5M 248.5M 130.7M 117.8M 53% /mnt/nand\r\n "
plzz help me
Your expression is equivalent to:
output.gsub!(/grep .*$/,'')
which is much easier to read.
The . in the regular expression matches all characters except newline by default. So, in the string provided, it matches "grep /mnt/nand", and will substitute a blank string for that. The result is the provided string, without the matched substring.
Here is a simpler example:
"hello\n\n\nworld".gsub(/hello.*$/,'') => "\n\n\nworld"
In both your provided regex, and the example above, the $ is not necessary. It is used as an anchor to match the end of a line, but since the pattern immediately before it (.*) matches everything up to a newline, it is redundant (but does not cause harm).
Since gsub returns a string, your first line is exactly the same as
output = output.gsub(/grep .*$/, '')
which takes the string and removes any occurance of the regexp pattern
/grep .*$/
i.e. all parts of the string that start with 'grep ' until the end of the string or a line break.
There's a good regexp tester/reference here. This one matches the word "grep", then a space, then any number of characters until the next line-break (\r or \n). "." by itself means any character, and ".*" together means any number of them, as many as possible. "$" means the end of a line.
For the '$', see here http://www.regular-expressions.info/reference.html
".*$" means "take every character from the end of the string" ; but the parser will interpret the "\n" as the end of a line, so it stops here.