How do I recursively replace part of a string with another given string in bash? - bash

I need to write bash script that converts a string of only integers "intString" to :id. intString always exists after /, may never contain any other types (create_step2 is not a valid intString), and may end at either a second / or end of line. intString may be any 1-8 characters. Script needs to be repeated for every line in a given file.
For example:
/sample/123456/url should be converted to /sample/:id/url
and /sample_url/9 should be converted to /sampleurl/:id however /sample_url_2/ should remain the same.
Any help would be appreciated!

It seems like the long way around the problem to go recursive but then I don't know what problem you are solving. It seems like a good sed command like
sed -E 's/\/[0-9]{1,}/\/:id/g'
could do it in one shot, but if you insist on being recursive then it might go something like this ...
#!/bin/bash
function restring()
{
s="$1"
s="$(echo $s | sed -E 's/\/[0-9]{1,}/\/:id/')"
if ( echo $s | grep -E '\/[0-9]{1,}' > /dev/null ) ; then
restring $s
else
echo $s
exit
fi
echo $s
}
restring "$1"
now run it
$ ./restring.sh "/foo/123/bar/456/baz/45435/andstuff"
/foo/:id/bar/:id/baz/:id/andstuff

Related

How to prepend to a string that comes out of a pipe

I have two strings saved in a bash variable delimited by :. I want to get extract the second string, prepend that with THIS_VAR= and append it to a file named saved.txt
For example if myVar="abc:pqr", THIS_VAR=pqr should be appended to saved.txt.
This is what I have so far,
myVar="abc:pqr"
echo $myVar | cut -d ':' -f 2 >> saved.txt
How do I prepend THIS_VAR=?
printf 'THIS_VAR=%q\n' "${myVar#*:}"
See Shell Parameter Expansion and run help printf.
The more general solution in addition to #konsolebox's answer is piping into a compound statement, where you can perform arbitrary operations:
echo This is in the middle | {
echo This is first
cat
echo This is last
}

sh random error when saving function output

Im trying to make a script that takes a .txt file containing lines
like:
davda103:David:Davidsson:800104-1234:TNCCC_1:TDDB46 TDDB80:
and then sort them etc. Thats just the background my problem lies here:
#!/bin/sh -x
cat $1 |
while read a
do
testsak = `echo $a | cut -f 1 -d :`; <---**
echo $testsak;
done
Where the arrow is, when I try to run this code I get some kind of weird error.
+ read a
+ cut -f+ echo 1 -d :davda103:David:Davidsson:800104-1234:TNCCC_1:TDDB46
TDDB80:
+ testsak = davda103
scriptTest.sh: testsak: Det går inte att hitta
+ echo
(I have my linux in swedish because school -.-) Anyways that error just says that it cant find... something. Any ideas what could be causing my problem?
You have extra spaces around the assignment operator, remove them:
testsak=`echo $a | cut -f 1 -d :`; <---**
The spaces around the equal sign
testsak = `echo $a | cut -f 1 -d :`; <---**
causes bash to interpret this as a command testak with arguments = and the result of the command substitution. Removing the spaces will fix the immediate error.
A much more efficient way to extract the value from a is to let read do it (and use input redirection instead of cat):
while IFS=: read testak the_rest; do
echo $testak
done < "$1"

sed backreferences and command interpolation

I am having an interesting issue using only sed to substitute short month strings (ex "Oct") with the corresponding number value (ex "10) given a string such as the following:
Oct 14 09:23:35 some other input
To be replaced directly via sed with:
14-10-2013 09:23:25 some other input
None of the following is actually relevant to solving the trivial problem of month string -> number conversion; I'm more interested in understanding some weird behavior I encountered while trying to solve this problem entirely with sed.
Without any attempt of this string substitution (the echo statement is in lieu of the actual input in my script):
...
MMM_DD_HH_mm_SS="([A-Za-z]{3}) ([0-9]{2}) (.+:[0-9]{2})"
echo "Oct 14 09:23:35 some other input" | sed -r "s/$MMM_DD_HH_mm_ss (.+)/\2-\1-\3 \4/"
Then how to transform the backreference \1 into a number. Of course one thinks of using command interpolation with the backreference as an argument:
...
TestFunc()
{
echo "received input $1$1"
}
...
echo "Oct 14 09:23:35 some other input" | sed -r "s/$MMM_DD_HH_mm_ss (.+)/\2-$(TestFunc \\1)-\3 \4/"
Where TestFunc would be a variation of the date command (as proposed by Jotne below) with the echo'd date-time group as an input. Here TestFunc is just an echo because I'm much more interested in the behavior of what the function believes to be the value of $1.
In this case the sed with TestFunc produces the output:
14-received input OctOct-09:23:35 some other input
Which suggests that sed actually is inserting backreference \1 into the command substitution $(...) for handling by TestFunc (which appears to receive \1 as the local variable $1).
However, all attempts to do anything more with the local $1 fail. For example:
TestFunc()
{
echo "processed: $1$1" > tmp.txt # Echo 1
if [ "$1" == "Oct" ]; then
echo "processed: 10"
else
echo "processed: $1$1" # Echo 2
fi
}
Returns:
14-processed: OctOct-09:23:35 some other input
$1 has been substituted into Echo 2, yet tmp.txt contains the value processed: \1\1; as if the backreference is not being inserted into the command substitution. Even weirder, the if condition fails with $1 != "Oct", yet it falls through to an echo statement which indicates $1 = "Oct".
My question is why is the backreference insertion working in the case of Echo 2 but not Echo 1? I suspect that the backreference insertion isn't working at all (given the failure of the if statement in TestFunc) but rather something subtle is going on that makes the substitution appear to work correctly in the case of Echo 2; what is that subtlety?
Solution
On further reflection I believe I understand what is going on:
\\1 is passed to the command substitution subroutine / child function as the literal \1. This is why equality test within the child function is failing.
however the echo function is correctly handling the string \\1 as $1. So echo "aa$1aa" returns the result of the command substitution to sed as aa\1aa. Other functions such as rev also "see" $1 as \1.
sed then interpolates \1 in aa\1aa as Oct or whatever the backreference is, to return aaOctaa to the user.
Since command substitution within regexes clearly works, it would be really cool if sed replaced the value of \\1 (or \1, whatever) with the backreference before executing the command substitution $(...); this would significantly increase sed's power...
This might work for you (GNU sed):
s/$/\nJan01...Oct10Nov11Dec12/;s/(...) (..) (..:..:.. .*)\n.*\1(..).*/\2-\4-2013 \3/;s/\n.*//' file
Add a lookup to the end of the line and use the back reference to match on it making sure to remove the lookup table in all cases.
Here's an example of passing a backreference to a function:
f(){ echo "x$1y$1z"; }
echo a b c | sed -r 's/(.) (.) (.)/'"$(f \\2)"'/'
returns:
xbybz
HTH
Use the correct tool:
date -d "Oct 14 09:23:35" +"%d-%m-%Y %H:%M:%S"
14-10-2013 09:23:35
Date does read your input and convert it to any format you like

Bash - extracting a string between two points

For example:
((
extract everything here, ignore the rest
))
I know how to ignore everything within, but I don't know how to do the opposite. Basically, it'll be a file and it needs to extract the data between the two points and then output it to another file. I've tried countless approaches, and all seem to tell me the indentation I'm stating doesn't exist in the file, when it does.
If somebody could point me in the right direction, I'd be grateful.
If your data are "line oriented", so the marker is alone (as in the example), you can try some of the following:
function getdata() {
cat - <<EOF
before
((
extract everything here, ignore the rest
someother text
))
after
EOF
}
echo "sed - with two seds"
getdata | sed -n '/((/,/))/p' | sed '1d;$d'
echo "Another sed solution"
getdata | sed -n '1,/((/d; /))/,$d;p'
echo "With GNU sed"
getdata | gsed -n '/((/{:a;n;/))/b;p;ba}'
echo "With perl"
getdata | perl -0777 -pe "s/.*\(\(\s*\\n(.*)?\)\).*/\$1/s"
Ps: yes, its looks like a dance of crazy toothpicks
Assuming you want to extract the string inside (( and )):
VAR="abc((def))ghi"
echo "$VAR"
VAR=${VAR##*((}
VAR=${VAR%%))*}
echo "$VAR"
## cuts away the longest string from the beginning; # cuts away the shortest string from the beginning; %% cuts away the longest string at the end; % cuts away the shortes string at the end
The file :
$ cat /tmp/l
((
extract everything here, ignore the rest
someother text
))
The script
$ awk '$1=="((" {p=1;next} $1=="))" {p=o;next} p' /tmp/l
extract everything here, ignore the rest
someother text
sed -n '/^((/,/^))/ { /^((/b; /^))/b; p }'
Brief explanation:
/^((/,/^))/: range addressing (inclusive)
{ /^((/b; /^))/b; p }: sequence of 3 commands
1. skip line with ^((
2. skip line with ^))
3. print
The line skipping is required to make the range selection exclusive.

Capturing multiple line output into a Bash variable

I've got a script 'myscript' that outputs the following:
abc
def
ghi
in another script, I call:
declare RESULT=$(./myscript)
and $RESULT gets the value
abc def ghi
Is there a way to store the result either with the newlines, or with '\n' character so I can output it with 'echo -e'?
Actually, RESULT contains what you want — to demonstrate:
echo "$RESULT"
What you show is what you get from:
echo $RESULT
As noted in the comments, the difference is that (1) the double-quoted version of the variable (echo "$RESULT") preserves internal spacing of the value exactly as it is represented in the variable — newlines, tabs, multiple blanks and all — whereas (2) the unquoted version (echo $RESULT) replaces each sequence of one or more blanks, tabs and newlines with a single space. Thus (1) preserves the shape of the input variable, whereas (2) creates a potentially very long single line of output with 'words' separated by single spaces (where a 'word' is a sequence of non-whitespace characters; there needn't be any alphanumerics in any of the words).
Another pitfall with this is that command substitution — $() — strips trailing newlines. Probably not always important, but if you really want to preserve exactly what was output, you'll have to use another line and some quoting:
RESULTX="$(./myscript; echo x)"
RESULT="${RESULTX%x}"
This is especially important if you want to handle all possible filenames (to avoid undefined behavior like operating on the wrong file).
In case that you're interested in specific lines, use a result-array:
declare RESULT=($(./myscript)) # (..) = array
echo "First line: ${RESULT[0]}"
echo "Second line: ${RESULT[1]}"
echo "N-th line: ${RESULT[N]}"
In addition to the answer given by #l0b0 I just had the situation where I needed to both keep any trailing newlines output by the script and check the script's return code.
And the problem with l0b0's answer is that the 'echo x' was resetting $? back to zero... so I managed to come up with this very cunning solution:
RESULTX="$(./myscript; echo x$?)"
RETURNCODE=${RESULTX##*x}
RESULT="${RESULTX%x*}"
Parsing multiple output
Introduction
So your myscript output 3 lines, could look like:
myscript() { echo $'abc\ndef\nghi'; }
or
myscript() { local i; for i in abc def ghi ;do echo $i; done ;}
Ok this is a function, not a script (no need of path ./), but output is same
myscript
abc
def
ghi
Considering result code
To check for result code, test function will become:
myscript() { local i;for i in abc def ghi ;do echo $i;done;return $((RANDOM%128));}
1. Storing multiple output in one single variable, showing newlines
Your operation is correct:
RESULT=$(myscript)
About result code, you could add:
RCODE=$?
even in same line:
RESULT=$(myscript) RCODE=$?
Then
echo $RESULT $RCODE
abc def ghi 66
echo "$RESULT"
abc
def
ghi
echo ${RESULT#Q}
$'abc\ndef\nghi'
printf '%q\n' "$RESULT"
$'abc\ndef\nghi'
but for showing variable definition, use declare -p:
declare -p RESULT RCODE
declare -- RESULT="abc
def
ghi"
declare -- RCODE="66"
2. Parsing multiple output in array, using mapfile
Storing answer into myvar variable:
mapfile -t myvar < <(myscript)
echo ${myvar[2]}
ghi
Showing $myvar:
declare -p myvar
declare -a myvar=([0]="abc" [1]="def" [2]="ghi")
Considering result code
In case you have to check for result code, you could:
RESULT=$(myscript) RCODE=$?
mapfile -t myvar <<<"$RESULT"
declare -p myvar RCODE
declare -a myvar=([0]="abc" [1]="def" [2]="ghi")
declare -- RCODE="40"
3. Parsing multiple output by consecutives read in command group
{ read firstline; read secondline; read thirdline;} < <(myscript)
echo $secondline
def
Showing variables:
declare -p firstline secondline thirdline
declare -- firstline="abc"
declare -- secondline="def"
declare -- thirdline="ghi"
I often use:
{ read foo;read foo total use free foo ;} < <(df -k /)
Then
declare -p use free total
declare -- use="843476"
declare -- free="582128"
declare -- total="1515376"
Considering result code
Same prepended step:
RESULT=$(myscript) RCODE=$?
{ read firstline; read secondline; read thirdline;} <<<"$RESULT"
declare -p firstline secondline thirdline RCODE
declare -- firstline="abc"
declare -- secondline="def"
declare -- thirdline="ghi"
declare -- RCODE="50"
After trying most of the solutions here, the easiest thing I found was the obvious - using a temp file. I'm not sure what you want to do with your multiple line output, but you can then deal with it line by line using read. About the only thing you can't really do is easily stick it all in the same variable, but for most practical purposes this is way easier to deal with.
./myscript.sh > /tmp/foo
while read line ; do
echo 'whatever you want to do with $line'
done < /tmp/foo
Quick hack to make it do the requested action:
result=""
./myscript.sh > /tmp/foo
while read line ; do
result="$result$line\n"
done < /tmp/foo
echo -e $result
Note this adds an extra line. If you work on it you can code around it, I'm just too lazy.
EDIT: While this case works perfectly well, people reading this should be aware that you can easily squash your stdin inside the while loop, thus giving you a script that will run one line, clear stdin, and exit. Like ssh will do that I think? I just saw it recently, other code examples here: https://unix.stackexchange.com/questions/24260/reading-lines-from-a-file-with-bash-for-vs-while
One more time! This time with a different filehandle (stdin, stdout, stderr are 0-2, so we can use &3 or higher in bash).
result=""
./test>/tmp/foo
while read line <&3; do
result="$result$line\n"
done 3</tmp/foo
echo -e $result
you can also use mktemp, but this is just a quick code example. Usage for mktemp looks like:
filenamevar=`mktemp /tmp/tempXXXXXX`
./test > $filenamevar
Then use $filenamevar like you would the actual name of a file. Probably doesn't need to be explained here but someone complained in the comments.
How about this, it will read each line to a variable and that can be used subsequently !
say myscript output is redirected to a file called myscript_output
awk '{while ( (getline var < "myscript_output") >0){print var;} close ("myscript_output");}'

Resources