String manipulation required - shell

Here is a sample string . I would like to get the output from this in the specified format.
String:
/vob/TEST/.##/main/ch_vobsweb/1/VOBSWeb/main/ch_vobsweb/4/VobsWebUI/main/ch_vobsweb/2/VaultWeb/main/ch_vobsweb/2/func.js
filename;path to file
func.js;VOBSWeb/VosWebUI/VaultWeb/func.js
The filename is listed at the end of the whole string , and it's path is supposed to be stripped using the characters after each numeric value (eg. /1/VOBSWeb/ and then /4/VobsWebUI and then /2/vaultWeb)

one way
$ string="/vob/TEST/.##/main/ch_vobsweb/1/VOBSWeb/main/ch_vobsweb/4/VobsWebUI/main/ch_vobsweb/2/VaultWeb/main/ch_vobsweb/2/func.js"
$ path=$(echo "$string" | sed "s|\/[0-9]\/|\n|g"|sed 's|\/.*||' | tr "\n" "/"|sed 's/\/$//')
$ echo ${path##*/}
func.js
$ echo ${path%\/*}
/VOBSWeb/VobsWebUI/VaultWeb

Related

sed capture to get string between slashes

I have a filepath like this: /bing/foo/bar/bin and I want to extract only the string between bing/ and the next slash.
So /bing/foo/bar/bin should just produce "foo".
I tried the following:
echo "/bing/foo/bar/bin" | sed -r 's/.*bing\/(.*)\/.*/\1/'
but this produces "foo/bar" instead of "foo".
Try this command
echo "/bing/foo/bar/bin" | sed -r 's|.*bing/([^/]*)/.*|\1|'
use | as delimiters instead of / is proper in your case, reference from "Delimiters in sed substitution",
sed can use any character as a delimiter, it will automatically use the character following the s as a delimiter.
or
echo "/bing/foo/bar/bin" | grep -oP "/bing/\K(\w+)"

How to find string from txt file and store in variable in BASH

Requirement is to find a string from txt file and store it to variable.
file look like this(rsa.txt)
Encrypting String
... Input string : Test_123
... Encrypted string : $ENC(JEVOQyhZbVpkQmM0L3ArT2c4M05TZks5TmxRPT1+KQ==)
Required output (variable name : encstring):
encstring = $ENC(JEVOQyhZbVpkQmM0L3ArT2c4M05TZks5TmxRPT1+KQ==)
I tried below code but showing no result
encstring=$(grep -oE '$ENC[^()]*==)' <<< rsa.txt)
With awk, could you please try following. Simply, search for string /Encrypted string along with a condition to check if last field of that line has $ENC in it then last field for that line by using $NF.
encstring=$(awk '/Encrypted string/ && $NF~/\$ENC/{print $NF}'
You can use
encstring=$(sed -n 's/.*\(\$ENC(.*)\).*/\1/p' rsa.txt)
# OR
encstring=$(grep -oP '\$ENC\(.*?\)' rsa.txt)
See an online demo:
s='Encrypting String
... Input string : Test_123
... Encrypted string : $ENC(JEVOQyhZbVpkQmM0L3ArT2c4M05TZks5TmxRPT1+KQ==)'
encstring=$(sed -n 's/.*\(\$ENC(.*)\).*/\1/p' <<< "$s")
echo "$encstring"
# => $ENC(JEVOQyhZbVpkQmM0L3ArT2c4M05TZks5TmxRPT1+KQ==)
The sed -n 's/.*\(\$ENC(.*)\).*/\1/p' command does the following:
-n suppresses the default line output
s/.*\(\$ENC(.*)\).*/\1/ - finds any text, then captures $ENC(...) into Group 1 and then matches the rest of the string, and replaces the match with the Group 1 value
p - prints the result of the substitution.
The grep -oP '\$ENC\(.*?\)' command extracts all $ENC(...) matches, with any chars, as few as possible, between ( and ).
You are searching for ENC which is followed by 0 or more occurances of something which is not an open or closed parenthesis. However, in your input file, there is an open parenthese after ENC. Therefore [^()]* matches the null string. After this you expect the string ==). This would match only for the input ENC==)`.
You need to escape $ as \$ as it means "end of string" with -E

How to use sed to extract a string [duplicate]

This question already has answers here:
BASH extract value after string in variable Not file [duplicate]
(2 answers)
Closed last year.
I need to extract a number from the output of a command: cmd. The output is type: 1000
So my question is how to execute the command, store its output in a variable and extract 1000 in a shell script. Also how do you store the extracted string in a variable?
This question has been answered in pieces here before, it would be something like this:
line=$(sed -n '2p' myfile)
echo "$line"
if [ `echo $line || grep 'type: 1000' ` ] then;
echo "It's there!";
fi;
Store output of sed into a variable
String contains in Bash
EDIT: sed is very limited, you would need to use bash, perl or awk for what you need.
This is a typical use case for grep:
output=$(cmd | grep -o '[0-9]\+')
You can write the output of a command or even a pipeline of commands into a shell variable using so called command substitution:
variable=$(cmd);
In comments it appeared that the output of cmd contains more lines than the type : 1000. In this case I would suggest sed:
output=$(cmd | sed -n 's/type : \([0-9]\+\)/\1/p;q')
You tagged your question as sed but your question description does not restrict other tools, so here's a solution using awk.
output = `cmd | awk -F':' '/type: [0-9]+/{print $2}'`
Alternatively, you can use the newer $( ) syntax. Some find the newer syntax preferable and it can be conveniently nested, without the need for escaping backtics.
output = $(cmd | awk -F':' '/type: [0-9]+/{print $2}')
If the output is rigidly restricted to "type: " followed by a number, you can just use cut.
var=$(echo 'type: 1000' | cut -f 2 -d ' ')
Obviously you'll have to pipe the output of your command to cut, I'm using echo as a demo.
In addition, I'd use grep and then cut if the string you are searching is more complex. If we assume there can be all kind of numbers in the text, but only one occurrence of "type: " followed by a number, you can use the command:
>> var=$(echo "hello 12 type: 1000 foo 1001" | grep -oE "type: [0-9]+" | cut -f 2 -d ' ')
>> echo $var
1000
You can use the | operator to send the output of one command to another, like so:
echo " 1\n 2\n 3\n" | grep "2"
This sends the string " 1\n 2\n 3\n" to the grep command, which will search for the line containing 2. It sound like you might want to do something like:
cmd | grep "type"
Here is a plain sed solution that uses a regualar expression to find the number in your string:
cmd | sed 's/^.*type: \([0-9]\+\)/\1/g'
^ means from the start
.* can be any character (also none)
\([0-9]\+\) are numbers (minimum one character)
\1 means it takes the first pattern it finds (and only in this case) and uses it as replacement for the whole string

passing sed backreference to base64 command

What I am trying to achieve is pass the Base64 encoded value captured in the sed regex to the base64 and have it decoded.
But the problem is, even though it seems like the correct value is being passed to the function using backreference, base64 complains that the input is invalid.
Following is my script -
#!/bin/bash
decodeBaseSixtyFour() {
echo "$1 is decoded to `echo $1 | base64 -d`"
}
echo Passing direct value ...
echo SGVsbG8gQmFzZTY0Cg== | sed -r "s/(.+)$/$(decodeBaseSixtyFour SGVsbG8gQmFzZTY0Cg==)/"
echo Passing captured value ...
echo SGVsbG8gQmFzZTY0Cg== | sed -r "s/(.+)$/$(decodeBaseSixtyFour \\1)/"
And when ran it produces the following output -
Passing direct value ...
SGVsbG8gQmFzZTY0Cg== is decoded to Hello Base64
Passing captured value ...
base64: invalid input
SGVsbG8gQmFzZTY0Cg== is decoded to
I think the output explains what I mean.
Is it possible to do what I am trying to do? If not, why?
Perl s/// can do what you want, but I don't think what you're asking for is what you need.
$ echo SGVsbG8gQmFzZTY0Cg== | perl -MMIME::Base64 -pe 's/(.+)/decode_base64($1)/e'
Hello Base64
What's actually happening:
echo SGVsbG8gQmFzZTY0Cg== | sed -r "s/(.+)$/$(decodeBaseSixtyFour \\1)/"
Before sed starts reading input, the shell notices the process substitution in the double quoted string
the decodeBaseSixtyFour function is called with the string "\\1"
base64 chokes on the input \1 and emits the error message
the function returns the string "\1 is decoded to "
now the sed script is 's/(.+)$/\1 is decoded to /' which is how you get the last line.
As I commented sed cannot do an equivalent of replace_callback which is esentially what you're trying to do.
Following awk does close to what you're trying to do:
s="My string is SGVsbG8gQmFzZTY0Cg== something"
awk '{for(i=1; i<=NF; i++) if ($i~/==$/) "base64 -D<<<"$i|getline $i}1'<<<"$s"
My string is Hello Base64 something

How to split a string in bash delimited by tab

I'm trying to split a tab delimitted field in bash.
I am aware of this answer: how to split a string in shell and get the last field
But that does not answer for a tab character.
I want to do get the part of a string before the tab character, so I'm doing this:
x=`head -1 my-file.txt`
echo ${x%\t*}
But the \t is matching on the letter 't' and not on a tab. What is the best way to do this?
Thanks
If your file look something like this (with tab as separator):
1st-field 2nd-field
you can use cut to extract the first field (operates on tab by default):
$ cut -f1 input
1st-field
If you're using awk, there is no need to use tail to get the last line, changing the input to:
1:1st-field 2nd-field
2:1st-field 2nd-field
3:1st-field 2nd-field
4:1st-field 2nd-field
5:1st-field 2nd-field
6:1st-field 2nd-field
7:1st-field 2nd-field
8:1st-field 2nd-field
9:1st-field 2nd-field
10:1st-field 2nd-field
Solution using awk:
$ awk 'END {print $1}' input
10:1st-field
Pure bash-solution:
#!/bin/bash
while read a b;do last=$a; done < input
echo $last
outputs:
$ ./tab.sh
10:1st-field
Lastly, a solution using sed
$ sed '$s/\(^[^\t]*\).*$/\1/' input
10:1st-field
here, $ is the range operator; i.e. operate on the last line only.
For your original question, use a literal tab, i.e.
x="1st-field 2nd-field"
echo ${x% *}
outputs:
1st-field
Use $'ANSI-C' strings in the parameter expansion:
$ x=$'abc\tdef\tghi'
$ echo "$s"
abc def ghi
$ echo ">>${x%%$'\t'*}<<"
>>abc<<
read field1 field2 <<< ${tabDelimitedField}
or
read field1 field2 <<< $(command_producing_tab_delimited_output)
Use awk.
echo $yourfield | awk '{print $1}'
or, in your case, for the first field from the the last line of a file
tail yourfile | awk '{x=$1}END{print x}'
There is an easy way for a tab separated string : convert it to an array.
Create a string with tabs ($ added before for '\t' interpretation) :
AAA=$'ABC\tDEF\tGHI'
Split the string as an array using parenthesis :
BBB=($AAA)
Get access to any element :
echo ${BBB[0]}
ABC
echo ${BBB[1]}
DEF
echo ${BBB[2]}
GHI
x=first$'\t'second
echo "${x%$'\t'*}"
See QUOTING in man bash
The answer from https://stackoverflow.com/users/1815797/gniourf-gniourf hints at the use of built in field parsing in bash, but does not really complete the answer. The use of the IFS shell parameter to set the input field separate will complete the picture and give the ability to parse files which are tab-delimited, of a fixed number of fields, in pure bash.
echo -e "a\tb\tc\nd\te\tf" > myfile
while IFS='<literaltab>' read f1 f2 f3;do echo "$f1 = $f2 + $f3"; done < myfile
a = b + c
d = e + f
Where, of course, is replaced by a real tab, not \t. Often, Control-V Tab does this in a terminal.

Resources