How to split a string by underscore and extract an element as a variable in bash? - bash

Suppose I have a string such like s=DNA128533_mutect2_filtered.vcf.gz. How could I extract the DNA128533 as an ID variable.
I tried
id=(cut -d_ -f1 <<< ${s}
echo $id
It seems not working. some suggestions? Thanks

No need to spend a sub-shell calling cut -d'_' -f1 and using bashism <<< "$s".
The POSIX shell grammar has built-in provision for stripping-out the trailing elements with variable expansion, without forking a costly sub-shell or using non-standard Bash specific <<<"here string".
#!/usr/bin/env sh
s=DNA128533_mutect2_filtered.vcf.gz
id=${s%%_*}
echo "$id"

You want to filter the DNA... part out of the filename. Therefore:
s="DNA128533_mutect2_filtered.vcf.gz"
id=$(echo "$s" | cut -d'_' -f1)
echo "$id"
If you want to use your way of doing it (with <<<), do this:
id=$(cut -d'_' -f1 <<< "$s")
echo "$id"
Your command has some syntax issues, like you are missing ).
And you want the output of the command to be stored in variable id, so you have to make it run via the $( ) syntax.

IFS is the bash way delimiter, we can cut string as below:
IFS='_' read -r -a array <<< "a_b_c_d"
echo "${array[0]}"

Related

Script : substitute value in script and display it

I have this sql command : insert into users(username, password) values ($username, $password)
I want to display this line for every user
this is my script
#!/bin/bash
for name in $(cat /etc/passwd | cut -d: -f1)
do
pass= sudo grep -w $name /etc/shadow | cut -d: -f2
echo 'insert into `users`(`username`, `password`) values ($name, $pass)'
done
But when i execute the script it doesn't do the substitution
As root (sudo -s):
#!/bin/bash
while read name; do
pass=$(grep -w "$name" /etc/shadow | cut -d: -f2)
echo "INSERT INTO \`users\`(\`username\`, \`password\`) VALUES ($name, $pass)"
done < <(cut -d: -f1 /etc/passwd)
 Notes
If you are a bash beginniner, some good pointers to start learning :
FAQ,
Guide,
Ref,
bash hackers,
quotes,
Check your script
And avoid people recommendations saying to learn with tldp.org web site, the tldp bash guide -ABS in particular) is outdated, and in some cases just plain wrong. The BashGuide and the bash-hackers' wiki are far more reliable.
Learn how to quote properly in shell, it's very important :
"Double quote" every literal that contains spaces/metacharacters and every expansion: "$var", "$(command "$var")", "${array[#]}", "a & b". Use 'single quotes' for code or literal $'s: 'Costs $5 US', ssh host 'echo "$HOSTNAME"'. See
http://mywiki.wooledge.org/Quotes
http://mywiki.wooledge.org/Arguments
http://wiki.bash-hackers.org/syntax/words
Here I have fixed the issues with your script. Please check if it works now :
#!/bin/bash
while read name
do
pass=$( sudo grep -w $name /etc/shadow | awk -F':' '{print $2}' )
echo "insert into 'users'('username', 'password') values ($name, $pass)"
done <<< "$(awk -F':' '{print $1}' /etc/passwd))"
Regards!
Try this Shellcheck-clean pure Bash code, which needs to be run as root:
#! /bin/bash -p
while IFS=: read -r name pass _ ; do
printf "insert into users (username, password) values ('%s', '%s')\\n" \
"$name" "$pass"
done </etc/shadow
/etc/shadow should contain the same users as /etc/passwd, so the code doesn't use /etc/passwd.
See BashFAQ/001 (How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?) for an explanation of how while IFS=: read -r ... works. It also explains the use of _ as a "junk variable".
See the accepted, and excellent, answer to Why is printf better than echo? for an explanation of why the code uses printf instead of echo.

bash to extract and store after second underscore

I am trying to use bash to extract after the second _ and store that in a variable pref. I am using a loop so the below is not completely accurate, but the file structure/format is.
I can extract evertything before the first _ using pref=${bname%%_*}, but can't seem to change it to the second_`. Thank you :).
file to extract from
00-0000_Last-First_base_counts_FBN1.txt
desired output
00-0000_Last-First
bash
pref=${bname%%_}; pref=${bname%%_*.txt}
Using cut with _ as delimiter get 1st and 2nd fields:
s='00-0000_Last-First_base_counts_FBN1.txt'
cut -d_ -f1-2 <<< "$s"
00-0000_Last-First
To store in a variable:
pref=$(cut -d_ -f1-2 <<< "$s")
GNU sed and grep
$ sed -r 's/([^_]+_[^_]*).*/\1/' <<<"00-0000_Last-First_base_counts_FBN1.txt"
00-0000_Last-First
$ sed 's/_[^_]*//2g' <<< "00-0000_Last-First_base_counts_FBN1.txt"
00-0000_Last-First
$ grep -o "^[^_]*_\?[^_]*" <<<"00-0000_Last-First_base_counts_FBN1.txt"
00-0000_Last-First
To store in variable
somevar="00-0000_Last-First_base_counts_FBN1.txt";
pref=$(sed 's/_[^_]*//2g' <<< "$somevar")
As well as with sed, awk and cut, you can achieve this with expr:
$ filename=00-0000_Last-First_base_counts_FBN1.txt
$ echo $(expr match "$filename" '^\([^_]*_[^_]*\)')
00-0000_Last-First
This is echoing the capture group with in the \( and \) of the regular expression.

How to split the contents of `$PATH` into distinct lines?

Suppose echo $PATH yields /first/dir:/second/dir:/third/dir.
Question: How does one echo the contents of $PATH one directory at a time as in:
$ newcommand $PATH
/first/dir
/second/dir
/third/dir
Preferably, I'm trying to figure out how to do this with a for loop that issues one instance of echo per instance of a directory in $PATH.
echo "$PATH" | tr ':' '\n'
Should do the trick. This will simply take the output of echo "$PATH" and replaces any colon with a newline delimiter.
Note that the quotation marks around $PATH prevents the collapsing of multiple successive spaces in the output of $PATH while still outputting the content of the variable.
As an additional option (and in case you need the entries in an array for some other purpose) you can do this with a custom IFS and read -a:
IFS=: read -r -a patharr <<<"$PATH"
printf %s\\n "${patharr[#]}"
Or since the question asks for a version with a for loop:
for dir in "${patharr[#]}"; do
echo "$dir"
done
How about this:
echo "$PATH" | sed -e 's/:/\n/g'
(See sed's s command; sed -e 'y/:/\n/' will also work, and is equivalent to the tr ":" "\n" from some other answers.)
It's preferable not to complicate things unless absolutely necessary: a for loop is not needed here. There are other ways to execute a command for each entry in the list, more in line with the Unix Philosophy:
This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.
such as:
echo "$PATH" | sed -e 's/:/\n/g' | xargs -n 1 echo
This is functionally equivalent to a for-loop iterating over the PATH elements, executing that last echo command for each element. The -n 1 tells xargs to supply only 1 argument to it's command; without it we would get the same output as echo "$PATH" | sed -e 'y/:/ /'.
Since this uses xargs, which has built-in support to split the input, and echoes the input if no command is given, we can write that as:
echo -n "$PATH" | xargs -d ':' -n 1
The -d ':' tells xargs to use : to separate it's input rather than a newline, and the -n tells /bin/echo to not write a newline, otherwise we end up with a blank trailing line.
here is another shorter one:
echo -e ${PATH//:/\\n}
You can use tr (translate) to replace the colons (:) with newlines (\n), and then iterate over that in a for loop.
directories=$(echo $PATH | tr ":" "\n")
for directory in $directories
do
echo $directory
done
My idea is to use echo and awk.
echo $PATH | awk 'BEGIN {FS=":"} {for (i=0; i<=NF; i++) print $i}'
EDIT
This command is better than my former idea.
echo "$PATH" | awk 'BEGIN {FS=":"; OFS="\n"} {$1=$1; print $0}'
If you can guarantee that PATH does not contain embedded spaces, you can:
for dir in ${PATH//:/ }; do
echo $dir
done
If there are embedded spaces, this will fail badly.
# preserve the existing internal field separator
OLD_IFS=${IFS}
# define the internal field separator to be a colon
IFS=":"
# do what you need to do with $PATH
for DIRECTORY in ${PATH}
do
echo ${DIRECTORY}
done
# restore the original internal field separator
IFS=${OLD_IFS}

easy way to spilt a string like this in bash?

Now I have strings in the form "temp:10" and I use temp=$(echo $str|awk '{split($0,array,":")} END{print array[1]}') to split which is overkilled and slow..there must be a simpler to do this?
Use bash's parameter expansion with suffix removal:
temp=${str%%:*}
There's also the read command:
$ str="temp:10"
$ IFS=: read before after <<< "$str"
$ echo "$before"
temp
$ echo "$after"
10
If I understand you right, you need the value before the :, temp in this example. If so, then you can use the cut command:
cut -d':' -f1

How to handle variables that contain ";"?

I have a configuration file that contains lines like "hallo;welt;" and i want to do a grep on this file.
Whenever i try something like grep "$1;$2" my.config or echo "$1;$2 of even line="$1;$2" my script fails with something like:
: command not found95: line 155: =hallo...
How can i tell bash to ignore ; while evaluating "..." blocks?
EDIT: an example of my code.
# find entry
$line=$(grep "$1;$2;" $PERMISSIONSFILE)
# splitt line
reads=$(echo $line | cut -d';' -f3)
writes=$(echo $line | cut -d';' -f4)
admins=$(echo $line | cut -d';' -f5)
# do some stuff on the permissions
# replace old line with new line
nline="$1;$2;$reads;$writes;$admins"
sed -i "s/$line/$nline/g" $TEMPPERM
my script should be called like this: sh script "table" "a.b.*.>"
EDIT: another, simpler example
$test=$(grep "$1;$2;" temp.authorization.config)
the temp file:
table;pattern;read;write;stuff
the call sh test.sh table pattern results in: : command not foundtable;pattern;read;write;stuff
Don't use $ on the left side of an assignment in bash -- if you do it'll substitute the current value of the variable rather than assigning to it. That is, use:
test=$(grep "$1;$2;" temp.authorization.config)
instead of:
$test=$(grep "$1;$2;" temp.authorization.config)
Edit: also, variable expansions should be in double-quotes unless there's a good reason otherwise. For example, use:
reads=$(echo "$line" | cut -d';' -f3)
instead of:
reads=$(echo $line | cut -d';' -f3)
This doesn't matter for semicolons, but does matter for spaces, wildcards, and a few other things.
A ; inside quotes has no meaning at all for bash. However, if $1 contains a doublequote itself, then you'll end up with
grep "something";$2"
which'll be parsed by bash as two separate commands:
grep "something" ; other"
^---command 1----^ ^----command 2---^
Show please show exactly what your script is doing around the spot the error is occurring, and what data you're feeding into it.
Counter-example:
$ cat file.txt
hello;welt;
hello;world;
hell;welt;
$ cat xx.sh
grep "$1;$2" file.txt
$ bash -x xx.sh hello welt
+ grep 'hello;welt' file.txt
hello;welt;
$
You have not yet classified your problem accurately.
If you try to assign the result of grep to a variable (like I do) your example breaks.
Please show what you mean. Using the same data file as before and doing an assignment, this is the output I get:
$ cat xx.sh
grep "$1;$2" file.txt
output=$(grep "$1;$2" file.txt)
echo "$output"
$ bash -x xx.sh hello welt
+ grep 'hello;welt' file.txt
hello;welt;
++ grep 'hello;welt' file.txt
+ output='hello;welt;'
+ echo 'hello;welt;'
hello;welt;
$
Seems to work for me. It also demonstrates why the question needs an explicit, complete, executable, minimal example so that we can see what the questioner is doing that is different from what people answering the question think is happening.
I see you've provided some sample code:
# find entry
$line=$(grep "$1;$2;" $PERMISSIONSFILE)
# splitt line
reads=$(echo $line | cut -d';' -f3)
writes=$(echo $line | cut -d';' -f4)
admins=$(echo $line | cut -d';' -f5)
The line $line=$(grep ...) is wrong. You should omit the $ before line. Although it is syntactically correct, it means 'assign to the variable whose name is stored in $line the result of the grep command'. That is unlikely to be what you had in mind. It is, occasionally, useful. However, those occasions are few and far between, and only for people who know what they're doing and who can document accurately what they're doing.
For safety if nothing else, I would also enclose the $line values in double quotes in the echo lines. It may not strictly be necessary, but it is simple protective programming.
The changes lead to:
# find entry
line=$(grep "$1;$2;" $PERMISSIONSFILE)
# split line
reads=$( echo "$line" | cut -d';' -f3)
writes=$(echo "$line" | cut -d';' -f4)
admins=$(echo "$line" | cut -d';' -f5)
The rest of your script was fine.
It seems like you are trying to read a semicolon-delimited file, identify a line starting with 'table;pattern;' where table is a string you specify and pettern is a regular expression grep will understand. Once the line is identified you wish to replaced the 3rd, 4th and 5th fields with different data and write the updated line back to the file.
Does this sound correct?
If so, try this code
#!/bin/bash
in_table="$1"
in_pattern="$2"
file="$3"
while IFS=';' read -r -d$'\n' tuple pattern reads writes admins ; do
line=$(cut -d: -f1<<<"$tuple")
table=$(cut -d: -f2<<<"$tuple")
# do some stuff with the variables
# e.g., update the values
reads=1
writes=2
admins=12345
# replace the old line with the new line
sed -i'' -n $line'{i\
'"$table;$pattern;$reads;$writes;$admins"'
;d;}' "$file"
done < <(grep -n '^'"${in_table}"';'"${in_pattern}"';' "${file}")
I chose to update by line number here to avoid problems of unknown characters in the left hand of the substitution.

Resources