Replacing part of a string in bash using sed [duplicate] - bash

This question already has answers here:
unix sed substitute nth occurence misfunction?
(3 answers)
Closed 4 years ago.
In bash, suppose I have the input:
ATGTGSDTST
and I want to print:
AT
ATGT
ATGTGSDT
ATGTGSDTST
which means that I need to look for all the substrings that end with 'T' and print them.
I thought I should use sed inside a for loop, but I don't understand how to use sed correctly in this case.
Any help?
Thanks

The following script uses sed:
#!/usr/bin/env bash
pattern="ATGTGSDTST"
sub="T"
# Get number of T in $pattern:
num=$(grep -o -n "T" <<< "$pattern" | cut -d: -f1 | uniq -c | grep -o "[0-9]\+ ")
i=1
text=$(sed -n "s/T.*/T/p" <<< "$pattern")
echo $text
while [ $i -lt $num ]; do
text=$(sed -n "s/\($sub[^T]\+T\).*/\1/p" <<< "$pattern")
sub=$text
echo $text
((i++))
done
gives output:
AT
ATGT
ATGTGSDT
ATGTGSDTST

No sed needed, just use parameter expansion:
#! /bin/bash
string=ATGTGSDTST
length=${#string}
prefix=''
while (( ${#prefix} != $length )) ; do
sub=${string%%T*}
sub+=T
echo $prefix$sub
string=${string#$sub}
prefix+=$sub
done

Related

How to use multiple variable in for loop in sh? [duplicate]

This question already has answers here:
How do I pipe a file line by line into multiple read variables?
(3 answers)
Closed 18 days ago.
I want to use multiple variable in for loop at once in sh.
I have a query like this:
top -n 1 -b -c| awk -vOFS="\t" '{print $1,$2,$9}'
I know i use for loop in bash like this:
for i in {2..10}
do
echo "output: $i"
done
what i want to try is:
for x y z in $(top -n 1 -b -c| awk -vOFS="\t" {print $1,$2,$9}')
do
echo "output: $x $y $z"
done
Pipe to a while read loop:
top -n 1 -b -c| awk -vOFS="\t" '{print $1,$2,$9}' | while IFS=$'\t' read -r x y z
do
echo "output: $x $y $z"
done

How to cut variables which are beteween quotes from a string

I had problem with cut variables from string in " quotes. I have some scripts to write for my sys classes, I had a problem with a script in which I had to read input from the user in the form of (a="var1", b="var2")
I tried the code below
#!/bin/bash
read input
a=$($input | cut -d '"' -f3)
echo $a
it returns me a error "not found a command" on line 3 I tried to double brackets like
a=$(($input | cut -d '"' -f3)
but it's still wrong.
In a comment the OP gave a working answer (should post it as an answer):
#!/bin/bash
read input
a=$(echo $input | cut -d '"' -f2)
b=$(echo $input | cut -d '"' -f4)
echo sum: $(( a + b))
echo difference: $(( a - b))
This will work for user input that is exactly like a="8", b="5".
Never trust input.
You might want to add the check
if [[ ${input} =~ ^[a-z]+=\"[0-9]+\",\ [a-z]+=\"[0-9]+\"$ ]]; then
echo "Use your code"
else
echo "Incorrect input"
fi
And when you add a check, you might want to execute the input (after replacing the comma with a semicolon).
input='testa="8", testb="5"'
if [[ ${input} =~ ^[a-z]+=\"[0-9]+\",\ [a-z]+=\"[0-9]+\"$ ]];
then
eval $(tr "," ";" <<< ${input})
set | grep -E "^test[ab]="
else
echo no
fi
EDIT:
#PesaThe commented correctly about BASH_REMATCH:
When you use bash and a test on the input you can use
if [[ ${input} =~ ^[a-z]+=\"([0-9]+)\",\ [a-z]+=\"([0-9])+\"$ ]];
then
a="${BASH_REMATCH[1]}"
b="${BASH_REMATCH[2]}"
fi
To extract the digit 1 from a string "var1" you would use a Bash substring replacement most likely:
$ s="var1"
$ echo "${s//[^0-9]/}"
1
Or,
$ a="${s//[^0-9]/}"
$ echo "$a"
1
This works by replacing any non digits in a string with nothing. Which works in your example with a single number field in the string but may not be what you need if you have multiple number fields:
$ s2="1 and a 2 and 3"
$ echo "${s2//[^0-9]/}"
123
In this case, you would use sed or grep awk or a Bash regex to capture the individual number fields and keep them distinct:
$ echo "$s2" | grep -o -E '[[:digit:]]+'
1
2
3

Remove trailing `,` from bash output [duplicate]

This question already has answers here:
How can I join elements of a Bash array into a delimited string?
(34 answers)
Closed 4 years ago.
I have a script that parses a command:
while read line; do
# The third token is either IP or protocol name with '['
token=`echo $line | awk '{print $3}'`
last_char_idx=$((${#token}-1))
last_char=${token:$last_char_idx:1}
# Case 1: It is the protocol name
if [[ "$last_char" = "[" ]]; then
# This is a protocol. Therefore, port is token 4
port=`echo $line | awk '{print $4}'`
# Shave off the last character
port=${port::-1}
else
# token is ip:port. Awk out the port
port=`echo $token | awk -F: '{print $2}'`
fi
PORTS+=("$port")
done < <($COMMAND | egrep "^TCP open")
for p in "${PORTS[#]}"; do
echo -n "$p, "
done
This prints out ports like:
80,443,8080,
The problem is that trailing slash ,
How can I get the last port to not have a trailing , in the output ?
Thanks
${array[*]} uses the first character in IFS to join elements.
IFS=,
echo "${PORTS[*]}"
If you don't want to change IFS, you can instead use:
printf -v ports_str '%s,' "${PORTS[#]}"
echo "${ports_str%,}"
...or, simplified from a suggestion by Stefan Hamcke:
printf '%s' "${PORTS[0]}"; printf ',%s' "${PORTS[#]:1}"
...changing the echo to printf '%s' "${ports_str%,}" if you don't want a trailing newline after the last port. (echo -n is not recommended; see discussion in the APPLICATION USAGE of the POSIX spec for echo).
how about
$ echo "${ports[#]}" | tr ' ' ','
Why not simply:
( for p in "${PORTS[#]}"; do
echo -n "$p, "
done ) | sed -e 's/,$//'

How to extract multiple text and numbers from a string using sed? [duplicate]

This question already has answers here:
Linux bash: Multiple variable assignment
(6 answers)
Closed 7 years ago.
How can I extract 3 or more separate text from a line using 'sed'
I have the following line:
echo <MX><[Mike/DOB-029/Post-555/Male]><MX>
So far I am able to extract the 'DOB-029' by doing
sed -n 's/.*\(DOB-[0-9]*\).*/\1/p'
but I am not getting the other texts such as the name or the post.
My expected output should be Mike DOB-029 Post-555
Edited
Say I have a list within a file and I want to extract specific text/IDs from the entire list and save it to a .txt file
sed 's/.*[\(.*\).\(DOB-[0-9]*\).\(Post-[0-9]*\).*/\1 \2 \3/' should do the trick!
Parts in between \( and \) are captured strings that can be called upon using \i with i the index of the group.
Script for custom use:
#! /bin/bash
fields=${1:-123}
file='/path/to/input'
name=$(sed 's/.*\[\([^\/]*\)\/.*/\1/' $file)
dob=$(sed 's/.*\(DOB-[0-9]*\).*/\1/' $file)
post=$(sed 's/.*\(Post-[0-9]*\).*/\1/' $file)
[[ $fields =~ .*1.* ]] && output=$name
[[ $fields =~ .*2.* ]] && output="$output $dob"
[[ $fields =~ .*3.* ]] && output="$output $post"
echo $output
Set the file with the line you want to parse in the file variable (I can add more functionality such as supplying the file as argument or getting it from a larger file if you like). And execute the script with an int argument, if this int contains '1' it will display name, if 2, it will display DOB and 3 will output post information. You can combine to e.g. '123' or '32' or whichever combination you like.
Stdin
If you want to read from stdin, use following script:
#! /usr/bin/env bash
line=$(cat /dev/stdin)
fields=${1:-123}
name=$(echo $line | sed 's/.*\[\([^\/]*\)\/.*/\1/')
dob=$(echo $line | sed 's/.*\(DOB-[0-9]*\).*/\1/')
post=$(echo $line | sed 's/.*\(Post-[0-9]*\).*/\1/')
[[ $fields =~ .*1.* ]] && output=$name
[[ $fields =~ .*2.* ]] && output="$output $dob"
[[ $fields =~ .*3.* ]] && output="$output $post"
echo $output
Example usage:
$ chmod +x script.sh
$ echo '<MX><[Mike/DOB-029/Post-555/Male]><MX>' | ./script.sh 123
Mike DOB-029 Post-555
$ echo '<MX><[Mike/DOB-029/Post-555/Male]><MX>' | ./script.sh 12
Mike DOB-029
$ echo '<MX><[Mike/DOB-029/Post-555/Male]><MX>' | ./script.sh 32
DOB-029 Post-555
$ echo '<MX><[Mike/DOB-029/Post-555/Male]><MX>' | ./script.sh
Mike DOB-029 Post-555
A solution with awk:
echo "<MX><[Mike/DOB-029/Post-555/Male]><MX>" | awk -F[/[] '{print $2, $3, $4}'
We set the delimiter as / or [ (-F[/[]). then we just print the fields $2, $3 and $4 which are the 2nd, 3rd and 4th fields respectively.
With sed:
echo "<MX><[Mike/DOB-029/Post-555/Male]><MX>" | sed 's/\(^.*\[\)\(.*\)\(\/[^/]*$\)/\2/; s/\// /g'
use the bash substitution builtins.
line="<MX><[Mike/D0B-029/Post-555/Male]><MX>";
linel=${line/*[/}; liner=${linel%\/*}; echo ${liner//\// }

Randomizing arg order for a bash for statement

I have a bash script that processes all of the files in a directory using a loop like
for i in *.txt
do
ops.....
done
There are thousands of files and they are always processed in alphanumerical order because of '*.txt' expansion.
Is there a simple way to random the order and still insure that I process all of the files only once?
Assuming the filenames do not have spaces, just substitute the output of List::Util::shuffle.
for i in `perl -MList::Util=shuffle -e'$,=$";print shuffle<*.txt>'`; do
....
done
If filenames do have spaces but don't have embedded newlines or backslashes, read a line at a time.
perl -MList::Util=shuffle -le'$,=$\;print shuffle<*.txt>' | while read i; do
....
done
To be completely safe in Bash, use NUL-terminated strings.
perl -MList::Util=shuffle -0 -le'$,=$\;print shuffle<*.txt>' |
while read -r -d '' i; do
....
done
Not very efficient, but it is possible to do this in pure Bash if desired. sort -R does something like this, internally.
declare -a a # create an integer-indexed associative array
for i in *.txt; do
j=$RANDOM # find an unused slot
while [[ -n ${a[$j]} ]]; do
j=$RANDOM
done
a[$j]=$i # fill that slot
done
for i in "${a[#]}"; do # iterate in index order (which is random)
....
done
Or use a traditional Fisher-Yates shuffle.
a=(*.txt)
for ((i=${#a[*]}; i>1; i--)); do
j=$[RANDOM%i]
tmp=${a[$j]}
a[$j]=${a[$[i-1]]}
a[$[i-1]]=$tmp
done
for i in "${a[#]}"; do
....
done
You could pipe your filenames through the sort command:
ls | sort --random-sort | xargs ....
Here's an answer that relies on very basic functions within awk so it should be portable between unices.
ls -1 | awk '{print rand()*100, $0}' | sort -n | awk '{print $2}'
EDIT:
ephemient makes a good point that the above is not space-safe. Here's a version that is:
ls -1 | awk '{print rand()*100, $0}' | sort -n | sed 's/[0-9\.]* //'
If you have GNU coreutils, you can use shuf:
while read -d '' f
do
# some stuff with $f
done < <(shuf -ze *)
This will work with files with spaces or newlines in their names.
Off-topic Edit:
To illustrate SiegeX's point in the comment:
$ a=42; echo "Don't Panic" | while read line; do echo $line; echo $a; a=0; echo $a; done; echo $a
Don't Panic
42
0
42
$ a=42; while read line; do echo $line; echo $a; a=0; echo $a; done < <(echo "Don't Panic"); echo $a
Don't Panic
42
0
0
The pipe causes the while to be executed in a subshell and so changes to variables in the child don't flow back to the parent.
Here's a solution with standard unix commands:
for i in $(ls); do echo $RANDOM-$i; done | sort | cut -d- -f 2-
Here's a Python solution, if its available on your system
import glob
import random
files = glob.glob("*.txt")
if files:
for file in random.shuffle(files):
print file

Resources