Grep command returns nothing in shell script - bash

When I try to extract rows that are matched string which are in another file.But the grep command returns nothing.
#!/bin/bash
input="export.txt"
file="filename.csv"
val=`head -n 1 $file`
echo $val>export.csv
cat export.txt | while read line
do
val=`echo $line | tr -d '\n'`
echo $val
valu=`grep $val $file`
echo $valu
done

You can simply do this :
grep -f list.txt input.txt
Which will extract all the lines from input which match any word from list.txt.
If for some reason you want to save each match, you can do it in a Bash array as :
IFS=$'\n' read -d '' -a values <<< "$( grep -f list.txt input.txt )"
And then you can print a certain match as :
echo "${values[1]}"
Regards!

Related

Sed replace substring only if expression exist

In a bash script, I am trying to remove the directory name in filenames :
documents/file.txt
direc/file5.txt
file2.txt
file3.txt
So I try to first see if there is a "/" and if yes delete everything before :
for i in **/*.scss *.scss; do
echo "$i" | sed -n '^/.*\// s/^.*\///p'
done
But it doesn't work for files in the current directory, it gives me a blank string.
I get :
file.txt
file5.txt
When you only want the filename, use basename instead of sed.
# basename /path/to/file
returns file
here is the man page
Your sed attempt is basically fine, but you should print regardless of whether you performed a substitution; take out the -n and the p at the end. (Also there was an unrelated syntax error.)
Also, don't needlessly loop over all files.
printf '%s\n' **/*.scss *.scss |
sed -n 's%^.*/%%p'
This also can be done with awk bash util.
Example:
echo "1/2/i.py" | awk 'BEGIN {FS="/"} {print $NF}'
output: i.py
Eventually, I did :
for i in **/*.scss *.scss; do
# for i in *.scss; do
# for i in _hm-globals.scss; do
name=${i##*/} # remove dir name
name=${name%.scss} # remove extension
name=`echo "$name" | sed -n "s/^_hm-//p"` # remove _hm-
if [[ $name = *"."* ]]; then
name=`echo "$name" | sed -n 's/\./-/p'` #replace . to --
fi
echo "$name" >&2
done

Bash script to stdout stuck with redirect

My bash script is the following:
#!/bin/bash
if [ ! -f "$1" ]; then
exit
fi
while read line;do
str1="[GAC]*T"
num=$"(echo $line | tr -d -c 'T' | wc -m)"
for((i=0;i<$num;i++))do
echo $line | sed "s/$str1/&\n/" | head -n1 -q
str1="${str1}[GAC]*T"
done
str1="[GAC]*T"
done < "$1
While it works normally as it should (take the filename input and print it line by line until the letter T and next letter T and so on) it prints to the terminal.
Input:
GATTT
ATCGT
Output:
GAT
GATT
GATTT
AT
ATCGT
When I'm using the script with | tee outputfile the outputfile is correct but when using the script with > outputfile the terminal hangs / is stuck and does not finish. Moreover it works with bash -x scriptname inputfile > outputfile but is stuck with bash scriptname inputfile > outputfile.
I made modifications to your original script, please try:
if [ ! -f "$1" ]; then
exit
fi
while IFS='' read -r line || [[ -n "$line" ]];do
str1="[GAC]*T"
num=$(echo $line | tr -d -c 'T' | wc -m)
for((i=0;i<$num;i++));do
echo $line | sed "s/$str1/&\n/" | head -n1 -q
str1="${str1}[GAC]*T"
done
str1="[GAC]*T"
done < "$1"
For input:
GATTT
ATCGT
This script outputs:
GAT
GATT
GATTT
AT
ATCGT
Modifications made to your original script were:
Line while read line; do changed to while IFS='' read -r line || [[ -n "$line" ]]; do. Why I did this is explained here: Read a file line by line assigning the value to a variable
Line num=$"(echo $line | tr -d -c 'T' | wc -m)" changed to num=$(echo $line | tr -d -c 'T' | wc -m)
Line for((i=0;i<$num;i++))do changed to for((i=0;i<$num;i++));do
Line done < "$1 changed to done < "$1"
Now you can do: ./scriptname inputfile > outputfile
Try:
sed -r 's/([^T]*T+)/\1\n/g' gatc.txt > outputfile
instead of your script.
It takes some optional non-Ts, followed by at least one T and inserts a newline after the T.
cat gatc.txt
GATGATTGATTTATATCGT
sed -r 's/([^T]*T+)/\1\n/g' gatc.txt
GAT
GATT
GATTT
AT
AT
CGT
For multiple lines, to delete empty lines in the end:
echo "GATTT
ATCGT" | sed -r 's/([^T]*T+)/\1\n/g;' | sed '/^$/d'
GATTT
AT
CGT

how do I split a string on the nth delimiter?

For every line in my file, I want to print everything on that line before the 4th dash.
Input:
TCGA-HC-8216-10A-11D-A323-01
TCGA-J4-8200-10A-11D-A323-01
TCGA-EJ-A65E-10A-11D-A323-01
and I want to split each line on the fourth dash "-"
Output:
TCGA-HC-8216-10A
TCGA-J4-8200-10A
TCGA-EJ-A65E-10A
I know I can split on every dash like this:
#!/usr/bin/env bash
IN="TCGA-HC-8216-01A-11D-A323-01
TCGA-J4-8200-10A-11D-A323-01
TCGA-EJ-A65E-10A-11D-A323-01"
arr=$(echo $IN | tr "-" "\n")
for x in $arr
do
echo "> [$x]"
done
but this splits and prints each part of the string between every dash.
Use cut
cut -d- -f1-4 <<'EOF'
TCGA-HC-8216-01A-11D-A323-01
TCGA-J4-8200-10A-11D-A323-01
TCGA-EJ-A65E-10A-11D-A323-01
EOF
You are cutting your input on -d (delimiter) of - and returning -f (fields) 1-4, one through four.
#!/bin/bash
IN="TCGA-HC-8216-01A-11D-A323-01
TCGA-J4-8200-10A-11D-A323-01
TCGA-EJ-A65E-10A-11D-A323-01"
arr=$(echo "$IN" | cut -d '-' -f1-4)
echo "$arr"
Prints:
TCGA-HC-8216-01A
TCGA-J4-8200-10A
TCGA-EJ-A65E-10A
Using pure bash and pattern matching:
#!/bin/bash
IN="TCGA-HC-8216-01A-11D-A323-01
TCGA-J4-8200-10A-11D-A323-01
TCGA-EJ-A65E-10A-11D-A323-01"
re='([^-]+-){3}[^-]+'
for line in $IN
do
if [[ $line =~ $re ]]; then
trunc=${BASH_REMATCH[0]}
fi
echo "$trunc"
done
Output:
TCGA-HC-8216-01A
TCGA-J4-8200-10A
TCGA-EJ-A65E-10A
Using grep with ERE:
arr=$(echo "$IN" | grep -oE "^([^-]*-){3}[^-]*")
With BRE:
arr=$(echo "$IN" | grep -o "^\([^-]*-\)\{3\}[^-]*")
Example:
#!/bin/bash
IN="TCGA-HC-8216-01A-11D-A323-01
TCGA-J4-8200-10A-11D-A323-01
TCGA-EJ-A65E-10A-11D-A323-01"
arr=$(echo "$IN" | grep -oE "^([^-]*-){3}[^-]*")
for x in $arr
do
echo "> [$x]"
done
Output:
> [TCGA-HC-8216-01A]
> [TCGA-J4-8200-10A]
> [TCGA-EJ-A65E-10A]

assign stat|grep|awk to a variable in bash

I have a file of filenames, and I need to be able to get the size of these files using bash.
I have the following script which does that, but It prints the filename and the size on different lines, i'd prefer it to do it all on one line if possible.
#!/bin/sh
filename="$1"
while read -r line
do
name=$line
vars=(`echo $name | tr '.' ' '`)
echo $name
stat -x $name | grep Size: | awk '{ print $2 }'
done < "$filename"
I'd love to have it of the form:
filename: $size
How can I do this?
(I am using OSX hence the slightly odd version of stat.)
Pass -n to the echo to prevent a trailing newline from being added. So change
echo $name
to
echo -n $name
and to add the : separator between the file name and file size
echo -n ${name}": "
This should do the trick:
while read f
do
echo "${f} : $(stat -L -c %s ${f})"
done < "${filename}"
echo $name: $(stat -x $name | sed -n '/^Size:/s///p')

Redirect output to a bash array

I have a file containing the string
ipAddress=10.78.90.137;10.78.90.149
I'd like to place these two IP addresses in a bash array. To achieve that I tried the following:
n=$(grep -i ipaddress /opt/ipfile | cut -d'=' -f2 | tr ';' ' ')
This results in extracting the values alright but for some reason the size of the array is returned as 1 and I notice that both the values are identified as the first element in the array. That is
echo ${n[0]}
returns
10.78.90.137 10.78.90.149
How do I fix this?
Thanks for the help!
do you really need an array
bash
$ ipAddress="10.78.90.137;10.78.90.149"
$ IFS=";"
$ set -- $ipAddress
$ echo $1
10.78.90.137
$ echo $2
10.78.90.149
$ unset IFS
$ echo $# #this is "array"
if you want to put into array
$ a=( $# )
$ echo ${a[0]}
10.78.90.137
$ echo ${a[1]}
10.78.90.149
#OP, regarding your method: set your IFS to a space
$ IFS=" "
$ n=( $(grep -i ipaddress file | cut -d'=' -f2 | tr ';' ' ' | sed 's/"//g' ) )
$ echo ${n[1]}
10.78.90.149
$ echo ${n[0]}
10.78.90.137
$ unset IFS
Also, there is no need to use so many tools. you can just use awk, or simply the bash shell
#!/bin/bash
declare -a arr
while IFS="=" read -r caption addresses
do
case "$caption" in
ipAddress*)
addresses=${addresses//[\"]/}
arr=( ${arr[#]} ${addresses//;/ } )
esac
done < "file"
echo ${arr[#]}
output
$ more file
foo
bar
ipAddress="10.78.91.138;10.78.90.150;10.77.1.101"
foo1
ipAddress="10.78.90.137;10.78.90.149"
bar1
$./shell.sh
10.78.91.138 10.78.90.150 10.77.1.101 10.78.90.137 10.78.90.149
gawk
$ n=( $(gawk -F"=" '/ipAddress/{gsub(/\"/,"",$2);gsub(/;/," ",$2) ;printf $2" "}' file) )
$ echo ${n[#]}
10.78.91.138 10.78.90.150 10.77.1.101 10.78.90.137 10.78.90.149
This one works:
n=(`grep -i ipaddress filename | cut -d"=" -f2 | tr ';' ' '`)
EDIT: (improved, nestable version as per Dennis)
n=($(grep -i ipaddress filename | cut -d"=" -f2 | tr ';' ' '))
A variation on a theme:
$ line=$(grep -i ipaddress /opt/ipfile)
$ saveIFS="$IFS" # always save it and put it back to be safe
$ IFS="=;"
$ n=($line)
$ IFS="$saveIFS"
$ echo ${n[0]}
ipAddress
$ echo ${n[1]}
10.78.90.137
$ echo ${n[2]}
10.78.90.149
If the file has no other contents, you may not need the grep and you could read in the whole file.
$ saveIFS="$IFS"
$ IFS="=;"
$ n=$(</opt/ipfile)
$ IFS="$saveIFS"
A Perl solution:
n=($(perl -ne 's/ipAddress=(.*);/$1 / && print' filename))
which tests for and removes the unwanted characters in one operation.
You can do this by using IFS in bash.
First read the first line from file.
Seoncd convert that to an array with = as delimeter.
Third convert the value to an array with ; as delimeter.
Thats it !!!
#!/bin/bash
IFS='\n' read -r lstr < "a.txt"
IFS='=' read -r -a lstr_arr <<< $lstr
IFS=';' read -r -a ip_arr <<< ${lstr_arr[1]}
echo ${ip_arr[0]}
echo ${ip_arr[1]}

Resources