How to get unique string from variable in bash - bash

Hello I would like to ask how can I get unique string from variable.
while read -r line
do
route=$(echo $line | awk -F'[:]' '{print $2}') #get path from log file
if [ "`dirname "$route"`" == "`xrealpath "$PWD"`" ]; then #compare path from log file with $PWD
name=$(echo $line | awk -F'[:]' '{print $1}') #take name from 1st column in log file
fi
if ! [ "$name" == "$help_name" ]; then
echo $name
help_name=$name
pom=$pom:$name
fi
done < $WEDI_RC
Sample logfile:
proj.sh:/Users/Tom/Documents/proj.sh:2015-03-21:1
proj1.sh:/Users/Tom/Documents/proj.sh:2015-03-21:1
proj.sh:/Users/Tom/Documents/proj.sh:2015-03-21:2
proj1.sh:/Users/Tom/Documents/proj.sh:2015-03-21:2
proj.sh:/Users/Tom/Documents/proj.sh:2015-03-21:3
proj1.sh:/Users/Tom/Documents/proj.sh:2015-03-21:3
How can I echo each unique just one time?
My output now looks something like this:
proj.sh
proj1.sh
proj.sh
proj1.sh
proj.sh
:proj.sh:proj1.sh:proj.sh:proj1.sh:proj.
Expecting output:
proj.sh
proj1.sh
I don't know how much files can be readed in while cycle.
We cannot use any temporary files
Thank you

Answer for Original Version of This Question
This uses the associative array seen to keep track of what names have been seen:
declare -A seen
while read -r line
... blabla ...
do
if [ -z "${seen[$name]}" ]; then
echo $name
seen["$name"]=1
pom=$pom:$name
fi
done < "$WEDI_RC"
Working Example (No blabla)
Let us start with this file:
$ cat file
proj.sh
proj1.sh
proj.sh
proj1.sh
proj.sh
We will run this code (note that ...blabla... has been removed and the loop now reads in name directly):
$ cat script.sh
declare -A seen
while read -r name
do
if [ -z "${seen[$name]}" ]; then
echo $name
seen["$name"]=1
pom=$pom:$name
fi
done < file
declare -p pom
This is the result:
$ bash script.sh
proj.sh
proj1.sh
declare -- pom=":proj.sh:proj1.sh"
Answer for Revised Question
In the revised question, the following code appears:
route=$(echo $line | awk -F'[:]' '{print $2}') #get path from log file
if [ "`dirname "$route"`" == "`xrealpath "$PWD"`" ]; then #compare path from log file with $PWD
name=$(echo $line | awk -F'[:]' '{print $1}') #take name from 1st column in log file
This means that, as the code runs, name may never be set depending on the current directory when the script is run. This would explain the error messages reported in the comments.

Related

Why am I getting this file error and a cat error even though I am passing the text file in the argument?

The program calls for us to read in the directory full of text files, parse data from those files into their respective attributes.
Then once the data is set, load a general template which has those attributes in the text.
I'm using a sed command to replace the specific attributes, only if the number of students is greater than 50. If so it runs the sed command and writes to a file, and into a directory.
But i am getting this error when I'm passing
test3.sh ./data assign4.template 12/16/2021 ./output
Error
cat: assign4.template: No such file or directory
test3.sh: line 62: output/MAT3103.crs: No such file or directory
The current file is MAT4353.crs
Now what I am thinking is that, for the file or directory error, it is looking in that folder and searching for a file named that
But Not entirely sure how to resolve that.
As for the cat: template error, I don't get that since I am passing the template in the terminal
As for the other paramaters being passed, the Date which is also substituted in the sed command, All output files should be written to the directory defined by the last argument. This directory may or may not already exist. Each file should be named by the course’s department code and number,and with the extension.warn
Here is the total code
#!/bin/bash
# checking if user has passed atleast four arguments are passed
if [ $# -ne 4 ]
then
echo "Atleast 4 argument should be passed"
exit 1
fi
# if output directory exits check
if [ -d output ]
then
# if output directory exists will get deleted
echo "output directory already exists. So removing its contents"
rm -f output/*
else
# output directory does not exist, so gets created here
echo "output directory does not exist. So creating a new directory"
mkdir output
fi
max_students=50
template=$2
dt=$3
cd $1
for i in *; do
echo The current file is ${i}
dept_code=$(awk 'NR==2
{print $1 ; exit}' $i)
echo $dept_code
dept_name=$(awk 'NR==2
{print $2 ; exit}' $i)
echo $dept_name
course_name=$(awk 'FNR==2' $i)
echo $course_name
course_sched=$(awk 'FNR==3' $i | awk '{print $1}')
course_sched=$(awk 'FNR==3' $i | awk '{print $1}')
echo $course_sched
course_start=$(awk 'FNR==3' $i | awk '{print $2}')
echo $course_start
course_end=$(awk 'FNR==3' $i | awk '{print $3}')
echo $course_end
credit_hours=$(awk 'FNR==4' $i)
echo $credit_hours
num_students=$(awk 'FNR==5' $i)
echo $num_students
# checking if number of students currently enrolled > max students
if (( $(echo "$num_students > $max_students" |bc -l) ))
then
# output filename creation
out_file=${i}
# using example Template and sed command to replace the variables
cat $template | sed -e "s/\[\\[\dept_code\]\]/$dept_code/" | sed -e "s/\[\\[\dept_name\]\]/$dept_name/" | sed -e "s|\[\[course_name\]\]|$course_name|" | sed -e "s|\[\[course_start\]\]|$$
fi
done
You define the variable as
template=$2
and since your second parameter is assign4.template, this is what the variable template is set to. Then you do a
cat $template
which is, first of all, unnecessary, since you can do an input redirection on sed instead, but most of all requires, that the file exists in your working directory. Since you have done before a
cd $1
it means that the file data/assign4.template does not exist. You have to create this file before you can use your script.
use single quotes in your positional arguments.
test3.sh './data' 'assign4.template' '12/16/2021' './output'
or
test3.sh data assign4.template '12/16/2021' output

How to use variable with awk when being read from a file

I have a file with the following entries:
foop07_bar2_20190423152612.zip
foop07_bar1_20190423153115.zip
foop08_bar2_20190423152612.zip
foop08_bar1_20190423153115.zip
where
foop0* = host
bar* = fp
I would like to read the file and create 3 variables, the whole file name, host and fp (which stands for file_path_differentiator).
I am using read to take the first line and get my whole file name variable, I though I could then feed this into awk to grab the next two variables, however the first method of variable insertion creates an error and the second gives me all the variables.
I would like to loop each line, as I wish to use these variables to ssh to the host and grab the file
#!/bin/bash
while read -r FILE
do
echo ${FILE}
host=`awk 'BEGIN { FS = "_" } ; { print $1 }'<<<<"$FILE"`
echo ${host}
path=`awk -v var="${FILE}" 'BEGIN { FS = "_" } ; { print $2 }'`
echo ${path}
done <zips_not_received.csv
Expected Result
foop07_bar2_20190423152612.zip
foop07
bar2
foop07_bar1_20190423153115.zip
foop07
bar1
Actual Result
foop07_bar2_20190423152612.zip
/ : No such file or directoryfoop07_bar2_20190423152612.zip
bar2 bar1 bar2 bar1
You can do this alone with bash, without using any external tool.
while read -r file; do
[[ $file =~ (.*)_(.*)_.*\.zip ]] || { echo "invalid file name"; exit 1; }
host="${BASH_REMATCH[1]}"
path="${BASH_REMATCH[2]}"
echo "$file"
echo "$host"
echo "$path"
done < zips_not_received.csv
typical...
Managed to work a solution after posting...
#!/bin/bash
while read -r FILE
do
echo ${FILE}
host=`echo "$FILE" | awk -F"_" '{print $1}'`
echo $host
path=`echo "$FILE" | awk -F"_" '{print $2}'`
echo ${path}
done <zips_not_received.csv
not sure on the elegance or its correctness as i am using echo to create variable...but i have it working..
Assuming there is no space or _ in your "file name" that are part of the host or path
just separate line before with sed, awk, ... if using default space separator (or use _ as argument separator in batch). I add the remove of empty line value as basic security seeing your sample.
sed 's/_/ /g;/[[:blank:]]\{1,\}/d' zips_not_received.csv \
| while read host path Ignored
do
echo "${host}"
echo "${path}"
done

Bash script: Eqw status

I've been trying to run this code in a server but always send an 'Eqw' status.
I want to process a file that contains three columns, one with a path, and the others are information about that file.
Can anyone please help me to figure out what's wrong in my code?
#!/bin/bash
source $HOME/.bashrc
source /share/apps/Profiles/share-profile.sh
module load compilers/python-2.7.15
MANIFEST=path with the file with three columns
TRIM=trimming program that does work properly
#Run pyadapter_trim.py to trim adapter sequences
COUNTER=0;
#for each file in the input manifest, run tirimming program
while read LINE
do
READ=`echo "$LINE" | awk -F"\t" '{print $1}'`
SAMPLE=`echo "$LINE" | awk -F"\t" '{print $2}'`
GENOME=`echo "$LINE" | awk -F"\t" '{print $3}'`
WDIR=`(dirname $READ1)`
cd $WDIR
QCDIR=$WDIR"/ats"
mkdir -p $QCDIR
echo -e "Processing sample ${SAMPLE}"
echo -e "${READ1}..."
python $TRIM -a $READ
echo ""
COUNTER=`expr $COUNTER + 1`
done < "$MANIFEST"
echo "$COUNTER Files Processed"

How to set variables from each line store in file

I want to read a file line by line and pass the first and second fields as arguments to a bash script, then iterate to the next line and do the same thing.
My file is pwd.out:
/path/dir/name1/date name1
/path/dir/name2/date name2
I have tried the following without success:
while read line; do dir=`awk '{print $1}'`; name=`awk '{print $2}'`; echo "./myprogram $dir somethingHere $name"; done < pwd.out
where it outputs:
./myprogram /path/dir/name1/date /path/dir/name2/date somethingHere
I think that somehow $dir is getting the values from all the lines and $name is not being set.
What I would like to have is:
./myprogram /path/dir/name1/date somethingHere name1
./myprogram /path/dir/name2/date somethingHere name2
Thanks in advance
You don't need awk for this. Just read the variables in the order they come, such as this:
while read dir name
do
./myprogram $dir somethingHere $name
done < pwd.out
Test
See an example in which I just echo dir=$dir, name=$name with your given file:
$ while read dir name; do echo "dir=$dir, name=$name"; done < pwd.out
dir=/path/dir/name1/date, name=name1
dir=/path/dir/name2/date, name=name2
Your awk command was not working because you were not giving any input to it.
It could work if you did this, although it is unnecessary to use an external command like awk with something that bash can handle perfectly as you can see above.
while read line
do
dir=$(awk '{print $1}' <<< "$line")
name=$(awk '{print $2}' <<< "$line")
echo "./myprogram $dir somethingHere $name"
done < pwd.out
while read dir name; do
echo "dir=$dir name=$name"
done < pwd.out

How can I tokenize $PATH by using awk?

How can I tokenize $PATH by using awk?
I tried 3 hours, but it totally screwed out.
#!/bin/bash
i=1
while true; do
token=$($echo $PATH | awk -F ':' '{print $"$i"}')
if [ -z "$token" ]; then
break
fi
((i++))
if [ -a "$TOKEN/$1" ]; then
echo "$TOKEN/$1"
break
fi
break
done
When I run this code, I got
/home/$USERID/bin/ff: line 6: /home/$USERID/bin:/usr/local/symlinks:/usr/local/scripts:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/home/$USERID/bin: No such file or directory
How can I change my program?
What are you trying to do?
This will let you iterate against the individual paths:
echo $PATH | tr ':' '\n' | while read line; do echo $line; done
As #SiegeX notes, an even shorter version works
echo $PATH | while read -d ':' line; do echo $line; done
Do the whole thing in awk
#!/bin/bash
awk -v addPath="$1" 'BEGIN{RS=":";ORS=addPath "\n"}{$1=$1}1' <<< $PATH
Proof of Concept
$ addPath="/foo"
$ awk -v addPath="$addPath" 'BEGIN{RS=":";ORS=addPath "\n"}{$1=$1}1' <<< $PATH
/usr/local/bin/foo
/usr/bin/foo
/bin/foo
/usr/games/foo
/usr/lib/java/bin/foo
/usr/lib/qt/bin/foo
/usr/share/texmf/bin/foo
./foo
/sbin/foo
/usr/sbin/foo
/usr/local/sbin/foo
I think simple tr : \\n would suffice. Pipe it with sed 's#$#blabla#g' to add something to the lines and that's it.
You don't need to use external tools such as awk or tr to tokenize the PATH. Bash is capable of doing so:
#!/bin/sh
IFS=:
for p in $PATH
do
if [ -a "$p/$1" ]; then
echo "$p/$1"
break
fi
done
The IFS is a bash built-in variable which bash use as an input field separator (IFS).

Resources