remove delimiter if condition not satisfied and substitute a string on condition - shell

Consider the below file.
HEAD~XXXX
XXX~XXX~XXX~XXX~XXX~XXX~~WIN~SCRIPT~~~
XXX~XXX~XXX~XXX~XXX~XXX~~WIN~TPSCRI~~~
XXX~XXX~XXX~XXX~XXX~XXX~~WIN~RSCPIT~~~
TAIL~20
wish the Output to be like below for the above:
HEAD~XXXX
XXX~XXX~XXX~XXX~XXX~XXX~~WIN~SCRIPT~~~
XXX~XXX~XXX~XXX~XXX~XXX~~~~~~
XXX~XXX~XXX~XXX~XXX~XXX~~~~~~
TAIL~20
If the 9th field is SCRIPT, I want both 8th & 9th fields to be empty like the 10th & if the line contains words HEAD/TAIL those have to ignored from our above condition, i.e., NF!=13 - will need the header & footer as it is in the input.
I have tried the below, but there should be a smarter way.
awk -F'~' -v OFS='~' '($9 != "Working line takeover with change of CP" {$9 = ""}) && ($9 != "Working line takeover with change of CP" {$8 = ""}) {NF=13; print}' file
the above doesn't work
head -1 file > head
tail -1 file > tail
sed -i '/HDR/d' file
sed -i '/TLR/d' file
sed -i '/^\s*$/d' file
awk -F'~' -v OFS='~' '$9 != "Working line takeover with change of CP" {$9,$8 = ""} {NF=13; print}' file >> file.tmp //syntax error
cat file.tmp >> head
cat tail >> head
echo "" >> head
mv head file1
I'm trying an UNIX shell script with the below requirements.
Consider a file like this..
XXX~XXX~XXX~XXX~XXX~XXX~~XXX~~SCRIPT~~~
XXX~XXX~XXX~XXX~XXX~XXX~~XXX~~OTHERS~~~~
XXX~XXX~XXX~XXX~XXX~XXX~~XXX~~OTHERS~~~
Each file should have 12 fields(~ as delimiter), if not a ~ has to removed.
If anything OTHER than SCRIPT string present in the 10th field, the field has to be removed.
I tried the below in /bin/bash, I know I'm not doing it so well. I'm feeding line to sed & awk commands.
while read readline
echo "entered while"
do
fieldcount=`echo $readline | awk -F '~' '{print NF}'`
echo "Field count printed"
if [ $fieldcount -eq 13 ] && [ $fieldcount -ne 12 ]
then
echo "entering IF & before deletion"
#remove delimiter at the end of line
#echo "$readline~" >> $S_DIR/$1.tmp
#sed -i '/^\s*$/d' $readline
sed -i s'/.$//' $readline
echo "after deletion"
if [ awk '/SCRIPT/' $readline -ne "SCRIPT"]
then
#sed -i 's/SCRIPT//' $readline
replace_what="OTHERS"
#awk -F '~' -v OFS=~ '{$'$replace_what'=''; print }'
sed -i 's/[^,]*//' $replace_what
echo "$readline" >> $S_DIR/$1.tmp
fi
else
echo "$readline" >> $S_DIR/$1.tmp
fi
done < $S_DIR/$1

awk -F'~' -v OFS='~' '$10 != "SCRIPT" {$10 = ""} {NF=12; print}' file
XXX~XXX~XXX~XXX~XXX~XXX~~XXX~~SCRIPT~~
XXX~XXX~XXX~XXX~XXX~XXX~~XXX~~~~
XXX~XXX~XXX~XXX~XXX~XXX~~XXX~~~~
In bash, I would write:
(
# execute in a subshell, so the IFS setting is localized
IFS='~'
while read -ra fields; do
[[ ${fields[9]} != "SCRIPT" ]] && fields[9]=''
echo "${fields[*]:0:12}"
done < file
)
Your followup question:
awk -F'~' -v OFS='~' '
$1 == "HEAD" || $1 == "TAIL" {print; next}
$9 != "SCRIPT" {$8 = $9 = ""}
{NF=13; print}
' file
If you have further questions, please create a new question instead of editing this one.

Related

sed command within a while loop doesn't write output

I have this input file
gb|KY798440.1|
gb|KY842329.1|
MG082893.1
MG173246.1
and I want to get all the characters that are between the "|" or the full line if there is no "|". That is a desired output that looks like
KY798440.1
KY842329.1
MG082893.1
MG173246.1
I wrote:
while IFS= read -r line; do
if [[ $line == *\|* ]] ; then
sed 's/.*\|\(.*\)\|.*/\1/' <<< $line >> output_file
else echo $line >> output_file
fi
done < input_file
Which gives me
empty line
empty line
MG082893.1
MG173246.1
(note: empty line means an actual empty line - it doesn't actually writes "empty line")
The sed command works on a single example (i.e. sed 's/.*\|\(.*\)\|.*/\1/' <<< "gb|KY842329.1|" outputs KY842329.1) but within the loop it just does a line return. The else echo $line >> output_file seems to work.
Bare sed:
$ sed 's/^[^|]*|\||[^|]*$//g' file
Output:
KY798440.1
KY842329.1
MG082893.1
MG173246.1
You could do
sed '/|/s/[^|]*|\([^|]*\)|.*/\1/' input
or
awk 'NF>1 {print $2} NF < 2 { print $1}' FS=\| input
or
sed -e 's/[^|]*|//' -e 's/|.*//' input

shell script : comma in the beginning instead of end

This is a part of my shell script.
for line in `cat $1`
do
startNum=`echo $line | awk -F "," '{print $1}'`
endNum=`echo $line | awk -F "," '{print $2}'`
operator=`echo $line | awk -F "," '{print $3}'`
termPrefix=`echo $line | awk -F "," '{print $4}'`
if [[ "$endNum" == 81* ]] || [[ "$endNum" == 33* ]] || [[ "$endNum" == 55* ]]
then
areaCode="${endNum:0:2}"
series="${endNum:2:4}"
startCLI="${startNum:6:4}"
endCLI="${endNum:6:4}"
else
areaCode="${endNum:0:3}"
series="${endNum:3:3}"
startCLI="${startNum:6:4}"
endCLI="${endNum:6:4}"
fi
echo "Add,${areaCode},${series},${startCLI},${endCLI},${termPrefix},"
#>> ${File}
done
input is csv contains below many rows :
5557017101,5557017101,102,1694
5515585614,5515585614,102,084
Output od shell script :
,dd,55,5701,7101,7101,1694
,dd,55,1558,5614,5614,0848
Not sure why comma is coming in startign of output, instead as per shell script it should come in the end.
please help
Here is a suggested awk command that should replace all of your shell+awk code. This awk also takes care of trailing \r:
awk -v RS=$'\r' 'BEGIN{FS=OFS=","} NF>3{
startNum=$1; endNum=$2; termPrefix=$4;
if (endNum ~ /^(81|33|55)/) {
areaCode=substr(endNum,1,2); series=substr(endNum,3,4)
}
else {
areaCode=substr(endNum,1,3); series=substr(endNum,4,3)
}
startCLI=substr(startNum,7,4); endCLI=substr(endNum,7,4);
print "Add", areaCode, series, startCLI, endCLI, termPrefix
}' file
Add,55,5701,7101,7101,1694
Add,55,1558,8561,5614,084

Unix - How do I have my shell script process more than one file from the command line?

I'm trying to modify an existing script I have to take up to three text files and transform them. Currently the script will only transform the text from one file. Here's the existing script I have:
if [ $# -eq 1 ]
then
if [ -f $1 ]
then
name="My Name"
echo $name
date
starting_data=$1
sed '/^id/ d' $starting_data > raw_data3
sed 's/-//g' raw_data3 > raw_data4
cut -f1 -d, raw_data4 > cutfile1.col1
cut -f2 -d, raw_data4 > cutfile1.col2
cut -f3 -d, raw_data4 > cutfile1.col3
sed 's/$/:/' cutfile1.col2 > last
sed 's/^ //' last > last2
sed 's/^ //' cutfile1.col3 > first
paste -d\ first last2 cutfile1.col1 > final
cat final
else
echo "$1 cannot be found."
fi
else
echo "Please enter a filename."
fi
All those temp files are unnecessary. awk can do all of what sed and cut can do, so this should be what you want (pending the output field separator question)
if [ $# -eq 0 ]; then
echo "usage: $0 file ..."
exit 1
fi
for file in "$#"; do
if ! [ -f "$file" ]; then
echo "file not found: $file"
continue
fi
name="My Name"
echo "$name"
date
awk -F, -v OFS=" " '
/^id/ {next}
{
gsub(/-/, "")
sub(/^ /, "", $2)
sub(/^ /, "", $3)
print $3, $2 ":", $1
}
' "$file" > final
cat final
done
Note all my double quotes: those are required.

Shell Script - Extract number at X column in current line in file

I am reading a file (test.log.csv) line by line until the end of the file, and I want to extract the value at 4th column of current line read then output the value to a text file. (output.txt)
For example, right now I read until 2nd line (INSERT,SLT_TEST_1,TEST,1127192896,0,DEBUG1) and I want to extract the number at column 4 in the current line and output to a text file named as output.txt.
test.log.csv
INSERT,SLT_TEST_1,TEST,1127192896,0,DEBUG1
INSERT,SLT_TEST_1,TEST,1127192896,0,DEBUG1
INSERT,SLT_TEST_1,TEST,1127192896,0,DEBUG1
The desired output is
output.txt
1127192896
1127192896
1127192896
Right now my script is as below
#! /bin/bash
clear
rm /home/mobaxterm/Script/output.txt
while IFS= read -r line
do
if [[ $line == *"INSERT"* ]] && [[ $line == *"$1"* ]]
then
echo $line >> /home/mobaxterm/Script/output.txt
lastID=$(awk -F "," '{if (NR==curLine) { print $4 }}' curLine="${lineCount}")
echo $lastID
else
if [ lastID == "$1" ]
then
echo $line >> /home/mobaxterm/Script/output.txt
fi
fi
lineCount=$(($lineCount+1))
done < "/home/mobaxterm/Script/test.log.csv"
The parameter ($1) will be 1127192896
I tried declaring a counter in the loop and compare NR with the counter, but the script just stopped after it found the first one.
Find all the lines where the 4th field is 1127192896 and output the 4th field:
awk -F, -v SEARCH="1127192896" '$4 ~ SEARCH {print $4}' test.log.csv
1127192896
1127192896
1127192896
Find all the lines containing the word "INSERT" and where the 4th field is 1127192896
awk -F, -v SEARCH="1127192896" '$4 ~ SEARCH && /INSERT/ {print $4}' test.log.csv
If you have the number you want to look for in a variable called $1, put that in place of the 1127192896, like this:
awk -F, -v SEARCH="$1" '$4 ~ SEARCH && /INSERT/ {print $4}' test.log.csv
You can combine variable substitution and definition of array.
array_variable=( ${line//,/ /} )
sth_you_need=${array_variable[1]}
Or you can just use awk/cut
sth_you_need=$(echo $line | awk -F, 'NR==2{print $2}')
# or
sth_you_need=$(echo $line | cut -d, -f2)

put awk code in bash and sort the result

I have a awk code for combining 2 files and add the result to the end of file.txt using ">>"
my code
NR==FNR && $2!=0 {two[$0]++;j=1; next }{for(i in two) {split(i,one,FS); if(one[3] == $NF){x=$4;sub( /[[:digit:]]/, "A", $4); print j++,$1,$2,$3,x,$4 | "column -t" ">>" "./Desktop/file.txt"}}}
i want put my awk to bash script and finaly sort my file.txt and save sorted result to file.txt again using >
i tried this
#!/bin/bash
command=$(awk '{NR==FNR && $2!=0 {two[$0]++;j=1; next }{for(i in two) {split(i,one,FS); if(one[3] == $NF){x=$4;sub( /[[:digit:]]/, "A", $4); print $1,$2,$3,$4 | "column -t" ">>" "./Desktop/file.txt"}}}}')
echo -e "$command" | column -t | sort -s -n -k4 > ./Desktop/file.txt
but it gives me error "for reading (no such a file or directory)"
where is my mistake?
Thanks in advance
1) you aren't specifying the input files for your awk script. This:
command=$(awk '{...stuff...}')
needs to be:
command=$(awk '{...stuff...}' file1 file2)
2) You move your awk condition "NR == ..." inside the action part so it will no longer behave as a condition.
3) Your awk script output is going into "file.txt" so "command" is empty when you echo it on the subsequent line.
4) You have unused variables x and j
5) You pass the arg FS to split() unnecessarily.
etc...
I THINK what you want is:
command=$( awk '
NR==FNR && $2!=0 { two[$0]++; next }
{
for(i in two) {
split(i,one)
if(one[3] == $NF) {
sub(/[[:digit:]]/, "A", $4)
print $1,$2,$3,$4
}
}
}
' file1 file2 )
echo -e "$command" | column -t >> "./Desktop/file.txt"
echo -e "$command" | column -t | sort -s -n -k4 >> ./Desktop/file.txt
but it's hard to tell.

Resources