I have a file with some data like this:
1, 3, 0, 0, 0
0, 4, 5, 0, 5
2, 6, 0, 1, 0
I would like to write a shell script to delete all the lines where the 3rd argument is 0 (here line 1 and 2)
I already know that
sed -i '/0/d' file.txt
but I don't know how I can select the 3rd arguments whatever the others one.
Do you have an idea?
Awk may be a better tool than sed for this task:
awk -F' *, *' '$3 != 0 {print}' FILE
But sed can do it:
sed -i '/^[0-9][0-9]*, [0-9][0-9]*, 0,/d' FILE
sed -iE '/^([^,]+,){2} 0,/d' file
explained with explain.py:
sed -iE '/^([^,]+,){2} 0,/d' file
\_/ || ||| || || \_/ | |
| || ||| || || | | \- delete command
| || ||| || || | |
| || ||| || || | \- followed by blank zero komma
| || ||| || || |
| || ||| || || \- two times this inner pattern
| || ||| || ||
| || ||| || |\- followed by a comma
| || ||| || |
| || ||| || \- at least one of them
| || ||| ||
| || ||| |\- komma
| || ||| |
| || ||| \- not
| || |||
| || ||\- start of pattern, group of characters, which are
| || ||
| || |\- at begin of line
| || |
| || \- start of pattern for delete command
| ||
| |\- Extended regexp (parenthesis and braces without masking)
| |
| \- inplace changes
|
\- run sed
awk with in place editing (similar to sed -i)
awk '$3!="0,"{print $0>FILENAME}' file
Ruby(1.9+)
ruby -ane 'print unless $F[2]=="0," ' file
Related
I have an output like this:
| Value | Value2 | Name1 | Type | Date | Status |
| Value1 | Value1 | Name1 | Type1 | Date | Success |
| Value2 | Value2 | Name2 | Type1 | Date | Failed |
| Value2 | Value2 | Name3 | Type1 | Date | Pending |
I want to get each column values in variables for each line containing status "Pending" in the last column.
Here the matching line would be:
| Value2 | Value2 | Name3 | Type1 | Date | Pending |
I want to get each column of this line in a variable:
myvar1=Value2
myvar2=Value2
myvar3=Name3
myvar4=Type1
myvar5=Date
What is the best way to do that?
Thanks
Simply:
while IFS= read -r line ;do
IFS='|' read -r foo myvar{1..6} foo <<<"$line"
[ "${myvar6}" ] && [ -z "${myvar6//*Pending*}" ] && echo "$line"
done <inputfile ;
Will print:
| Value2 | Value2 | Name3 | Type1 | Date | Pending |
I'm going to assume the output you mention comes from a command named your_command. If you have it in a file, for example, that command could be cat that_file.
I think that a switch inside a loop is a legible elegant solution.
your_command | (
while read line; do
case $line in
*'Pending |')
IFS='|' read -ra myvar <<< "$line"
echo ${myvar[1]}
echo ${myvar[2]}
echo ${myvar[3]}
echo ${myvar[4]}
echo ${myvar[5]}
;;
*)
echo ...IGNORED $line
;;
esac
done
)
The output with the example you have given is the following
...IGNORED | Value | Value2 | Name1 | Type | Date | Status |
...IGNORED | Value1 | Value1 | Name1 | Type1 | Date | Success |
...IGNORED | Value2 | Value2 | Name2 | Type1 | Date | Failed |
Value2
Value2
Name3
Type1
Date
If you don't want to use an array, because whatever reason, you can change the IFS='|' read -ra myvar <<< "$line" line for
myvar1=$(echo $line | cut -d'|' -f 2)
myvar2=$(echo $line | cut -d'|' -f 3)
myvar3=$(echo $line | cut -d'|' -f 4)
myvar4=$(echo $line | cut -d'|' -f 5)
myvar5=$(echo $line | cut -d'|' -f 6)
First you can select the line. If it is only one ending with "Pending", this would work:
line=$(grep '| Pending |$' file.txt | sed 's/\s*|\s*/|/g' | sed 's/^|//g')
The variable line now has only the values separeted with the pipe symbol, without the spaces around it and no pipe symbols at the beginning the line.
Then, if you do not use an array, you can manually assign the variables like
myvar1=$(echo $line | awk -F'|' '{print $1}')
myvar2=$(echo $line | awk -F'|' '{print $2}')
...
If there are many lines containing the keyword "Pending" you have to use an array or a dynamic structure instead of static variable names.
First, what you asked for:
$: while read -r myvar1 myvar2 myvar3 myvar4 myvar5 Pending
> do echo "myvar1=[$myvar1] myvar2=[$myvar2] myvar3=[$myvar3] myvar4=[$myvar4] myvar5=[$myvar5]"
> done < <( sed -n '/[|]\s*Pending\s*[|]\s*$/{ s,[ |], ,g; s/^ //; s/ $//; p; }' file )
myvar1=[Value2] myvar2=[Value2] myvar3=[Name3] myvar4=[Type1] myvar5=[Date]
The sed selects only the records you want (/[|]\s*Pending\s*[|]\s*$/) converts all the delimiter-crap to single spaces (s,[ |], ,g;, breaks if you have any spaces embedded in your data), strips leading and trailing delimiters (s/^ //; s/ $//;), and prints the result (-n says don't print by default, p; says do print this records now).
But I think you should seriously reconsider the spaces around your delimiter, unless you wanted to keep them as part of the data. I'd leave off the leading and maybe the trailing delimiter. I'd also really consider putting them into an array, though I do understand you may want to use the fields by name...just don't call them myvar1, etc.
Another option is this:
awk 'BEGIN { FS=OFS="|" } $(NF-1)~/Pending/ { gsub(/^\s*\|\s*/, "", $0); NF-=2; print $0; }' file.txt | while IFS='|' read myVar1 myVar2 myVar3 myVar4 myVar5
do
#Do something
done
Maybe I'm using the wrong tool for the job here...
My data looks like this (this is from a json file which has been converted to a csv):
"hostname1",1,""
"hostname2",1,""
"hostname3",0,"yay_some_text
more_text
more_text
"
The first column is the hostname, second is the exit code and the third the result. I usually do something like this and make a moderately pretty table:
cat tmp.file | ( while read line
do
name=$(echo $line | awk -F "," '{print $1}')
exit_code=$(echo $line | awk -F "," '{print $2}')
output=$(echo $line | awk -F "," '{print $3}')
#I can then do stuff with the output here and ultimately do this:
echo -e "|${name}\t|${exit_code}\t|${output}\t|"
done
)
However the third column is causing me no end of problems; I think regardless of what I do, the read line bit will make this impossible. Does anyone have a better method of sorting this? I'd ideally like to keep the linebreaks, but if thats going to be too hard, I'll happily replace them with commas.
Desired output (either is fine):
| hostname1 | 1 | |
| hostname2 | 1 | |
| hostname3 | 0 | yay_some_text
more_text
more_text |
| hostname1 | 1 | |
| hostname2 | 1 | |
| hostname3 | 0 | yay_some_text, more_text, more_text |
Whichever of these you prefer will work robustly* and efficiently using any awk in any shell on every UNIX box:
$ cat tst.awk
{ rec = rec $0 ORS }
/"$/ {
gsub(/[[:space:]]*"[[:space:]]*/,"",rec)
gsub(/,/," | ",rec)
printf "| %s |\n", rec
rec = ""
}
.
$ awk -f tst.awk file
| hostname1 | 1 | |
| hostname2 | 1 | |
| hostname3 | 0 | yay_some_text
more_text
more_text |
.
$ cat tst.awk
{ rec = rec $0 RS }
/"$/ {
gsub(/[[:space:]]*"[[:space:]]*/,"",rec)
gsub(/,/," | ",rec)
gsub(RS,", ",rec)
printf "| %s |\n", rec
rec = ""
}
.
$ awk -f tst.awk file
| hostname1 | 1 | |
| hostname2 | 1 | |
| hostname3 | 0 | yay_some_text, more_text, more_text |
*robustly assuming your quoted strings never contain commas or escaped double quotes, i.e. it looks like the example you provided and your existing code relies on.
$ gawk -v RS='"\n' -v FPAT='[^,]*|"[^"]*"' -v OFS=' | ' '
{gsub(/"/,""); $1=$1; print OFS $0 OFS}' file
| hostname1 | 1 | |
| hostname2 | 1 | |
| hostname3 | 0 | yay_some_text
more_text
more_text
|
In your case, one way is , you can transform the file to a simpler structure before using
awk '/[^"]$/ { printf("%s", $0); next } 1' tmp.file | ( while read line
do
name=$(echo $line | awk -F ',' '{print $1}')
exit_code=$(echo $line | awk -F ',' '{print $2}')
output=$(echo $line | awk -F ',' '{print $3}')
#I can then do stuff with the output here and ultimately do this:
echo -e "|${name}\t|${exit_code}\t|${output}\t|"
done
)
If all you want to do is to display as a table, you can use column utility
awk '/[^"]$/ { printf("%s", $0); next } 1' tmp.file | column -t -o " | " -s ,
If you are so particular about the starting and ending seperator '|', you can simply pipe the output of this command to a sed|awk.
I have this String:
"a | a | a | a | a | a | a | a"
and I want to replace every " | " with an incrementing value like so:
"a0a1a2a3a4a5a6a"
I know I can use gsub to replace strings:
> echo "a | a | a | a | a | a | a | a" | awk '{gsub(/\ \|\ /, ++i)}1'
a1a1a1a1a1a1a1a
But it seems gsub only increments after each newline, so my solution for now would be first putting a newline after each " | ", then using gsub and deleting the newlines again:
> echo "a | a | a | a | a | a | a | a" | awk '{gsub(/\ \|\ /, " | \n")}1' | awk '{gsub(/\ \|\ /, ++i)}1' | tr -d '\n'
a1a2a3a4a5a6a7a
Which is honestly just disgusting...
Is there a better way to do this?
If perl is okay:
$ echo 'a | a | a | a | a | a | a | a' | perl -pe 's/ *\| */$i++/ge'
a0a1a2a3a4a5a6a
*\| * match | surrounded by zero or more spaces
e modifier allows to use Perl code in replacement section
$i++ use value of $i and increment (default value 0)
You can use awk like this:
s="a | a | a | a | a | a | a | a"
awk -F ' *\\| *' -v OFS="" '{s=""; for(i=1; i<NF; i++) s = s $i i-1; print s $i}' <<< "$s"
a0a1a2a3a4a5a6a
-F ' *\\| *' will sets | surrounded by optional spaces as input field separator.
for loop just goes through each field and appends field incrementing position after each field.
If using just sh is an option, then perhaps substitute until a fixed point is reached:
s=$1 # first argument passed to script, "a | a | a |..."
n=0
while true
do
prev=$s
s=${s%" | a"}
test "$s" = "$prev" && break
result=$result${n}"a"
n=$((n + 1))
done
echo $s$result
If this program lives in script file digits.sh,
$ sh digits.sh "a | a | a | a | a | a | a | a"
a0a1a2a3a4a5a6a
$
Another solution using awk
echo "a | a | a | a | a | a | a | a" |
awk -v RS="[ ]+[|][ ]+" '{printf "%s%s",(f?NR-2:""),$0; f=1}'
you get,
a0a1a2a3a4a5a6a
Part of my Bash assignment includes reading a text file, then separating each line into words and using them.
The words are separated by |, lines are separated by \n. We were told to use the tr command, but I couldn't find an elegant solution.
An example:
Hello | My | Name | Is | Bill
should give:
Hello
My
Name
Is
Bill
One word per iteration.
You only need one invocation of tr to do the job:
$ echo "Hello | My | Name | Is | Bill" | tr -cs '[:alpha:]' '\n'
Hello
My
Name
Is
Bill
$
The -c option is for 'the complement' of the characters in the first pattern; the -s option 'squeezes' out duplicate replacement characters. So, anything that's not alphabetic is converted to a newline, but consecutive newlines are squeezed to a single newline.
Clearly, if you need to keep 'Everyone else | can | call | me | Fred' with the two words in the first line of output, then you have to work considerably harder:
$ echo "Everyone else | can | call | me | Fred" |
> tr '|' '\n' |
> sed 's/ *$//;s/^ *//'
Everyone else
can
call
me
Fred
$
The sed script here removes leading and trailing blanks, leaving intermediate blanks unchanged. You can replace multiple blanks with a single blank if you need to, and so on and so forth. You can't use tr to conditionally replace a given character (to change some blanks and leave others alone, for example).
some other options:
awk:
awk -F'\\| ' -v OFS="\n" '$1=$1'
example:
kent$ echo "Hello | My | Name | Is | Bill" |awk -F'\\| ' -v OFS="\n" '$1=$1'
Hello
My
Name
Is
Bill
grep
grep -o '[^ |]*'
example:
kent$ echo "Hello | My | Name | Is | Bill"|grep -o '[^ |]*'
Hello
My
Name
Is
Bill
sed
sed 's/ | /\n/g'
example:
kent$ echo "Hello | My | Name | Is | Bill" |sed 's/ | /\n/g'
Hello
My
Name
Is
Bil
My favorite perl :)
echo "Hello | My | Name | Is | Bill" | perl -pe 's/\s*\|\s*/\n/g'
will remove the excessive spaces too, so
echo "Hello | My | Name | Is | Bill" | perl -pe 's/\s*\|\s*/\n/g' | cat -vet
will print
Hello$
My$
Name$
Is$
Bill$
Using tr:
echo "Hello | My | Name | Is | Bill" | tr -s '\| ' '\n'
OR if you decide to give awk a chance:
echo "Hello | My | Name | Is | Bill" | awk -F '\|' '{for (i=1; i<=NF; i++) {
sub(/ /, "", $i); print $i}}'
This code should do it, converts '|' to newline, remove leading/trailing space:
echo "Hello | My | Name | Is | Bill" | tr '|' '\n' | tr -d [:blank:]
File temp: Hello | My | Name | Is | Bill
$ cat temp | tr '|' '\n' | sed 's/^ *//g'
Hello
My
Name
Is
Bill
$
The sed part gets rid of leading spaces (because there is a space between the '|' and the word. This will also work for "Hello everyone | My | Name | Is | Bill":
$ cat temp | tr '|' '\n' | sed 's/^ *//g'
Hello everyone
My
Name
Is
Bill
$
Need to optimize the UNIX shell one liner
cat ${TEMPFILE} | cut -d ' ' -f1 | sed '/^$/d'| sed '1,4d'| sed 's/$/|ON_ICE|OFF_ICE/g' > ${MYREPORT}
as this is causing performance issues.
Call sed only once:
cat ${TEMPFILE}|cut -d ' ' -f1|sed '/^$/d;1,4d;s/$/|ON_ICE|OFF_ICE/g'>${MYREPORT}
use awk as follows:
awk '{$0=$1};if (NF>1){++rec}; if(NF > 1 && rec > 4 ){sub(/$/,"|ON_ICE|OFF_ICE")); print};' ${TEMPFILE} > ${MYREPORT}
awk '/^$/ || ++count <= 4 {next} {print $1 "|ON_ICE|OFF_ICE"}' "$TEMPFILE" > "$MYREPORT"
In
cat ${TEMPFILE} | cut -d ' ' -f1 | sed '/^$/d' | sed '1,4d' | sed 's/$/|ON_ICE|OFF_ICE/g' > ${MYREPORT}
clearly you can replace sed '1,4d' with tail +4