Removing current line of a file - bash

I'm facing something that looks easy, but can't find the answer :
The goal of this function is to remove all the line that contains 3 commas ',' :
while read line; do
COUNT=$(echo $line | grep -o "\," | wc -)
if [ $COUNT -ne 3 ]; then
remove line
fi
done < tmp.txt
I dont find how to remove current line, can you help me ?
I extract this tmp.txt from a larger with grep, if it was in a variable instead of a tmp.txt will it be the same ?
while read line; do
COUNT=$(echo $line | grep -o "\," | wc -)
COUNT=$(echo $line | grep -o "\," | wc -)
if [ $COUNT -ne 3 ]; then
remove line
fi
done <<< "$toto"
Thanks in advance

Using sed command only solution.
sed '/^\([^,]*,\)\{3\}[^,]*$/d' infile
Delete all those line which character comma , occurred exactly 3 times.
Or using awk:
awk -F, 'NF!=4' infile
Or both read from a variable.
sed '/^\([^,]*,\)\{3\}[^,]*$/d' <<<"$variable"
awk -F, 'NF!=4' <<<"$variable"

A simple awk solution
awk 'gsub(/,/,",")!=3' file
gsub replaces the pattern with the specified string and it returns the number of substitutions/replacements made.
We are replacing , with , here and thus gsub will return us the number of , in the string.
Example :
Input file
hello this line has 1 ,
This line, has, 3 ,
This line, has, 4 , commas , Thanks
Output
$ awk 'gsub(/,/,",")!=3' file
hello this line has 1 ,
This line, has, 4 , commas , Thanks

I would have done it in the other way :
while read line; do
COUNT=$(echo $line | grep -o "\," | wc -)
if [ $COUNT -eq 3 ]; then
echo $line >> $tempofile
fi
done < tmp.txt
If the line is matched, keep it, otherwise get to next line.

This simple command can remove all the lines that contains 3
$ awk '!/3/' file_name

Related

how to awk pattern as variable and loop the result?

I assign a keyword as variable, and need to awk from a file using this variable and loop. The file has millions of lines.
i have tried the code below.
DEVICE="DEV2"
while read -r line
do
echo $line
X_keyword=`echo $line | cut -d ',' -f 2 | grep -w "X" | cut -d '=' -f2`
echo $X_keyword
done <<< "$(grep -w $DEVICE $config)"
log="Dev2_PRT.log"
while read -r file
do
VALUE=`echo $file | cut -d '|' -f 1`
HEADER=`echo $VALUE | cut -c 1-4`
echo $file
if [[ $HEADER = 'PTR:' ]]; then
VALUE=`echo $file | cut -d '|' -f 4`
echo $VALUE
XCOORD+=($VALUE)
((X++))
fi
done <<< "awk /$X_keyword/ $log"
expected result:
the log files content lots of below:
PTR:1|2|3|4|X_keyword
PTR:1|2|3|4|Y_rest .....
Filter the X_keyword and get the field no 4.
Unfortunately your shell script is simply the wrong approach to this problem (see https://unix.stackexchange.com/q/169716/133219 for some of the reasons why) so you should set it aside and start over.
To demonstrate the solution, lets create a sample input file:
$ seq 10 | tee file
1
2
3
4
5
6
7
8
9
10
and a shell variable to hold a regexp that's a character list of the chars 5, 6, or 7:
$ var='[567]'
Now, given the above input, here is the solution for how to g/re/p pattern as variable and count how many results:
$ awk -v re="$var" '$0~re{print; c++} END{print "---" ORS c+0}' file
5
6
7
---
3
If that's not all you need then please edit your question to clarify your requirements and provide concise, testable sample input and expected output.

Bash : How to check in a file if there are any word duplicates

I have a file with 6 character words in every line and I want to check if there are any duplicate words. I did the following but something isn't right:
#!/bin/bash
while read line
do
name=$line
d=$( grep '$name' chain.txt | wc -w )
if [ $d -gt '1' ]; then
echo $d $name
fi
done <$1
Assuming each word is on a new line, you can achieve this without looping:
$ cat chain.txt | sort | uniq -c | grep -v " 1 " | cut -c9-
You can use awk for that:
awk -F'\n' 'found[$1] {print}; {found[$1]++}' chain.txt
Set the field separator to newline, so that we look at the whole line. Then, if the line already exists in the array found, print the line. Finally, add the line to the found array.
Note: If a line will only be suppressed once, so if the same line appears, say, 6 times, it will be printed 5 times.

How to search a line containing word in file and from that line to end of file should be echo the date using shell script

cat "file.log"| grep -q '2013-11-10'
while read line
do
echo file_content_time=`echo $line | sed -e 's/\([0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0- 9] [0-9][0-9]:[0-9][0-9]:[0-9][0-9]\).*/\1/'`
if [ $? -eq 0 ]
then
echo comparison_start_date=`date -d "$file_content_time" +%Y%m%d`
fi
done < 'file.log'
/* Here I am trying find out the line containing '2013-11-10' and from that line onwards date has to display .*/
To output everything from a line containing a pattern up to the end-of-file all you need is
awk '/2013-11-10/,/pattern-not-in-file/' file.log
awk '/pattern/{p=1}p' your_file
initial_time=$(grep -o -m1 "2013-11-10 [0-9][0-9]:[0-9][0-9]:[0-9][0-9]" file.log)

How to delete all lines containing more than three characters in the second column of a CSV file?

How can I delete all of the lines in a CSV file which contain more than 3 characters in the second column? E.g.:
cave,ape,1
tree,monkey,2
The second line contains more than 3 characters in the second column, so it will be deleted.
awk -F, 'length($2)<=3' input.txt
You can use this command:
grep -vE "^[^,]+,[^,]{4,}," test.csv > filtered.csv
Breakdown of the grep syntax:
-v = remove lines matching
-E = extended regular expression syntax (also -P is perl syntax)
bash stuff:
> filename = overwrite/create a file and fill it with the standard out
Breakdown of the regex syntax:
"^[^,]+,[^,]{4,},"
^ = beginning of line
[^,] = anything except commas
[^,]+ = 1 or more of anything except commas
, = comma
[^,]{4,} = 4 or more of anything except commas
And please note that the above is simplified and would not work if the first 2 columns contained commas in the data. (it does not know the difference between escaped commas and raw ones)
No one has supplied a sed answer yet, so here it is:
sed -e '/^[^,]*,[^,]\{4\}/d' animal.csv
And here's some test data.
>animal.csv cat <<'.'
cave,ape,0
,cat,1
,orangutan,2
large,wolf,3
,dog,4,happy
tree,monkey,5,sad
.
And now to test:
sed -i'' -e '/^[^,]*,[^,]\{4\}/d' animal.csv
cat animal.csv
Only ape, cat and dog should appear in the output.
This is a filter script for your type of data. It assumes your data is in utf8
#!/bin/bash
function px {
local a="$#"
local i=0
while [ $i -lt ${#a} ]
do
printf \\x${a:$i:2}
i=$(($i+2))
done
}
(iconv -f UTF8 -t UTF16 | od -x | cut -b 9- | xargs -n 1) |
if read utf16header
then
px $utf16header
cnt=0
out=''
st=0
while read line
do
if [ "$st" -eq 1 ] ; then
cnt=$(($cnt+1))
fi
if [ "$line" == "002c" ] ; then
st=$(($st+1))
fi
if [ "$line" == "000a" ]
then
out=$out$line
if [[ $cnt -le 3+1 ]] ; then
px $out
fi
cnt=0
out=''
st=0
else
out=$out$line
fi
done
fi | iconv -f UTF16 -t UTF8

bash: grep only lines with certain criteria

I am trying to grep out the lines in a file where the third field matches certain criteria.
I tried using grep but had no luck in filtering out by a field in the file.
I have a file full of records like this:
12794357382;0;219;215
12795287063;0;220;215
12795432063;0;215;220
I need to grep only the lines where the third field is equal to 215 (in this case, only the third line)
Thanks a lot in advance for your help!
Put down the hammer.
$ awk -F ";" '$3 == 215 { print $0 }' <<< $'12794357382;0;219;215\n12795287063;0;220;215\n12795432063;0;215;220'
12795432063;0;215;220
grep:
grep -E "[^;]*;[^;]*;215;.*" yourFile
in this case, awk would be easier:
awk -F';' '$3==215' yourFile
A solution in pure bash for the pre-processing, still needing a grep:
while read line; do
OLF_IFS=$IFS; IFS=";"
line_array=( $line )
IFS=$OLD_IFS
test "${line_array[2]}" = 215 && echo "$line"
done < file | grep _your_pattern_
Simple egrep (=grep -E)
egrep ';215;[0-d][0-d][0-d]$' /path/to/file
or
egrep ';215;[[:digit:]]{3}$' /path/to/file
How about something like this:
cat your_file | while read line; do
if [ `echo "$line" | cut -d ";" -f 3` == "215" ]; then
# This is the line you want
fi
done
Here is the sed version to grep for lines where 3rd field is 215:
sed -n '/^[^;]*;[^;]*;215;/p' file.txt
Simplify your problem by putting the 3rd field at the beginning of the line:
cut -d ";" -f 3 file | paste -d ";" - file
then grep for the lines matching the 3rd field and remove the 3rd field at the beginning:
grep "^215;" | cut -d ";" -f 2-
and then you can grep for whatever you want. So the complete solution is:
cut -d ";" -f 3 file | paste -d ";" - file | grep "^215;" | cut -d ";" -f 2- | grep _your_pattern_
Advantage: Easy to understand; drawback: many processes.

Resources