I have a lot of *.csv files. I want to delete the content after a specific line. I will delete all lines after 20031231
How do I solve this problem with some lines of a shell script?
Test,20031231,000107,0.74843,0.74813
Test,20031231,000107,0.74838,0.74808
Test,20031231,000108,0.74841,0.74815
Test,20031231,000108,0.74835,0.74809
Test,20031231,000110,0.74842,0.74818
Test,20040101,000100,0.73342,0.744318
quick and dirty but without any other info about constraint
sed '1,/20031231/p;d' YourFile
If you want to use a shell script, the best is to use awk. This will do the trick:
awk 'BEGIN {FS=","} {if ($2 == "20031231") print $0}' input.csv > output.csv
This code will write to a different file only the lines that have 20031231.
ignores empty lines and unmatched data
awk file:
$ cat awk.awk
{
if($2<="20031231" && $0!=""){
print $0
}else{
next
}
}
execution:
$ awk -F',' -f awk.awk input
Test,20031231,000107,0.74843,0.74813
Test,20031231,000107,0.74838,0.74808
Test,20031231,000108,0.74841,0.74815
Test,20031231,000108,0.74835,0.74809
Test,20031231,000110,0.74842,0.74818
one liner:
$ awk -F',' '{if($2<="20031231" && $0!=""){print $0}else{next}}' input
Test,20031231,000107,0.74843,0.74813
Test,20031231,000107,0.74838,0.74808
Test,20031231,000108,0.74841,0.74815
Test,20031231,000108,0.74835,0.74809
Test,20031231,000110,0.74842,0.74818
with Miller (http://johnkerl.org/miller/doc/)
mlr --nidx --fs "," filter '$2>20031231' input
gives you
Test,20040101,000100,0.73342,0.744318
With awk please try:
awk -F, '$2<=20031231' input.csv
In my project, I have two files.
The content userid is :
6534
4524
4522
6635
The content userpwinfo.txt is:
nsgg315_RJ:x:4520:100::/home-gg/users/nsgg315_RJ:/bin/bash
nsgg316_ZJY:x:4521:100::/home-gg/users/nsgg316_ZJY:/bin/bash
nsgg317_CPA:x:4522:100::/home-gg/users/nsgg317_CPA:/bin/bash
nsgg318_ZRL:x:4523:100::/home-gg/users/nsgg318_ZRL:/bin/bash
nsgg319_YYM:x:4524:100::/home-gg/users/nsgg319_YYM:/bin/bash
Now I want to print the username which id is in userid. I writed a bash shell like:
for i in $(cat userid)
do
#username=`awk -F: '{if($3=="$i") print $1}' /root/userpwinfo.txt`
#username=`awk -F: '$3=="$i" {print $1}' /root/userpwinfo.txt`
#username=`awk -F: '{if($3~/$i/) print $1}' /root/userpwinfo.txt`
username=`awk -F: '{if($3==$i) print $1}' /root/userpwinfo.txt`
echo $username
done
But unlucky, it shows nothing. The correct result should be:
nsgg319_YYM
nsgg317_CPA
I have tried in command line:
awk -F: '{if($3==4524) print $1}' /root/userpwinfo.txt
It is OK
Maybe if($3==$i) is wrong in shell, Who can help me?
Your $i is the shell variable, but it's inside the quotation mark ' so awk will try to interpret it instead of the shell.
Try this:
username=`awk -F: '{if($3=='$i') print $1}' /root/userpwinfo.txt`
Note that the $i is between ' marks, meaning it's outside of the block that will be interpreted by awk, meaning it should be interpreted by the shell.
Also note that if you have an empty line in the input file, your awk command would be if($3==) which is invalid and will yield an error.
I'd like to comment also that awk is meant to have a filter and an execution block. You shouldn't need to write an if inside a block, unless you want something unusual. Meaning, your command would be more appropriately written as:
username=`awk -F: '($3=='$i'){print $1}' /root/userpwinfo.txt`
Note that even this is not a very good solution, but you already have much to think about with only these changes. When you're more familiar with awk or getting more professional, come back and check the comments. ;)
If username is what you needed using the 2 files, you could try
$ cat userpwinfo.txt
nsgg315_RJ:x:4520:100::/home-gg/users/nsgg315_RJ:/bin/bash
nsgg316_ZJY:x:4521:100::/home-gg/users/nsgg316_ZJY:/bin/bash
nsgg317_CPA:x:4522:100::/home-gg/users/nsgg317_CPA:/bin/bash
nsgg318_ZRL:x:4523:100::/home-gg/users/nsgg318_ZRL:/bin/bash
nsgg319_YYM:x:4524:100::/home-gg/users/nsgg319_YYM:/bin/bash
$ cat userid.txt
6534
4524
4522
6635
$ awk -F":" ' { if( NR==FNR ) { a[$3]=$1; next } ; if(a[$1]) print a[$1] }' userpwinfo.txt userid.txt
nsgg319_YYM
nsgg317_CPA
The below awk command (copied and pasted from stackoverflow) works fine from the command line but doesnt print anything when aliased
awk '/WORD/ {print $3}' log.log | awk 'BEGIN{c=0} length($0){a[c]=$0;c++}END{p5=(c/100*5); p5=p5%1?int(p5)+1:p5; print a[c-p5-1]}'
alias getperc="awk '/WORD/ {print \$3}' log.log | awk 'BEGIN{c=0} length(\$0){a[c]=$0;c++}END{p5=(c/100*5); p5=p5%1?int(p5)+1:p5; print a[c-p5-1]}'"
I am fairly new to using bash. What am I missing here?
Don't use aliases. They require an additional layer of quoting, which is troublesome (as here), and they prevent you from being able to usefully parameterize or add conditional logic to your code.
A simple transliteration to a function is:
getperc() { awk '/WORD/ {print $3}' log.log | awk 'BEGIN{c=0} length($0){a[c]=$0;c++}END{p5=(c/100*5); p5=p5%1?int(p5)+1:p5; print a[c-p5-1]}'; }
A slightly more capable one, which will still use log.log by default, but which will also let you provide an alternate input file name (as in getperc alternate.log) or pipe to your function (as in cat alternate.log | getperc):
getperc() {
[[ -t 0 || $1 ]] || set -- - # use "-" (stdin) as input file if not a TTY
# ...this will let you pipe to your function.
awk '/WORD/ {print $3}' "${1:-log.log}" | awk 'BEGIN{c=0} length($0){a[c]=$0;c++}END{p5=(c/100*5); p5=p5%1?int(p5)+1:p5; print a[c-p5-1]}'
}
I think there is confusion by bash regarding $3 and $0 it thinks they are argument of the alias. you can verify this by
try this in bash
alias ech="echo {print \$3}"
it will print just
{print }
but now try
alias ech="echo {print \$\3}"
it will print what you expected
{print $3}
Let me know if this solves your problem
I am trying to pass a for loop index i into awk but keep getting unexpected token awk errors.
First I tried using the -v option within awk:
for i in "${myarray}"
awk -v var=$i '/var/{print}' myfile.dat
done
I also tried calling the variable directly using single quotes:
for i in "${myarray}"
awk '/'"$i"'/{print}' myfile.dat
done
My end goal is to learn how to pass a for loop index variable through awk as the search pattern. I'd like the above code to search through myfile.dat and print lines which contain the strings in myarray.
There are 2 problems:
Array traversing should be like this for i in "${myarray[#]}"
awk treats text between /.../ as regex literal, to use a variable use $0 ~ var.
Your code should be:
for i in "${myarray[#]}"; do
awk -v var="$i" '$0 ~ var' myfile.dat
done
{print} is default action in awk that you can omit as shown above.
you can do the same loop free as well, e.g.,
echo "${myarray[#]}" | tr ' ' '|' | awk 'NR==FNR{pat=$0; next} $0 ~ pat' - file
I'm writing a script that involves generating Awk programs and running them via awk $(...), as in
[lynko#hephaestus] ~ % awk $(echo 'BEGIN { print "hello!" }')
The generated program is going to be more complicated in the end, but first I want to make sure this is possible. In the past I've done
[lynko#hephaestus] ~ % program=$(echo 'BEGIN { print "hello" }')
[lynko#hephaestus] ~ % awk "$program"
hello!
where the grouping is unsurprising. But the first example (under GNU awk, which gives a more helpful error message than mawk which is default on my other machine) gives
[lynko#hephaestus] ~ % awk $(echo 'BEGIN { print "hello!" }')
awk: cmd. line:1: BEGIN blocks must have an action part
presumably because this is executed as awk BEGIN { print "hello!" } rather than awk 'BEGIN { print "hello!" }'. Is there a way I can force $(...) to remain as one group? I'd rather not use "$()" since I'd have to escape all the double-quotes in the program generator.
I'm running Bash 4.2.37 and mawk 1.3.3 on Crunchbang Waldorf.
Put quotes around it. You don't need to escape the double quotes inside it:
awk "$(echo 'BEGIN { print "hello!" }')"
I'm also wondering why you are using an echo statement. Awk doesn't need one.
awk 'BEGIN { print "Awk SQUAWK!" }'
That will work perfectly.