Separate and add numbers from an external file with .sh - bash
Question #1
How can I read a column and add each entry from a file using .sh?
Example file:
10000:max:100:1,2:3,4
10001:jill:50:7,8:3,2
10002:fred:300:5,6:7,8
How to use IFS=':' to read that file with a .sh file line by line and add the third part so that it would output the addition e.g. 450
$ ./myProgram myFile.txt
450
A simple awk one-liner command would do this job.
$ awk -F: '{sum+=$3}END{print sum}' file
450
For each line, awk would add the column 3 value to the variable sum. Printing the variable sum at the end will give you the total count. -F: sets the Field Separator value to colon.
It's simple. Try using awk like:
awk -F':' '{sum+=$3} END {print sum}' myfile.txt
Here -F is delimeter where we say fields are delimeted with colon ":" present in file myfile.txt
We add $3 value to sum. And once that's done, we print the value of sum.
Related
AWK remove blank lines and append empty columns to all csv files in the directory
Hi I am looking for a way to combine all the below commands together. Remove blank lines in the csv file (comma delimited) Add multiple empty columns to each line up to 100th column Perform action 1 & 2 on all the files in the folder I am still learning and this is the best I could get: awk '!/^[[:space:]]*$/' x.csv > tmp && mv tmp x.csv awk -F"," '($100="")1' OFS="," x.csv > tmp && mv tmp x.csv They work out individually but I don't know how how to put them together and I am looking for ways to have it run through all the files under the directory. Looking for concrete AWK code or shell script calling AWK. Thank you! An example input would be: a,b,c x,y,z Expected output would be: a,b,c,,,,,,,,,, x,y,z,,,,,,,,,,
you can combine in one script without any loops $ awk 'BEGIN{FS=OFS=","} FNR==1{close(f); f=FILENAME".updated"} NF{$100=""; print > f}' files... it won't overwrite the original files.
You can pipe the output of the first to the other: awk '!/^[[:space:]]*$/' x.csv | awk -F"," '($100="")1' OFS="," > new_x.csv If you wanted to run the above on all the files in your directory, you would do: shopt -s nullglob for f in yourdirectory/*.csv; do awk '!/^[[:space:]]*$/' "${f}" | awk -F"," '($100="")1' OFS="," > new_"${f}" done The shopt -s nullglob is so that an empty directory won't give you a literal *. Quoted from a good source for about looping through files
With recent enough GNU awk you could: $ gawk -i inplace 'BEGIN{FS=OFS=","}/\S/{NF=100;$1=$1;print}' * Explained: $ gawk -i inplace ' # using GNU awk and in-place file editing BEGIN { FS=OFS="," # set delimiters to a comma } /\S/ { # gawk specific regex operator that matches any character that is not a space NF=100 # set the field count to 100 which truncates fields above it $1=$1 # edit the first field to rebuild the record to actually get the extra commas print # output records }' * Some test data (the first empty record is empty, the second empty record has a space and a tab, trust me bro): $ cat file 1,2,3 1,2,3,4,5,6, 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101 Output of cat file after the execution of the GNU awk program: 1,2,3,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, 1,2,3,4,5,6,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100
Shell script copying all columns of text file instead of specified ones
I trying to copy 3 columns from one text file and paste them into a new text file. However, whenever I execute this script, all of the columns in the original text file get copied. Here is the code I used: cut -f 1,2,6 PROFILES.1.0.profile > compiledfile.txt paste compiledfile.txt > myNewFile Any suggestions as to what I'm doing wrong? Also, is there a simpler way to do this? Thanks!
Let's suppose that the input is comma-separated: $ cat File 1,2,3,4,5,6,7 a,b,c,d,e,f,g We can extract columns 1, 2, and 6 using cut: $ cut -d, -f 1,2,6 File 1,2,6 a,b,f Note the use of option -d, to specify that the column separator is a comma. By default, cut uses a tab as the column separator. If the separator in your file is anything else, you must use the -d option.
Using awk awk -vFS=your_delimiter_here -vOFS=your_delimiter_here 'print $1,$2,$6' PROFILES.1.0.profile > compiledfile.txt should do it. For comma separated fields the solution would be awk -vFS=, -vOFS=, '{print $1,$2,$6}' PROFILES.1.0.profile > compiledfile.txt FS is an awk builtin variable which stands for field-separator. Similarly OFS stands for output-field-separator. And the handy -v option with awk helps you assign a value to variable.
You could use awk to this. awk -F "delimiter" ' { print $1,$2 ,$3 #Where $1,$2 and so are column numbers }' filename > newfile
How do I pass a stored value as the column number parameter to edit in awk?
I have a .dat file with | separator and I want to change the value of the column which is defined by a number passed as argument and stored in a var. My code is awk -v var="$value" -F'|' '{ FS = OFS = "|" } $1=="$id" {$"\{$var}"=8}1' myfile.dat > tmp && mv tmp myfiletemp.dat This changes the whole line to 8, obviously doesn't work. I was wondering what is the right way to write this part {$"\{$var}"=8}1 For example, if I want to change the fourth column to 8 and I have value=4, how do I get {$4=8}?
The other answer is mostly correct, but just wanted to add a couple of notes, in case it wasn't totally clear. Referring to a variable with a $ in front of it turns it in to a reference to the column. So i=3; print $i; print i will print the third column and then the number 3. Putting all your variables in the command line will avoid any problems with trying to include bash variables inside your single-quoted awk code, which won't work. You can let awk do the output to the specific file instead of relying on bash to redirect output and move files. The -F option on the command line specifies FS for you, so no need to redeclare it in your code. Here's how I would do this: #!/bin/bash column=4 value=8 id=1 awk -v col="$column" -v val="$value" -v id="$id" -F"|" ' BEGIN {OFS="|"} {$1==id && $col=val; print > "myfiletemp.dat"} ' myfile.dat
you can refer to the awk variable directly by it's name, slight rewrite of your script with correct reference to column number var... awk -F'|' -v var="$value" 'BEGIN{OFS=FS} $1=="$id"{$var=8}1' should work as long as $value is a number. If id is another bash variable, pass it the same way as an awk variable awk -F'|' -v var="$value" -v id="$id" 'BEGIN{OFS=FS} $1==id{$var=8}1'
Not only can you use a number in a variable by putting a $ in front of it, you can also use put a $ in front of an expression! $ date | tee /dev/stderr | awk '{print $(2+2)}' Mon Aug 3 12:47:39 CDT 2020 12:47:39
egrep -v match lines containing some same text on each line
So I have two files. Example of file 1 content. /n01/mysqldata1/mysql-bin.000001 /n01/mysqldata1/mysql-bin.000002 /n01/mysqldata1/mysql-bin.000003 /n01/mysqldata1/mysql-bin.000004 /n01/mysqldata1/mysql-bin.000005 /n01/mysqldata1/mysql-bin.000006 Example of file 2 content. /n01/mysqlarch1/mysql-bin.000004 /n01/mysqlarch1/mysql-bin.000001 /n01/mysqlarch2/mysql-bin.000005 So I want to match based only on mysql-bin.00000X and not the rest of the file path in each file as they differ between file1 and file2. Here's the command I'm trying to run cat file1 | egrep -v file2 The output I'm hoping for here would be... /n01/mysqldata1/mysql-bin.000002 /n01/mysqldata1/mysql-bin.000003 /n01/mysqldata1/mysql-bin.000006 Any help would be much appreciated.
Just compare based on everything from /: $ awk -F/ 'FNR==NR {a[$NF]; next} !($NF in a)' f2 f1 /n01/mysqldata1/mysql-bin.000002 /n01/mysqldata1/mysql-bin.000003 /n01/mysqldata1/mysql-bin.000006 Explanation This reads file2 in memory and then compares with file1. -F/ set the field separator to /. FNR==NR {a[$NF]; next} while reading the first file (file2), store every last piece into an array a[]. Since we set the field separator to /, this is the mysql-bin.00000X part. !($NF in a) when reading the second file (file1) check if the last field (mysql-bin.00000X part) is in the array a[]. If it does not, print the line. I'm having one problem that I've noticed when testing. If file2 is empty nothing is returned at all where as I would expected every line in file1 to be returned. Is this something you could help me with please? – user2841861. Then the problem is that FNR==NR matches when reading the second file. To prevent this, just cross check that the "reading into a[] array" action is done on the first file: awk -F/ 'FNR==NR && argv[1]==FILENAME {a[$NF]; next} !($NF in a)' f2 f1 ^^^^^^^^^^^^^^^^^^^^ From man awk: ARGV The command-line arguments available to awk programs are stored in an array called ARGV. ARGC is the number of command-line arguments present. See section Other Command Line Arguments. Unlike most awk arrays, ARGV is indexed from zero to ARGC - 1
Processing CSV items one by one using awk
Using the following script to access CSV items. #!/bin/bash awk -F "|" 'NR > 0 {print $1}' UserAgents.csv When running the script I am getting the correct output, i.e. the entire set of values in the first 'column' of the CSV are printed to the terminal. What I would like to add is to read these items one by one and perform some operation on them like concatenate it with a string, and then output them (to file, pipe, or terminal) one by one.
This should make it clear what your awk script is doing: awk -F '|' '{ print NR, NF, $1, "with some trailing text" }' UserAgents.csv