Cannot print in awk command in bash script - bash

I am trying to read values from a file and print specific items into a variable which I will use later.
cat /dir1/file1 | while read blmbline2
do
BLMBFILE2=`print $blmbline2 | awk '{$1=""; print $0}'`
echo $BLMBFILE2
done
When I run that same code at the command line, it runs as expected, but, when I run it in a bash script called testme.sh, I get this error:
./testme.sh: line 3: print: command not found
If I run print by itself at the command prompt, I don't get an error (just a blank line).
If I run "bash" and then print at the command prompt, I get command not found.
I can't figure out what I'm doing wrong. Can someone suggest?
updated: I see some other posts that say to use echo or printf? Is there a difference I need to be concerned with in using one of those in bash?

Since awk can read files, you may be able to do away with the cat | while read and just use awk. Using a sample file containing:
1 2 3 4 5 6
1 2 3 4 5 6
1 2 3 4 5 6
1 2 3 4 5 6
1 2 3 4 5 6
1 2 3 4 5 6
Declare your bash array variable and populate with the output from awk:
arr=() ; arr=($(awk '{$1=""; print $0}' /dir1/file1))
Use the following to display array size and contents:
printf "array length: %d\narray contents: %s\n" "${#arr[#]}" "${arr[*]}"
Output:
array length: 30
array contents: 2 3 4 5 6 2 3 4 5 6 2 3 4 5 6 2 3 4 5 6 2 3 4 5 6 2 3 4 5 6

Change print to echo in your shell script. With printf you can format the data and with echo it will print the entire line of the file. Also, create an array so you can store multiple items:
BLMBFILE2=()
while IFS= read -r -d $'\0'
do
BLMBFILE2+=(`echo $REPLY | awk '{$1=""; print $0}'`)
echo $BLMBFILE2
done < <(cat /dir1/file1)
echo "Items found:"
for value in "${BLMBFILE2[#]}"
do
echo $value
done

Related

Bash how to add a sum of unknown numbers from a user [duplicate]

This question already has an answer here:
Command Line Arguments vs Input - What's the Difference?
(1 answer)
Closed 12 months ago.
I have for example a user with a set of numbers. How can I make bash add them together?
Example in one go the user enters
(The amount of numbers they enter is up to them and it is unknown)
bash file 3 1 5 2 2 4
How can I make bash return 17 directly from that example?
I tried
#!/usr/bin/env sh
sum=0
while read number && [ -n "$number" ]; do
sum=$((sum + ${number/#-}))
echo "$sum"
done
But this is not clean and it is returning
$ bash file
3
3
1
4
5
9
2
11
2
13
4
17
I instead want the user to only place their numbers in 1 go and not be there to put more and more numbers
Instead of having them excute the command like I have it like
bash file
1
3
4
etc
instead I want to do it in 1 go
bash file 1 3 5 6
How?
You can loop through all script arguments and calculate sum:
#!/usr/bin/env sh
sum=0
for i in "$#"; do
sum=$(( $sum + $i ))
done
echo $sum
Running with your example:
$ bash sum 3 1 5 2 2 4
17

Efficient way of indexing a specific number from a text file

I have a text file containing a line of various numbers (i.e. 2 4 1 7 12 1 4 4 3 1 1 2)
I'm trying to get the index for each occurrence of 1. This is my code for what I'm currently doing (subtracting each index value by 1 since my indexing starts at 0).
eq='0'
gradvec=()
count=0
length=0
for item in `cat file`
do
((count++))
if (("$item"=="$eq"))
then
((length++))
if (("$length"=='1'))
then
gradvec=$((count -1))
else
gradvec=$gradvec' '$((count - 1))
fi
fi
done
Although the code works, I was wondering if there was a shorter way of doing this? The result is the gradvec variable being
2 5 9 10
Consider this as the input file:
$ cat file
2 4 1 7 12 1
4 4 3 1 1 2
To get the indices of every occurrence of 1 in the input file:
$ awk '$1==1 {print NR-1}' RS='[[:space:]]+' file
2
5
9
10
How it works:
$1==1 {print NR-1}
If the value in any record is 1, print the record number minus 1.
RS='[[:space:]]+'
Define the record separator as one or more of any kind of space.

How to add multi column data using Bash

I have a bash file say input.dat which looks like following.
1 2 4 6
2 3 6 9
3 4 8 12
I want the data in 2nd, 3rd and 4th column to be added and printed in output.dat file like following
1 12
2 18
3 24
How can this be achieved in bash ?
Using awk you can do this:
awk '{print $1, $2+$3+$4}' input.dat
and if you prefer bash it can be done like this (at least if the numbers are integers): bash sum.sh < input.dat and sum.sh is
sum.sh
while read -r v1 v2 v3 v4;
do
echo $v1 $(( v2 + v3 + v4 ))
done

Randomly sample lines retaining commented header lines

I'm attempting to randomly sample lines from a (large) file, while always retaining a set of "header lines". Header lines are always at the top of the file and unlike any other lines, begin with a #.
The actual file format I'm dealing with is a VCF, but I've kept the question general
Requirements:
Output all header lines (identified by a # at line start)
The command / script should (have the option to) read from STDIN
The command / script should output to STDOUT
For example, consider the following sample file (file.in):
#blah de blah
1
2
3
4
5
6
7
8
9
10
An example output (file.out) would be:
#blah de blah
10
2
5
3
4
I have a working solution (in this case selecting 5 non-header lines at random) using bash. It is capable of reading from STDIN (I can cat the contents of file.in into the rest of the command) however it writes to a named file rather than STDOUT:
cat file.in | tee >(awk '$1 =~ /^#/' > file.out) | awk '$1 !~ /^#/' | shuf -n 5 >> file.out
By using process substitution (thanks Tom Fenech), both commands are seen as files.
Then using cat we can concatenate these "files" together and output to STDOUT.
cat <(awk '/^#/' file) <(awk '!/^#/' file | shuf -n 10)
Input
#blah de blah
1
2
3
4
5
6
7
8
9
10
Output
#blah de blah
1
9
8
4
7
2
3
10
6
5

setting awk variables through inlining

I've got this:
./awktest -v fields=`cat testfile`
which ought to set fields variable to '1 2 3 4 5' which is all that testfile contains
It returns:
gawk: ./awktest:9: fatal: cannot open file `2' for reading (No such file or directory)
When I do this it works fine.
./awktest -v fields='1 2 3 4 5'
printing fields at the time of error yields:
1
printing fields in the second instance yields:
1 2 3 4 5
When I try it with 12345 instead of 1 2 3 4 5 it works fine for both, so it's a problem with the white space. What is this problem? And how do I fix it.
This is most likely not an awk question. Most likely, it is your shell that is the culprit.
For example, if awktest is:
#!/bin/bash
i=1
for arg in "$#"; do
printf "%d\t%s\n" $i "$arg"
((i++))
done
Then you get:
$ ./awktest -v fields=`cat testfile`
1 -v
2 fields=1
3 2
4 3
5 4
6 5
You see that the file contents are not being handled as a single word.
Simple solution: use double quotes on the command line:
$ ./awktest -v fields="$(< testfile)"
1 -v
2 fields=1 2 3 4 5
The $(< file) construct is a bash shortcut for `cat file` that does not need to spawn an external process.
Or, read the first line of the file in the awk BEGIN block
awk '
BEGIN {getline fields < "testfile"}
rest of awk program ...
'
./awktest -v fields="`cat testfile`"
#note that:
#./awktest -v fields='`cat testfile`'
#does not work

Resources