prevent duplicate variable and print using awk statement - shell

I am iterating through a file and printing a set of values using awk
echo $value | awk ' {print $4}' >> 'some location'
the command works fine , but I want to prevent the duplicate values being stored in the file
Thanks in advance.

Instead of processing the file line by line, you should use a single awk command for the entire file
For example:
awk '!a[$4]++{print $4}' file >> 'some location'
Will only keep the unique values of the fourth column

Using only one instance of awk as suggested by user000001 is certainly the right thing to do, and since very little detail is given in the question this is pure speculation, but the simplest solution may be a trivial refactor of your loop. For example, if the current code is:
while ...; do
...
echo $value | awk ...
...
done
You can simply change it to:
while ...; do
...
echo $value >&5
...
done 5>&1 | awk '!a[$4]++{print $4}' >> /p/a/t/h
Note that although this is a "simple" fix in terms of code to change, it is almost certainly not the correct fix! Removing the while loop completely and just using awk is the right thing to do.

Related

How to get '2f8b547d..eb94967a' string from the log 'Updating 2f8b547d..eb94967a Fast-forward....' in shell?

I am building a shell script.
The script gets git log such as:
"Updating 2f8b547d..eb94967a Fast-forward...."
but I want to get 2f8b547d..eb94967a snippet.
I am a new one for the shell. So, Thanks for you help.
Update:
For the more, I want use the snippet as a param. Because I will excute
git log 2f8b547d..eb94967a
You can pipe it to awk like so:
echo "Updating 2f8b547d..eb94967a Fast-forward...." | awk '{print $2}'
Your result will be 2f8b547d..eb94967a.
If it is a script, say, abc.sh that had such output, then you can run:
$> ./abc.sh | awk '{print $2}'
awk takes the output and splits the information by space. Updating is represented with $1. 2f8b547d..eb94967a is $2 and so on. In the above example, we ask awk to print out the 2nd item in the output.
As an alternative to awk (don't get me wrong, awk is super for this job as well), you can simply use cut with a space delimiter extract the second field, e.g.
cut -d' ' -f2 yourgit.log
You can also pipe output to cut or redirect the input file to it using < as well. It essentially does the same as the awk command, it just being a different choice of which utility to use.
Here another alternative:
echo "Updating 2f8b547d..eb94967a Fast-forward...." | read u hash rest
After this, the string you are looking for is stored on the variable hash:
echo $hash

Swap in shell from file

Through cut -d":" -f1,3 I made a new file which looks like this:
username1:number1
username2:number2
username3:number3
But my point is, I want to my file to looks like this:
number1:username1
number2:username2
number3:username3
I tried that cut -d":" -f3,1 but it still gets me username1:number1 even when I want to be that 3rd column be the 1st and the 1st column to print it like a last one ... Any help with that ?
cut -f3,1 will print the same as cut -f1,3. Use awk:
awk -F: '{print $3 FS $1}' file
I like awk for this sort of thing. You've already got an awk answer though.
To do this in pure bash, you'd do the following:
while IFS=: read -r one two; do printf '%s:%s\n' "$two" "$one"; done < input.txt
The IFS variable is the field separator used to slice up the input into separate variables for read, and I'm using printf to gives us predictably formatted output.

grep launches background processes

I have an input file that contains several path, including one referring to a initial solution. Corresponding line is the following:
initial_solution_file = ../../INIT/foo
What I would like to do is having an alias that would display this path so that I would type "init" and the shell would return " the initial solution is: ../../INIT/foo"
What I have tried is:
grep initial_solution_file input_file | awk '{print $3}' | echo "the initial solution is:" `xargs echo`
It provides the desired output, but I additionaly get something like:
[6] 48201 48202
What is this and how to prevent it from happening ?
Thanks in advance
echo "the initial solution is: $(awk '/initial_solution_file/{print $3}' input_file)"
the initial solution is: ../../INIT/foo
There is no need of pipes , you can do command substitution by using $(....) construct. Also, grep and awk can be done by awk alone.

Shell Script to generate specific columns as separate files

I want to print my first column and 2nd column from radius.dat and save it to rad.2.out, first column with 3rd column as rad.3.out, and so on.
However, this script doesn't seem to be working.
#!/bin/bash
for i in {2..30}
do
awk '{print $1, $i}' radius.dat > 'rad.'$i'.out'
done
Using awk you can do:
awk '{for(i=2;i<=NF;i++) print $1, $i > ("rad."i".out")}' radius.dat
The only caveat is that it will lead to many open files, it might not be a problem if you are not on ancient awk.
What we are doing here is basically using an iterator and iterating through columns starting from the second and printing the first column and the iterator during each iteration to an output file using the naming convention as you desire.
Update (based on your comment to your question):
If you notice too many open files error then you can do:
awk '{
for (i=2; i<=NF; i++) {
print $1, $i >> ("rad."i".out");
close("rad."i".out")
}
}' file
Notice in the second option we use >> instead of >. This is due to the fact that we are closing the file after each iteration so we need to make sure we don't overwrite the existing files.
Your quoting is quite off ... awk never gets the column. Try this:
#!/bin/bash
for i in {2..30}; do
awk "{print \$1, \$$i;}" radius.dat > "rad.$i.out"
done

Substring extraction using bash shell scripting and awk

So, I have a file called 'dummy' which contains the string:
"There is 100% packet loss at node 1".
I also have a small script that I want to use to grab the percentage from this file. The script is below.
result=`grep 'packet loss' dummy` |
awk '{ first=match($0,"[0-9]+%")
last=match($0," packet loss")
s=substr($0,first,last-first)
print s}'
echo $result
I want the value of $result to basically be 100% in this case. But for some reason, it just prints out a blank string. Can anyone help me?
You would need to put the closing backtick after the end of the awk command, but it's preferable to use $() instead:
result=$( grep 'packet loss' dummy |
awk '{ first=match($0,"[0-9]+%")
last=match($0," packet loss")
s=substr($0,first,last-first)
print s}' )
echo $result
but you could just do:
result=$( grep 'packet loss' | grep -o "[0-9]\+%" )
Try
awk '{print $3}'
instead.
the solution below can be used when you don't know where the percentage numbers are( and there's no need to use awk with greps)
$ results=$(awk '/packet loss/{for(i=1;i<=NF;i++)if($i~/[0-9]+%$/)print $i}' file)
$ echo $results
100%
You could do this with bash alone using expr.
i=`expr "There is 98.76% packet loss at node 1" : '[^0-9.]*\([0-9.]*%\)[^0-9.]*'`; echo $i;
This extracts the substring matching the regex within \( \).
Here I'm assuming that the output lines you're interested in adhere strictly to your example, with the percentage value being the only variation.
With that assumption, you really don't need anything more complicated than:
awk '/packet loss/ { print $3 }' dummy
This quite literally means "print the 3rd field of any lines containing 'packet loss' in them". By default awk treats whitespace as field delimiters, which is perfect for you.
If you are doing more than simply printing the percentage, you could save the results to a shell variable using backticks, or redirect the output to a file. But your sample code simply echoes the percentages to stdout, and then exits. The one-liner does the exact same thing. No need for backticks or $() or any other shell machinations whatsoever.
NB: In my experience, piping the output of grep to awk is usually doing something that awk can do all by itself.

Resources