Bash script - For loop multiple arguments and awk

Bash script - For loop multiple arguments and awk - bash

I'm trying to write a BASH script for patching my CentOS servers. My goal is that when I run the script, it outputs the package name, the currently installed version, and new package version. Something like this:
nspr.x86_64 / 4.8.8-3.el6 / 4.8.9-3.el6_2
This way, if I ever need to downgrade because a package broke something, I have a record of this.
The command yum check-update command gives me the 1st piece of information (the package name) and the 3rd piece of information (the new version) listed above, and I can use awk to separate the two. It's easy after that; just run rpm -q {package} to get the 2nd piece of information, then concatenate them.
However, I'm running into trouble with the for loop in my code. I need to pass multiple arguments into the loop (the package name and newer version) so I can echo them later.
Here's my code:
for package in `/usr/bin/yum --cacheonly --noplugins check-update | awk '{print $1, $2}'`;
do
OLD_VER=`rpm -q ${package}` # ${package} should actually be $1 from the awk statement above
NEW_VER=${2} # This is $2 from the awk statement above
echo "${package} / ${OLD_VER} / ${NEW_VER}"
done
Pardon the obvious errors in the code; I'm new to BASH scripting. My confusion mostly stems from awk; I'm not very familiar with it, nor how it interacts with a BASH script. However, I think it's clear what I'm trying to do. Does package get passed as an array? Is there a better way to do this?

you want to read the whole line:
/usr/bin/yum --cacheonly --noplugins check-update |
while read line; do
set - $line;
echo first: $1 second: $2
done

Try this:
/usr/bin/yum --cacheonly --noplugins check-update \
| awk '{package=$1; newVer=$2; cmd = "rpm -q " package; cmd | getline oldVer; close(cmd); print package " / " oldVer " / " newVer}'
The issue with your script was that although both bash and awk use $ to reference variables, they are different vars and you can't reference awk's $2 from bash or vice versa. Writing everything in either awk or bash should solve the issue.

Related

How to make awk command run faster on large data files

I used this awk command below to create a new UUID column in a table in my existing .dat files.
$ awk '("uuidgen" | getline uuid) > 0 {print uuid "|" $0} {close("uuidgen")}' $filename > ${filename}.pk
The problem is that my .dat files are pretty big (like 50-60 GB) and this awk command takes hours even on small data files (like 15MB).
Is there any way to increase the speed of this awk command?

I wonder if you might save time by not having awk open and close uuidgen every line.
$ function regen() { while true; do uuidgen; done; }
$ coproc regen
$ awk -v f="$filename" '!(getline line < f){exit} {print $0,line}' OFS="|" < /dev/fd/${COPROC[0]} > "$filename".pk
This has awk reading your "real" filename from a variable, and the uuid from stdin, because the call to uuidgen is handled by a bash "coprocess". The funky bit around the getline is to tell awk to quit once it runs out of input from $filename. Also, note that awk is taking input from input redirection instead of reading the file directly. This is important; the file descriptor at /dev/fd/## is a bash thing, and awk can't open it.
This should theoretically save you time doing unnecessary system calls to open, run and close the uuidgen binary. On the other hand, the coprocess is doing almost the same thing anyway by running uuidgen in a loop. Perhaps you'll see some improvement in an SMP environment. I don't have a 50GB text file handy for benchmarking. I'd love to hear your results.
Note that coproc is a feature that was introduced with bash version 4. And use of /dev/fd/* requires that bash is compiled with file descriptor support. In my system, it also means I have to make sure fdescfs(5) is mounted.
I just noticed the following on my system (FreeBSD 11):
$ /bin/uuidgen -
usage: uuidgen [-1] [-n count] [-o filename]
If your uuidgen also has a -n option, then adding it to your regen() function with ANY value might be a useful optimization, to reduce the number of times the command needs to be reopened. For example:
$ function regen() { while true; do uuidgen -n 100; done; }
This would result in uuidgen being called only once every 100 lines of input, rather than for every line.
And if you're running Linux, depending on how you're set up, you may have an alternate source for UUIDs. Note:
$ awk -v f=/proc/sys/kernel/random/uuid '{getline u<f; close(f); print u,$0}' OFS="|" "$filename" "$filename".pk
This doesn't require the bash coproc, it just has awk read a random uuid directly from a Linux kernel function that provides them. You're still closing the file handle for every line of input, but at least you don't have to exec the uuidgen binary.
YMMV. I don't know what OS you're running, so I don't know what's likely to work for you.

Your script is calling shell to call awk to call shell to call uuidgen. Awk is a tool for manipulating text, it's not a shell (an environment to call other tools from) so don't do that, just call uuidgen from shell:
$ cat file
foo .*
bar stuff
here
$ xargs -d $'\n' -n 1 printf '%s|%s\n' "$(uuidgen)" < file
5662f3bd-7818-4da8-9e3a-f5636b174e94|foo .*
5662f3bd-7818-4da8-9e3a-f5636b174e94|bar stuff
5662f3bd-7818-4da8-9e3a-f5636b174e94|here

I'm just guessing that the real problem here is that you're running a sub-process for each line. You could read your file explicitly line by line and read output from a batch-uuidgen line by line, and thus only have a single subprocess to handle at once. Unfortunately, uuidgen doesn't work that way.
Maybe another solution?
perl -MData::UUID -ple 'BEGIN{ $ug = Data::UUID->new } $_ = lc($ug->to_string($ug->create)) . " | " . $_' $filename > ${filename}.pk
Might this be faster?

Assign nmap result to an array in bash

I made a bash script to insert the result of nmap command to an array. The script is working on bash 4.3.30, but it does not work when I try to run it on bash 4.4.12. It looks like the array is empty or it just have the first value.
Here is my code:
#!/bin/bash
declare -a IP_ARRAY
NMAP_OUTPUT=`nmap -sL $1 | grep "Nmap scan report" | awk '{print $NF}'`
read -a IP_ARRAY <<< $NMAP_OUTPUT
printf '%s\n' "${IP_ARRAY[#]}"
With bash 4.3, the values of the string NMAP_OUTPUT are well copied to the array IP_ARRAY. The the other version not and I don't find the error.
The string NMAP_OUTPUT looks like:
10.0.0.0 10.0.0.1 10.0.0.2 10.0.0.3 10.0.0.4 10.0.0.5 10.0.0.6 10.0.0.7 10.0.0.8 10.0.0.9 10.0.0.10
Instead of using my code above, this code works:
IP_ARRAY=(${NMAP_OUTPUT})
I would like to understand with my previous code is working on one version and not in the other one.
Thank you very much!!!

Your script has multiple issues which could be fixed. It could be done very simply minimizing a number of steps.
You are using NMAP_OUTPUT as a variable. The bash shell does support arrays which you can use to store a list. Also independent entries present in a variable's context undergo Word-Splitting done by the shell. The consequence of that is, if a entry has spaces in-between, it will be tough to identify if it is a separate word or part of a whole word.
Storing the command output to a variable and later parsing to an array is round about way. You can directly pass the output to an array
Using grep and awk together is not needed, awk can do whatever grep can
Always quote the shell variable and array expansions. Never use unquoted expansion in your results (like in <<< $NMAP_OUTPUT). It could have adverse affects in case of words containing spaces.
Always use lower case variable names for user-defined functions/variables and array names.
Use mapfile built-in
Version of bash v4.0 on-wards provides options mapfile/readarray to directly read from a file or output of command.
All your script needs is
mapfile -t nmapOutput < <(nmap -sL "$1" | awk '/Nmap scan report/{print $NF}')
printf '%s\n' "${nmapOutput[#]}"
There is nothing I could infer why your script didn't work between the versions of bash you've indicated. I was able to run your script on the given input on bash 4.4.12
But the crux of the problem seems to be using variables and arrays interchangeably in the wrong way.

it seems you're trying to do this the hard way.
why not simply:
IP_ARRAY=( `nmap -sL 127.1/29 | grep "Nmap scan report" | awk '{print $NF}'` )

How do I execute a shell command, and assign the output to an AC_DEFINE variable in configure.ac

I need to identify the major version of a software, to do this it's principally to execute
command --version | head | awk -F "." '{print $1}'
Now, i need to assign this output to a AC_DEFINE variable.
I've tried;
majorvar=$(command --version | head | awk -F "." '{print $1}')
AC_DEFINE_UNQUOTED([myVar],["$majorvar"],[description])
here 'command' appeared in the resulting config.h file
and
AC_DEFINE([myVar],0,[descr])
AC_SUBST([myVar],[$(command --version | head | awk -F "." '{print $1}')])
where the value set by the define (0) is appearing in the result.
Is it possible to obtain the result that I want, i.e
#define myVar 5
or am I going about this in the wrong way, and if so, how should I go about it?
BR/Patrik

I needed several month to wrap my head around the concept of macros.
As the name already tells, macros are 'bigger' than scripts. I hope this example can help.
AC_DEFUN([VERSION],
[$(test -f $srcdir/VERSION && cat $srcdir/VERSION || AS_ECHO(["$PACKAGE_VERSION"]); )])
am_stable_version_rx='[[1-9]\.[0-9]+(\.[0-9]+)?]'
am_beta_version_rx="[$am_stable_version_rx[bdfhjlnprtvxz]]"
am_release_type=`AS_ECHO(["$VERSION"]) | LC_ALL=C awk ["
/^$am_stable_version_rx$/ { print \"stable\"; exit(0); }
/^$am_beta_version_rx$/ { print \"beta version\"; exit(0); }
{ print \"development snapshot\"; }"]`
test "$am_release_type" = stable || cat <<EOF
WARNING: You are about to use a $am_release_type of AC_PACKAGE_NAME VERSION.
WARNING: It might easily suffer from new bugs or regressions.
WARNING: You are strongly advised not to use it in production code.
Please report bugs, problems and feedback to <AC_PACKAGE_BUGREPORT>.
EOF
The code snippet defines:
a macro function DEFUN named VERSION
a script variable with regex for versioning
a script variable assignment containing a macro AS_ECHO which places the VERSION script during the generation of the configure script.

AWK - taking incorrect parameters when using $1, $2 etc

I am using a script which launches with a number parameters assigned to it. For example, you can imagine that this is what I am doing when launching the script from the command line:
script.sh "/tmp" "/apps" "/var".
This script runs through a system which I will not delve into detail about. This system is the cause of my problem but I am powerless to make any changes to this.
So, when I use awk '{print $2}', I receive the 2nd parameter I passed to my script ("/apps") rather than the field in the command I am running.
My question: is there any alternative notation that I can use in AWK except for $1, $2 etc to signify field values?
UPDATE:
Here is an example of one command within the script:
df | grep -e /$ | awk '{print $3/1024}'.
This problem is, the 3rd parameter is populated by "/var" as mentioned above

Try \$1, \$2 etc. (The "system" seems to drop at least one level of quoting from the command, so let's add one.)

Extracting words in quotes in shell script

I am making a shell script that will automate the install process of Arch Linux AUR packages. I need to list all dependencies of the package (to check if they are installed), they appear like this in install script:
depends=('sdl' 'libvorbis' 'openal')
The easiest way (or the only idea) that I could come up with is something like this:
grep "depends" PKGBUILD | awk -F"'" '{print $2 "\n" $4 "\n" $6;}'
But the dependency count varies from package to package. So, how I output the names in quotes if the word count is varying?
Thanks in advance,
-skazhy

If the depends is just one line, one thing you may try is to evaluate the line in bash itself... This will lead to an array called "depends" that holds all the values. Seems tricky, but not with dynamic languages:
depend_line=`grep depends $PKGBUILD`
eval "${depend_line}"
echo ${depend[0]} # Will print sdl in your example

You can avoid the security issues of using eval or sourcing a temporary file by using declare:
declare -a "$(grep "depends" PKGBUILD)"
This will create an array called "depends" containing "sdl", "libvorbis" and "openal" based on the example data.

Try this on for size:
grep "depends" PKGBUILD > /tmp/depends
. /tmp/depends
echo ${depends[#]}
Hey look, is that an array? Yes it is.
for d in "${depends[#]}" ; do
printf '"%s"' "$d"
done
Note: In a real script you'd want to be more careful with the naming of the temporary file.

You could do something like:
grep "depends" PKGBUILD | perl -e "while(<>){print \"\$1\\n\" while m/'.{-}'/g;}"

awk -F"'" '{for(i=2;i<=NF;i+=2) print($i)}'

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio