I'm using the ls command to list files to be used as input. For each file found, I need to
Perform a system command (importdb) and write to a log file.
Write to an error log file if the first character of column 2, line 6 of the log file created in step 1 is not "0".
rename the file processed so it won't get re-processed on the next run.
My script:
#!/bin/sh
ls APCVENMAST_[0-9][0-9][0-9][0-9]_[0-9][0-9] |
while read LINE
do
importdb -a test901 APCVENMAST ${LINE} > importdb${LINE}.log
awk "{if (NR==6 && substr($2,1,1) != "0")
print "ERROR processing ", ${LINE} > importdb${LINE}err.log
}" < importdb${LINE}.log
mv ${LINE} ${LINE}.PROCESSED
done
This is very preliminary code, and I'm new to this, but I can't get passed parsing errors as the one below.
The error context is:
{if (NR==6 && >>> substr(, <<< awk The statement cannot be correctly parsed.
Issues:
Never double quote an awk script.
Always quote literal strings.
Pass in shell variables correctly either by using -v if you need to access the value in the BEGIN block or after the scripts i.e. awk -v awkvar="$shellvar" 'condition{code}' file or by awk condition{code}' awkvar="$shellvar"
Always quote shell variables.
Conditional should be outside block.
There is ambiguity with redirection and concatenation precedence so use parenthesis.
So the corrected (syntactical) script:
awk 'NR==6 && substr($2,1,1) != 0 {
print "ERROR processing ", line > ("importdb" line "err.log")
}' line="${LINE}" "importdb${LINE}.log"
You have many more issues but as I don't know what you are trying to achieve it's difficult to suggest the correct approach...
You shouldn't parse the output of ls
Awk reads files you don't need to loop using shell constructs
Related
I'm trying to make an awk command which stores an entire config file as variables.
The config file is in the following form (keys never have spaces, but values may):
key=value
key2=value two
And my awk command is:
$(awk -F= '{printf "declare %s=\"%s\"\n", $1, $2}' $file)
Running this without the outer subshell $(...) results in the exact commands that I want being printed, so my question is less about awk, and more about how I can run the output of awk as commands.
The command evaluates to:
declare 'key="value"'
which is somewhat of a problem, since then the double quotes are stored with the value. Even worse is when a space is introduced, which results in:
declare 'key2="value' two"
Of course, I cannot remove the quotes or the multi-word values cause problems.
I've tried most every solution I could find, such as set -f, eval, and system().
You don't need to use Awk for this but the do this with built-ins available. Read the config file properly using input redirection
#!/bin/bash
while IFS== read -r k v; do
declare "$k"="$v"
done < config_file
and source the file as
$ source script.sh
$ echo "$key"
value
$ echo "$key2"
value two
If source is not available explicitly, POSIX-ly way of doing it would be to do just
. ./script.sh
I have a file with below commands
cat /some/dir/with/files/file1_name.tsv|awk -F "\\t" '{print $21$19$23$15}'
cat /some/dir/with/files/file2_name.tsv|awk -F "\\t" '{print $2$13$3$15}'
cat /some/dir/with/files/file3_name.tsv|awk -F "\\t" '{print $22$19$3$15}'
When i loop through the file to run the command, i get below error
cat file | while read line; do $line; done
cat: invalid option -- 'F'
Try `cat --help' for more information.
You are not executing the command properly as you intended it. Since you are reading line by line on the file (for unknown reason) you could call the interpreter directly as below
#!/bin/bash
# ^^^^ for running under 'bash' shell
while IFS= read -r line
do
printf "%s" "$line" | bash
done <file
But this has an overhead of creating a forking a new process for each line of the file. If your commands present under file are harmless and is safe to be run in one shot, you can just as
bash file
and be done with it.
Also for using awk, just do as below for each of the lines to avoid useless cat
awk -F "\\t" '{print $21$19$23$15}' file1_name.tsv
You are expecting the pipe (|) symbol to act as you are accustomed to, but it doesn't. To help you understand, try this :
A="ls / | grep e" # Loads a variable with a command with pipe
$A # Does not work
eval "$A" # Works
When expanding a variable without using eval, expansion and word splitting occurs after the shell interprets redirections and pipes, so your pipe symbol is seen just as a literal character.
Some options you have :
A) Avoid piping, by passing the file name as an argument
awk -F "\\t" '{print $21$19$23$15}' /some/dir/with/files/file1_name.tsv
B) Use eval as shown below, the potential security implications of which I would suggest you to research.
C) Put arguments in file and parse it, avoiding the use of eval, something like :
# Assumes arguments separated by spaces
IFS=" " read -r -a arguments;
awk "${arguments[#]-}"
D) Implement the parsing of your data files in Bash instead of awk, and use your configuration file to specify output without the need for expanding anything (e.g. by specifying fields to print separated by spaces).
The first three approaches involve some form of interpretation of outside data as code, and that comes with risks if the file used as input cannot be guaranteed safe. Approach C might be considered a bit better in that regard, but since the command you are calling is awk, an actual program is passed to awk, so whatever awk can do, an attacker (or careless user) with write access to your file can cause your script to do anything awk can do.
Can someone please help with this because I can't seem to find a solution. I have the following script that works fine:
#!/bin/bash
#Checks the number of lines in the userdomains file
NUM=`awk 'END {print NR}' /etc/userdomains.hristian`;
echo $NUM
#Prints out a particular line from the file (should work with $NUM eventually)
USER=`sed -n 4p /etc/userdomains.hristian`
echo $USER
#Edits the output so that only the username is left
USER2=`echo $USER | awk '{print $NF}'`
echo $USER2
However, when I substitute the 4 on line 12 with the variable $NUM, like this, it doesn't work:
USER=`sed -n $NUMp /etc/userdomains.hristian`
I tried a number of different combinations of quotes and ${}, however none of them seem to work because I'm a BASH newbie. Help please :)
I'm not sure exactly what you've already tried but this works for me:
$ cat out
line 1
line 2
line 3
line 4
line 5
$ num=4
$ a=`sed -n ${num}p out`
$ echo "$a"
line 4
To be clear, the issue here is that you need to separate the expansion of $num from the p in the sed command. That's what the curly braces do.
Note that I'm using lowercase variable names. Uppercase ones should be be reserved for use by the shell. I would also recommend using the more modern $() syntax for command substitution:
a=$(sed -n "${num}p" out)
The double quotes around the sed command aren't necessary but they don't do any harm. In general, it's a good idea to use them around expansions.
Presumably the script in your question is a learning exercise, which is why you have done all of the steps separately. For the record, you could do the whole thing in one go like this:
awk 'END { print $NF }' /etc/userdomains.hristian
In the END block, the values from the last line in the file can still be accessed, so you can print the last field directly.
Your trying to evaluate the variable $NUMp rather than $NUM. Try this instead:
USER=`sed -n ${NUM}p /etc/userdomains.hristian`
I am trying to check the ends of code for semi-colons as they are causing me some issues for a server I have running. To do this I am using a bash script (as I am more familiar with bash) to read through the lines and return those that doesn't end with a semi-colon. My bash script is as follows
while read line
do
if[$line!=*;]
echo $line
fi
done < $1
When I run the script, it says there is an error by fi but I cannot figure it out. I also realize this will return statements like if and while but that will be fine for my needs.
Given the sample input
use CGI;
print "<html>"
print "<head>";
print "</head>";
print "<body><p> HELLO WORLD </p>";
print "</body>";
print "</html>"
this should be the output
print "<html>"
print "</html>"
I think the easiest way would be with grep. Given an input.txt file like this:
spam
foo;<Space><Space>
sausage
baked;<Tab>
beans
unladen;
You could do
grep -v ';\s*$' input.txt
and obtain
spam
sausage
beans
grep's -v flag means "return all lines not matching this regular expression", so it will skip all lines ending with semi-colons.
If your lines have also spaces after the semi-colons, the \s* means "any sequence of space characters" so grep will remove those lines aswell.
The reason you have a problem is that your if statement requires a then. You also need some more spaces and to quote your variables. It still won't work, though. Your comparison is wrong, too - that's not how [ works to compare strings. You can use bash's [[ instead:
while read line
do
if [[ "$line" != *\; ]]
then
echo "$line"
fi
done < $1
But even with all that, what you really should be doing is:
grep -v ';$' $1
To search lines with semicolon ; and without semicolon in a linux file
$ grep ';' file_name
$ grep -v ';\s*$' file_name
Is it possible to have an awk command within a bash script return values to a bash variable, i.e.,if my awk script does some arithmetic operations, can I store the answers in variables so, they can be accessed in the bash script. If possible, how to distinguish between multiple return variables. Thanks.
No. You can use exit to return an error code, but in general you can't modify the shell environment from a subprocess.
You can also, of course, print the desired content in awk and put it into variables in bash by using read:
read a b c <<< $(echo "foo" | awk '{ print $1; print $1; print $1 }')
Now $a, $b and $c are all 'foo'. Note that you have to use the <<<$() syntax to get read to work. If you use a pipeline of any sort a subprocess is created too and the environment read creates the variables in is lost when the pipeline is done executing.
var=$(awk '{ print $1}')
This should set var to the output of awk. Then you can use string functions or whatever from there to differentiate within the value or have awk print only the part you want.
I know this question is old, but there's another way to do this that's worked really well for me, and that's using an unused file descriptor. We all know stdin (&0), stdout (&1), and stderr (&2), but as long as you redirect it (aka: use it), there's no reason you can't use fd3 (&3).
The advantage to this method over other answers is that your awk script can still write to stdout like normal, but you also get the result variables in bash.
In your awk script, at the end, do something like this:
END {
# print the state to fd3
printf "SUM=%s;COUNT=%s\n", tot, cnt | "cat 1>&3"
}
Then, in your bash script, you can do something like this:
awk -f myscript.awk <mydata.txt 3>myresult.sh
source myresult.sh
echo "SUM=${SUM} COUNT=${COUNT}"