BASH - add prefix (file path) to each line in text file using awk - bash

I am trying to get the full path of a files within a directory. So far this is what I have in bash.
prefix="s3://${s3_bucket}/${s3_folder}/$(date --date="$i days ago" +"%Y/%m/%d")/"
#echo $prefix
aws s3 ls s3://${s3_bucket}/${s3_folder}/$(date --date="$i days ago" +"%Y/%m/%d")/ | sed -n 's/.*\([0-9][0-9]-h.*gz\)/\1/p' | awk '$0="${prefix}"$0' >> ${s3_files_1}
In my output, I am getting the following:
${prefix}file1.gz
${prefix}file2.gz
The output I am looking for is something like below.
s3://my_bucket/my_folder/file1.gz
s3://my_bucket/my_folder/file2.gz
My issue is with the way the awk command is interpreting the variable ${prefix}. Can anyone please help?

You can use -v to pass shell variable contents to awk:
prefix="s3://my_bucket/my_folder/"
echo "file1.gz" | awk -v myprefix="${prefix//\\/\\\\}" '{ print myprefix $0 }'
Sadly, awk -v is not data safe. This example uses parameter expansion to escape backslashes to avoid them being mangled.

Related

give a file without changing the name in script [duplicate]

This question already has answers here:
How to pass parameters to a Bash script?
(4 answers)
Closed 1 year ago.
At the beginning I have a file.txt, which contains several informations that I will take using the grep command as you see in the script.
What I want is to give the script the file I want instead of file.txt but without changing the file name each time in the script for example if the file is named Me.txt I don’t want to go into the script and write Me.txt in each grep command especially if I have dozens of orders.
Is there a way to do this?
#!/bin/bash
grep teste file.txt > testline.txt
awk '{print $2}' testline.txt > test.txt
echo '#'
echo '#'
grep remote file.txt > remoteline.txt
awk '{print $3}' remoteline.txt > remote.txt
echo '#'
echo '#'
grep adresse file.txt > adresseline.txt
awk '{print $2}' adresseline.txt > adresse.txt
Using a parameter, as many contributors here suggested, is of course the obvious approach, and the one which is usually taken in such case, so I want to extend this idea:
If you do it naively as
filename=$1
you have to supply the name on every invocation. You can improve on this by providing a default value for the case the parameter is missing:
filename=${1:-file.txt}
But sometimes you are in a situation, where for some time (working on a specific task), you always need the same filename over and over, and the default value happens to be not the one you need. Another possibility to pass information to a program is via the environment. If you set the filename by
filename=${MOOFOO:-file.txt}
it means that - assuming your script is called myscript.sh - if you invoke your script by
MOOFOO=myfile.txt myscript.sh
it uses myfile.txt, while if you call it by
myscript.sh
it uses the default file.txt. You can also set MOOFOO in your shell, as
export MOOFOO=myfile.txt
and then, even a lone execution of
myscript.sh
with use myfile.txt instead of the default file.txt
The most flexible approach is to combine both, and this is what I often do in such a situation. If you do in your script a
filename=${1:-${MOOFOO:-file.txt}}
it takes the name from the 1st parameter, but if there is no parameter, takes it from the variable MOOFOO, and if this variable is also undefined, uses file.txt as the last fallback.
You should pass the filename as a command line parameter so that you can call your script like so:
script <filename>
Inside the script, you can access the command line parameters in the variables $1, $2,.... The variable $# contains the number of command line parameters passed to the script, and the variable $0 contains the path of the script itself.
As with all variables, you can choose to put the variable name in curly brackets which has advantages sometimes: ${1}, ${2}, ...
#!/bin/bash
if [ $# = 1 ]; then
filename=${1}
else
echo "USAGE: $(basename ${0}) <filename>"
exit 1
fi
grep teste "${filename}" > testline.txt
awk '{print $2}' testline.txt > test.txt
echo '#'
echo '#'
grep remote "${filename}" > remoteline.txt
awk '{print $3}' remoteline.txt > remote.txt
echo '#'
echo '#'
grep adresse "${filename}" > adresseline.txt
awk '{print $2}' adresseline.txt > adresse.txt
By the way, you don't need two different files to achieve what you want, you can just pipe the output of grep straight into awk, e.g.:
grep teste "${filename}" | awk '{print $2}' > test.txt
but then again, awk can do the regex match itself, reducing it all to just one command:
awk '/teste/ {print $2}' "${filename}" > test.txt

How to remove the username/hostname line from an output on Korn Shell?

I run the command
df -gP /data1 /data2 | grep -v File | awk '{print $1}' |
awk -F/dev/ '$0=$2' | tr '\n' '
on the AIX shell (ksh) and it prints the output below:
lv_data01 lv_data02 root#testhost:/
However, I would like the output to be printed this way. Could someone help?
lv_data01 lv_data02
Using grep … | awk … | awk … is not necessary; a single awk could do the whole job. So could sed and it might even be easier. I'd be tempted to deal with the spacing by using:
x=$(df … | sed …); echo $x
The tr command, once corrected, replaces newlines with spaces, so the prompt follows without a newline before it. The ; echo suggestion adds the missing newline; the echo $x suggestion (note no double quotes) does too.
As for the sed command:
sed -n '/File/!{ s/[[:space:]].*//; s%^.*/dev/%%p; }'
Don't print anything by default
If the line doesn't match File (doing the work of grep -v):
remove the first space (blank or tab) and everything after it (doing the work of awk '{print $1}')
replace everything up to /dev/ with nothing and print (doing the work of awk -F/dev/ '{$0=$2}')
The command substitution and capture, followed by echo, deals with spaces and newlines.
So, my suggested solution is:
x=$(df -gP /data1 /data2 | sed -n '/File/!{ s/[[:space:]].*//; s%^.*/dev/%%p; }'); echo $x
You could add unset x after the echo if you are going to be using this directly in the shell and not in a shell script. If it'll be encapsulated in a shell script, you don't have to worry about it.
I'm blithely assuming the output from df -gP won't contain a path such as this, with two occurrences of /dev:
/who/knows/dev/lv_data01/dev/bin
If that's a real problem, you can fix the sed script, but I don't think it will be. It's one thing the second awk script in the question handles differently.

Extract specific string from line with standard grep,egrep or awk

i'm trying to extract a specific string from a grep output
uci show minidlna
produces a large list
.
.
.
minidlna.config.enabled='1'
minidlna.config.db_dir='/mnt/sda1/usb/db'
minidlna.config.enable_tivo='1'
minidlna.config.wide_links='1'
.
.
.
so i tried to narrow down what i wanted by running
uci show minidlna | grep -oE '\bdb_dir=\S+'
this narrows the output to
db_dir='/mnt/sda1/usb/db'
what i want is to output only
/mnt/sda1/usb/db
without the quotes and without the starting "db_dir" so i can run rm /mnt/sda1/usb/db/file.db
i've used the answers found here
How to extract string following a pattern with grep, regex or perl
and that's as close as i got.
EDIT: after using Ed Morton's awk command i needed to pass the output to rm command.
i used:
| ( read DB; (rm $DB/files.db) .
read DB passes the output into the vairable DB.
(...) combines commands.
rm $DB/files.db deletes the the file files.db.
Is this what you're trying to do?
$ awk -F"'" '/db_dir/{print $2}' file
/mnt/sda1/usb/db
That will work in any awk in any shell on every UNIX box.
If that's not what you want then edit your question to clarify your requirements and post more truly representative sample input/output.
Using sed with some effort to avoid single quotes:
sed -n 's/^minidlna.config.db_dir=\s*\S\(\S*\)\S\s*$/\1/p' input
Well, so you end up having a string like db_dir='/mnt/sda1/usb/db'.
I would first remove the quotes by piping this to
.... | tr -d "'"
Now you end up with a string like db_dir=/mnt/sda1/usb/db.
Say you have this string stored in a variable named confstr, then
${confstr##*=}
gives you just /mnt/sda1/usb/db, since *= denotes everything from the start to the equal sign, and ## denotes removal.
I would do this:
Once you either extracted your line about into file.txt (or pipe it into this command), split the fields using the quote character. Use printf to generate the rm command and pass this into bash to execute.
$ awk -F"'" '{printf "rm %s.db/file.db\n", $2}' file.txt | bash
rm: /mnt/sda1/usb/db.db/file.db: No such file or directory
With your original command:
$ uci show minidlna | grep -oE '\bdb_dir=\S+' | \
awk -F"'" '{printf "rm %s.db/file.db\n", $2}' | bash

Bash read filename and return version number with awk

I am trying to use one or two lines of Bash (that can be run in a command line) to read a folder-name and return the version inside of the name.
So if I have myfolder_v1.0.13 I know that I can use echo "myfolder_v1.0.13" | awk -F"v" '{ print $2 }' and it will return with 1.0.13.
But how do I get the shell to read the folder name and pipe with the awk command to give me the same result without using echo? I suppose I could always navigate to the directory and translate the output of pwd into a variable somehow?
Thanks in advance.
Edit: As soon as I asked I figured it out. I can use
result=${PWD##*/}; echo $result | awk -F"v" '{ print $2 }'
and it gives me what I want. I will leave this question up for others to reference unless someone wants me to take it down.
But you don't need an Awk at all, here just use bash parameter expansion.
string="myfolder_v1.0.13"
printf "%s\n" "${string##*v}"
1.0.13
You can use
basename "$(cd "foldername" ; pwd )" | awk -Fv '{print $2}'
to get the shell to give you the directory name, but if you really want to use the shell, you could also avoid the use of awk completetly:
Assuming you have the path to the folder with the version number in the parameter "FOLDERNAME":
echo "${FOLDERNAME##*v}"
This removes the longest prefix matching the glob expression "*v" in the value of the parameter FOLDERNAME.

Bash - nested variable expansion inside command assignment

I'm sure this is simple, but I'm new to bash scripts and the syntactical process here is beyond me. I can't seem to find the right search terms to find what I need. This script is really just a stepping stone to my final version.
Invocation: ./myscript.sh testFile
Script:
#!/bin/bash
file=$1
awk='{print $9}' # do not expand $9
awk="'/$file/$awk'" # DO expand file argument
echo "$awk" # prints '/graphic/{print $9}' (as expected)
echo "ls -l | awk $awk" # prints ls -l | awk '/graphic/{print $9}' (as expected)
test="$(ls -l | awk $awk)" # error
echo "$test"
Output:
'/testFile/{print $9}'
ls -l | awk '/testFile/{print $9}'
awk: syntax error at source line 1
context is
>>> ' <<<
missing }
awk: bailing out at source line 1
Even though I can copy and run the second echo'd line and it works successfully, the failure of the command leads me to believe this is not simple string concat but some crazier voodoo.
I've tried some other version as well like making a variable containing the whole command, but then I get even less expected output.
If I do test="$($awk)" I get
'/testFile/{print $9}'
ls -l | awk '/testFile/{print $9}'
ls: $9}': No such file or directory
ls: '/testFile/{print: No such file or directory
ls: awk: No such file or directory
ls: |: No such file or directory
If I do test=$(awk) I get
'/testFile/{print $9}'
ls -l | awk '/testFile/{print $9}'
usage: awk [-F fs] [-v var=value] [-f progfile | 'prog'] [file ...]
Since my Google queries basically only contain the words "bash command variable assignment", I can't get anything related to the nested variable expansion that I have here. I understand what it's doing based on the error, but I couldn't say why or how to fix it.
If someone could provide a fix as well as explain or point me to a resource explaining what's going on here, it would be greatly appreciated. Or maybe there's even another approach that would simplify the logic.
Thanks!
Change to:
test="$(ls -l | awk "$awk")" # error
awk requires the script to be a single argument. But when you expand a variable outside double quotes, the shell performs word splitting, so $awk is expanded into two arguments:
'{print
$9}'
The quotes keep the expansion as a single argument.
Also, take the single quotes out of
awk="'/$file/$awk'"
Single quotes are not processed after expanding a variable, so they'll be passed literally to awk. Putting double quotes around $awk achieves the result you were trying to get with these quotes.

Resources