How to do basename on second field in a file and replace it in line - bash

I'm trying to get something like basename of second field in a file and replace it:
$ myfile=/var/lib/jenkins/myjob/myfile
$ sha512sum "$myfile" | tee myfile-checksum
$ cat myfile-checksum
deb32b1c7122fc750a6742765e0e54a821 /var/lib/jenkins/myjob/myfile
Desired output:
deb32b1c7122fc750a6742765e0e54a821 myfile
So people can easily do sha512sum -c myfile-checksum with no manual edits.
With sed or awk, that is how far i made it for now :)
awk -F/ '{print $NF}' myfile-checksum
sed -i "s|${value}|$(basename $value)|" myfile-checksum
Thanks.

You can set the field separators to both spaces and slashes and print the first and last fields:
awk -F" |/" '{print $1, $NF}'
With your input:
$ awk -F" |/" '{print $1, $NF}' <<< "deb32b1c7122fc750a6742765e0e54a821 /var/lib/jenkins/myjob/myfile"
deb32b1c7122fc750a6742765e0e54a821 myfile
In case your filename contain spaces, do remove everything from the first field up to the last slash, as indicated by Ed Morton:
$ awk '{hash=$1; gsub(/^.*\//,""); print hash, $0}' <<< "deb32b1c7122fc750a6742765e0e54a821 /var/lib/jenkins/myjob/myfile with spaces"
deb32b1c7122fc750a6742765e0e54a821 myfile with spaces

$ awk 'sub(".*/",$1" ")' <<< "deb32b1c7122fc750a6742765e0e54a821 /var/lib/jenkins/myjob/myfile"
deb32b1c7122fc750a6742765e0e54a821 myfile
The will work for any file name except one that contains newlines. If you have that case let us know.

sha512sum will simply use the file name you've passed to it - unchanged.
If you pass
sha512sum /path/to/file
it will give you:
123456.. /path/to/file
But if you:
pushd /path/to
sha512sum file
popd
it will give you
123456.. file
If the filename is a variable you can use parameter expansion like this:
pushd "${file%/*}"
sha256sum "${file##*/}"
popd
or even
# cd will not change the PWD of the current shell since
# the command runs in a sub shell
(cd "${file%/*}"; sha256sum "${file##*/}")
Having that $file contains the filename, ${file%/*} expands to the path without the filename and ${file##*/} expands to the filename without the path.

Related

redirect output of loop to current reading file

I have simple script that looks like
for file in `ls -rlt *.rules | awk '{print $9}'`
do
cat $file | awk -F"|" -v DATE=$(date +%Y"_"%m"_"%d) '!$3{$3=DATE} !$4{$4=DATE} 1' OFS="|" $file
done
How can i redirect output of awk to the same file which it is reading to perform action.
files have data before running above script
123|test||
After running script files should have data like
123|test|2017_04_05|2017_04_05
You cannot replace your files on the fly like this, mostly because you increase their size.
The way is to use temporary file, then replace the current:
for file in `ls -1 *.rules `
do
TMP_FILE=/tmp/${file}_$$
awk -F"|" -v DATE=$(date +%Y"_"%m"_"%d) '!$3{$3=DATE} !$4{$4=DATE} 1' OFS="|" $file > ${TMP_FILE}
mv ${TMP_FILE} $file
done
I would modify Michael Vehrs otherwise good answer as follows:
ls -rt *.rules | while read file
do
TMP_FILE="/tmp/${file}_$$"
awk -F"|" -v DATE=$(date +%Y"_"%m"_"%d) \
'!$3{$3=DATE} !$4{$4=DATE} 1' OFS="|" "$file" > "$TMP_FILE"
mv "$TMP_FILE" "$file"
done
Your question uses ls(1) to sort the files by time, oldest first. The above preserves that property. I removed the {} braces because they add nothing in a shell script if the variable name isn't being interpolated, and quotes to cope with filenames that include whitespace.
If time-order doesn't matter, I'd consider an inside-out solution: in awk, write to a temporary file instead of standard output, and then rename it with system in an END block. Then if something goes wrong your input is preserved.
First of all, it is silly to use a combination of ls -rlt and awk when the only thing you need is the file name. You don't even need ls because the shell glob is expanded by the shell, not ls. Simply use for file in *.rules. Since the date would seem to be the same for every file (unless you run the command at midnight), it is sufficient to calculate it in advance:
date=$(date +%Y"_"%m"_"%d)
for file in *.rules
do
TMP_FILE=$(mktemp ${file}_XXXXXX)
awk -F"|" -v DATE=${date} '!$3{$3=DATE} !$4{$4=DATE} 1' OFS="|" $file > ${TMP_FILE}
mv ${TMP_FILE} $file
done
However, since awk also knows which file it is reading, you could do something like this:
awk -F"|" -v DATE=$(date +%Y"_"%m"_"%d) \
'!$3{$3=DATE} !$4{$4=DATE} { print > FILENAME ".tmp" }' OFS="|" *.rules
rename .tmp "" *.rules.tmp

need to parse between second underscore and first hyphen of the text using sed

I have an rpm file, e.g. abc_defg_hijd-3.29.0-2_el6_11h.txt.
I need to parse the words between the 2nd underscore _ and first hyphen - of the above text,
so the required output will be hijd.
I was able to parse the above with sed for the above, but it worked only for the above example and I have filenames which differ a little, hence I would like to explicitly parse between the second underscore and first hyphen.
Use this sed command (on Mac):
sed -E 's/^[^_]*_[^_]*_([^-]*)-.*$/\1/'
OR (on Linux):
sed -r 's/^[^_]*_[^_]*_([^-]*)-.*$/\1/'
Using awk:
awk -F '_' '{sub(/-.*$/, "", $3); print $3}'
$ foo='abc_defg_hijd-3.29.0-2_el6_11h.txt'
$ bar=${foo%%-*} # remove everything after the first -
$ bar=${bar#*_}; bar=${bar#*_} # remove everything before the second _
$ echo "${bar}"
hijd
grep was born to extract:
grep -oP '[^_-]*_\K[^_-]*(?=-)'
example
kent$ echo 'abc_defg_hijd-3.29.0-2_el6_11h.txt'|grep -oP '[^_-]*_\K[^_-]*(?=-)'
hijd
awk is nuclear bomb for text processing,but it can kill a fly for sure:
awk -F- 'split($1,a,"_")&&$0=a[3]'
or shorter(gawk):
awk -v FPAT="[^-_]*" '$0=$3'
example
kent$ echo 'abc_defg_hijd-3.29.0-2_el6_11h.txt'|awk -F- 'split($1,a,"_")&&$0=a[3]'
hijd
kent$ echo 'abc_defg_hijd-3.29.0-2_el6_11h.txt'|awk -v FPAT="[^-_]*" '$0=$3'
hijd
with GNU sed
echo 'abc_defg_hijd-3.29.0-2_el6_11h.txt' |
sed 's/\([^_]\+_\)\{2\}\([^-]\+\)-.*/\2/g'
hijd
windows batch:
for /f "tokens=3delims=_-" %%i in ("abc_defg_hijd-3.29.0-2_el6_11h.txt") do echo %%i
hijd

sed emulate "tr | grep"

Given the following file
$ cat a.txt
FOO='hhh';BAR='eee';BAZ='ooo'
I can easily parse out one item with tr and grep
$ tr ';' '\n' < a.txt | grep BAR
BAR='eee'
However if I try this using sed it just prints everything
$ sed 's/;/\n/g; /BAR/!d' a.txt
FOO='hhh'
BAR='eee'
BAZ='ooo'
With awk you could do this:
awk '/BAR/' RS=\; file
But if in the case of BAZ this would produce an extra newline, because the is no ; after the last word. If you want to remove that newline as well you would need to do something like:
awk '/BAZ/{sub(/\n/,x); print}' RS=\; file
or with GNU awk or mawk you could use:
awk '/BAZ/' RS='[;\n]'
If your grep has the -o option then you could also try this:
grep -o '[^;]*BAZ[^;]*' file
sed can do it just as you want:
sed -n 's/.*\(BAR[^;]*\).*/\1/gp' <<< "FOO='hhh';BAR='eee';BAZ='ooo'"
The point here is that you must suppress sed's default output -- the whole line --, and print only the substitutions you want to performed.
Noteworthy points:
sed -n suppresses the default output;
s/.../.../g operates in the entire line, even if already matched -- greedy;
s/.1./.2./p prints out the substituted part (.2.);
the tr part is given as the delimiter in the expression \(BAR[^;]*\);
the grep job is represented by the matching of the line itself.
awk 'BEGIN {RS=";"} /BAR/' a.txt
The following grep solution might work for you:
grep -o 'BAR=[^;]*' a.txt
$ sed 's/;/\n/g;/^BAR/!D;P;d' a.txt
BAR='eee'
replace all ; with \n
delete until BAR line is at the top
print BAR line
delete pattern space

Bash/Shell - paths with spaces messing things up

I have a bash/shell function that is supposed to find files then awk/copy the first file it finds to another directory. Unfortunately if the directory that contains the file has spaces in the name the whole thing fails, since it truncates the path for some reason or another. How do I fix it?
If file.txt is in /path/to/search/spaces are bad/ it fails.
dir=/path/to/destination/ | find /path/to/search -name file.txt | head -n 1 | awk -v dir="$dir" '{printf "cp \"%s\" \"%s\"\n", $1, dir}' | sh
cp: /path/to/search/spaces: No such file or directory
*If file.txt is in /path/to/search/spacesarebad/ it works, but notice there are no spaces. :-/
Awk's default separator is white space. Simply change it to something else by doing:
awk -F"\t" ...
Your script should look like:
dir=/path/to/destination/ | find /path/to/search -name file.txt | head -n 1 | awk -F"\t" -v dir="$dir" '{printf "cp \"%s\" \"%s\"\n", $1, dir}' | sh
As pointed by the comments, you don't really need all those steps, you could actually simply do (one-liner):
dir=/path/to/destination/ && path="$(find /path/to/search -name file.txt | head -n 1)" && cp "$path" "$dir"
Formated code (that may look better, in this case ^^):
dir=/path/to/destination/
path="$(find /path/to/search -name file.txt | head -n 1)"
cp "$path" "$dir"
The "" are used to assign the entire content of the string to the variable, causing the separator IFS, which is a white space by default, not to be considered over the string.
If you think spaces are bad, wait till you get into trouble with newlines. Consider for example:
mkdir spaces\ are\ bad
touch spaces\ are\ bad/file.txt
mkdir newlines$'\n'are$'\n'even$'\n'worse
touch newlines$'\n'are$'\n'even$'\n'worse/file.txt
And:
find . -name file.txt
The head command assumes newline delimiter. You can get around the space and newline issue with GNU find and GNU grep (maybe others) by using \0 delimiters:
find . -name file.txt -print0 | grep -zm1 . | xargs -0 cp -t "$dir"
You could try this.
awk '{print substr($0, index($0,$9))}'
For example this is the output of ls command:
-rw-r--r--. 1 root root 73834496 Dec 6 10:55 File with spaces 2
If you use simple awk like this
# awk '{print $9}'
It returns only
# File
If used with the full command
# awk '{print substr($0, index($0,$9))}'
I get the whole output
File with spaces 2
Here
substr(s, a, b) : it returns b number of chars from string s, starting at position a. The parameter b is optional.
For example if the match is addr:192.168.1.133 and you use substr as follows
# awk '{print substr($2,6)}'
You get the IP i.e 192.168.1.133. Note the 6 is the character starting from a in addr
So in the proper command the $2 is $0 ( which prints whole line.) and index($0,$9) matches $9 and prints everything ahead of column 9. You can change that to index($0,$8) and see that the output changes to
# 10:55 File with spaces 2
`index(IN, FIND)'
This searches the string IN for the first occurrence of the string
FIND, and returns the position in characters where that occurrence
begins in the string IN.
I hope it helps. Moreover if you are assigning this value to a variable in script then you need to enclose the variables in double quotes. Other wise you will get errors if you are doing some other operation for the extracted file name.

Shell scripting and using backslashes with back-ticks?

I'm trying to do some manipulation with Wordpress and I'm trying to write a script for it...
# cat /usr/local/uftwf/_wr.sh
#!/bin/sh
# $Id$
#
table_prefix=`grep ^\$table_prefix wp-config.php | awk -F\' '{print $2}'`
echo $table_prefix
#
Yet I'm getting following output
# /usr/local/uftwf/_wr.sh
ABSPATH ABSPATH wp-settings.php_KEY LOGGED_IN_KEY NONCE_KEY AUTH_SALT SECURE_AUTH_SALT LOGGED_IN_SALT NONCE_SALT wp_0zw2h5_ de_DE WPLANG WP_DEBUG s all, stop editing! Happy blogging. */
#
Running from command line, I get the correct output that I'm looking for:
# grep ^\$table_prefix wp-config.php | awk -F\' '{print $2}'
wp_0zw2h5_
#
What is going wrong in the script?
The problem is the grep command:
table_prefix=`grep ^\$table_prefix wp-config.php | awk -F\' '{print $2}'`
It either needs three backslashes - not one - or you need to use single quotes (which is much simpler):
table_prefix=$(grep '^$table_prefix' wp-config.php | awk -F"'" '{print $2}')
It's also worth using the $( ... ) notation in general.
The trouble is that the backquotes removes the backslash, so the shell variable is evaluated, and what's passed to grep is, most likely, just ^, and each line starts with a beginning of line.
This has all the appearance as though the grep is not omitting all the lines that are not matching, when you issue the echo $table_prefix without quotes it collapses all the white space into a single output line, if you issue an: echo "$table_prefix", you would see the match with all the other white-space that was output.
I'd recommend the following sed expression instead:
table_prefix=$(sed -n "s/^\$table_prefix.*'\([^']*\)'.*/\1/p" wp-config.php)
You should try
#!/bin/sh
table_prefix=$(awk -F"'" '/^\$table_prefix/{print $2}' wp-config.php)
echo $table_prefix
Does this one work for you?
awk -F\' '/^\$table_prefix/ {print $2}' wp-config.php
Update
If you are using shell scripting, there is no need to call up awk, grep:
#!/bin/sh
while read varName op varValue theRest
do
if [ "_$varName" = "_\$table_prefix" ]
then
table_prefix=${varValue//\'/} # Remove the single quotes
table_prefix=${table_prefix/;/} # Remove the semicolon
break
fi
done < wp-config.php
echo "Found: $table_prefix"

Resources