How can I pipe the results of grep to a perl one liner? - bash

I have a grep command that find the files that need a value replaced. Then I have a perl one liner that needs to be executed on each file to replace a variables found in that file.
How can I pipe the results of my grep command to the perl one liner?
grep -Irc "/env/file1/" /env/scripts/ | cut -d':' -f1 | sort | uniq
/env/scripts/config/MainDocument.pl
/env/scripts/config/MainDocument.pl2
/env/scripts/config/MainDocument.pl2.bak
perl -p -i.bak -e 's{/env/file1/}{/env/file2/}g' /env/scripts/config/MainDocument.pl
Thanks for your help.

With the $(...) bash syntax.
perl -p -i.bak -e 's{/env/file1/}{/env/file2/}g' $(grep -Irc "/env/file1/" /env/scripts/ | cut -d':' -f1 | sort | uniq)

I'd forget the perl one liner to use xargs and sed instead.
grep -Irc "/env/file1/" /env/scripts/ | cut -d':' -f1 | sort | uniq | xargs sed -ibak ':/env/file1/:/env/file2/:'

Related

grep return the string in between words

I am trying to use grep to filter out the RDS snapshot identifier from the rds describe-db-snapshots command output below:
"arn:aws:rds:ap-southeast-1:123456789:snapshot:rds:apple-pie-2018-05-06-17-12",
"rds:apple-pie-2018-05-06-17-12",
how to return the exact output as in
rds:apple-pie-2018-05-06-17-12
tried using
grep -Eo ",rds:"
but not able to
Following awk may also help you on same.
awk 'match($0,/^"rds[^"]*/){print substr($0,RSTART+1,RLENGTH-1)}' Input_file
Your grep -Eo ",rds:" is failing for different reasons:
You did not add a " in the string to match
Between the comma and rds you need to match the character.
You are trying to match the comma that can be on the previous line
Your sample input is 2 lines (with a newline in between), perhaps the real input is without the newline.
You want to match until the next double quote.
You can support both input-styles (with/without newline) with
grep -Eo '(,|^)"rds:[^"]*' rdsfile |cut -d'"' -f2
You can do this in one command with
sed -rn 's/.*(,|^)"(rds:[^"]*).*/\2/p' rdsfile
EDIT: Manipulting stdout and not the file is with similar commands:
yourcommand | grep -Eo '(,|^)"rds:[^"]*' |cut -d'"' -f2
# or
yourcommand | sed -rn 's/.*(,|^)"(rds:[^"]*).*/\2/p'
You can also test the original commands with yourcommand > rdsfile.
You might notice that rdsfile is missing data that you have seen on the screen, in that case add 2>&1
yourcommand 2>&1 | grep -Eo '(,|^)"rds:[^"]*' |cut -d'"' -f2
# or
yourcommand 2>&1 | sed -rn 's/.*(,|^)"(rds:[^"]*).*/\2/p'

How to write a shell script that reads all the file names in the directory and finds a particular string in file names?

I need a shell script to find a string in file like the following one:
FileName_1.00_r0102.tar.gz
And then pick the highest value from multiple occurrences.
I am interested in "1.00" part of the file name.
I am able to get this part separately in the UNIX shell using the commands:
find /directory/*.tar.gz | cut -f2 -d'_' | cut -f1 -d'.'
1
2
3
1
find /directory/*.tar.gz | cut -f2 -d'_' | cut -f2 -d'.'
00
02
05
00
The problem is there are multiple files with this string:
FileName_1.01_r0102.tar.gz
FileName_2.02_r0102.tar.gz
FileName_3.05_r0102.tar.gz
FileName_1.00_r0102.tar.gz
I need to pick the file with FileName_("highest value")_r0102.tar.gz
But since I am new to shell scripting I am not able to figure out how to handle these multiple instances in script.
The script which I came up with just for the integer part is as follows:
#!/bin/bash
for file in /directory/*
file_version = find /directory/*.tar.gz | cut -f2 -d'_' | cut -f1 -d'.'
done
OUTPUT: file_version:command not found
Kindly help.
Thanks!
If you just want the latest version number:
cd /path/to/files
printf '%s\n' *r0102.tar.gz | cut -d_ -f2 | sort -n -t. -k1,2 |tail -n1
If you want the file name:
cd /path/to/files
lastest=$(printf '%s\n' *r0102.tar.gz | cut -d_ -f2 | sort -n -t. -k1,2 |tail -n1)
printf '%s\n' *${lastest}_r0102.tar.gz
You could try the following which finds all the matching files, sorts the filenames, takes the last in that list, and then extracts the version from the filename.
#!/bin/bash
file_version=$(find ./directory -name "FileName*r0102.tar.gz" | sort | tail -n1 | sed -r 's/.*_(.+)_.*/\1/g')
echo ${file_version}
I have tried and thats worth working below script line, that You need.
echo `ls ./*.tar.gz | sort | sed -n /[0-9]\.[0-9][0-9]/p|tail -n 1`;
It's unnecessary to parse the filename's version number prior to finding the actual filename. Use GNU ls's -v (natural sort of (version) numbers within text) option:
ls -v FileName_[0-9.]*_r0102.tar.gz | tail -1

uniq -c without additional spaces

Is there an option in uniq -c (or an alternative) that doesn't add additional whitespaces around the count number? Currently I generally pipe it through sed, like so:
sort | uniq -c | sed 's/^ *\([0-9]*\) /\1 /'
But this seems kinda redundant, particularly given how frequently I have to do this.
You can try to make the sed command as short as possible with
sort | uniq -c | sed 's/^ *//'
If you have GNU grep, you can also use the -P flag:
sort | uniq -c | grep -Po '\d.*'
(Do not use awk '{$1=$1};1', it will trim more than you want)
When you need this often, you can make a function or script calling
sort | uniq -c | sed 's/^ *//'
or only
uniq -c | sed 's/^ *//'

sort -R is not an option in my OS

I have a couple OS that do not have sort -R to generate a random list from a txt file I have. For example, I am trying to use the following command:
sort -R file | head -20000 > newfile
I looked up the man pages in these OS and sure enough, the -R option is not listed.
What is an alternative that can generate a random list from a file and print to a new file?
CentOS 5
Try:
shuf file | head -n 20000 > newfile
or:
cat file | perl -MList::Util=shuffle -e 'print shuffle(<STDIN>);'
You can use the shuf command, if it is installed.
shuf can either take a file as its input
shuf file | head -n 20000 > newfile
or read from stdin
cat file | shuf | head -n 20000 > newfile
cat file | awk 'BEGIN{srand();}{print rand()"\t"$0}' | sort -k1 -n | cut -f2 | head -20000 > newfile
This is working out for me.
cat ALLEMAILS.txt | awk 'BEGIN{srand();}{print rand()"\t"$0}' | sort -k1 -n | cut -f2 | head -20000 | tee 20000random.txt
This for seeing progress.

awk issue, summing lines in various files

I have a list of files starting with the word "output", and I want to sum up the total number of rows in all the files.
Here's my strategy:
for f in `find outpu*`;do wc -l $f | awk '{x+=$1}END{print $1}' ; done
Before piping over, if there were a way I could do something like >> to a temporary variable and then run the awk command after, I could accomplish this goal.
Any tips?
use this to see details and sum :
wc -l output*
and this to see only the sum:
wc -l output* | tail -n1 | cut -d' ' -f1
Here is some stuff for fun, check it out:
grep -c . out* | cut -d':' -f2- | paste -sd+ | bc
all lines, including empty ones:
grep -c '' out* | cut -d':' -f2- | paste -sd+ | bc
you can play in grep with conditions on lines in files
Watch out, this find command will only find stuff in your current directory if there is one file matching outpu*.
One way of doing it:
awk 'END{print NR}' $(find 'outpu*')
Provided that there is not an insane amount of matching filenames that overflows the maximum command length limit of your shell.

Resources