How do you pipe shell output to ruby -e? - ruby

Say I was typing something in my terminal like:
ls | grep phrase
and after doing so I realize I want to delete all these files.
I want to use Ruby to do so, but can't quite figure out what to pass into it.
ls | grep phrase | ruby -e "what do I put in here to go through each line by line?"

Use this as a starting point:
ls ~ | ruby -ne 'print $_ if $_[/^D/]'
Which returns:
Desktop
Documents
Downloads
Dropbox
The -n flag means "loop over all incoming lines" and stores them in the "default" variable $_. We don't see that variable used much, partly as a knee-jerk reaction to Perl's overuse of it, but it has its useful moments in Rubydom.
These are the commonly used flags:
-e 'command' one line of script. Several -e's allowed. Omit [programfile]
-n assume 'while gets(); ... end' loop around your script
-p assume loop like -n but print line also like sed

ARGF will save your bacon.
ls | grep phrase | ruby -e "ARGF.read.each_line { |file| puts file }"
=> phrase_file
file_phrase
stuff_in_front_of_phrase
phrase_stuff_behind
ARGF is an array that stores whatever you passed into your (in this case command-line) script.
You can read more about ARGF here:
http://www.ruby-doc.org/core-1.9.3/ARGF.html
For more uses check out this talk on Ruby Forum:
http://www.ruby-forum.com/topic/85528

Related

How do I trim whitespace, but not newlines, from subshell output in bash?

There are many tens, maybe a hundred or more previous questions that seem "identical" to this already here, but after extensive search, I found NOTHING that even came close to working - though I did learn quite a lot - and so I decided to just RTFM and figure this out on my own.
The Problem
I wanted to search the output of a ps auxwww command to find processes of interest, and the issue was that I can't just simply use cut to find the exact data from them that I wanted. ps, it turns out, tries to columnate the output, adding either extra spaces or tabs that get in the way of using cut to get the correct data.
So, since I'm not a master at bash, I did a search... The answers I found were all focused on either variables - a "backup strategy" from my point of view that itself didn't solve the whole problem - or they only trimmed leading or trailing space or all "whitespace" including newlines. NOPE, Won't Work For Cut! And, neither will removing trailing newlines and so forth.
So, restated, the question is, how do we efficiently end up with the white space defined as simply a single space between other characters without eliminating newlines?
Below, I will give my answer, but I welcome others to give theirs - who knows, maybe someone has a better answer?!
Answer:
At least MY answer - please leave your own, too! - was to do this:
ps auxwww | grep <program> | tr -s [:blank:] | cut -d ' ' -f <field_of_interest>
This worked great!
Obviously, there are many ways to adapt this to other needs.
As an alternative to all of the pipes and grep with cut, you could simply use awk. The benefit of using awkwith the default field-separator (FS) being set to break on whitespace is that it considers any number of whitespace between fields as a single separator.
So using awk will do away with needing to use tr -s to "squeeze" whitespace to define fields. Further, awk gives far greater control over field matching using regular expressions rather than having to rely on grep of a full line and cut to locate a pre-determined field numbers. (though to some extent you will still have to tell awk what field out of the ps command you are interested in)
Using bash, you can also eliminate the pipe | by using process substitution to send the output of ps auxwww to awk on stdin using redirection, e.g. awk ... < <(ps auxwww) for a single tidy command line.
To get your "program" and "file_of_interest" into awk you have two options. You can initialize awk variables using the -v var=value option (there can be multiple -v otions given), or you can use the BEGIN rule to initialize the variables. The only difference being with -v you can provide a shell variable for value and there is no whitespace allowed surrounding the = sign, while within BEGIN any whitespace is ignored.
So in your case a couple of examples to get the virtual memory size for firefox processes, you could use:
awk -v prog="firefox" -v fnum="5" '
$11 ~ prog {print $fnum}
' < <(ps auxwww)
(above if you had myprog=firefox as a shell variable, you could use -v prog="$myprog" to initialize the prog variable for awk)
or using the BEGIN rule, you could do:
awk 'BEGIN {prog = "firefox"; fnum = "5"}
$11 ~ prog {print $fnum }
' < <(ps auxwww)
In each command above, it locates the COMMAND field from ps (field 11) and checks whether it contains firefox and if so it outputs field no. 5 the virtual memory size used by each process.
Both work fine as one-liners as well, e.g.
awk -v prog="firefox" -v fnum="5" '$11 ~ prog {print $fnum}' < <(ps auxwww)
Don't get me wrong, the pipeline is perfectly fine, it will just be slow. For short commands with limited output there won't be much difference, but when the output is large, awk will provide orders of magnitude improvement over having to tr and grep and cut reading over the same records three times.
The reason being, the pipes and the process on each side requires separate processes be spawned by the shell. So minimizes their use, improves the efficiency of what your script is doing. Now if the data is small as are the processes, there isn't much of a difference. However if you are reading a 3G file 3 times over -- that's is the difference in orders of magnitude. Hours verses minutes or seconds.
I had to use single quotes on CentosOS Linux to get tr working like described above:
ps -o ppid= $$ | tr -d '[:space:]'
You can reduce the number of pipes using this Perl one-liner, which uses Perl regexes instead of a separate grep process. This combines grep, tr and cut in a single command, with an easy way to manipulate the output (#F is the array of fields, 0-indexed):
Examples:
# Start an example process to provide the input for `ps` in the next commands:
/Applications/Emacs.app/Contents/MacOS/Emacs-x86_64-10_14 --geometry 109x65 /tmp/foo &
# Print single space-delimited output of `ps` for all emacs processes:
ps auxwww | perl -lane 'print "#F" if $F[10] =~ /emacs/i'
# Prints:
# bar 72144 0.0 0.5 4610272 82320 s006 SN 11:15AM 0:01.31 /Applications/Emacs.app/Contents/MacOS/Emacs-x86_64-10_14 --geometry 109x65 /tmp/foo
# Print emacs PID and file name opened with emacs:
ps auxwww | perl -lane 'print join "\t", #F[1, -1] if $F[10] =~ /emacs/i'
# Prints:
# 72144 /tmp/foo
The Perl one-liner uses these command line flags:
-e : Tells Perl to look for code in-line, instead of in a file.
-n : Loop over the input one line at a time, assigning it to $_ by default.
-l : Strip the input line separator ("\n" on *NIX by default) before executing the code in-line, and append it when printing.
-a : Split $_ into array #F on whitespace or on the regex specified in -F option.
SEE ALSO:
perldoc perlrun: how to execute the Perl interpreter: command line switches
perldoc perlre: Perl regular expressions (regexes)

Defining a variable using head and cut

might be an easy question, I'm new in bash and haven't been able to find the solution to my question.
I'm writing the following script:
for file in `ls *.map`; do
ID=${file%.map}
convertf -p ${ID}_par #this is a program that I use, no problem
NAME=head -n 1 ${ID}.ind | cut -f1 -d":" #Now: This step is the problem: don't seem to be able to make a proper NAME function. I just want to take the first column of the first line of the file ${ID}.ind
It gives me the return
line 5: bad substitution
any help?
Thanks!
There are a couple of issues in your code:
for file in `ls *.map` does not do what you want. It will fail e.g. if any of the filenames contains a space or *, but there's more. See http://mywiki.wooledge.org/BashPitfalls#for_i_in_.24.28ls_.2A.mp3.29 for details.
You should just use for file in *.map instead.
ALL_UPPERCASE names are generally used for system variables and built-in shell variables. Use lowercase for your own names.
That said,
for file in *.map; do
id="${file%.map}"
convertf -p "${id}_par"
name="$(head -n 1 "${id}.ind" | cut -f1 -d":")"
...
looks like it would work. We just use $( cmd ) to capture the output of a command in a string.

using grep in a ruby script

I have a simple script that scrapes a webpage and puts the lines of content to the screen, then I simply pipe it to grep to output what I want then pipe that to less.
myscript.rb scrape-term | grep argument | less
I changed the script to use the following, instead of having the extra arguments on the command-line;
%x[ #{my-text-output} | grep argument | less ]
but now I get the error;
sh: 0: command not found
I've tried the other variants found here but nothing works!
What you need to do is open a handle to grep, not jam the output into there. The hack solution is to dump your output in a Tempfile and read from that through grep but really that's a mess.
Ideally what you do is only emit output that matches your pattern by filtering:
while (...)
# ...
if (output.match(/#{argument}/))
puts argument
end
end
Then this can be channelled through to less if required.
Remember there's also a grep method inside of Ruby for anything in an Array. For example, if you've created an array called output with the lines in it:
output.grep(/#{argument}/).each do |line|
puts line
end
Since you're going to run it in a shell anyway, because of a less command, then maybe try this:
%x[ echo #{my-text-output} | grep argument | less ]
You can also try to do echo -e which will display \n as newline etc.
(Ideally, your text should have a \n char at the end of each line already.)
Edit:
Forgot about "" for variable. No more Unknown Command errors, although less doesn't want to work with me. I have this script.rb now:
text = "foo
foobar
foobaraski
bar
barski"
a = %x[ echo "#{text}" | grep foo]
print a
And it gives me:
shell> ruby script.rb
foo
foobar
foobaraski
Edit 2:
And now it works:
text = "foo
foobar
foobaraski
bar
barski"
system "echo '#{text}' | grep foo | less"
Opens up less with three lines, just as bash command would.

Colorized grep -- viewing the entire file with highlighted matches

I find grep's --color=always flag to be tremendously useful. However, grep only prints lines with matches (unless you ask for context lines). Given that each line it prints has a match, the highlighting doesn't add as much capability as it could.
I'd really like to cat a file and see the entire file with the pattern matches highlighted.
Is there some way I can tell grep to print every line being read regardless of whether there's a match? I know I could write a script to run grep on every line of a file, but I was curious whether this was possible with standard grep.
Here are some ways to do it:
grep --color 'pattern\|$' file
grep --color -E 'pattern|$' file
egrep --color 'pattern|$' file
The | symbol is the OR operator. Either escape it using \ or tell grep that the search text has to be interpreted as regular expressions by adding -E or using the egrep command instead of grep.
The search text "pattern|$" is actually a trick, it will match lines that have pattern OR lines that have an end. Because all lines have an end, all lines are matched, but the end of a line isn't actually any characters, so it won't be colored.
To also pass the colored parts through pipes, e.g. towards less, provide the always parameter to --color:
grep --color=always 'pattern\|$' file | less -r
grep --color=always -E 'pattern|$' file | less -r
egrep --color=always 'pattern|$' file | less -r
Here's something along the same lines. Chances are, you'll be using less anyway, so try this:
less -p pattern file
It will highlight the pattern and jump to the first occurrence of it in the file.
You can jump to the next occurence with n and to the previous occurence with p. Quit with q.
I'd like to recommend ack -- better than grep, a power search tool for programmers.
$ ack --color --passthru --pager="${PAGER:-less -R}" pattern files
$ ack --color --passthru pattern files | less -R
$ export ACK_PAGER_COLOR="${PAGER:-less -R}"
$ ack --passthru pattern files
I love it because it defaults to recursive searching of directories (and does so much smarter than grep -r), supports full Perl regular expressions (rather than the POSIXish regex(3)), and has a much nicer context display when searching many files.
You can use my highlight script from https://github.com/kepkin/dev-shell-essentials
It's better than grep because you can highlight each match with its own color.
$ command_here | highlight green "input" | highlight red "output"
You can also create an alias. Add this function in your .bashrc (or .bash_profile on osx)
function grepe {
grep --color -E "$1|$" $2
}
You can now use the alias like this: "ifconfig | grepe inet" or "grepe css index.html".
(PS: don't forget to source ~/.bashrc to reload bashrc on current session)
Use colout program: http://nojhan.github.io/colout/
It is designed to add color highlights to a text stream. Given a regex and a color (e.g. "red"), it reproduces a text stream with matches highlighted. e.g:
# cat logfile but highlight instances of 'ERROR' in red
colout ERROR red <logfile
You can chain multiple invocations to add multiple different color highlights:
tail -f /var/log/nginx/access.log | \
colout ' 5\d\d ' red | \
colout ' 4\d\d ' yellow | \
colout ' 3\d\d ' cyan | \
colout ' 2\d\d ' green
Or you can achieve the same thing by using a regex with N groups (parenthesised parts of the regex), followed by a comma separated list of N colors.
vagrant status | \
colout \
'\''(^.+ running)|(^.+suspended)|(^.+not running)'\'' \
green,yellow,red
The -z option for grep is also pretty slick!
cat file1 | grep -z "pattern"
As grep -E '|pattern' has already been suggested, just wanted to clarify that it's possible to highlight a whole line too.
For example, tail -f somelog | grep --color -E '| \[2\].*' (specifically, the part -E '|):
I use rcg from "Linux Server Hacks", O'Reilly. It's perfect for what you want and can highlight multiple expressions each with different colours.
#!/usr/bin/perl -w
#
# regexp coloured glasses - from Linux Server Hacks from O'Reilly
#
# eg .rcg "fatal" "BOLD . YELLOW . ON_WHITE" /var/adm/messages
#
use strict;
use Term::ANSIColor qw(:constants);
my %target = ( );
while (my $arg = shift) {
my $clr = shift;
if (($arg =~ /^-/) | !$clr) {
print "Usage: rcg [regex] [color] [regex] [color] ...\n";
exit(2);
}
#
# Ugly, lazy, pathetic hack here. [Unquote]
#
$target{$arg} = eval($clr);
}
my $rst = RESET;
while(<>) {
foreach my $x (keys(%target)) {
s/($x)/$target{$x}$1$rst/g;
}
print
}
I added this to my .bash_aliases:
highlight() {
grep --color -E "$1|\$"
}
The sed way
As there is already a lot of different solution, but none show sed as solution,
and because sed is lighter and quicker than grep, I prefer to use sed for this kind of job:
sed 's/pattern/\o33[47;31;1m&\o033[0m/' file
This seems less intuitive.
\o33 is the sed syntax to generate the character octal 033 -> Escape.
(Some shells and editors also allow entering <Ctrl>-<V> followed by <Esc>, to type the character directly.)
Esc [ 47 ; 31 ; 1 m is an ANSI escape code: Background grey, foreground red and bold face.
& will re-print the pattern.
Esc [ 0 m returns the colors to default.
You could also highlight the entire line, but mark the pattern as red:
sed -E <file -e \
's/^(.*)(pattern)(.*)/\o33[30;47m\1\o33[31;1m\2\o33[0;30;47m\3\o33[0m/'
Dynamic tail -f, following logfiles
One of advantage of using sed: You could send a alarm beep on console, using bell ascii character 0x7. I often use sed like:
sudo tail -f /var/log/kern.log |
sed -ue 's/[lL]ink .*\([uU]p\|[dD]own\).*/\o33[47;31;1m&\o33[0m\o7/'
-u stand for unbuffered. This ensure that line will be treated immediately.
So I will hear some beep instantly, when I connect or disconnect my ethernet cable.
Of course, instead of link up pattern, you could watch for USB in same file, or even search for from=.*alice#bobserver.org in /var/log/mail.log (If
you're Charlie, anxiously awaiting an email from Alice;)...
To highlight patterns while viewing the whole file, h can do this.
Plus it uses different colors for different patterns.
cat FILE | h 'PAT1' 'PAT2' ...
You can also pipe the output of h to less -R for better reading.
To grep and use 1 color for each pattern, cxpgrep could be a good fit.
Use ripgrep, aka rg: https://github.com/BurntSushi/ripgrep
rg --passthru...
Color is the default:
rg -t tf -e 'key.*tfstate' -e dynamodb_table
--passthru
Print both matching and non-matching lines.
Another way to achieve a similar effect is by modifying your pattern to
match the empty string.
For example, if you are searching using rg foo then using
rg "^|foo" instead will emit every line in every file searched, but only
occurrences of foo will be highlighted.
This flag enables the same behavior without needing to modify the pattern.
Sacrilege, granted, but grep has gotten complacent.
brew/apt/rpm/whatever install ripgrep
You'll never go back.
another dirty way:
grep -A80 -B80 --color FIND_THIS IN_FILE
I did an
alias grepa='grep -A80 -B80 --color'
in bashrc.
Here is a shell script that uses Awk's gsub function to replace the text you're searching for with the proper escape sequence to display it in bright red:
#! /bin/bash
awk -vstr=$1 'BEGIN{repltext=sprintf("%c[1;31;40m&%c[0m", 0x1B,0x1B);}{gsub(str,repltext); print}' $2
Use it like so:
$ ./cgrep pattern [file]
Unfortunately, it doesn't have all the functionality of grep.
For more information , you can refer to an article "So You Like Color" in Linux Journal
One other answer mentioned grep's -Cn switch which includes n lines of Context. I sometimes do this with n=99 as a quick-and-dirty way of getting [at least] a screenfull of context when the egrep pattern seems too fiddly, or when I'm on a machine on which I've not installed rcg and/or ccze.
I recently discovered ccze which is a more powerful colorizer. My only complaint is that it is screen-oriented (like less, which I never use for that reason) unless you specify the -A switch for "raw ANSI" output.
+1 for the rcg mention above. It is still my favorite since it is so simple to customize in an alias. Something like this is usually in my ~/.bashrc:
alias tailc='tail -f /my/app/log/file | rcg send "BOLD GREEN" receive "CYAN" error "RED"'
Alternatively you can use The Silver Searcher and do
ag <search> --passthrough
I use following command for similar purpose:
grep -C 100 searchtext file
This will say grep to print 100 * 2 lines of context, before & after of the highlighted search text.
It might seem like a dirty hack.
grep "^\|highlight1\|highlight2\|highlight3" filename
Which means - match the beginning of the line(^) or highlight1 or highlight2 or highlight3. As a result, you will get highlighted all highlight* pattern matches, even in the same line.
Ok, this is one way,
wc -l filename
will give you the line count -- say NN, then you can do
grep -C NN --color=always filename
If you want highlight several patterns with different colors see this bash script.
Basic usage:
echo warn error debug info 10 nil | colog
You can change patterns and colors while running pressing one key and then enter key.
Here's my approach, inspired by #kepkin's solution:
# Adds ANSI colors to matched terms, similar to grep --color but without
# filtering unmatched lines. Example:
# noisy_command | highlight ERROR INFO
#
# Each argument is passed into sed as a matching pattern and matches are
# colored. Multiple arguments will use separate colors.
#
# Inspired by https://stackoverflow.com/a/25357856
highlight() {
# color cycles from 0-5, (shifted 31-36), i.e. r,g,y,b,m,c
local color=0 patterns=()
for term in "$#"; do
patterns+=("$(printf 's|%s|\e[%sm\\0\e[0m|g' "${term//|/\\|}" "$(( color+31 ))")")
color=$(( (color+1) % 6 ))
done
sed -f <(printf '%s\n' "${patterns[#]}")
}
This accepts multiple arguments (but doesn't let you customize the colors). Example:
$ noisy_command | highlight ERROR WARN
Is there some way I can tell grep to print every line being read
regardless of whether there's a match?
Option -C999 will do the trick in the absence of an option to display all context lines. Most other grep variants support this too. However: 1) no output is produced when no match is found and 2) this option has a negative impact on grep's efficiency: when the -C value is large this many lines may have to be temporarily stored in memory for grep to determine which lines of context to display when a match occurs. Note that grep implementations do not load input files but rather reads a few lines or use a sliding window over the input. The "before part" of the context has to be kept in a window (memory) to output the "before" context lines later when a match is found.
A pattern such as ^|PATTERN or PATTERN|$ or any empty-matching sub-pattern for that matter such as [^ -~]?|PATTERN is a nice trick. However, 1) these patterns don't show non-matching lines highlighted as context and 2) this can't be used in combination with some other grep options, such as -F and -w for example.
So none of these approaches are satisfying to me. I'm using ugrep, and enhanced grep with option -y to efficiently display all non-matching output as color-highlighted context lines. Other grep-like tools such as ag and ripgrep also offer a pass-through option. But ugrep is compatible with GNU/BSD grep and offers a superset of grep options like -y and -Q. For example, here is what option -y shows when combined with -Q (interactive query UI to enter patterns):
ugrep -Q -y FILE ...
Also try:
egrep 'pattern1|pattern2' FILE.txt | less -Sp 'pattern1|pattern2'
This will give you a tabular output with highlighted pattern/s.

What is your latest useful Perl one-liner (or a pipe involving Perl)? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
The one-liner should:
solve a real-world problem
not be extensively cryptic (should be easy to understand and reproduce)
be worth the time it takes to write it (should not be too clever)
I'm looking for practical tips and tricks (complementary examples for perldoc perlrun).
Please see my slides for "A Field Guide To The Perl Command Line Options."
Squid log files. They're great, aren't they? Except by default they have seconds-from-the-epoch as the time field. Here's a one-liner that reads from a squid log file and converts the time into a human readable date:
perl -pe's/([\d.]+)/localtime $1/e;' access.log
With a small tweak, you can make it only display lines with a keyword you're interested in. The following watches for stackoverflow.com accesses and prints only those lines, with a human readable date. To make it more useful, I'm giving it the output of tail -f, so I can see accesses in real time:
tail -f access.log | perl -ne's/([\d.]+)/localtime $1/e,print if /stackoverflow\.com/'
The problem: A media player does not automatically load subtitles due to their names differ from corresponding video files.
Solution: Rename all *.srt (files with subtitles) to match the *.avi (files with video).
perl -e'while(<*.avi>) { s/avi$/srt/; rename <*.srt>, $_ }'
CAVEAT: Sorting order of original video and subtitle filenames should be the same.
Here, a more verbose version of the above one-liner:
my #avi = glob('*.avi');
my #srt = glob('*.srt');
for my $i (0..$#avi)
{
my $video_filename = $avi[$i];
$video_filename =~ s/avi$/srt/; # 'movie1.avi' -> 'movie1.srt'
my $subtitle_filename = $srt[$i]; # 'film1.srt'
rename($subtitle_filename, $video_filename); # 'film1.srt' -> 'movie1.srt'
}
The common idiom of using find ... -exec rm {} \; to delete a set of files somewhere in a directory tree is not particularly efficient in that it executes the rm command once for each file found. One of my habits, born from the days when computers weren't quite as fast (dagnabbit!), is to replace many calls to rm with one call to perl:
find . -name '*.whatever' | perl -lne unlink
The perl part of the command line reads the list of files emitted* by find, one per line, trims the newline off, and deletes the file using perl's built-in unlink() function, which takes $_ as its argument if no explicit argument is supplied. ($_ is set to each line of input thanks to the -n flag.) (*These days, most find commands do -print by default, so I can leave that part out.)
I like this idiom not only because of the efficiency (possibly less important these days) but also because it has fewer chorded/awkward keys than typing the traditional -exec rm {} \; sequence. It also avoids quoting issues caused by file names with spaces, quotes, etc., of which I have many. (A more robust version might use find's -print0 option and then ask perl to read null-delimited records instead of lines, but I'm usually pretty confident that my file names do not contain embedded newlines.)
You may not think of this as Perl, but I use ack religiously (it's a smart grep replacement written in Perl) and that lets me edit, for example, all of my Perl tests which access a particular part of our API:
vim $(ack --perl -l 'api/v1/episode' t)
As a side note, if you use vim, you can run all of the tests in your editor's buffers.
For something with more obvious (if simple) Perl, I needed to know how many test programs used out test fixtures in the t/lib/TestPM directory (I've cut down the command for clarity).
ack $(ls t/lib/TestPM/|awk -F'.' '{print $1}'|xargs perl -e 'print join "|" => #ARGV') aggtests/ t -l
Note how the "join" turns the results into a regex to feed to ack.
All one-liners from the answers collected in one place:
perl -pe's/([\d.]+)/localtime $1/e;' access.log
ack $(ls t/lib/TestPM/|awk -F'.' '{print $1}'|xargs perl -e 'print join "|" => #ARGV')
aggtests/ t -l
perl -e'while(<*.avi>) { s/avi$/srt/; rename <*.srt>, $_ }'
find . -name '*.whatever' | perl -lne unlink
tail -F /var/log/squid/access.log | perl -ane 'BEGIN{$|++} $F[6] =~ m{\Qrad.live.com/ADSAdClient31.dll}
&& printf "%02d:%02d:%02d %15s %9d\n", sub{reverse #_[0..2]}->(localtime $F[0]), #F[2,4]'
export PATH=$(perl -F: -ane'print join q/:/, grep { !$c{$_}++ } #F'<<<$PATH)
alias e2d="perl -le \"print scalar(localtime($ARGV[0]));\""
perl -ple '$_=eval'
perl -00 -ne 'print sort split /^/'
perl -pe'1while+s/\t/" "x(8-pos()%8)/e'
tail -f log | perl -ne '$s=time() unless $s; $n=time(); $d=$n-$s; if ($d>=2) { print qq
($. lines in last $d secs, rate ),$./$d,qq(\n); $. =0; $s=$n; }'
perl -MFile::Spec -e 'print join(qq(\n),File::Spec->path).qq(\n)'
See corresponding answers for their descriptions.
The Perl one-liner I use the most is the Perl calculator
perl -ple '$_=eval'
One of the biggest bandwidth hogs at $work is download web advertising, so I'm looking at the low-hanging fruit waiting to be picked. I've got rid of Google ads, now I have Microsoft in my line of sights. So I run a tail on the log file, and pick out the lines of interest:
tail -F /var/log/squid/access.log | \
perl -ane 'BEGIN{$|++} $F[6] =~ m{\Qrad.live.com/ADSAdClient31.dll}
&& printf "%02d:%02d:%02d %15s %9d\n",
sub{reverse #_[0..2]}->(localtime $F[0]), #F[2,4]'
What the Perl pipe does is to begin by setting autoflush to true, so that any that is acted upon is printed out immediately. Otherwise the output it chunked up and one receives a batch of lines when the output buffer fills. The -a switch splits each input line on white space, and saves the results in the array #F (functionality inspired by awk's capacity to split input records into its $1, $2, $3... variables).
It checks whether the 7th field in the line contains the URI we seek (using \Q to save us the pain of escaping uninteresting metacharacters). If a match is found, it pretty-prints the time, the source IP and the number of bytes returned from the remote site.
The time is obtained by taking the epoch time in the first field and using 'localtime' to break it down into its components (hour, minute, second, day, month, year). It takes a slice of the first three elements returns, second, minute and hour, and reverses the order to get hour, minute and second. This is returned as a three element array, along with a slice of the third (IP address) and fifth (size) from the original #F array. These five arguments are passed to sprintf which formats the results.
#dr_pepper
Remove literal duplicates in $PATH:
$ export PATH=$(perl -F: -ane'print join q/:/, grep { !$c{$_}++ } #F'<<<$PATH)
Print unique clean paths from %PATH% environment variable (it doesn't touch ../ and alike, replace File::Spec->rel2abs by Cwd::realpath if it is desirable) It is not a one-liner to be more portable:
#!/usr/bin/perl -w
use File::Spec;
$, = "\n";
print grep { !$count{$_}++ }
map { File::Spec->rel2abs($_) }
File::Spec->path;
I use this quite frequently to quickly convert epoch times to a useful datestamp.
perl -l -e 'print scalar(localtime($ARGV[0]))'
Make an alias in your shell:
alias e2d="perl -le \"print scalar(localtime($ARGV[0]));\""
Then pipe an epoch number to the alias.
echo 1219174516 | e2d
Many programs and utilities on Unix/Linux use epoch values to represent time, so this has proved invaluable for me.
Remove duplicates in path variable:
set path=(`echo $path | perl -e 'foreach(split(/ /,<>)){print $_," " unless $s{$_}++;}'`)
Remove MS-DOS line-endings.
perl -p -i -e 's/\r\n$/\n/' htdocs/*.asp
Extracting Stack Overflow reputation without having to open a web page:
perl -nle "print ' Stack Overflow ' . $1 . ' (no change)' if /\s{20,99}([0-9,]{3,6})<\/div>/;" "SO.html" >> SOscores.txt
This assumes the user page has already been downloaded to file SO.html. I use wget for this purpose. The notation here is for Windows command line; it would be slightly different for Linux or Mac OS X. The output is appended to a text file.
I use it in a BAT script to automate sampling of reputation on the four sites in the family:
Stack Overflow, Server Fault, Super User and Meta Stack Overflow.
In response to Ovid's Vim/ack combination:
I too am often searching for something and then want to open the matching files in Vim, so I made myself a little shortcut some time ago (works in Z shell only, I think):
function vimify-eval; {
if [[ ! -z "$BUFFER" ]]; then
if [[ $BUFFER = 'ack'* ]]; then
BUFFER="$BUFFER -l"
fi
BUFFER="vim \$($BUFFER)"
zle accept-line
fi
}
zle -N vim-eval-widget vimify-eval
bindkey '^P' vim-eval-widget
It works like this: I search for something using ack, like ack some-pattern. I look at the results and if I like it, I press arrow-up to get the ack-line again and then press Ctrl + P. What happens then is that Z shell appends and "-l" for listing filenames only if the command starts with "ack". Then it puts "$(...)" around the command and "vim" in front of it. Then the whole thing is executed.
I often need to see a readable version of the PATH while shell scripting. The following one-liners print every path entry on its own line.
Over time this one-liner has evolved through several phases:
Unix (version 1):
perl -e 'print join("\n",split(":",$ENV{"PATH"}))."\n"'
Windows (version 2):
perl -e "print join(qq(\n),split(';',$ENV{'PATH'})).qq(\n)"
Both Unix/Windows (using q/qq tip from #j-f-sebastian) (version 3):
perl -MFile::Spec -e 'print join(qq(\n), File::Spec->path).qq(\n)' # Unix
perl -MFile::Spec -e "print join(qq(\n), File::Spec->path).qq(\n)" # Windows
One of the most recent one-liners that got a place in my ~/bin:
perl -ne '$s=time() unless $s; $n=time(); $d=$n-$s; if ($d>=2) { print "$. lines in last $d secs, rate ",$./$d,"\n"; $. =0; $s=$n; }'
You would use it against a tail of a log file and it will print the rate of lines being outputed.
Want to know how many hits per second you are getting on your webservers? tail -f log | this_script.
Get human-readable output from du, sorted by size:
perl -e '%h=map{/.\s/;7x(ord$&&10)+$`,$_}`du -h`;print#h{sort%h}'
Filters a stream of white-space separated stanzas (name/value pair lists),
sorting each stanza individually:
perl -00 -ne 'print sort split /^/'
Network administrators have the tendency to misconfigure "subnet address" as "host address" especially while using Cisco ASDM auto-suggest. This straightforward one-liner scans the configuration files for any such configuration errors.
incorrect usage: permit host 10.1.1.0
correct usage: permit 10.1.1.0 255.255.255.0
perl -ne "print if /host ([\w\-\.]+){3}\.0 /" *.conf
This was tested and used on Windows, please suggest if it should be modified in any way for correct usage.
Expand all tabs to spaces: perl -pe'1while+s/\t/" "x(8-pos()%8)/e'
Of course, this could be done with :set et, :ret in Vim.
I have a list of tags with which I identify portions of text. The master list is of the format:
text description {tag_label}
It's important that the {tag_label} are not duplicated. So there's this nice simple script:
perl -ne '($c) = $_ =~ /({.*?})/; print $c,"\n" ' $1 | sort | uniq -c | sort -d
I know that I could do the whole lot in shell or perl, but this was the first thing that came to mind.
Often I have had to convert tabular data in to configuration files. For e.g, Network cabling vendors provide the patching record in Excel format and we have to use that information to create configuration files. i.e,
Interface, Connect to, Vlan
Gi1/0/1, Desktop, 1286
Gi1/0/2, IP Phone, 1317
should become:
interface Gi1/0/1
description Desktop
switchport access vlan 1286
and so on. The same task re-appears in several forms in various administration tasks where a tabular data needs to be prepended with their field name and transposed to a flat structure. I have seen some DBA's waste a lot of times preparing their SQL statements from excel sheet. It can be achieved using this simple one-liner. Just save the tabular data in CSV format using your favourite spreadsheet tool and run this one-liner. The field names in header row gets prepended to individual cell values, so you may have to edit it to match your requirements.
perl -F, -lane "if ($.==1) {#keys = #F} else{print #keys[$_].$F[$_] foreach(0..$#F)} "
The caveat is that none of the field names or values should contain any commas. Perhaps this can be further elaborated to catch such exceptions in a one-line, please improve this if possible.
Here is one that I find handy when dealing with a collection compressed log files:
open STATFILE, "zcat $logFile|" or die "Can't open zcat of $logFile" ;
At some time I found that anything I would want to do with Perl that is short enough to be done on the command line with 'perl -e' can be done better, easier and faster with normal Z shell features without the hassle of quoting. E.g. the example above could be done like this:
srt=(*.srt); for foo in *.avi; mv $srt[1] ${foo:r}.srt && srt=($srt[2,-1])

Resources