I want to remove the first two characters of a column in a text file.
I am using the below but this is also truncating the headers.
sed -i 's/^..//' file1.txt
Below is my file:
FileName,Age
./Acct_Bal_Tgt.txt,7229
./IDQ_HB1.txt,5367
./IDQ_HB_LOGC.txt,5367
./IDQ_HB.txt,5367
./IGC_IDQ.txt,5448
./JobSchedule.txt,3851
I want the ./ to be removed from each line in the file name.
Transferring comments to an answer, as requested.
Modify your script to:
sed -e '2,$s/^..//' file1.txt
The 2,$ prefix limits the change to lines 2 to the end of the file, leaving line 1 unchanged.
An alternative is to remove . and / as the first two characters on a line:
sed -e 's%^[.]/%%' file1.txt
I tend to use -e to specify that the script option follows; it isn't necessary unless you split the script over several arguments (so it isn't necessary here where there's just one argument for the script). You could use \. instead of [.]; I'm allergic to backslashes (as you would be if you ever spent time working out whether you needed 8 or 16 consecutive backslashes to get the right result in a troff document).
Advice: Don't use the -i option until you've got your script working correctly. It overwrites your file with the incorrect output just as happily as it will with the correct output. Consequently, if you're asking about how to write a sed script on SO, it isn't safe to be using the -i option. Also note that the -i option is non-standard and behaves differently with different versions of sed (when it is supported at all). Specifically, on macOS, the BSD sed requires a suffix specified; if you don't want a backup, you have to use two arguments: -i ''.
Use this Perl one-liner:
perl -pe 's{^[.]/}{}' file1.txt > output.txt
The Perl one-liner uses these command line flags:
-e : Tells Perl to look for code in-line, instead of in a file.
-p : Loop over the input one line at a time, assigning it to $_ by default. Add print $_ after each loop iteration.
s{^[.]/}{} : Replace a literal dot ([.]) followed by a slash ('/'), found at the beginning of the line (^), with nothing (delete them). This does not modify the header since it does not match the regex.
If you prefer to modify the file in-place, you can use this:
perl -i.bak -pe 's{^[.]/}{}' file1.txt
This creates the backup file file1.txt.bak.
SEE ALSO:
perldoc perlrun: how to execute the Perl interpreter: command line switches
perldoc perlrequick: Perl regular expressions quick start
binary file file.f1
which has String abc I want to overwrite it with adcd
perl -pi -e s/abc/abcd/ file.f1
works but it inserts it rather than overwriting it, which causes error for the program which uses it
I'm not sure how will I be able to do that without making things more complex,
I'd prefer if it used tools like sed, grep, python, perl one liners which are available by default on UNIX system
I'm not very experienced user and am very new to these tools
edit- hope its clear now
data inside bin file is like
[abc def xyz]
when doing perl -pi -e s/abc/abcd/ file.f1
it becomes [abcd def xyz]
what i want is to overwrite it with a extra [space] so it becomes
[abcd ef xyz]
You are trying to patch a binary file. Perl RE are not set for this type of process. While they will work MOST time, specific sequences may trick the RE engine, which assume the file to be text. Use with care.
To get replacement, make the source string match the length of the target string
perl -pi -e 's/abc./abcd/' file.f1
Perl will replace the first 4 byte string that starts with abc with abcd. If you suspect that the 4th character may be special (e.g. new line, or similar), use the single line mode. It will allow '.' to match ANY character.
perl -pi -e 's/abc./abcd/s' file.f1
perl -pi -e 's/blue/red/g' $file_name
The g at the end is required. Another tool to use would be sed for these kinds of tasks.
Another post about using perl
I would like to use a terminal/shell to truncate or otherwise limit a text file to a certain number of lines.
I have a whole directory of text files, for each of which only the first ~50k lines are useful.
How do I delete all lines over 50000?
In-place truncation
To truncate the file in-place with sed, you can do the following:
sed -i '50001,$ d' filename
-i means in place.
d means delete.
50001,$ means the lines from 50001 to the end.
You can make a backup of the file by adding an extension argument to -i, for example, .backup or .bak:
sed -i.backup '50001,$ d' filename
In OS-X or FreeBSD you must provide an argument to -i - so to do this while avoiding making a backup:
sed -i '' '50001,$ d' filename
The long argument name version is as follows, with and without the backup argument:
sed --in-place '50001,$ d' filename
sed --in-place=.backup '50001,$ d' filename
New File
To create a new truncated file, just redirect from head to the new file:
head -n50000 oldfilename > newfilename
-n50000 means the number of lines, head otherwise defaults to 10.
> means to redirect into, overwriting anything else that might be there.
Substitute >> for > if you mean to append into the new file.
It is unfortunate that you cannot redirect into the same file, which is why sed is recommended for in-place truncation.
No sed? Try Python!
This is a bit more typing than sed. Sed is short for "Stream Editor" after all, and that's another reason to use it, it's what the tool is suited for.
This was tested on Linux and Windows with Python 3:
from collections import deque
from itertools import islice
def truncate(filename, lines):
with open(filename, 'r+') as f:
blackhole = deque((),0).extend
file_iterator = iter(f.readline, '')
blackhole(islice(file_iterator, lines))
f.truncate(f.tell())
To explain the Python:
The blackhole works like /dev/null. It's a bound extend method on a deque with maxlen=0, which is the fastest way to exhaust an iterator in Python (that I'm aware of).
We can't simply loop over the file object because the tell method would be blocked, so we need the iter(f.readline, '') trick.
This function demonstrates the context manager, but it's a bit superfluous since Python would close the file on exiting the function. Usage is simply:
>>> truncate('filename', 50000)
Very easy indeed using sed:
sed -n '1,50000 p' filename
This will only print lines 1 to 50000 in the file 'filename'.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
The one-liner should:
solve a real-world problem
not be extensively cryptic (should be easy to understand and reproduce)
be worth the time it takes to write it (should not be too clever)
I'm looking for practical tips and tricks (complementary examples for perldoc perlrun).
Please see my slides for "A Field Guide To The Perl Command Line Options."
Squid log files. They're great, aren't they? Except by default they have seconds-from-the-epoch as the time field. Here's a one-liner that reads from a squid log file and converts the time into a human readable date:
perl -pe's/([\d.]+)/localtime $1/e;' access.log
With a small tweak, you can make it only display lines with a keyword you're interested in. The following watches for stackoverflow.com accesses and prints only those lines, with a human readable date. To make it more useful, I'm giving it the output of tail -f, so I can see accesses in real time:
tail -f access.log | perl -ne's/([\d.]+)/localtime $1/e,print if /stackoverflow\.com/'
The problem: A media player does not automatically load subtitles due to their names differ from corresponding video files.
Solution: Rename all *.srt (files with subtitles) to match the *.avi (files with video).
perl -e'while(<*.avi>) { s/avi$/srt/; rename <*.srt>, $_ }'
CAVEAT: Sorting order of original video and subtitle filenames should be the same.
Here, a more verbose version of the above one-liner:
my #avi = glob('*.avi');
my #srt = glob('*.srt');
for my $i (0..$#avi)
{
my $video_filename = $avi[$i];
$video_filename =~ s/avi$/srt/; # 'movie1.avi' -> 'movie1.srt'
my $subtitle_filename = $srt[$i]; # 'film1.srt'
rename($subtitle_filename, $video_filename); # 'film1.srt' -> 'movie1.srt'
}
The common idiom of using find ... -exec rm {} \; to delete a set of files somewhere in a directory tree is not particularly efficient in that it executes the rm command once for each file found. One of my habits, born from the days when computers weren't quite as fast (dagnabbit!), is to replace many calls to rm with one call to perl:
find . -name '*.whatever' | perl -lne unlink
The perl part of the command line reads the list of files emitted* by find, one per line, trims the newline off, and deletes the file using perl's built-in unlink() function, which takes $_ as its argument if no explicit argument is supplied. ($_ is set to each line of input thanks to the -n flag.) (*These days, most find commands do -print by default, so I can leave that part out.)
I like this idiom not only because of the efficiency (possibly less important these days) but also because it has fewer chorded/awkward keys than typing the traditional -exec rm {} \; sequence. It also avoids quoting issues caused by file names with spaces, quotes, etc., of which I have many. (A more robust version might use find's -print0 option and then ask perl to read null-delimited records instead of lines, but I'm usually pretty confident that my file names do not contain embedded newlines.)
You may not think of this as Perl, but I use ack religiously (it's a smart grep replacement written in Perl) and that lets me edit, for example, all of my Perl tests which access a particular part of our API:
vim $(ack --perl -l 'api/v1/episode' t)
As a side note, if you use vim, you can run all of the tests in your editor's buffers.
For something with more obvious (if simple) Perl, I needed to know how many test programs used out test fixtures in the t/lib/TestPM directory (I've cut down the command for clarity).
ack $(ls t/lib/TestPM/|awk -F'.' '{print $1}'|xargs perl -e 'print join "|" => #ARGV') aggtests/ t -l
Note how the "join" turns the results into a regex to feed to ack.
All one-liners from the answers collected in one place:
perl -pe's/([\d.]+)/localtime $1/e;' access.log
ack $(ls t/lib/TestPM/|awk -F'.' '{print $1}'|xargs perl -e 'print join "|" => #ARGV')
aggtests/ t -l
perl -e'while(<*.avi>) { s/avi$/srt/; rename <*.srt>, $_ }'
find . -name '*.whatever' | perl -lne unlink
tail -F /var/log/squid/access.log | perl -ane 'BEGIN{$|++} $F[6] =~ m{\Qrad.live.com/ADSAdClient31.dll}
&& printf "%02d:%02d:%02d %15s %9d\n", sub{reverse #_[0..2]}->(localtime $F[0]), #F[2,4]'
export PATH=$(perl -F: -ane'print join q/:/, grep { !$c{$_}++ } #F'<<<$PATH)
alias e2d="perl -le \"print scalar(localtime($ARGV[0]));\""
perl -ple '$_=eval'
perl -00 -ne 'print sort split /^/'
perl -pe'1while+s/\t/" "x(8-pos()%8)/e'
tail -f log | perl -ne '$s=time() unless $s; $n=time(); $d=$n-$s; if ($d>=2) { print qq
($. lines in last $d secs, rate ),$./$d,qq(\n); $. =0; $s=$n; }'
perl -MFile::Spec -e 'print join(qq(\n),File::Spec->path).qq(\n)'
See corresponding answers for their descriptions.
The Perl one-liner I use the most is the Perl calculator
perl -ple '$_=eval'
One of the biggest bandwidth hogs at $work is download web advertising, so I'm looking at the low-hanging fruit waiting to be picked. I've got rid of Google ads, now I have Microsoft in my line of sights. So I run a tail on the log file, and pick out the lines of interest:
tail -F /var/log/squid/access.log | \
perl -ane 'BEGIN{$|++} $F[6] =~ m{\Qrad.live.com/ADSAdClient31.dll}
&& printf "%02d:%02d:%02d %15s %9d\n",
sub{reverse #_[0..2]}->(localtime $F[0]), #F[2,4]'
What the Perl pipe does is to begin by setting autoflush to true, so that any that is acted upon is printed out immediately. Otherwise the output it chunked up and one receives a batch of lines when the output buffer fills. The -a switch splits each input line on white space, and saves the results in the array #F (functionality inspired by awk's capacity to split input records into its $1, $2, $3... variables).
It checks whether the 7th field in the line contains the URI we seek (using \Q to save us the pain of escaping uninteresting metacharacters). If a match is found, it pretty-prints the time, the source IP and the number of bytes returned from the remote site.
The time is obtained by taking the epoch time in the first field and using 'localtime' to break it down into its components (hour, minute, second, day, month, year). It takes a slice of the first three elements returns, second, minute and hour, and reverses the order to get hour, minute and second. This is returned as a three element array, along with a slice of the third (IP address) and fifth (size) from the original #F array. These five arguments are passed to sprintf which formats the results.
#dr_pepper
Remove literal duplicates in $PATH:
$ export PATH=$(perl -F: -ane'print join q/:/, grep { !$c{$_}++ } #F'<<<$PATH)
Print unique clean paths from %PATH% environment variable (it doesn't touch ../ and alike, replace File::Spec->rel2abs by Cwd::realpath if it is desirable) It is not a one-liner to be more portable:
#!/usr/bin/perl -w
use File::Spec;
$, = "\n";
print grep { !$count{$_}++ }
map { File::Spec->rel2abs($_) }
File::Spec->path;
I use this quite frequently to quickly convert epoch times to a useful datestamp.
perl -l -e 'print scalar(localtime($ARGV[0]))'
Make an alias in your shell:
alias e2d="perl -le \"print scalar(localtime($ARGV[0]));\""
Then pipe an epoch number to the alias.
echo 1219174516 | e2d
Many programs and utilities on Unix/Linux use epoch values to represent time, so this has proved invaluable for me.
Remove duplicates in path variable:
set path=(`echo $path | perl -e 'foreach(split(/ /,<>)){print $_," " unless $s{$_}++;}'`)
Remove MS-DOS line-endings.
perl -p -i -e 's/\r\n$/\n/' htdocs/*.asp
Extracting Stack Overflow reputation without having to open a web page:
perl -nle "print ' Stack Overflow ' . $1 . ' (no change)' if /\s{20,99}([0-9,]{3,6})<\/div>/;" "SO.html" >> SOscores.txt
This assumes the user page has already been downloaded to file SO.html. I use wget for this purpose. The notation here is for Windows command line; it would be slightly different for Linux or Mac OS X. The output is appended to a text file.
I use it in a BAT script to automate sampling of reputation on the four sites in the family:
Stack Overflow, Server Fault, Super User and Meta Stack Overflow.
In response to Ovid's Vim/ack combination:
I too am often searching for something and then want to open the matching files in Vim, so I made myself a little shortcut some time ago (works in Z shell only, I think):
function vimify-eval; {
if [[ ! -z "$BUFFER" ]]; then
if [[ $BUFFER = 'ack'* ]]; then
BUFFER="$BUFFER -l"
fi
BUFFER="vim \$($BUFFER)"
zle accept-line
fi
}
zle -N vim-eval-widget vimify-eval
bindkey '^P' vim-eval-widget
It works like this: I search for something using ack, like ack some-pattern. I look at the results and if I like it, I press arrow-up to get the ack-line again and then press Ctrl + P. What happens then is that Z shell appends and "-l" for listing filenames only if the command starts with "ack". Then it puts "$(...)" around the command and "vim" in front of it. Then the whole thing is executed.
I often need to see a readable version of the PATH while shell scripting. The following one-liners print every path entry on its own line.
Over time this one-liner has evolved through several phases:
Unix (version 1):
perl -e 'print join("\n",split(":",$ENV{"PATH"}))."\n"'
Windows (version 2):
perl -e "print join(qq(\n),split(';',$ENV{'PATH'})).qq(\n)"
Both Unix/Windows (using q/qq tip from #j-f-sebastian) (version 3):
perl -MFile::Spec -e 'print join(qq(\n), File::Spec->path).qq(\n)' # Unix
perl -MFile::Spec -e "print join(qq(\n), File::Spec->path).qq(\n)" # Windows
One of the most recent one-liners that got a place in my ~/bin:
perl -ne '$s=time() unless $s; $n=time(); $d=$n-$s; if ($d>=2) { print "$. lines in last $d secs, rate ",$./$d,"\n"; $. =0; $s=$n; }'
You would use it against a tail of a log file and it will print the rate of lines being outputed.
Want to know how many hits per second you are getting on your webservers? tail -f log | this_script.
Get human-readable output from du, sorted by size:
perl -e '%h=map{/.\s/;7x(ord$&&10)+$`,$_}`du -h`;print#h{sort%h}'
Filters a stream of white-space separated stanzas (name/value pair lists),
sorting each stanza individually:
perl -00 -ne 'print sort split /^/'
Network administrators have the tendency to misconfigure "subnet address" as "host address" especially while using Cisco ASDM auto-suggest. This straightforward one-liner scans the configuration files for any such configuration errors.
incorrect usage: permit host 10.1.1.0
correct usage: permit 10.1.1.0 255.255.255.0
perl -ne "print if /host ([\w\-\.]+){3}\.0 /" *.conf
This was tested and used on Windows, please suggest if it should be modified in any way for correct usage.
Expand all tabs to spaces: perl -pe'1while+s/\t/" "x(8-pos()%8)/e'
Of course, this could be done with :set et, :ret in Vim.
I have a list of tags with which I identify portions of text. The master list is of the format:
text description {tag_label}
It's important that the {tag_label} are not duplicated. So there's this nice simple script:
perl -ne '($c) = $_ =~ /({.*?})/; print $c,"\n" ' $1 | sort | uniq -c | sort -d
I know that I could do the whole lot in shell or perl, but this was the first thing that came to mind.
Often I have had to convert tabular data in to configuration files. For e.g, Network cabling vendors provide the patching record in Excel format and we have to use that information to create configuration files. i.e,
Interface, Connect to, Vlan
Gi1/0/1, Desktop, 1286
Gi1/0/2, IP Phone, 1317
should become:
interface Gi1/0/1
description Desktop
switchport access vlan 1286
and so on. The same task re-appears in several forms in various administration tasks where a tabular data needs to be prepended with their field name and transposed to a flat structure. I have seen some DBA's waste a lot of times preparing their SQL statements from excel sheet. It can be achieved using this simple one-liner. Just save the tabular data in CSV format using your favourite spreadsheet tool and run this one-liner. The field names in header row gets prepended to individual cell values, so you may have to edit it to match your requirements.
perl -F, -lane "if ($.==1) {#keys = #F} else{print #keys[$_].$F[$_] foreach(0..$#F)} "
The caveat is that none of the field names or values should contain any commas. Perhaps this can be further elaborated to catch such exceptions in a one-line, please improve this if possible.
Here is one that I find handy when dealing with a collection compressed log files:
open STATFILE, "zcat $logFile|" or die "Can't open zcat of $logFile" ;
At some time I found that anything I would want to do with Perl that is short enough to be done on the command line with 'perl -e' can be done better, easier and faster with normal Z shell features without the hassle of quoting. E.g. the example above could be done like this:
srt=(*.srt); for foo in *.avi; mv $srt[1] ${foo:r}.srt && srt=($srt[2,-1])