Search for specific lines from a file - windows

I have an array that contains the data from a text file.
I want to filter the array and copy some information to another array. grep seems to not work.
Here's what I have
$file = 'files.txt';
open (FH, "< $file") or die "Can't open $file for read: $!";
#lines = <FH>;
close FH or die "Cannot close $file: $!";
chomp(#lines);
foreach $y (#lines){
if ( $y =~ /(?:[^\\]*\\|^)[^\\]*$/g ) {
print $1, pos $y, "\n";
}
}
files.txt
public_html
Trainings and Events
General Office\Resources
General Office\Travel
General Office\Office Opperations\Contacts
General Office\Office Opperations\Coordinator Operations
public_html\Accordion\dependencies\.svn\tmp\prop-base
public_html\Accordion\dependencies\.svn\tmp\props
public_html\Accordion\dependencies\.svn\tmp\text-base
The regular expression should take the last one or two folders and put them into their own array for printing.

A regex can get very picky for this. It is far easier to split the path into components and then count off as many as you need. And there is a tool for this exact purpose, the core module File::Spec, as mentioned by xxfelixxx in a comment.
You can use its splitdir to break up the path, and catdir to compose one.
use warnings 'all';
use strict;
use feature 'say';
use File::Spec::Functions qw(splitdir catdir);
my $file = 'files.txt';
open my $fh, '<', $file or die "Can't open $file: $!";
my #dirs;
while (<$fh>) {
next if /^\s*$/; # skip empty lines
chomp;
my #path = splitdir $_;
push #dirs, (#path >= 2 ? catdir #path[-2,-1] : #path);
}
close $fh;
say for #dirs;
I use the module's functional interface while for heavier work you want its object oriented one. Reading the whole file into an array has its uses but in general process line by line. The list manipulations can be done more elegantly but I went for simplicity.
I'd like to add a few general comments
Always start your programs with use strict and use warnings
Use lexical filehandles, my $fh instead of FH
Being aware of (at least) a dozen-or-two of most used modules is really helpful. For example, in the above code we never had to even mention the separator \.

I can't write a full answer because I'm using my phone. In any case zdim has mostly answered you. But my solution would look like this
use strict;
use warnings 'all';
use feature 'say';
use File::Spec::Functions qw/ splitdir catdir /;
my $file = 'files.txt';
open my $fh, '<', $file or die qq{Unable to open "$file" for input: $!};
my #results;
while ( <$fh> ) {
next unless /\S/;
chomp;
my #path = splitdir($_);
shift #path while #path > 2;
push #results, catdir #path;
}
print "$_\n" for #results;

Related

How to Pass more than one file in perl MY function [Perl]

I am new to Perl and was wondering if you guys can help me in regards to passing more than one files in the below code;
my #files=<data/j*.*.txt>;
if (#ARGV) {
my $test=$ARGV[0];
$test=lc($test);
print "Using $test instead\n";
#files=</data/$test*.*.txt>;
print "Found #files instead\n";
}
my $outfile='/data/w_c.txt';
my $lotfile='/data/completed.txt';
if (-e $outfile) {
unlink $outfile;
}
In the above code (my #files=<data/j*.*.txt>;) is currently having all the files starting with j*.*, But I would like to pass all the below files only;
j*.1.txt
c*.3.1.txt
a*.a.b.txt
etc..
How could I pass the list of files in the program itself? I am trying to read all those files and extract information from them..!
Thank you in advance..
You can use something like this:
<data/j*.*.txt data/j*.1.txt data/a*.a.b.txt>
There comes a point where it might be best to use <data/*.txt> and use a regex to filter out all but those you want.
Rather than using globs this way I'd be tempted to switch to opendir and readdir and to use an array of patterns in a regex with alternation to select my files. That way you're not using two different text wildcard syntaxes (for glob and for regex) in the same short snippet of code, which I've seen confuse programmers new to Perl before.
# Set your data directory.
my $dir = '/data';
# Take the whole array of arguments on the command line as patterns to
# match in the regex, or default to a short list of patterns if there
# are none.
# (Consider using an options library later rather than messing
# with #ARGV directly if the program becomes more complex.)
my #filespecs = ( scalar #ARGV ? #ARGV : qw( j.*?\.1\.txt c.*?\.3\.1\.txt ) );
# Join the multiple patterns with the regex alternation character.
# This makes them multiple matching options in a single regex.
my $re = join '|', #filespecs;
# Open the directory for reading, or terminate with an error.
opendir my $d, $dir or die "Cannot open directory $dir : $!\n";
# Select into the #files array things read from the directory
# entry that are regular files (-f), do not start with '.',
# and which match the regex.
my #files = grep { (-f) && (!/^\./) && (/$re/) } readdir $d;
# Close the directory handle now that we're done using it.
closedir $d;
Without the overly verbose comments, that boils down to just this.
my $dir = '/data';
my #filespecs = ( scalar #ARGV ? #ARGV : qw( j.*?\.1\.txt c.*?\.3\.1\.txt ) );
my $re = join '|', #filespecs;
opendir my $d, $dir or die "Cannot open directory $dir : $!\n";
my #files = grep { (-f) && (!/^\./) && (/$re/) } readdir $d;
closedir $d;
I elided the last few lines of your original code because it doesn't seem directly related to your question.
Some sources for you to read that may help make sense of this solution.:
perldoc perlop for the Conditional Operator
https://perldoc.perl.org/perlop#Conditional-Operator , and for qw()
https://perldoc.perl.org/perlop#qw/STRING/
perldoc perlre to learn
about Perl regexes, especially in this case alternation
https://perldoc.perl.org/perlre#Metacharacters
perldoc perlfunc for the -f file test https://perldoc.perl.org/perlfunc#-X-FILEHANDLE , opendir https://perldoc.perl.org/perlfunc#opendir-DIRHANDLE,EXPR , readdir https://perldoc.perl.org/perlfunc#readdir-DIRHANDLE , closedir https://perldoc.perl.org/perlfunc#closedir-DIRHANDLE , and grep https://perldoc.perl.org/perlfunc#grep-BLOCK-LIST

How to move files from a server to the computer using perl

I want to move files from a server that my Windows computer is connected to, to the actual computer. I have tried the code on my mac and it works fine, so I suspect the problem has to do with the fact that the files I wish to move are on a server or perhaps with Windows (I am unfamiliar with this OS). It is important to me to be able to use File::Find::Rule because there are many subdirectories within subdirectories that need to be searched.
use strict;
use warnings;
use File::Find::Rule;
use File::Copy;
# directory where files live
# my $dir = "\\172.18\user\folder\folder2";
# directory where TextGrids will be moved to
my $outdir = "\users\lisa\desktop\test";
my #files;
#files = File::Find::Rule -> file()
-> name("*_clean.TextGrid")
-> maxdepth()
-> in($dir);
foreach my $file (#files) {
$file =~ /(.*\\)(.*)/;
my $name = $2;
copy("$file", "$outdir/$name") or die "Copy failed: $!";
}
Edit: Ok, I've made some changes to the script below. But the strange thing is, that when I ask it to print each file, it gives me something like \\172.18\user\folder\folder/255/file.txt. I changed the regex to be (.*\/)(.*) and now the script works perfectly, though I don't know why!
use strict;
use warnings;
use File::Find::Rule;
use File::Copy;
# directory where files live
my $dir = "\\\\172.18\\user\\folder\\folder2";
# directory where TextGrids will be moved to
my $outdir = "C:\\Users\\lisa\\desktop\\test";
my #files;
#files = File::Find::Rule -> file()
-> name("*_clean.TextGrid")
-> maxdepth()
-> in($dir);
foreach my $file (#files) {
print "$file\n";
$file =~ /(.*\\)(.*)/;
my $name = $2;
copy("$file", "$outdir\\$name") or die "Copy failed: $!";
}
After your edit, the script works because the last directory separator in the string happens to be /, which is matched by the \/ in the regular expression. Even though you had \ in the input, the library you used to find the files added /s.
I have some suggestions:
You can avoid the need to escape (most) backslashes by using single quoted strings, unless you need the interpolation of the double quoted ones.
Escaping backslashes is optional unless followed by a single quote or another backslash:
my $outdir = '\users\lisa\desktop\test';
but
my $outdir = '\users\lisa\desktop\test\\';
$outdir = '\users\lisa\desktop\test\\\'ere is a path';
my $not_a_path = 'three backslashes\\\\\in between, all but the last need escaping';
'ere is a path is the last element in that path.
If you're dealing with Windows, consider using [\\/] in place of directory separator in regular expressions. (Or [\\\/] if you absolutely must use / as regular expression delimiter.)
Even if you have control over user input to only use \ in paths, libraries you use will usually add /, so it's better to be prepared for a combination of both.
$file =~ /(.*[\\\/])(.*)/;
$file =~ m{(.*[\\/])(.*)};
$file =~ m¤(.*[\\/])(.*)¤;
I also removed the superfluous quotes from around $file in the copy() call. Final result:
use strict;
use warnings;
use File::Find::Rule;
use File::Copy;
# directory where files live
my $dir = '\\172.18\user\folder\folder2';
# directory where TextGrids will be moved to
my $outdir = 'C:\Users\lisa\desktop\test';
my #files;
#files = File::Find::Rule -> file()
-> name("*_clean.TextGrid")
-> maxdepth()
-> in($dir);
foreach my $file (#files) {
print "$file\n";
$file =~ /(.*[\\\/])(.*)/;
my $name = $2;
copy($file, "$outdir\\$name") or die "Copy failed: $!";
}
use strict;
use warnings;
my $dir = "\\172.18\user\folder\folder2";
print("$dir\n");
my $outdir = "\users\lisa\desktop\test";
print("$outdir\n");
outputs
Unrecognized escape \d passed through at a.pl line 7.
\172.18Ser?older?older2
Sersisadesktop est
You need to escape your backslashes!
use strict;
use warnings;
my $dir = "\\\\172.18\\user\\folder\\folder2";
print("$dir\n");
my $outdir = "\\users\\lisa\\desktop\\test";
print("$outdir\n");
Don't ignore warnings.

file organisation in windows using perl

I am working on a windows machine and I have a directory filled with ~200k of files which I need to organise. This is a job I will need to do regularly with different filename sets but with similar patterns so perl seemed a good tool to use.
Each filename is made up of {a string A}{2 or 3 digit number B}{single letter "r" or "x"}{3 digit number}.extension
I want to create a folder for each string A
Within each folder I want a sub-folder for each B
I then want to move each file into its relevant sub-folder
So it will end up looking something like
/CustomerA/1
/CustomerA/2
/CustomerA/3
/CustomerB/1
/CustomerB/2
/CustomerB/3
etc with the files in each sub-folder
so CustomerA888x123.xml is moved into /CustomerA/888/
I have the list of files in an array but I am struggling with splitting the file name out to its constituent parts and using the parts effectively.
Thanks for the answer. I ended up with this:
#!usr/bin/perl
use warnings;
use strict;
use File::Copy qw(move);
use File::Path qw(make_path);
opendir my $dir, ".";
my #files = readdir($dir);
closedir $dir;
foreach my $file (#files) {
my ($cust, $num) = $file =~ m/(\D+)(\d+)/;
my $dirname = "$cust/$num";
my #dirs_made = make_path($dirname, { verbose => 1 });
move($file, $dirname) or warn "cant move $file to $dirname: $!";
}
Given your description of file names, this regex should parse what you need
my ($cust, $num) = $filename =~ m/(\D+)(\d+)/;
Use a more precise pattern if you wish or need to be more specific about what precedes the number, for example [a-zA-Z] for letters only.
With that on hand, you can create directories using the core module File::Path, for example
use File::Path qw(make_path);
my $dirname = "$cust/$num";
my #dirs_made = make_path($dirname, { verbose => 1 });
This creates the path as needed, returning the names of created directories. It also prints the names with the verbose. If the directory exists it quietly skips it. If there are problems it raises a die so you may want to wrap it in eval
eval { make_path($dirname) };
if ($#) {
warn "Error with make_path($dirname): $#";
}
Also note the File::Path::Tiny module as an alternative, thanks to Sinan Ünür for bringing it up. Other than being far lighter, it also has the more common error-handling policy whereby a false is returned on failure so you don't need an eval but only the usual check
use File::Path::Tiny;
File::Path::Tiny::mk($path) or warn "Can't mk($path): $!";
The module behaves similarly to mkdir in many ways, see the linked documentation.
Move the files using the move function form the core module File::Copy, for example
use File::Copy qw(move);
move($file, $dirname) or warn "Can't move $file to $dirname: $!";
All this can be in a loop over the array with the file names.

Perl Reading from one file, writing contents to another file on Windows

I am very new to Perl and its syntax. I've done a bit of research about reading from one file and writing to another. I've written a short piece of code that doesnt seem to be giving me any error but it also doesn't write to the file. Some help would be greatly appreciated.
#!/usr/bin/perl
use strict;
use warnings;
my $defaultfile = 'C:\\Glenn Scott C\\AUTO IOX\\IOMETER FILES\\test.txt';
my $mainfile = 'C:\\Glenn Scott C\\AUTO IOX\\IOMETER FILES\\IOMETERFILECREATOR.txt';
open FILE, $defaultfile;
open FILE2, $mainfile;
while (my $line = <FILE>)
{
print FILE2($line);
}
close FILE;
close FILE2;
Close, but not quite.
open is best done with 3 arguments. open ( my $default_fh, '<', $defaultfile ) or die $!;
print to a file handle doesn't work like that. It's print {$main_fh} $line;
you should test open for success. An or die $! is sufficient.
So this would be what you need:
#!/usr/bin/perl
use strict;
use warnings;
my $defaultfile = 'C:\\Glenn Scott C\\AUTO IOX\\IOMETER FILES\\test.txt';
my $mainfile =
'C:\\Glenn Scott C\\AUTO IOX\\IOMETER FILES\\IOMETERFILECREATOR.txt';
open( my $default_fh, "<", $defaultfile ) or die $!;
open( my $main_fh, ">", $mainfile ) or die $!;
while ( my $line = <$default_fh> ) {
print {$main_fh} $line;
}
close $default_fh;
close $main_fh;

Need to split multiple files in a directory based on string, rename properly using powershell or fix my perl script

I have a directory full of files (text exports of Dynamics NAV objects that have been exported) in Windows. Each file contains multiple objects. I need to split each file into separate files based on lines that begin with OBJECT, and name each file appropriately.
The purpose of this is to get our Dynamics NAV system into git.
I wrote a nifty perl program to do this that works great on linux. But it hangs on the while(<>) loop in Windows (Server 2012 if that matters).
So, I need to either figure out how to do this in the PowerShell script that I wrote that generates all of the files, or fix my perl script that I'm calling from PowerShell. Does Windows perl handle filehandles differently than linux?
Here's my code:
#!/usr/bin/perl
use strict;
use warnings;
use File::Path qw(make_path remove_tree);
use POSIX qw(strftime);
my $username = getlogin || getpwuid($<);
my $datestamp = strftime("%Y%m%d-%H%M%S", localtime);
my $work_dir = "/temp/nav_export";
my $objects_dir = "$work_dir/$username/objects";
my $export_dir = "$work_dir/$username/$datestamp";
print "Objects being exported to $export_dir\n";
make_path("$export_dir/Page", "$export_dir/Codeunit", "$export_dir/MenuSuite", "$export_dir/Query", "$export_dir/Report", "$export_dir/Table", "$export_dir/XMLport");
chdir $objects_dir or die "Could not change to $objects_dir: $!";
# delete empty files
foreach(glob('*.*')) {
unlink if -f and !-s _;
}
my #files = <*>;
my $count = #files;
print "Processing $count files\n";
open (my $fh, ">-") or die "Could not open standard out: $!";
# OBJECT Codeunit 1 ApplicationManagement
while(<>)
{
if (m/^OBJECT ([A-Za-z]+) ([0-9]+) (.*)/o)
{
my $objectType = $1;
my $objectID = $2;
my $objectName = my $firstLine = $3;
$objectName =~ s/[\. \/\(\)\\]/_/g; # translate spaces, (, ), ., \ and / to underscores
$objectName =~ tr/\cM//d; # get rid of Ctrl-M
my $filename = $export_dir . "/" . $objectType . "/" . $objectType . "~" . $objectID . "~" . $objectName;
close $fh and open($fh, '>', $filename) or die "Could not open file '$filename' $!";
print $fh "OBJECT $objectType $objectID $firstLine\n";
next;
}
print $fh $_;
}
I've learned quite a bit of PowerShell in the past few days. There are some things that it really does quite well. And some (such as calling an executable with variables and command line options that have spaces) that are maddeningly difficult to figure out. To call curl, this is what I resorted to:
$curl = "C:\Program Files (x86)\cURL\bin\curl"
$arg10 = '-s'
$arg1 = '-X'
$arg11 = 'post'
$arg2 = '-H'
$arg22 = '"Accept-Encoding: gzip,deflate"'
$arg3 = '-H'
$arg33 = '"Content-Type: text/xml;charset=UTF-8"'
$arg4 = '-H'
$arg44 = '"SOAPAction:urn:microsoft-dynamics-schemas/page/permissionrange:ReadMultiple"'
$arg5 = '--ntlm'
$arg6 = '-u'
$arg66 = 'username:password'
$arg7 = '-d'
$arg77 = '"#soap_envelope.txt"'
$arg8 = "http://$servicetier.corp.company.net:7047/$database/WS/DBDOC/Page/PermissionRange"
$arg9 = "-o"
$arg99 = "c:\temp\nav_export\$env:username\raw_list.xml"
&"$curl" $arg10 $arg1 $arg11 $arg2 $arg22 $arg3 $arg33 $arg4 $arg44 $arg5 $arg6 $arg66 $arg7 $arg77 $arg8 $arg9 $arg99
I realize that part is a bit of a tangent. But I've been working really hard at trying to figure this out and not have to bother you nice folk here at stackoverflow!
I'm ambivalent about making it work in PowerShell or fixing the Perl code at this point. I just need to make it work. But I'm hoping it's just some little difference in filehandle handling between linux and Windows.
It's hard to believe that the Perl code that you show does anything on Linux either. It looks like your while loop is supposed to be reading through all of the files in the #files array, but to make it do that you have to copy the names to #ARGV.
Also note that #files will contain directories as well as files.
I suggest you change the lines starting with my #files = <*> to this. There's no reason why it shouldn't work on both Windows and Linux.
our #ARGV = grep -f, glob '*';
my $count = #ARGV;
print "Processing $count files\n";
my $fh;
while (<>) {
s/\s+\z//; # Remove trailing whitespace (including CR and LF)
my #fields = split ' ', $_, 4;
if ( #fields == 4 and $fields[0] eq 'OBJECT' ) {
my ($object_type, $object_id, $object_name) = #fields[1,2,3];
$object_name =~ tr{ ().\\/}{_}; # translate spaces, (, ), ., \ and / to underscores
my $filename = "$export_dir/$object_type/$object_type~$object_id~$object_name";
open $fh, '>', $filename or die "Could not open file '$filename': $!";
}
print $fh "$_\n" if $fh;
if (eof) {
close $fh;
$fh = undef;
}
}

Resources