Perl Image::OCR::Tesseract module on Windows - windows

Anyone out there know of a graceful way to install the "Image::OCR::Tesseract" module on Windows? The module fails to install on Windows via CPAN due to a *NIX only module dependency called "LEOCHARRE::CLI". This module does not seem to be required to run "Image::OCR::Tesseract" itself.
I've managed to get the module working by first manually installing the dependency modules listed in the makefile.pl (except for "LEOCHARRE::CLI") and then by moving the module file to the correct directory structure under "C:\Perl\site\lib\Image\OCR". The final part of getting it to work was to alter the section of code that calls the ImageMagick and Tesseract executables from the command line to put quotes around the program names when the executables are called by module.
This works, but I'd really feel better about doing a PPM or CPAN install on a production system from a repo that works on Windows.

Never mind, I got it, though I can't decide what is the better solution.
To get the installer to work on Windows via the traditional "perl makefile.pl, make, make test, make install" routine requires an edit to the Makefile.pl script, including the missing Windows install module (Devel::AssertOS::MSWin32), and patch to AssertEXE.pm to use "File::Which" rather than the built in shell "which" command that Windows lacks. All this still requires that The "Image::OCR::Tesseract" be patched to put quotes around program names when executing "convert" and "tesseract" from the command line.
Given the number of steps involved to make the installer work on Windows, and the fact the module does not create a binary component for the module to link to, I'd say the best option for installing and getting the Tesseract module working on windows would be to first install the following binary packages:
ImageMagick
Link
Tesseract
http://code.google.com/p/tesseract-ocr/downloads/list
Next, locate your Perl module directory - on my system it is "C:\Perl\site\lib". Create a folder "Image", if you don't have one. Next, open the Image folder and create a folder called "OCR". Open the OCR folder. At this point, your path should be something along the lines of "C:\Perl\site\lib\Image\OCR". Create a new text file called "Tesseract.pm", and copy in the following content...
package Image::OCR::Tesseract;
use strict;
use Carp;
use Cwd;
use String::ShellQuote 'shell_quote';
use Exporter;
use vars qw(#EXPORT_OK #ISA $VERSION $DEBUG $WHICH_TESSERACT $WHICH_CONVERT %EXPORT_TAGS #TRASH);
#ISA = qw(Exporter);
#EXPORT_OK = qw(get_ocr get_hocr _tesseract convert_8bpp_tif tesseract);
$VERSION = sprintf "%d.%02d", q$Revision: 1.24 $ =~ /(\d+)/g;
%EXPORT_TAGS = ( all => \#EXPORT_OK );
BEGIN {
use File::Which 'which';
$WHICH_TESSERACT = which('tesseract');
$WHICH_CONVERT = which('convert');
if($^O=~m/MSWin/) {
$WHICH_TESSERACT='"'.$WHICH_TESSERACT.'"';
$WHICH_CONVERT='"'.$WHICH_CONVERT.'"';
}
$WHICH_TESSERACT or die("Is tesseract installed? Cannot find bin path to tesseract.");
$WHICH_CONVERT or die("Is convert installed? Cannot find bin path to convert.");
}
END {
scalar #TRASH or return;
if ( $DEBUG ){
print STDERR "Debug on, these are trash files:\n".join("\n",#TRASH) ;
}
else {
unlink #TRASH;
}
}
sub DEBUG { Carp::cluck("Image::OCR::Tesseract::DEBUG() deprecated") }
sub get_hocr {
my ($abs_image,$abs_tmp_dir,$lang)= #_;
-f $abs_image or croak("$abs_image is not a file on disk");
my $hocr="hocr";
if(defined $abs_tmp_dir){
-d $abs_tmp_dir or die("tmp dir arg $abs_tmp_dir not a dir on disk.");
$abs_image=~/([^\/]+)$/ or die("cant match filename in path arg '$abs_image'");
my $abs_copy = "$abs_tmp_dir/$1";
# TODO, what if source and dest are same, i want it to die
require File::Copy;
File::Copy::copy($abs_image, $abs_copy)
or die("cant make copy of $abs_image to $abs_copy, $!");
# change the image to get ocr from to be the copy
$abs_image = $abs_copy;
# since it's a copy. erase that on exit
push #TRASH, $abs_image;
}
my $tmp_tif = convert_8bpp_tif($abs_image);
push #TRASH, $tmp_tif; # for later delete
_tesseract($tmp_tif,$lang,$hocr) || '';
}
sub get_ocr {
my ($abs_image,$abs_tmp_dir,$lang)= #_;
-f $abs_image or croak("$abs_image is not a file on disk");
if(defined $abs_tmp_dir){
-d $abs_tmp_dir or die("tmp dir arg $abs_tmp_dir not a dir on disk.");
$abs_image=~/([^\/]+)$/ or die("cant match filename in path arg '$abs_image'");
my $abs_copy = "$abs_tmp_dir/$1";
# TODO, what if source and dest are same, i want it to die
require File::Copy;
File::Copy::copy($abs_image, $abs_copy)
or die("cant make copy of $abs_image to $abs_copy, $!");
# change the image to get ocr from to be the copy
$abs_image = $abs_copy;
# since it's a copy. erase that on exit
push #TRASH, $abs_image;
}
my $tmp_tif = convert_8bpp_tif($abs_image);
push #TRASH, $tmp_tif; # for later delete
_tesseract($tmp_tif,$lang) || '';
}
sub convert_8bpp_tif {
my ($abs_img,$abs_out) = (shift,shift);
defined $abs_img or die('missing image arg');
$abs_out ||= $abs_img.'.tmp.'.time().(int rand(9000)).'.tif';
my #arg = ( $WHICH_CONVERT, $abs_img, '-compress','none','+matte', $abs_out );
#die (join(" ", #arg));
system(#arg) == 0 or die("convert $abs_img error.. $?");
$DEBUG and warn("made $abs_out 8bpp tiff.");
$abs_out;
}
# people expect tesseract to automatically convert
*tesseract = \&_tesseract;
sub _tesseract {
my ($abs_image,$lang,$hocr) = #_;
defined $abs_image or croak('missing image path arg');
$abs_image=~/\.tif+$/i or warn("Are you sure '$abs_image' is a tif image? This operation may fail.");
#my #arg = (
# $WHICH_TESSERACT, shell_quote($abs_image), shell_quote($abs_image),
# (defined $lang and ('-l', $lang) ), '2>/dev/null'
#);
my $cmd =
( sprintf '%s %s %s',
$WHICH_TESSERACT,
shell_quote($abs_image),
shell_quote($abs_image)
) .
( defined $lang ? " -l $lang" : '' ) .
( defined $hocr ? " hocr" : '' ) .
" 2>/dev/null";
$DEBUG and warn "command: $cmd";
system($cmd); # hard to check ==0
my $txt = $abs_image.($hocr?".html":".txt");
unless( -f $txt ){
Carp::cluck("no text output for image '$abs_image'. (No text file '$txt' found on disk)");
return;
}
$DEBUG and warn "Found text file '$txt'";
my $content = (_slurp($txt) || '');
$DEBUG and warn("content length of text in '$txt' from image '$abs_image' is ". length $content );
push #TRASH, $txt;
$content;
}
sub _slurp {
my $abs = shift;
open(FILE,'<', $abs) or die("can't open file for reading '$abs', $!");
local $/;
my $txt = <FILE>;
close FILE;
$txt;
}
1;
__END__
#sub _force_imgtype {
# my $img = shift;
# my $type = shift;
# my $delete_original = shift;
# $delete_original ||=0;
#
#
# if($img=~/\.$type$/i){
# return $img;
# }
#
# my $img_out= $img;
# $img_out=~s/\.\w{1,5}$/\.$type/ or die("cant get file ext for $img");
#
#
#
#}
Save and close. Close the command line session and open a new one if you've had one open from before you did the ImageMagick and Tesseract binary installs. Test the module with the following script:
use Image::OCR::Tesseract;
my $image = 'SomeImageFileThatContainsText.jpg';
my $text = Image::OCR::Tesseract::get_ocr($image);
print "Text...\n";
print $text."\n";
print "Normal Exit\n";
exit;
That's it. Messy, I know, but there's no good way around the fact that the module installer really needs to be updated to support Windows (and other) systems even though the actual module code almost runs without modification. Really, if Tesseract and ImageMagick were installed to paths without spaces then the "Image::OCR::Tesseract" module code would not need any changes, but this minor tweak lets the supporting executables be installed anywhere, including the default locations.

Related

In a setup with two Nix Flakes, where one provides a plugin for the other's application, "path <…> is not valid". How to fix that?

I have two Nix Flakes: One contains an application, and the other contains a plugin for that application. When I build the application with the plugin, I get the error
error: path '/nix/store/3b7djb5pr87zbscggsr7vnkriw3yp21x-mainapp-go-modules' is not valid
I have no idea what this error means and how to fix it, but I can reproduce it on both macOS and Linux. The path in question is the vendor directory generated by the first step of buildGoModule.
The minimal setup to reproduce the error requires a bunch of files, so I provide a commented bash script that you can execute in an empty folder to recreate my setup:
#!/bin/bash
# I have two flakes: the main application and a plugin.
# the mainapp needs to be inside the plugin directory
# so that nix doesn't complain about the path:mainapp
# reference being outside the parent's root.
mkdir -p plugin/mainapp
# each is a go module with minimal setup
tee plugin/mainapp/go.mod <<EOF >/dev/null
module example.com/mainapp
go 1.16
EOF
tee plugin/go.mod <<EOF >/dev/null
module example.com/plugin
go 1.16
EOF
# each contain minimal Go code
tee plugin/mainapp/main.go <<EOF >/dev/null
package main
import "fmt"
func main() {
fmt.Println("Hello, World!")
}
EOF
tee plugin/main.go <<EOF >/dev/null
package plugin
import log
func init() {
fmt.Println("initializing plugin")
}
EOF
# the mainapp is a flake that provides a function for building
# the app, as well as a default package that is the app
# without any plugins.
tee plugin/mainapp/flake.nix <<'EOF' >/dev/null
{
description = "main application";
inputs = {
nixpkgs.url = github:NixOS/nixpkgs/nixos-21.11;
flake-utils.url = github:numtide/flake-utils;
};
outputs = {self, nixpkgs, flake-utils}:
let
# buildApp builds the application from a list of plugins.
# plugins cause the vendorSha256 to change, hence it is
# given as additional parameter.
buildApp = { pkgs, vendorSha256, plugins ? [] }:
let
# this is appended to the mainapp's go.mod so that it
# knows about the plugin and where to find it.
requirePlugin = plugin: ''
require ${plugin.goPlugin.goModName} v0.0.0
replace ${plugin.goPlugin.goModName} => ${plugin.outPath}/src
'';
# since buildGoModule consumes the source two times –
# first for vendoring, and then for building –
# we do the necessary modifications to the sources in an
# own derivation and then hand that to buildGoModule.
sources = pkgs.stdenvNoCC.mkDerivation {
name = "mainapp-with-plugins-source";
src = self;
phases = [ "unpackPhase" "buildPhase" "installPhase" ];
# write a plugins.go file that references the plugin's package via
# _ = "<module path>"
PLUGINS_GO = ''
package main
// Code generated by Nix. DO NOT EDIT.
import (
${builtins.foldl' (a: b: a + "\n\t_ = \"${b.goPlugin.goModName}\"") "" plugins}
)
'';
GO_MOD_APPEND = builtins.foldl' (a: b: a + "${requirePlugin b}\n") "" plugins;
buildPhase = ''
printenv PLUGINS_GO >plugins.go
printenv GO_MOD_APPEND >>go.mod
'';
installPhase = ''
mkdir -p $out
cp -r -t $out *
'';
};
in pkgs.buildGoModule {
name = "mainapp";
src = builtins.trace "sources at ${sources}" sources;
inherit vendorSha256;
nativeBuildInputs = plugins;
};
in (flake-utils.lib.eachDefaultSystem (system:
let
pkgs = nixpkgs.legacyPackages.${system};
in rec {
defaultPackage = buildApp {
inherit pkgs;
# this may be different depending on your nixpkgs; if it is, just change it.
vendorSha256 = "sha256-pQpattmS9VmO3ZIQUFn66az8GSmB4IvYhTTCFn6SUmo=";
};
}
)) // {
lib = {
inherit buildApp;
# helper that parses a go.mod file for the module's name
pluginMetadata = goModFile: {
goModName = with builtins; head
(match "module ([^[:space:]]+).*" (readFile goModFile));
};
};
};
}
EOF
# the plugin is a flake depending on the mainapp that outputs a plugin package,
# and also a package that is the mainapp compiled with this plugin.
tee plugin/flake.nix <<'EOF' >/dev/null
{
description = "mainapp plugin";
inputs = {
nixpkgs.url = github:NixOS/nixpkgs/nixos-21.11;
flake-utils.url = github:numtide/flake-utils;
nix-filter.url = github:numtide/nix-filter;
mainapp.url = path:mainapp;
mainapp.inputs = {
nixpkgs.follows = "nixpkgs";
flake-utils.follows = "flake-utils";
};
};
outputs = {self, nixpkgs, flake-utils, nix-filter, mainapp}:
flake-utils.lib.eachDefaultSystem (system:
let
pkgs = nixpkgs.legacyPackages.${system};
in rec {
packages = rec {
plugin = pkgs.stdenvNoCC.mkDerivation {
pname = "mainapp-plugin";
version = "0.1.0";
src = nix-filter.lib.filter {
root = ./.;
exclude = [ ./mainapp ./flake.nix ./flake.lock ];
};
# needed for mainapp to recognize this as plugin
passthru.goPlugin = mainapp.lib.pluginMetadata ./go.mod;
phases = [ "unpackPhase" "installPhase" ];
installPhase = ''
mkdir -p $out/src
cp -r -t $out/src *
'';
};
app = mainapp.lib.buildApp {
inherit pkgs;
# this may be different depending on your nixpkgs; if it is, just change it.
vendorSha256 = "sha256-a6HFGFs1Bu9EkXwI+DxH5QY2KBcdPzgP7WX6byai4hw=";
plugins = [ plugin ];
};
};
defaultPackage = packages.app;
}
);
}
EOF
You need Nix with Flake support installed to reproduce the error.
In the plugin folder created by this script, execute
$ nix build
trace: sources at /nix/store/d5arinbiaspyjjc4ypk4h5dsjx22pcsf-mainapp-with-plugins-source
error: path '/nix/store/3b7djb5pr87zbscggsr7vnkriw3yp21x-mainapp-go-modules' is not valid
(If you get hash mismatches, just update the flakes with the correct hash; I am not quite sure whether hashing when spreading flakes outside of a repository is reproducible.)
The sources directory (shown by trace) does exist and looks okay. The path given in the error message also exists and contains modules.txt with expected content.
In the folder mainapp, nix build does run successfully, which builds the app without plugins. So what is it that I do with the plugin that makes the path invalid?
The reason is that the file modules.txt generated as part of vendoring will contain the nix store path in the replace directive in this scenario. The vendor directory is a fixed output derivation and thus must not depend on any other derivations. This is violated by the reference in modules.txt.
This can only be fixed by copying the plugin's sources into the sources derivation – that way, the replace path can be relative and thus references no other nix store path.

How can I use the Win32::LongPath module in Perl to manipulate long path names?

My Perl scripts need to work with pathnames that are longer than 260 characters, and I can not turn on the feature in the registry to enable Windows Long Path support.
I included a small Perl test, using the Win32::LongPath module to do this, and found that only a few functions from that module work. No luck with:
chdirL
getcwdL
Environment:
Windows 10 version 10.0.19041 build 19041
Strawberry Perl 5.30.3
I can't really find evidence that Win32::LongPath will not work in that environment, except for CPAN saying that the module has only been tested on XP and Windows 8...
⚠ However all of the help for Perl/Windows Long Paths in Windows 10 seems to recommend this module?
Am I using it wrong? I have included the output of the last iteration of the loop in the MRE (Minimal Reproducible Example):
The chdirL command never changes directories.
The getcwdL command only has 249 characters (513 expected).
package main 1.0;
use strict;
use warnings;
use Carp;
use Readonly;
use File::Spec::Functions;
use Cwd;
use Win32::LongPath;
my $dir = 'd123456789';
my $file = 'test.txt';
my $long_path = 'C:\\Temp';
my $long_file;
my $long_root = catdir $long_path, $dir;
my $fh;
# Maximum path length on linux : 4096
# Maximum path length on Windows : 260
Readonly::Scalar my $MAX_PATH => 512;
chdirL $long_path;
while ( length $long_path < $MAX_PATH ) {
$long_path = catdir $long_path, $dir;
$long_file = catfile $long_path, $file;
printf "%-5d: %s\n", length $long_path, "Making $long_path...";
mkdirL $long_path;
# === Does not change directories ==>
chdirL $long_path;
system 'CD';
# === Truncates path name ==>
my $curdir = getcwdL;
printf "%-20s: %s (%d)\n", 'getcwpdL', $curdir, length $curdir;
printf "%-5d: %s\n", length $long_file, "Making $long_file...";
openL \$fh, '>', $long_file or die "unable to create file\n";
print {$fh} "$long_path\n" or die "unable to print to file\n";
close $fh or die "unable to close file\n";
last if ( !( testL 'e', $long_path ) );
last if ( !( testL 'e', $long_file ) );
}
unlinkL $long_file or warn "unable to delete file\n";
1;
Last loop iteration:
513 : Making C:\Temp\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789...
W:\home\_PERL\long_path
getcwpdL : C:\Temp\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789 (249)
522 : Making C:\Temp\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\d123456789\test.txt...
chdirL is failing with The filename or extension is too long. chdirL, like the others, converts the path to a long path (\\?\...), and calls the appropriate system call. This is SetCurrentDirectoryW for chdirL and GetCurrentDirectoryW for getcwdL.
Using paths of the form \\?\... extends the use length limit for some calls, but not for SetCurrentDirectoryW and GetCurrentDirectoryW. It also doesn't extend the limit for CreateDirectoryW, CreateDirectoryExW and RemoveDirectoryW. These five retain the classical length limit even when using "long paths", at least according to Maximum Path Length Limitation, which provides a registry setting PLUS a manifest entry you can use to remove the limit on long paths for those calls.
Windows Registry Editor Version 5.00
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem]
"LongPathsEnabled"=dword:00000001
<application xmlns="urn:schemas-microsoft-com:asm.v3">
<windowsSettings xmlns:ws2="http://schemas.microsoft.com/SMI/2016/WindowsSettings">
<ws2:longPathAware>true</ws2:longPathAware>
</windowsSettings>
</application>
I don't remember if DLLs have their own manifest or not. If they do, the settings could be changed for just the module, and nothing would break. If they don't and perl's manifest needs to be changed, this would affect all uses of GetCurrentDirectoryW in the process, and that could cause problems. (GetCurrentDirectoryW could return an error because the buffer is too small, which could lead to a failure or crash depending on whether error checking is performed.)

Can't locate CGIBook/Error.pm

I am new to perl. I am running a perl script on macbook and i get following error:
Can't locate CGIBook/Error.pm in #INC (#INC contains: /Library/Perl/5.12/darwin-thread-
multi-2level /Library/Perl/5.12 /Network/Library/Perl/5.12/darwin-thread-multi-2level
/Network/Library/Perl/5.12 /Library/Perl/Updates/5.12.3 /System/Library/Perl/5.12/darwin-
thread-multi-2level /System/Library/Perl/5.12 /System/Library/Perl/Extras/5.12/darwin-
thread-multi-2level /System/Library/Perl/Extras/5.12) at HW1_3A.pl line 5.
It looks like I don't have CGIBook in my perl directory. Is that correct? Can anyone help me with this?
I didn't find CGIBook::Error on CPAN, so it may be a local module or something you got (or should get) from a vendor. Someone may have installed in a different location other than the default module search path.
In this case, it looks like you may be trying to use an example from the ancient book CGI Programming with Perl, which created a module with the same name for the examples.
A Google search of the error message quickly led to this code:
#!/usr/bin/perl -wT
package CGIBook::Error;
# Export the error subroutine
use Exporter;
#ISA = "Exporter";
#EXPORT = qw( error );
$VERSION = "0.01";
use strict;
use CGI;
use CGI::Carp qw( fatalsToBrowser );
BEGIN {
sub carp_error {
my $error_message = shift;
my $q = new CGI;
my $discard_this = $q->header( "text/html" );
error( $q, $error_message );
}
CGI::Carp::set_message( \&carp_error );
}
sub error {
my( $q, $error_message ) = #_;
print $q->header( "text/html" ),
$q->start_html( "Error" ),
$q->h1( "Error" ),
$q->p( "Sorry, the following error has occurred: " ),
$q->p( $q->i( $error_message ) ),
$q->end_html;
exit;
}
1;

Archive::Any gives IO error

#!/usr/bin/perl
use strict;
use warnings;
my $archive_files = "C:\\Temp\\FREMOTE\\test.zip";
sub extract_archive($$);
extract_archive($archive_files, "C:\\Temp\\FREMOTE\\TEST\\");
extract_archive("C:\\Temp\\FREMOTE\\TEST\\testb.zip",
"C:\\Temp\\FREMOTE\\TEST\\testb\\");
sub extract_archive($$) {
my $archive_file = shift;
my $extract_dir = shift;
if (! -d "$extract_dir") {
mkdir $extract_dir;
}
use Archive::Any;
my $archive = Archive::Any->new($archive_file);
if($archive->extract($extract_dir)) {
print "Extracted $archive_file into $extract_dir\n";
undef $archive;
} else {
print "Failed to extracted $archive_file into $extract_dir\n";
}
}
I got the following error. How do I resolve it?
IO error: write error during copy : Bad file descriptor
at C:/Perl/site/lib/Archive/Any.pm line 193.
IO error: write error during copy : Bad file descriptor
at C:/Perl/site/lib/Archive/Any.pm line 193.
IO error: write error during copy : Bad file descriptor
at C:/Perl/site/lib/Archive/Any.pm line 193.
IO error: write error during copy : Bad file descriptor
at C:/Perl/site/lib/Archive/Any.pm line 193.
I tested it with the following code. Using two known-good zip files, I added the second zip file into the first - to reproduce what I believe you are doing. With the original code I kept receiving an error during the extraction of the second file:
Extracted C:\Temp\colorbox-master.zip into C:\Temp\FREMOTE\TEST\<br>
Can't call method "extract" on an undefined value at Perl-1.pl line 19.
Different from your error, but fixed with the following code:
#!/usr/bin/perl
use strict;
use warnings;
my $archive_files = "C:\\Temp\\colorbox-master.zip";
extract_archive($archive_files, "C:\\Temp\\FREMOTE\\TEST\\");
extract_archive("C:\\Temp\\FREMOTE\\TEST\\easybox-v1.3.zip", "C:\\Temp\\FREMOTE\\TEST\\testb\\");
sub extract_archive {
my $archive_file = shift;
my $extract_dir = shift;
if (!-d "$extract_dir") {
mkdir $extract_dir;
}
use Archive::Any;
my $archive = Archive::Any->new($archive_file);
if($archive->extract($extract_dir)) {
print "Extracted $archive_file into $extract_dir\n";
undef $archive;
} else {
print "Failed to extracted $archive_file into $extract_dir\n";
}
}
Extracted C:\Temp\colorbox-master.zip into C:\Temp\FREMOTE\TEST\
Extracted C:\Temp\FREMOTE\TEST\easybox-v1.3.zip into C:\Temp\FREMOTE\TEST\testb\
Note that I had just installed the 'Archive::Any-0.0932' module (ActiveState Perl) so I may have a different (fixed) version. You may want to check that your module is at the most recent version. And that your zip files are not broken.

How can I open a file found by File::Find from inside the wanted function?

I have code like below. If I open the file $File::Find::name (in this case it is ./tmp/tmp.h) in my search function (called by File::Find::find), it says "cannot open the file ./tmp/tmp.h reason = No such file or directory at temp.pl line 36, line 98."
If I open the file directly in another function, I am able open the file.
Can somebody tell me the reason for this behavior? I am using activeperl on Windows and the version is 5.6.1.
use warnings;
use strict;
use File::Find;
sub search
{
return unless($File::Find::name =~ /\.h\s*$/);
open (FH,"<", "$File::Find::name") or die "cannot open the file $File::Find::name reason = $!";
print "open success $File::Find::name\n";
close FH;
}
sub fun
{
open (FH,"<", "./tmp/tmp.h") or die "cannot open the file ./tmp/tmp.h reason = $!";
print "open success ./tmp/tmp.h\n";
close FH;
}
find(\&search,".") ;
See perldoc File::Find: The wanted function (search in your case) will be called after File::Find::find changed to the directory that is currently searched. As you can see, $File::Find::name contains the path to the file relative to where the search started. That path that won't work after the current directory changes.
You have two options:
Tell File::Find to not change to the directories it searches: find( { wanted => \%search, no_chdir => 1 }, '.' );
Or don't use $File::Find::name, but $_ instead.
If ./tmp/ is a symbolic link then you will need to do following:
find( { wanted => \&search, follow => 1 }, '.' );
Does that help?
If you want to search the file in current working dir you can use Cwd.
use warnings;
use strict;
use File::Find;
use Cwd;
my $dir = getcwd;
sub search
{
return unless($File::Find::name =~ /\.h\s*$/);
open (FH,"<", "$File::Find::name") or die "cannot open the file $File::Find::name reason = $!";
print "open success $File::Find::name\n";
close FH;
}
find(\&search,"$dir") ;

Resources