I have a perl script which writes messages to STDOUT and STDERR (via print/croak statements), but I also redirect the STDOUT and STDERR to a logfile:
File::Tee::tee STDOUT, ">>", "$logFile" ;
File::Tee::tee STDERR, ">>", "$logFile" ;
Now the output logfile has messages from STDOUT and STDERR displayed out of order. Also the actual output on terminal is also out of order. I have tried flushing the buffers (as recommended here: https://perl.plover.com/FAQs/Buffering.html) but it doesn't help:
select(STDERR) ;
$| = 1 ;
select(STDOUT) ;
$| = 1 ;
Does anyone know what I must do to see the output in order (I also tried additionally flushing the filehandle corresponding to $logfile, but its still the same)?
EDIT:
Thanks to all of you who have replied. A lot of the discussion over this ended up in comments, so I am going to list the few things which I tried based on feedback from all of you.
I already was flushing the STDOUT and STDERR before I used File::Tee. As #jimtut suspected, File::Tee was indeed the culprit - removing it restored the ordering on console. But I did want to redirect the STDOUT and STDERR.
#mob suggested using IO::Tee instead, but I haven't fully understood how to make that work they way I want in my code.
#briandfoy pointed out that there isn't a reliable way to ensure 2 seperate filehandles are seen in the correct order in realtime and also suggested using a logging routine which is the only place which can write to STDOUT/STDERR. #zimd further pointed out that File::Tee uses fork which is the heart of the issue since 2 processes cannot guarantee any order on output.
Since the File::Tee is to blame, I attempted to remove this from the code. I updated my logger function to print to STDOUT/STDERR as well as to additionally print to $log filehandle. Further for capturing the warns in the log, I did the following:
sub warning_handler {
my $msg = $_[0] ;
print STDERR $msg ;
print $log $msg if defined $log ;
}
$SIG{__WARN__} = \&warning_handler ;
This worked great for all of the code under my control. Everything was printing in order now both on console and logfile. However I realized I can't use this solution since I was also calling someone else's perl packages for some functionality and clearly I couldn't intercept the print/croak etc which wrote to STDOUT/STDERR within the 'off the shelf' package. So right now, I don't have a good solution. However I suspect if I can find someway to intercept STDOUT/STDERR within perl, I might be able to get what I need.
EDIT2:
I added my own answer which is probably the closest I got to solving the problem by modifying mob's solution of using IO::Tee instead of File::Tee, but even this misses some messages (though it fixes ordering).
EDIT3:
Finally found the 'solution'
use IO::Tee ;
use Capture::Tiny qw(capture);
...
...
select(STDERR) ;
$| = 1 ;
select(STDOUT) ;
$| = 1 ;
open (my $log, ">", $logfilename) ;
*REALSTDOUT = *STDOUT ;
*REALSTDERR = *STDERR ;
*STDOUT = IO::Tee->new(\*REALSTDOUT, $log);
*STDERR = IO::Tee->new(\*REALSTDERR, $log);
# Regular Perl code here which sends output to STDOUT/STDERR
...
...
# system calls / calls to .so needs to be catpured
&log_streams(sub { &some_func_which_calls_shared_object() ; }) ;
sub log_streams {
my ($cr, #args) = #_; # code reference, with its arguments
my ($out, $err, $exit) = capture { $cr->(#args) };
if ($out) {
print STDOUT $out;
}
if ($err) {
print STDERR $err;
}
}
The use of IO::Tee ensures all perl generated outputs to console also go to logfile, and this happens immediately thereby updating the log and console in realtime. Since IO::Tee is changing the meaning of STDOUT/STDERR filehandles to now refer to the teed handles, it can only intercept stdio from perl statements, it misses sys calls since they bypass perl's STDOUT/STDERR handles. So we capture the syscall output and then use the log_streams routine to forward it to the now aliased STDOUT/STDERR streams. This creates a delay in the system call generated output showing up in log/terminal but there is no delay for perl generated output - i.e. best of both worlds. Do note that the ordering of stderr and stdout generated by an invocation of subroutine some_func_which_calls_shared_object is not preserved since in the log_streams routine, we first print to STDOUT and then to STDERR - as long as the system call is atomic and doesn't do much in terms of interleaving stdout/stderr messages we should be ok.
Appreciate solutions from briandfoy, mob and zimd whose answers I combined to arrive at this solution! Never thought it would require to go through this detail for what seems a very simple problem.
With two separate file handles, there's no contract or guarantee that you'll see them in real time. Various settings and buffers affect that, which is why you see the auto flush stuff ($|). It's the same idea for files or the terminal.
Realize this is an architectural issue rather than a syntactic one. You have two things competing for the same resource. That usually ends in tears. I hesitate to suggest a solution when I don't know what the problem is, but consider having whatever is trying to write to STDOUT or STDERR write to some sort of message broker that collects all the messages and is the only thing that writes to the final (shared) destination. For example, things that want to add entries to the syslog don't write to the syslog; they send messages to the thing that writes to the syslog.
A more Perly example: in Log4perl, you don't write to the final destinations. You simply log a message and the logger is the single thing figures out how to handle it. When I want this sort of behavior with the module, I don't use output facilities directly:
debug( "Some debug message" );
sub debug {
my $message = shift;
output( "DEBUG: $message" );
}
sub output { # single thing that can output message
...
}
Then do whatever you need to do in output.
But, you can't always control that in other things that are also trying to output things. Perl let's you get around this by redefining what warn and friends do by putting a coderef in $SIG{__WARN__}. You can capture warning messages and do whatever you like with them (such as sending them to standard output). Beyond that is black magic that reopens STDERR onto something you can control. It's not that bad and it's isolated in one place.
At some point where another person doesn't want merged output, and the intrusive solutions make it impossible to separate them. I'd much prefer flexibility than hard-coded constraint. If I want just the errors, I want a way to get just the errors. There are many other sorts of workarounds, such as wrappers that collect both output streams (so, not at all intrusive) and various command-redirections.
You will have two filehandles writing to $logfile. Unless File::Tee takes care to seek to the end of the filehandle before every write (which it doesn't appear to), you will get a race condition where one filehandle will overwrite the other.
A workaround would be to use the reopen option to the File::Tee::tee function -- that will close the file after each write and reopen it (at the proper end of the file) before the next write. That could hurt your performance though, depending on how often you write to those filehandles.
You might also have better luck with IO::Tee, which is a more straightforward implementation (using tied filehandles) than what File::Tee uses (a background process for each File::Tee::tee call), so you may get fewer surprises. Here is how an IO::Tee solution might look:
use IO::Tee;
$| = 1;
open my $stdout, ">&=1"; # to write to original stdout
open my $stderr, ">&=2"; # to write to original stderr
open my $fh_log, ">>", $logfile;
*STDOUT = IO::Tee->new($stdout, $fh_log);
*STDERR = IO::Tee->new($stderr, $fh_log);
...
There are no background process, extra threads, or anything else to cause a race condition. Both STDOUT and STDERR will both write to the same log filehandle from the same process.
After taking hint from #mob's answer to use IO::Tie instead of File::Tee (since the latter uses fork causing out of order STDERR vs STDOUT), I modified mob's original solution a bit and it worked (almost - read on):
use IO::Tee
...
...
open (my $log, ">", $logfilename) ;
*MYSTDOUT = *STDOUT ;
*MYSTDERR = *STDERR ;
*STDOUT = IO::Tee->new(\*MYSTDOUT, $log);
*STDERR = IO::Tee->new(\*MYSTDERR, $log);
This resulted in the correct ordering on both console and in the logfile (mob's original solution using open to dup the STDOUT/STDERR didn't work - it resulted in correct order in logfile, but out of order on console. Using a typeglob alias instead of dup works for some strange reason).
However, as good as this solution sounds, it missed printing some messages from a package which I call in the logfile (though they get printed on console). My original code which had File::Tee did result in these same messages from package being shown in the logfile, so there is some voodoo going on somewhere. The specific package in question is a .so file so I have no visibility into how exactly it prints its messages.
EDIT:
I guess that the .so file is as good as an external system command printing to stdout/stderr. Since its not going through perl IO, the handles pointed to by STDOUT/STDERR typeglobs within perl will not reflect the output of external programs called from perl.
I guess the best solution would be to use a combination of this solution for messages coming from within the perl code, and using Capture::Tiny::capture as pointed out by #zdim for capturing and redirecting messages from system calls/calls going through the swig interface.
Note The first part is done via tie-d handles; solution in the second part uses Capture::Tiny
A bare-bones proof-of-concept for an approach using tie-d handles.
The package that ties a handle, by printing from it to a file and to (a copy of) STDOUT stream
package DupePrints;
use warnings;
use strict;
use feature 'say';
my $log = 't_tee_log.out';
open my $fh_out, '>', $log or die $!; # for logging
# An independent copy of STDOUT (via dup2), for prints to terminal
open my $stdout, '>&', STDOUT or die $!;
sub TIEHANDLE { bless {} }
sub PRINT {
my $self = shift;
print $fh_out #_;
print $stdout #_;
}
1;
A program that uses it
use warnings;
use strict;
use feature 'say';
use DupePrints;
$| = 1;
tie *STDERR, 'DupePrints';
tie *STDOUT, 'DupePrints';
say "hi";
warn "\t==> ohno";
my $y;
my $x = $y + 7;
say "done";
This prints to both the terminal and to t_tee_log.out the text
hi
==> ohno at main_DupePrints.pl line 14.
Use of uninitialized value $y in addition (+) at main_DupePrints.pl line 17.
done
See perltie and Tie::Handle, and this post with related examples, and perhaps this post
The logging to a file of STDOUT and STDERR streams (along with a copied printout) works across other modules that may be used in the main program, as well.
To also have "clean" prints, that don't get logged, copy the STDOUT handle in the main program, like it's done in the module, and print to that. If you need to use this in a more selective and sophisticated manner please modify as needed -- as it stands it is meant to be only a basic demo.
With the clarification in the question's edit, here is a different approach: wrap a call to Capture::Tiny, which captures all output from any code, and then manage the captured prints as needed
use warnings;
use strict;
use feature qw(say state);
use Capture::Tiny qw(capture);
sub log_streams {
my ($cr, #args) = #_; # code reference, with its arguments
# Initialize "state" variable, so it runs once and stays open over calls
state $fh_log = do {
open my $fh, '>', 'tee_log.txt' or die $!;
$fh;
};
my ($out, $err, $exit) = capture { $cr->(#args) };
if ($out) {
print $fh_log $out;
print $out;
}
if ($err) {
print $fh_log $err;
print $err;
}
}
log_streams( sub { say "hi" } );
log_streams( sub { warn "==> ohno" } );
log_streams( sub { my $y; my $x = $y + 7; } );
log_streams( sub { system('perl', '-wE', q(say "external perl one-liner")) } );
log_streams( sub { say "done" } );
The downside of all this is that everything needs to run via that sub. But then again, that's actually a good thing, even if sometimes inconvenient.
The state feature is used to "initialize" the filehandle because a variable declared as state is never re-initialized; so the file is opened only once, on the first call, and stays opened.
This is also a demo in need of completion.
Related
I'm using Perl 5.22.1 with Win32::Console. I have scripts that run make and I thought it would be nice if I could output the errors in red. Win32::Console has handles for stdout and sterr so I thought I could give colors to the individual handles but if I modify stderr, then it also modifies stdout. Is there a way to do this with Win32::Console or another module.
test.pl
use strict;
use warnings;
use Win32::Console;
my $stdout = new Win32::Console(STD_OUTPUT_HANDLE);
my $stderr = new Win32::Console(STD_ERROR_HANDLE);
sub testOutput
{
print("Testing output\n");
system("ls error");
system("ls *.pl");
print("\n");
}
testOutput();
$stdout->Attr($FG_BLUE);
testOutput();
$stderr->Attr($FG_RED);
testOutput();
$stdout->Write("stdout print\n");
$stderr->Write("stderr print\n");
The output
This happens because regardless of the apparent "stream" on which you invoke Attr, ultimately, both are connected to the console and all Win32::Console does is to output a given sequence of control characters to the console.
If I understand correctly, you want the stderr output of ls to show in one color and the stdout output of ls to show up in another color. In that case, you might want to use Capture::Tiny to capture those streams and output them with the settings you want.
I, in my script, shell a function that prints a message on the console. It can be called from any other function.
function print_message
{
echo "message content"
}
The problem is, in shell, functions like echo or printf that usually print data on standard output redirect their messages to the calling function instead as a return value.
return_value=$(print_message) # this line print nothing.
echo $return_value # This line print the message. I don't want to have to do it.
I would like to avoid this behavior and print it directly on standard - or error - output. Is there a way to do it?
Or am I just wrong to want to use functions in shell, and should I use instead a huge script to handle any comportment?
The $(...) calling syntax captures standard output. That is its job. That's what it does.
If you want static messages that don't get caught by that then you can use standard error (though don't do this for things that aren't error message or debugging messages, etc. please).
You can't have a function which outputs to standard output but that doesn't get caught by the $(...) context it is running in because there's only one standard output stream. The best you could do for that would be to detect when you have a controlling terminal/etc. and write directly to that instead (but I'd advise not doing that most of the time either).
To redirect to standard error for the function entirely you can do either of these.
print_message() {
echo "message content" >&2
}
or
print_message() {
echo "message content"
} >&2
The difference is immaterial when there is only one line of output but if there are multiple lines of output then the latter is likely to be slightly more optimized (especially when the output stream happens to be a file).
Also avoid the function keyword as it isn't POSIX/spec and isn't as broadly portable.
You are explicitly saying "don't print the output directly! Put it in a variable so I can print it myself!".
You can simply stop doing that, and the message will be printed automatically:
$ cat yourscript
#!/bin/bash
function print_message
{
echo "message content"
}
print_message
$ ./yourscript
message content
Invoking print_message inside $(...) redirects the output. If you don't want the output redirected then invoke the command without the $(...). E.g.
return_value=print_message # this line print nothing.
echo $return_value # this line print the message. I don't want to have to do it.
Note, the return value from the function you provided will now be the name of the function.
Ok so bear with me as I am not a professional, this is a proof of concept project to learn more about my shell, programming and just basic bash scripting.
So WHAT I WANT TO DO is: whenever anything is printed out in my terminal, be it the result of a command or an error message from the shell I want to apply some "filters" to what is being displayed so for example if I input "ls -a" in the terminal I would like to get the list of folders that the command returns but apply a TIME DELAY to the characters so that it seems like the list is being typed in real time.
More SPECIFICALLY I'd like for the script to take every alphanumerical character in STDOUT and spend a specific amount of time (say 100 milliseconds) iterating through random characters (these can be accessed randomly from a list) before finally stopping at the original value of the character.
WHAT I KNOW:
not much, I am new to programming in general so also the bash language but I can read some code and browsing through I found this http://brettterpstra.com/2012/09/15/matrixish-a-bash-script-with-no-practical-application/ script that plays with tput. This shows me the visual effect I'd like to accomplish can be accomplished...now to make it happen orderly and individually for each character printed to STDOUT...that is what I can't figure out.
WHAT I THINK:
in my mind I know I could take the STDOUT and pipe it to a file in which through any language (let's say python!) I can do all kinds of string manipulation and then return the output to STDOUT but I'd like for the characters to be manipulated in realtime so if for example the code was
cool_chars="£ ア イ ウ エ オ カ キ ク ケ コ サ シ ス "
stdout=whatever module works to grab STDOUT from shell as string
stdout = stdout.split(" ")
for word in stdout:
for letter in word:
n=0
while (n<10):
#print the following iteration in real time # shell but how????
print random.choice(cool_chars)
#finally stop at correct character
print letter
n++
Anyway, I've read a little about curses and ncurses and how you can create new windows with whatever specified parameters, I wonder if it'd be just a matter of creating a terminal with the specified parameters with the curses libraries and then making a link so that each new terminal instance opens my modified curses shell or if I can just do a bash shell script or if it'd be easiest to use something like python. I know all of the above can be options but I'm looking for the simplest, not necessarily most resource efficient answer.
Any help, comments, pointers etc is appreciated.
This does not answer you question fully, but it does print any input as if it was being type in real time:
perl -MTime::HiRes -F -ane '$|=1;$old=""; foreach $char(#F){Time::HiRes::sleep(0.1); print "\r${old}${char}"; $old.=$char}' /etc/hosts
instead of file, STDIN can be used:
echo -e "abc\ndef\nghi" | perl -MTime::HiRes -F -ane '$|=1;$old=""; foreach $char(#F){Time::HiRes::sleep(0.1); print "\r${old}${char}"; $old.=$char}'
We can make it shorter using shell's sleep:
perl -F -ane '$|=1;$old=""; foreach $char(#F){`sleep 0.1`; print "\r${old}${char}"; $old.=$char}'
EDIT:
The script below should fully solve your problem:
#!/usr/bin/perl
use strict;
use utf8;
binmode(STDOUT, ":utf8");
our $cols=`tput cols`;
our $|=1;
our $cursor="";
sub reset_line {
print "\r" . " "x$cols . "\r";
}
sub pick_cursor {
my #c = split (//,"£アイウエオカキクケコサシス");
$cursor=$c[int(rand(1+#c))];
}
while (<>) {
my $line="";
my #a=split //;
foreach my $char (#a) {
`sleep 0.1`;
reset_line;
pick_cursor;
if ( $char eq "\n" || $char =~ /\s/) {
print "${line}${char}";
}else {
print "${line}${char}${cursor}";
}
$line .= $char;
}
}
I am a trying to use taint mode. I want to open a file based on user input and open a file to read data. Below is my code
#!/usr/bin/perl -w
use strict;
use warnings;
my $name = $ARGV[0];
my $file = "/Desktop/data/$name";
open MYFILE, "$file" or die $!;
while (<MYFILE>) {
chomp;
print "$_\n";
}
close(MYFILE);
case 1) When I run file using
perl -w filename.pl input.txt
I am able to read data from the file.
case 2) When I change the
#!/usr/bin/perl -w
to
#!/usr/bin/perl -T
and run the file using
perl -T filename.pl input.txt
I am still able to read the data.
case 3)When I change file to open in write mode and run in tainted mode I get correct output as,
Insecure dependency in open while running with -t switch at test1.pl line 8.
What might be issue with case two scenarios? Or is that a correct behavior?
Is it allowed to open a file in taint mode for reading?
This is correct behaviour for taint mode. The documentation specifies:
You may not use data derived from outside your program to affect something else outside your program--at least, not by accident.
[...]
$arg = shift; # $arg is tainted
[...]
If you try to do something insecure, you will get a fatal error saying something like "Insecure dependency" or "Insecure $ENV{PATH}".
(edit: missed some stuff):
Tainted data may not be used directly or indirectly in any command that invokes a sub-shell, nor in any command that modifies files, directories, or processes, with the following exceptions:
Arguments to print and syswrite are not checked for taintedness.
(This is why the read-mode example doesn't complain about the file data.)
Command-line arguments are potentially insecure, and so are tainted until specified otherwise.
To determine whether data is tainted:
To test whether a variable contains tainted data, and whose use would thus trigger an "Insecure dependency" message, you can use the tainted() function of the Scalar::Util module, available in your nearby CPAN mirror, and included in Perl starting from the release 5.8.0.
To untaint data:
[...]the only way to bypass the tainting mechanism is by referencing subpatterns from a regular expression match. Perl presumes that if you reference a substring using $1, $2, etc., that you knew what you were doing when you wrote the pattern. That means using a bit of thought--don't just blindly untaint anything, or you defeat the entire mechanism. It's better to verify that the variable has only good characters (for certain values of "good") rather than checking whether it has any bad characters. That's because it's far too easy to miss bad characters that you never thought of.
(with a warning for use locale):
If you are writing a locale-aware program, and want to launder data with a regular expression containing \w, put no locale ahead of the expression in the same block. See SECURITY in perllocale for further discussion and examples.
This prevents the following from wiping out your hard drive:
perl script.pl '| rm -rf /'
Solution: Use the form of open that only accepts a file name.
open(my $fh, '<', $ARGV[0])
I'm executing a system command, and wanting to (1) pre-load STDIN for the system command and (2) capture the STDOUT from the command.
Per here I see I can do this:
open(SPLAT, "stuff") || die "can't open stuff: $!";
open(STDIN, "<&SPLAT") || die "can't dupe SPLAT: $!";
print STDOUT `sort`;
This uses the currently defined STDIN as STDIN for the sort. That's great if I have the data in a file, but I have it in a variable. Is there a way I can load the contents of the variable into STDIN before executing the system command? Something like:
open(STDIN, "<$myvariable"); # I know this syntax is not right, but you get the idea
print STDOUT `sort`;
Can this be done without using a temp file? Also, I'm in Windows, so Open2 is not recommended, I hear.
Thanks.
There's no reason not to use open2 on Windows. That said, open2 and open3 are rather low-level interfaces, so they're usually not the best choice on any platform.
Better alternatives include IPC::Run and IPC::Run3. IPC::Run is a bit more powerful than IPC::Run3, but the latter is a bit simpler to use.
May I recommend
use IPC::Run3 qw( run3 );
my $stdin = ...;
run3([ 'sort' ], \$stdin, \my $stdout);
It even does error checking for you.
But since you mentioned open2,
use IPC::Open2 qw( open2 );
my $stdin =...;
my $pid = open2(\local *TO_CHILD, \local *FROM_CHILD, 'sort');
print TO_CHILD $stdin;
close TO_CHILD;
my $stdout = '';
$stdout .= $_ while <FROM_CHILD>;
waitpid($pid);
die $? if $?;
Maybe IPC::Open2 didn't work so well on Windows 15 years ago, but I wouldn't expect you to have any trouble with it now.
use IPC::Open2;
my $pid = open2( \*SORT_OUT, \*SORT_IN, 'sort' );
print SORT_IN $sort_input; # or #sort_input
close SORT_IN;
print "The sorted output is: ", <SORT_OUT>;
close SORT_OUT;