How do I use raku -e and -n with multiple file glob - command-line-arguments

I'd like to do the following in raku on windows
raku -n -e ".say if /mydatabegin/;" *.file
Failed to open file C:\..\*.file: Invalid argument
The glob isn't interpreted as a glob. I assume that's because windows requires your programs to do the globbing yourself? So is there a pre-processing directive or function or even a switch I might have missed or redirect or something that allows the glob to be expanded while keeping the simplicity of the -n (or -p) and -e switches?
Obviously, I can change it to a full program by removing the -n (or -p), just using -e to specify a main, and loop on the glob results. But I really like -n.
PS. I'm literally just learning raku, and was surprised this didn't work out of the box. So examples of full programs with easy syntax work also. But I really like -n..
Edit: re #chenyf
raku -e ".say for $*ARGFILES" *.file
Same error. Related:
raku -e ".say for $*ARGFILES.lines" *.file
Same error.
raku -e "use IO::Glob; .say for glob('*.file')"
Worked as expected! Expanding:
raku -e "use IO::Glob; .say for glob('*.file').lines"
No such method 'lines' for invocant of type 'IO::Glob'
Getting closer - perhaps expanding on this is a good enough workaround. But returning to one line glory attempts:
raku -e "use IO::Glob; .say for glob($*ARGFILES)" test.file
Cannot resolve caller glob(IO::ArgFiles:D); none of these signatures match.
Ok - let's retreat back to the safety of strings:
raku -e "use IO::Glob; .say for glob($*ARGFILES.Str)" test.file
Yes! SO..:
raku -e "use IO::Glob; .say for glob($*ARGFILES.Str).lines" test.file
No such method 'lines' for invocant of type 'IO::Glob'
I clearly need to read more of the manual. But let's retreat a little and see if my use case works:
raku -e "use IO::Glob; .say for glob($*ARGFILES.Str)" *.file
Failed to open file C:\..\*.file: Invalid argument
The same error I started off with. Could this just be a raku on windows error?
Edit:
raku -MIO::Glob -e "my #files = (map { glob($_).dir }, #*ARGS).flat; for #files -> $file { say $_ for $file.lines }" *file *file2 *5
I have three sets of files. I can almost live with this solution - except for some reason the lines are being printed with "s
Any ideas on shortening, and getting rid of the quotes?
EDIT Working around the auto-globbing of the $*ARGFILES variable:
raku -MIO::Glob -n -e "BEGIN { #*ARGS = (map { glob($_) }, #*ARGS).flat }; .say" *.file *.file2
This has the advantage of still looking like the original one liner; it uses -n! It just had to do the globbing that seems to be a bug when $*ARGFILES is created.
raku -MIO::Glob -e "BEGIN { #*ARGS = (map { glob($_) }, #*ARGS).flat }; .say for $*ARGFILES.lines" *.file *.file2
Converting to $*ARGFILES.lines above shows that $*ARGFILES gets its values from #*ARGS dynamically.
EDIT
lastly, it turns out the glob function doesn't work with directories, at least on windows (the documentation has an example that simply doesn't work).
#Example from https://github.com/zostay/raku-IO-Glob
for glob("src/core/*.pm") -> $file { say ~$file }
#mine that doesn't work
raku -MIO::Glob -e "for glob('..\*.file') -> $file { say ~$file }"
#mine that does work.
raku -MIO::Glob -e "for glob('*.file').dir('..') -> $file { say ~$file }"
#And therefore the final modification of the script above:
raku -MIO::Glob -e "BEGIN { #*ARGS = (map { glob(.IO.basename).dir(.IO.dirname) }, #*ARGS).flat }; .say for $*ARGFILES.lines" ..\*.file

My rudimentary understanding of file-globbing is that the shell handles that--and since it seems that you're on Windows, all bets may be off. The only exception may be if you're using WSL Windows-Subsystem-for-Linux, which should give you a more Unix/Linux-like experience:
https://learn.microsoft.com/en-us/windows/wsl/about
According to the Microsoft document below Windows has two built-in shells, CMD.exe and Powershell (I believe the WSL shell above is optional):
"Command Shell Overview"
"Windows has two command shells: The Command shell and PowerShell. Each shell is a software program that provides direct communication between you and the operating system or application, providing an environment to automate IT operations."
https://learn.microsoft.com/en-us/windows-server/administration/windows-commands/windows-commands
Below is what works on a fairly old bash/MacOS system.
With the -ne (non-autoprinting) command line flag:
raku -ne 'print $_ ~ "\n";'
raku -ne 'say $_ ;'
raku -ne '.say;'
With the -pe (autoprinting) command line flag:
raku -pe 'print "";'
raku -pe ''
I literally just printed the contents of five (5) *.txt files using the commands above. The -e flag always ends your command-line flag cluster, but can be combined (see the -M flag below for an exception).
Now because you're on Windows, you may have to swap "" double-quotes with '' single-quotes, (and vice versa!). But hopefully that gets you started. [Any further problems and you may have to specify the full-path to your Rakudo executible].
If you'd like Raku to handle the directory-munging for you, there's Raku's dir() command, which can be used in conjunction with its :test adverb (parameter). Compare below when in the afore-mentioned directory with five (5) *.txt text files, (remove .elems calls to print actual text):
raku -e '.lines.elems.say for dir(test => / \.txt $ /);'
14
16
34
1
16
VERSUS:
raku -e '.lines.elems.say for $*ARGFILES;' *.txt
81
https://docs.raku.org/routine/dir
ADDENDUM: My inspection of the Raku P5-to-P6 Glossary suggests that the designers deliberately left Perl5's glob function out of Perl6/Raku, opting instead for a more-powerful built-in dir() routine. Links below:
https://docs.raku.org/language/5to6-perlfunc#index-entry-glob_-_perlfunc
https://docs.raku.org/routine/dir
Raku module IO::Glob is non-core (and..I've never tried it). But if you're working at the command line and want to load a module, you use (for example) -MIO::Glob followed by -e, -ne, or -pe, etc. (Also, no need to incorporate a use IO::Glob; line within your Raku one-liner, it's loaded via -M already).
See Raku one-liner examples with the Txt::CSV module and/or XML module at the links below:
https://unix.stackexchange.com/search?q=%5Bcsv%5D+Raku
https://unix.stackexchange.com/search?q=%5Bxml%5D+Raku
Follow additional Windows CMD.exe discussion/resolution below:
https://github.com/rakudo/rakudo/issues/4550

The general answer - Its not a bug. Windows programs have to deal with their own globbing if they want it. Making it work in the raku executable makes sense to me; it removes platform specific surprises, and makes one-liners easier.
But others didn't see it that way, and there is an easy enough solution - create your own module so that the code can remain consistent and be called relatively simply.
Here's a module for starters. There is room to add things like
a switch for making a successful match mandatory
a switch to indicate that a failed glob should stay in the #*ARGS variable
a switch to only glob after it.
a routine to apply the globbing instead of automatically doing it. This would allow you to remove the non-file switches
gather each glob into its own list (perhaps with a switch).
The module:
unit module CLGlob:ver<1.0>:auth<pureabsolute>;
use IO::Glob;
#*ARGS = map { .Str }, (map { glob(.IO.basename).dir(.IO.dirname) }, #*ARGS ).flat;
Note: the expression in the second map can be simplified when the glob function works for windows with directories.
Note: I convert each element to .Str to make #*ARGS consistent. The $*ARGFILES worked without doing that so some processing can be saved if you'll never look at the #*ARGS again.
Finally, the end result:
raku -MCLGlob -ne ".say" ..\*.file ..\*.file2
Yay.

Related

What does two at (#) signs surrounding a string mean in a shell script?

For example,
# Execute the pre-hook.
export SHELL=#shell#
param1=#param1#
param2=#param2#
param3=#param3#
param4=#param4#
param5=#param5#
if test -n "#preHook#"; then
. #preHook#
fi
For context, this is from a shell script in a commit from 2004 in the Nixpkgs repo; tried to see if this maybe a reference feature but string "shell" only occurs once (in a case-sensitive search) in the entire file.
The answer by Chris Dodd is correct, insofar as there's no intrinsic meaning to the shell -- and #foo# is thus commonly used as a sigil. Insofar as you encountered this in nixpkgs, it provides some stdenv tools specifically for implementing this pattern.
As documented at https://nixos.org/manual/nixpkgs/stable/#ssec-stdenv-functions, nixpkgs stdenv provides shell functions including substitute, substituteAll, substituteInPlace &c. which will replace #foo# values with the content of corresponding variables.
In the context of the linked commit, subsitutions of that form can be seen being performed in pkgs/build-wrapper/gcc-wrapper/builder.sh:
sed \
-e "s^#gcc#^$src^g" \
-e "s^#out#^$out^g" \
-e "s^#bash#^$SHELL^g" \
-e "s^#shell#^$shell^g" \
< $gccWrapper > $dst
...is replacing #out# with the value of $out, #bash# with the value of $SHELL, etc.
The # symbol has no meaning to the shell -- it is a punctuation character that will pretty much never occur in any actual shell script.
This makes it a good choice to use for patterns in script templates -- the basic idea being that a simple search-and-replace process will be used (perhaps with a sed script as in the link you show) to rewrite the template into an actual shell script. Every string of the form #name# in the template will be replaced by some other string related to the environment in which the script is being installed.

Is there any way of programatically test-compiling Raku code?

You can save code to a file and compile it with raku -c. Is there any way to do the same programatically, using some public API?
There is an undocumented named parameter :check to EVAL that will prevent the compiled code from actually running. It will return Nil on success. If there are any errors, it will throw like it otherwise also would.
$ raku -e 'q/say "foo"/.EVAL'
foo
$ raku -e 'q/zay "foo"/.EVAL'
===SORRY!=== Error while compiling ...
Undeclared routine:
zay used at line 1. Did you mean 'say'?
$ raku -e 'say q/say "foo"/.EVAL(:check)'
Nil
Perhaps that named parameter should be documented.
EDIT: Please note that this will not prevent BEGIN blocks from being run, as #ugexe pointed out.

Sed or Perl: One file with regex instructions, one instruction per line, executed on another file

I'm setting up a regex learning environment purely in bash/tmux with a pane for the file containing a regex, a pane for a text-file-for-processing, and a pane for the bash shell. I'm at the start of "The Bastards Book of Ruby"-regex chapter.
The 'Bastard's Book' shows an example of a 'negative-lookahead' regex (perfect, lets learn), where perl is recommended over sed. As I'm going for a CLI approach-> Bash command: $ perl -p file_with_regex.pl test.txt
(This prints the lines from test.txt with the intended substitutions)
Question: How would I add a second regex (on a new line) of the regex.pl file, and have perl execute both the first and (next) this second instruction for processing the text file?
# regex.pl
s/^(?!Mr)/Ms./g
s/Ms./Mrs./g
(Adding the second regex results in "Execution of regex.pl aborted due to compilation errors.")
The overall aim here is to progress in Ruby, while testing Regular Expressions as concisely as possible. Picking up a bare minimum of sed/perl while doing so would be a plus, as a proper dive into perl would take time from Ruby (and when it's time for the perl dive, I'll have had some time with the basics). The more I look at this the more it seems necessary to just do it in Ruby, if there isn't a perl switch that would enable a command-line-with-files approach.
The basic answer is that you need a semicolon after each line.
Paraphrased from perlrun, -p reads all lines of input, runs the commands you specified, and then prints out the value in $_ (the implicit variable you're running your substitute commands on in this script).
So, removing the magic, -p transformed your code into:
LINE:
while (<>) {
# regex.pl
s/^(?!Mr)/Ms./g
s/Ms./Mrs./g
} continue {
print or die "-p destination: $!\n";
}
Perl requires a semicolon between statements (but a terminal semicolon at the end of a block is optional) hence the error.
I personally would recommend writing the whole script above into the file instead of using -p because it is far less magical, but you're welcome to do it either way.
If you were going to write the whole script, I would recommend something more like the following:
use strict;
use warnings;
while ( my $line = <ARGV> ) {
$line =~ s/^(?!Mr)/Ms./g;
print "After first subst: $line";
$line =~ s/Ms./Mrs./g;
print "After second subst: $line";
}
use strict and use warnings are the boilerplate you want at the top of any perl script (to catch typos and other common mistakes) and explicitly calling the variable $line gives you a better understanding of how the script is working ($_ is very magical for beginners and the source of many errors IMO, but great when you know what's what).
If you're wondering about <> vs. <ARGV> they are the same thing and mean "Read through all the lines of files provided as command-line arguments to this script or standard input if no files are provided"."

Substitute output from variable to give another output

I am trying to substitute output from variable to give another output. The variable i have problems with is the $apps. It gives me "syntax error: bad substitution".
$appletDir is a directory with desktop shortcuts. The problem is that some shortcuts do not have the same name as the icon(png). So i need to substitute the program name with the png linking to it. I got it working with the commented out if-statement below. If this substitution could work then my script would look better. Cause i need to put down a couple of this.
I want it to look for "general_call" instead of "rtcom-call-ui" when going through the icon folders. Cause the png is called "general_call". The icons folders are the variables $icoDir64 $icoDirSca.
for applet in $appletDir*
do
app=`basename $applet | sed -e 's/.*://g' -e 's/.*osso-//g' -e 's/\.desktop.*//g'`
apps="${app/rtcom-call-ui/general_call}"
#if [ "${app}" = "rtcom-call-ui" ]; then
# app="general_call"
#fi
#echo $apps
#done
#exit 0
found=`find ${icoDir64} ${icoDirSca} -name "*.png"`
for file in $found
do
base="`basename ${file}`"
if [ "${base}" = "${app}.png" -o "${base}" = "tasklaunch_${app}.png" -o "${base}" = "general_${app}.png" ]; then
echo "WORKING!!!!!!!!!!!!!!!!!! $file"
fi
done
done
I think you may have a shell version problem (your shell isn't as modern as the notation you are using). A previous incarnation of this post suggested:
apps="${app}/rtcom-call-ui/general_call"
Or, for substituting rtcom-call-ui with general_call, you need to use echo and sed (at least in classic shells - it might be that bash has something built-in to do it):
apps=$(echo "${app}" | sed s/rtcom-call-ui/general_call/)
The notation ${var|continuation} (where | represents an arbitrary punctuation character) is used to modify the value substituted. For example:
apps="${app:-/something/suitable/as/the/default}"
would copy the value of $app, unless $app is not set at all (not relevant here; useful with environment variables) or if $app is an empty string.
The error you are getting is because there is no valid substitution that starts with '/' in your version of the shell. This notation seems to be valid in some versions of Bash (including the one I have to play with); I don't know when it was added. But if the shell you are using is complaining about the notation, then clearly it is not correct for the version of the shell you are using.
Depending on the shebang line (#!/bin/sh vs #!/bin/bash), it might work differently. Failing that, the version of Bash on your machine may be too old.
You can check your shell(s) with:
for app in /some/location/rtcom-call-ui/where.png /another/location/nowhere/thing.png
do
apps=${app/rtcom-call-ui/general-call}
echo $app
echo $apps
done

What is the purpose of the : (colon) GNU Bash builtin?

What is the purpose of a command that does nothing, being little more than a comment leader, but is actually a shell builtin in and of itself?
It's slower than inserting a comment into your scripts by about 40% per call, which probably varies greatly depending on the size of the comment. The only possible reasons I can see for it are these:
# poor man's delay function
for ((x=0;x<100000;++x)) ; do : ; done
# inserting comments into string of commands
command ; command ; : we need a comment in here for some reason ; command
# an alias for `true'
while : ; do command ; done
I guess what I'm really looking for is what historical application it might have had.
Historically, Bourne shells didn't have true and false as built-in commands. true was instead simply aliased to :, and false to something like let 0.
: is slightly better than true for portability to ancient Bourne-derived shells. As a simple example, consider having neither the ! pipeline operator nor the || list operator (as was the case for some ancient Bourne shells). This leaves the else clause of the if statement as the only means for branching based on exit status:
if command; then :; else ...; fi
Since if requires a non-empty then clause and comments don't count as non-empty, : serves as a no-op.
Nowadays (that is: in a modern context) you can usually use either : or true. Both are specified by POSIX, and some find true easier to read. However there is one interesting difference: : is a so-called POSIX special built-in, whereas true is a regular built-in.
Special built-ins are required to be built into the shell; Regular built-ins are only "typically" built in, but it isn't strictly guaranteed. There usually shouldn't be a regular program named : with the function of true in PATH of most systems.
Probably the most crucial difference is that with special built-ins, any variable set by the built-in - even in the environment during simple command evaluation - persists after the command completes, as demonstrated here using ksh93:
$ unset x; ( x=hi :; echo "$x" )
hi
$ ( x=hi true; echo "$x" )
$
Note that Zsh ignores this requirement, as does GNU Bash except when operating in POSIX compatibility mode, but all other major "POSIX sh derived" shells observe this including dash, ksh93, and mksh.
Another difference is that regular built-ins must be compatible with exec - demonstrated here using Bash:
$ ( exec : )
-bash: exec: :: not found
$ ( exec true )
$
POSIX also explicitly notes that : may be faster than true, though this is of course an implementation-specific detail.
I use it to easily enable/disable variable commands:
#!/bin/bash
if [[ "$VERBOSE" == "" || "$VERBOSE" == "0" ]]; then
vecho=":" # no "verbose echo"
else
vecho=echo # enable "verbose echo"
fi
$vecho "Verbose echo is ON"
Thus
$ ./vecho
$ VERBOSE=1 ./vecho
Verbose echo is ON
This makes for a clean script. This cannot be done with '#'.
Also,
: >afile
is one of the simplest ways to guarantee that 'afile' exists but is 0 length.
A useful application for : is if you're only interested in using parameter expansions for their side-effects rather than actually passing their result to a command.
In that case, you use the parameter expansion as an argument to either : or false depending upon whether you want an exit status of 0 or 1. An example might be
: "${var:=$1}"
Since : is a builtin, it should be pretty fast.
: can also be for block comment (similar to /* */ in C language). For example, if you want to skip a block of code in your script, you can do this:
: << 'SKIP'
your code block here
SKIP
Two more uses not mentioned in other answers:
Logging
Take this example script:
set -x
: Logging message here
example_command
The first line, set -x, makes the shell print out the command before running it. It's quite a useful construct. The downside is that the usual echo Log message type of statement now prints the message twice. The colon method gets round that. Note that you'll still have to escape special characters just like you would for echo.
Cron job titles
I've seen it being used in cron jobs, like this:
45 10 * * * : Backup for database ; /opt/backup.sh
This is a cron job that runs the script /opt/backup.sh every day at 10:45. The advantage of this technique is that it makes for better looking email subjects when the /opt/backup.sh prints some output.
It's similar to pass in Python.
One use would be to stub out a function until it gets written:
future_function () { :; }
If you'd like to truncate a file to zero bytes, useful for clearing logs, try this:
:> file.log
You could use it in conjunction with backticks (``) to execute a command without displaying its output, like this:
: `some_command`
Of course you could just do some_command > /dev/null, but the :-version is somewhat shorter.
That being said I wouldn't recommend actually doing that as it would just confuse people. It just came to mind as a possible use-case.
It's also useful for polyglot programs:
#!/usr/bin/env sh
':' //; exec "$(command -v node)" "$0" "$#"
~function(){ ... }
This is now both an executable shell-script and a JavaScript program: meaning ./filename.js, sh filename.js, and node filename.js all work.
(Definitely a little bit of a strange usage, but effective nonetheless.)
Some explication, as requested:
Shell-scripts are evaluated line-by-line; and the exec command, when run, terminates the shell and replaces it's process with the resultant command. This means that to the shell, the program looks like this:
#!/usr/bin/env sh
':' //; exec "$(command -v node)" "$0" "$#"
As long as no parameter expansion or aliasing is occurring in the word, any word in a shell-script can be wrapped in quotes without changing its' meaning; this means that ':' is equivalent to : (we've only wrapped it in quotes here to achieve the JavaScript semantics described below)
... and as described above, the first command on the first line is a no-op (it translates to : //, or if you prefer to quote the words, ':' '//'. Notice that the // carries no special meaning here, as it does in JavaScript; it's just a meaningless word that's being thrown away.)
Finally, the second command on the first line (after the semicolon), is the real meat of the program: it's the exec call which replaces the shell-script being invoked, with a Node.js process invoked to evaluate the rest of the script.
Meanwhile, the first line, in JavaScript, parses as a string-literal (':'), and then a comment, which is deleted; thus, to JavaScript, the program looks like this:
':'
~function(){ ... }
Since the string-literal is on a line by itself, it is a no-op statement, and is thus stripped from the program; that means that the entire line is removed, leaving only your program-code (in this example, the function(){ ... } body.)
Self-documenting functions
You can also use : to embed documentation in a function.
Assume you have a library script mylib.sh, providing a variety of functions. You could either source the library (. mylib.sh) and call the functions directly after that (lib_function1 arg1 arg2), or avoid cluttering your namespace and invoke the library with a function argument (mylib.sh lib_function1 arg1 arg2).
Wouldn't it be nice if you could also type mylib.sh --help and get a list of available functions and their usage, without having to manually maintain the function list in the help text?
#!/bin/bash
# all "public" functions must start with this prefix
LIB_PREFIX='lib_'
# "public" library functions
lib_function1() {
: This function does something complicated with two arguments.
:
: Parameters:
: ' arg1 - first argument ($1)'
: ' arg2 - second argument'
:
: Result:
: " it's complicated"
# actual function code starts here
}
lib_function2() {
: Function documentation
# function code here
}
# help function
--help() {
echo MyLib v0.0.1
echo
echo Usage: mylib.sh [function_name [args]]
echo
echo Available functions:
declare -f | sed -n -e '/^'$LIB_PREFIX'/,/^}$/{/\(^'$LIB_PREFIX'\)\|\(^[ \t]*:\)/{
s/^\('$LIB_PREFIX'.*\) ()/\n=== \1 ===/;s/^[ \t]*: \?['\''"]\?/ /;s/['\''"]\?;\?$//;p}}'
}
# main code
if [ "${BASH_SOURCE[0]}" = "${0}" ]; then
# the script was executed instead of sourced
# invoke requested function or display help
if [ "$(type -t - "$1" 2>/dev/null)" = function ]; then
"$#"
else
--help
fi
fi
A few comments about the code:
All "public" functions have the same prefix. Only these are meant to be invoked by the user, and to be listed in the help text.
The self-documenting feature relies on the previous point, and uses declare -f to enumerate all available functions, then filters them through sed to only display functions with the appropriate prefix.
It is a good idea to enclose the documentation in single quotes, to prevent undesired expansion and whitespace removal. You'll also need to be careful when using apostrophes/quotes in the text.
You could write code to internalize the library prefix, i.e. the user only has to type mylib.sh function1 and it gets translated internally to lib_function1. This is an exercise left to the reader.
The help function is named "--help". This is a convenient (i.e. lazy) approach that uses the library invoke mechanism to display the help itself, without having to code an extra check for $1. At the same time, it will clutter your namespace if you source the library. If you don't like that, you can either change the name to something like lib_help or actually check the args for --help in the main code and invoke the help function manually.
I saw this usage in a script and thought it was a good substitute for invoking basename within a script.
oldIFS=$IFS
IFS=/
for basetool in $0 ; do : ; done
IFS=$oldIFS
...
this is a replacement for the code: basetool=$(basename $0)
Another way, not yet mentioned here is the initialisation of parameters in infinite while-loops. Below is not the cleanest example, but it serves it's purpose.
#!/usr/bin/env bash
[ "$1" ] && foo=0 && bar="baz"
while : "${foo=2}" "${bar:=qux}"; do
echo "$foo"
(( foo == 3 )) && echo "$bar" && break
(( foo=foo+1 ))
done

Resources