Is ./*/ portable? - shell

I often use ./*/ in a for loop like
for d in ./*/; do
: # do something with dirs
done
to match all non-hidden directories in current working directory, but I'm not really sure if this is a portable way to do that. I have bash, dash and ksh installed on my system and it works with all, but since POSIX spec doesn't say anything about it (or it says implicitly, and I missed it) I think I can't rely on it. I also checked POSIX bug reports, but to no avail, there's no mention of it there as well.
Is its behaviour implementation or filesystem dependent? Am I missing something here? How do I know if it's portable or not?

Short answer: YES
Long Answer:
The POSIX standard (from opengroup) states that / will only match slashes in the expanded file name. Since Unix/Linux does not allow / in the file name, I believe that this is a safe assumption on Unix/Linux systems.
From the bolded text below, it seems that even for systems that will allow / in the file name, the POSIX standard require that / will not be matched to such file.
On Windows, looks like / is not allowed in the file name, but I'm not an expert on Windows.
From Shell Programming Language § Patterns Used for Filename Expansion:
The slash character in a pathname shall be explicitly matched by using one or more slashes in the pattern; it shall neither be matched by the asterisk or question-mark special characters nor by a bracket expression. Slashes in the pattern shall be identified before bracket expressions; thus, a slash cannot be included in a pattern bracket expression used for filename expansion.
...
Additional Note - clarifying pathname:
The pathname is defined in 4.13, with explicit reference to pathname with trailing slash in General Concepts § Pathname Resolution.
A pathname that contains at least one non-<slash> character and that ends with one or more trailing <slash> characters shall not be resolved successfully unless the last pathname component before the trailing <slash> characters names an existing directory or a directory entry that is to be created for a directory immediately after the pathname is resolved. Interfaces using pathname resolution may specify additional constraints when a pathname that does not name an existing directory contains at least one non-<slash> character and contains one or more trailing <slash> characters.

Related

what does $(1:D=) mean?

I'm reading Jamrule file of some project to understand how it builds.
But there is some that I can't understand.
Such like $(1:D=) or $(1:S=$(sample)) or $(1:G=$(sample))
what does it mean?
I searched colon and equal meaning in a shell script but I couldn't find when the alphabet is in between them.
ex) local _s = $(1:D=) ;
$(1) expands the first argument of a rule. $(1:D=foo) applies a modifier that replaces the directory portion of the expanded elements (dirname, if you think in shell terms) with the string foo. The special case $(1:D=) removes the directory portion. The modifier S refers to the suffix (aka extension) of the file name, G to the "grist" of a jam target name.
Please refer to the Variable Expansion section of the Perforce Jam documentation for a complete list. I can recommend reading the complete Jam documentation to understand the specific concepts (like grist).

What does the path "//" mean?

I just found the direcory // on my machine and now i am wondering what it means.
user#dev:~$ cd /
user#dev:/$ pwd
/
user#dev:/$ cd //
user#dev://$ pwd
//
It is obvously the root directory, but when and why do i use the double slash instead of the single slash?
Is it related to the escaped path strings which i use while programming?
For example:
string path = "//home//user//foo.file"
I also tried it with zsh but it changes to the usual root directory /. So I think its bash specific.
This is part of the specification for Pathname Resolution:
A pathname consisting of a single <slash> shall resolve to the root directory of the process. A null pathname shall not be successfully resolved. If a pathname begins with two successive <slash> characters, the first component following the leading <slash> characters may be interpreted in an implementation-defined manner, although more than two leading <slash> characters shall be treated as a single <slash> character.
So your shell is just following the specification and leaving // alone as it might be implementationally defined as something other than /.

Why would I not leave extglob enabled in bash?

I just found out about the bash extglob shell option here:-
How can I use inverse or negative wildcards when pattern matching in a unix/linux shell?
All the answers that used shopt -s extglob also mentioned shopt -u extglob to turn it off.
Why would I want to turn something so useful off? Indeed why isn't it on by default?
Presumably it has the potential for giving some nasty surprises.
What are they?
No nasty surprises -- default-off behavior is only there for compatibility with traditional, standards-compliant pattern syntax.
Which is to say: It's possible (albeit unlikely) that someone writing fo+(o).* actually intended the + and the parenthesis to be treated as literal parts of the pattern matched by their code. For bash to interpret this expression in a different manner than what the POSIX sh specification calls for would be to break compatibility, which is right now done by default in very few cases (echo -e with xpg_echo unset being the only one that comes immediately to mind).
This is different from the usual case where bash extensions are extending behavior undefined by the POSIX standard -- cases where a baseline POSIX shell would typically throw an error, but bash instead offers some new and different explicitly documented behavior -- because the need to treat these characters as matching themselves is defined by POSIX.
To quote the relevant part of the specification, with emphasis added:
An ordinary character is a pattern that shall match itself. It can be any character in the supported character set except for NUL, those special shell characters in Quoting that require quoting, and the following three special pattern characters. Matching shall be based on the bit pattern used for encoding the character, not on the graphic representation of the character. If any character (ordinary, shell special, or pattern special) is quoted, that pattern shall match the character itself. The shell special characters always require quoting.
When unquoted and outside a bracket expression, the following three characters shall have special meaning in the specification of patterns:
? - A question-mark is a pattern that shall match any character.
* - An asterisk is a pattern that shall match multiple characters, as described in Patterns Matching Multiple Characters.
[ - The open bracket shall introduce a pattern bracket expression.
Thus, the standard explicitly requires any non-NUL character other than ?, * or [ or those listed elsewhere as requiring quoting to match themselves. Bash's behavior of having extglob off by default allows it to conform with this standard in its default configuration.
However, for your own scripts and your own interactive shell, unless you're making a habit of running code written for POSIX sh with unusual patterns included, enabling extglob is typically worth doing.
Being a Kornshell person, I have extglob on in my .bashrc by default because that's the way it is in Kornshell, and I use it a lot.
For example:
$ find !(target) -name "*.xml"
In Kornshell, this is no problem. In BASH, I need to set extglob. I also set lithist and set -o vi. This allows me to use VI commands in using my shell history, and when I hit v, it shows my code as a bunch of lines.
Without lithist set:
for i in *;do;echo "I see $i";done
With listhist set:
for i in *
do
echo "I see $i"
done
Now, only if BASH had the print statement, I'd be all set.

why doesn't *.abc match a file named .abc?

I thought I understood wildcards, till this happened to me. Essentially, I'm looking for a wild card pattern that would return all files that are not named .gitignore. I came up with this, which seems to work for all cases I could conjure:
ls *[!{gitignore}]
To really validate if this works, I thought I'd negate the expression and see if it returns the file named .gitignore (actually any file that ended with gitignore; so 1.gitignore should also be returned). To that effect, I thought the negated expression would be:
ls *[{gitignore}]
However, this expression doesn't return a files named .gitignore (although it returns a file named 1.gitignore).
Essentially, my question, after simplification, boils down to:
Why doesn't *.abc match a file that is named .abc
I think I can take it from there.
PS:
I am working on Mac OSX Lion (10.7.4)
I wanted to add a clause to .gitignore such that I would ignore every file, except .gitignore in a given folder. So I ended up adding * in the .gitignore file. Result was, git ended up ignoring .gitignore :)
From the numerous searches I've made on google - Use the asterisk character (*) to represent zero or more characters.
I assume you're using Bash. From the Bash manual:
When a pattern is used for filename expansion, the character ‘.’ at the start of a filename or immediately following a slash must be matched explicitly, unless the shell option dotglob is set.
.gitignore patterns, however, are treated differently:
Otherwise, git treats the pattern as a shell glob suitable for consumption by fnmatch(3) with the FNM_PATHNAME flag: wildcards in the pattern will not match a / in the pathname.
According to the fnmatch(3) docs, a leading dot has to be explicitly matched only if the FNM_PERIOD flag is set, so *gitignore as a gitignore pattern would match .gitignore.
There is an easier way to accomplish this, though. To have .gitignore ignore everything except .gitignore:
*
!.gitignore
If you want to ignore everything except the gitignore file, use this as the file:
*
!.gitignore
Lines starting with an exclamation point are interpreted as exceptions.

Incorrect #INC in Activestate Perl in Windows

I am using ActiveState perl with Komodo Edit.
I am getting the following error.
Can't locate MyGengo.pm in #INC (#INC contains: C:/Perl/site/lib C:/Perl/lib .)
at D:\oDesk\MyGengo Integration\sample line 6.
Why is the interpreter looking in C:/Perl/lib instead of C:\Perl\lib?
Doesn’t it know that it is Windows and not Linux?
EDIT
I resolved the problem by copying the .pm file in C:\Perl\lib directory. I think, the issue happened since this module was manually downloaded. PPM install would copy the .pm file to the lib directory.
As far as Windows is concerned, C:/Perl/lib and C:\Perl\lib are the same directory.
The perlport documentation notes (emphasis added)
DOS and Derivatives
Perl has long been ported to Intel-style microcomputers running under systems like PC-DOS, MS-DOS, OS/2, and most Windows platforms you can bring yourself to mention (except for Windows CE, if you count that). Users familiar with COMMAND.COM or CMD.EXE style shells should be aware that each of these file specifications may have subtle differences:
my $filespec0 = "c:/foo/bar/file.txt";
my $filespec1 = "c:\\foo\\bar\\file.txt";
my $filespec2 = 'c:\foo\bar\file.txt';
my $filespec3 = 'c:\\foo\\bar\\file.txt';
System calls accept either / or \ as the path separator. However, many command-line utilities of DOS vintage treat / as the option prefix, so may get confused by filenames containing /. Aside from calling any external programs, / will work just fine, and probably better, as it is more consistent with popular usage, and avoids the problem of remembering what to backwhack and what not to.
Your comment shows that you’re using mygengo-perl-new but have it installed in C:\Perl\lib\MyGengo\mygengo-api\nheinric-mygengo-perl-new-ce194df\mygengo. This is an unusual location to install the module. The way the module is written, it expects mygengo.pm to be in one of the directories named in #INC. Client code then pulls it in with
use mygengo;
My suggestion is to move mygengo.pm from C:\Perl\lib\MyGengo\mygengo-api\nheinric-mygengo-perl-new-ce194df\mygengo to C:\Perl\site\lib.
As an alternative if you are using mygengo as part of another package that you’re developing, you could drop mygengo in your source tree, perhaps as a git submodule. Don’t forget to add use lib 'mygengo'; if you do it this way.
For full details, read about the #INC search process in the perlfunc documentation on require and the extra semantics for modules via use.
General advice on slashes versus backslashes
Even if your code will run on Windows only, prefer using forward-slash as the separator in hardcoded paths. Backslash is an escape character in the Perl language, so you have to think more carefully about it. In double-quoted strings, you have to remember to escape the escape character to get its ordinary meaning, e.g.,
# my $dir = "C:\Perl\lib"; # oops, $path would be 'C:Perlib'
$dir = "C:\\Perl\\lib";
The situation can be a little nicer inside single-quoted strings. Setting $dir as in
$dir = 'C:\Perl\lib';
does what you expect, but say you want $dir to have a trailing slash.
$dir = 'C:\Perl\lib\';
Now you have a syntax error.
Can't find string terminator "'" anywhere before EOF at dirstuff line n.
You may want to interpolate another value into $dir.
$dir = 'C:\Perl\lib\$module'; # nope
Oh yeah, you need double-quotes for interpolation.
$dir = "C:\Perl\lib\$module"; # still not right
After headscratching and debugging
$dir = "C:\\Perl\\lib\\$module"; # finally
Backslash is therefore more mistake-prone and a maintenance irritant. Forward slash is an ordinary character inside both single- and double-quoted strings, so it almost always means what you expect.
As perlport notes, the Windows command shell treats forward slash as introducing options and backslash as path separators. If you cannot avoid the shell, then you may be forced to deal with backslashes.

Resources