Unusual path joining in python's os library - windows

I am trying this in Windows 8.1, with python v.3.4. The os.path module has a join method which, according to the documentation, is a safe way to join fragments of files or folders without mixing up back and forward slashes. In the code snippet below I am trying to join a file and a folder:-
>>> photo = r"\camera\picnic.jpg"
>>> folder = os.getcwd()
>>> print(folder)
C:\Users\Renae
>>> path = os.path.join(folder, photo)
>>> print(path)
C:\camera\picnic.jpg
And boom goes the dynamite. I was expecting path to be C:\Users\Renae\camera\picnic.jpg. I've tried removing the r in front of photo with no results. I've also tried backslashes even though Windows uses forward slashes which made it worse. The result was a mix of back and forward slashes. If I remember correctly this was not a problem in Linux.

Try removing the initial slash.
I can't speak for windows because it's been a long time since working on windows, but in *nix systems, starting a path with a slash signifies the root of the file system. I'm guessing that the implementation in python (and possibly more languages) use this convention on windows as well. I don't have a windows box to verify this on though.

Related

Retrieving a long path from "\Device\HarddiskVolume1\Progra~1"

I'm writing a ProcessExplorer-like tool that shows all open files in the system.
Calling the NtQueryObject(,ObjectNameInformation,,,) gives me a path that looks like \Device\HarddiskVolume1\Users\ADMINI~1\AppData\Local\Temp\FXSAPIDebugLogFile.txt, so it is an NT device path that sometimes has shortened segments (ADMINI~1 in this case).
I don't really need to convert this path to a "standard" one (like C:\Users), but I need to expand it to a long form, so it should look like \Device\HarddiskVolume1\Users\Administrator\AppData\Local\Temp\FXSAPIDebugLogFile.txt. Besides, converting a device path to a standard one won't even be possible if the volume doesn't have a mountpoint.
I need a solution that works from Windows XP onwards.
Unfortunately, the most obvious method - using the GetLongPathNameW function doesn't work with a path like this, it returns a ERROR_INVALID_NAME for a path like this.
I tried different path prefixes: \\?\, \\.\, \??\, \\.\GLOBALROOT\ - none of this helped.
FindFirstFileW also doesn't accept a path like this on Vista.
It's strange, because the CreateFile function works with a \Device\HarddiskVolume path just fine.
Also I found, that if I remove the Device word from the path, making it \\.\HarddiskVolume1\Users\ADMINI~1\AppData\Local\Temp\FXSAPIDebugLogFile.txt, the GetLongPathNameW actually returns a correct long path. Unfortunately, this doesn't work on Windows XP/Vista.
I'd like to know if there is another method, that will work on Windows XP too.

Porting MATLAB code from Windows to Mac

I have just switched from a PC to a Mac, and I am finding that lots of my MATLAB code previously written when I had a PC does not work on my Mac! I have been working on MATLAB for a while now, but I am not an expert yet.
After searching around for differences between PC and Mac, I noted that a few things indeed differed, but I'd love to hear about whether I need to go through all my yet written MATLAB code and update it manually to make it work on my Mac.
Please let me know what best to do here.
Example:
clear all
cd 'c:\users\sss\Desktop\MATLAB\project\DataFile\'
load data
cd ..
Why doesn't this work? Is it because of the backslash required for MATLAB on a Mac?
Of course, if you try to access a Windows-style path on a Mac, it will error.
MATLAB includes a set of functions that make it fairly easy to make your code cross-platform with respect to these sorts of issues. Take a look at, for example, the functions fullfile, fileparts, filesep, pathsep, ispc, and ismac.
I'm afraid that for the moment, you'll probably need to recode things to be either Mac-specific or to be cross-platform using the functions above.
One way is to have a path variables or variables set which determine where your data is held. You can even use computer or ismac and ispc to automatically switch to the correct version:
if ispc
dpath = 'c:\users\sss\Desktop\MATLAB\project\DataFile\';
elseif ismac
dpath = '/Users/sss/MATLAB/project/DataFile/';
end
load (fullfile(dpath, 'data.mat'));
If you have multiple files in subdirectories of /MATLAB/project/, you can set a project directory (similarly to matlabroot but pointing at where your files for that project are kept), and then use fullfile to select the correct subdirectory.
e.g. given a directory in proot that points to wherever /MATLAB/project/ is on the appropriate computer, these produce filenames which are in /MATLAB/project/data and MATLAB/project/output respectively:
datain = fullfile(proot, 'data','data.mat');
dataout = fullfile(proot,'output','output.mat');

Colon (:) appears as forward slash (/) when creating file name

I am using date and time to label a new file that I'm creating, but when I view the file, the colon is a forward slash. I am developing on a Mac using 10.7+
Here is the code I'm using:
File.open("#{time.hour} : 00, #{time.month}-#{time.day}-#{time.year}", "a") do |mFile|
mFile.syswrite("#{pKey} - #{tKey}: \n")
mFile.syswrite("Items closed: #{itemsClosed} | Total items: #{totalItems} | Percent closed: % #{pClosed} \n")
mFile.syswrite("\n")
mFile.close
end
Here is the output (assuming the time is 1pm):
13 / 00, 11-8-2012
Why is this happening and how can I fix it? I want the output to be:
13:00, 11-8-2012
Once upon a time, before Mac OS X, : was the directory separator instead of /. Apparently OS X 10.7 is still trying to fix up programs like that. I don't know how you can fix this, if you really need the : to be there. I'd omit it :-).
EDIT: After a bit more searching this USENIX paper describes what is going on. The rule they use apparently is this:
Another obvious problem is the different path separators between HFS+ (colon, ':') and UFS
(slash, '/'). This also means that HFS+ file names may contain the slash character and not
colons, while the opposite is true for UFS file names. This was easy to address, though it
involves transforming strings back and forth. The HFS+ implementation in the kernel's VFS
layer converts colon to slash and vice versa when reading from and writing to the on-disk
format. So on disk the separator is a colon, but at the VFS layer (and therefore anything
above it and the kernel, such as libc) it's a slash. However, the traditional Mac OS
toolkits expect colons, so above the BSD layer, the core Carbon toolkit does yet another
translation. The result is that Carbon applications see colons, and everyone else sees
slashes. This can create a user-visible schizophrenia in the rare cases of file names
containing colon characters, which appear to Carbon applications as slash characters, but
to BSD programs and Cocoa applications as colons.
While OS X "is" a unix operating system, it also derives quite a bit its code, APIs, standards, etc from Mac OS 9. In unix, file paths have "/" separating the elements and ":" is allowed in the names of individual files and directories. In Mac OS 9, it was the other way around: file paths had ":" between elements and "/" was allowed in individual filenames. When Apple developed OS X, they wound up having to support some APIs that used unix-style file paths, and some APIs that used OS 9-style paths, and they had to both be able to work on the same filesystem.
What they did is to swap delimiters and allowed characters depending on context. If you write (/run) a program that uses unix APIs to access the filesystem, you'll see files with colons in their names and slashes separating path elements. If you write (/run) a program that uses the old OS 9 APIs (or their derivatives), you'll see files with slashes in their names and colons separating path elements. See Apple's developer Q&A #1392 and notes on specifying paths in AppleScript for a bit more discussion.
(There are some other differences as well. A unix path is absolute if it starts with the delimiter ("/"), and absolute paths start at the top of the root volume. An OS 9 path is absolute if it doesn't start with a delimiter, and absolute OS 9 paths start with a volume name. Thus, the unix path "/tmp/foo:bar" is equivalent to the OS 9 path "Macintosh HD:tmp:foo/bar".)
So, which character is really in the filename, a slash or a colon? Well, a filename is a rather abstract thing, but if you're asking about the bytes that're actually stored on the disk... if it's on an HFS+ (aka Mac OS Extended) volume, it's being stored in a filesystem that was designed to work with the OS 9 (well, technically Mac OS 8.1) APIs, so it allows slashes but forbids colons, so on an HFS+ volume the file will "really" have a slash in the name. OTOH if you store the file on a unixish volume, it'll be stored using the unix convention, and "really" have a colon in the name. But the difference doesn't really matter unless you're reading raw bytes off the disk or writing a filesystem driver...
Finally, why does the Finder display the controversial filename character as slash rather than colon? I'm pretty sure it's mostly inertia. The Finder isn't even entirely consistent about this, since if you use its Go To Folder option (Command-Shift-G) and type in "/Users/Shared", it treats that as a unix path. If you type in "Macintosh HD:Users:Shared", it has no idea what you're talking about. Furthermore, if you run touch /tmp/foo:bar, then try to get to it with Go To Folder:
Entering "/tmp/foo:bar" works.
Entering "/tmp/fo" then pressing tab autocompletes it to "/tmp/foo/bar/", which works.
Entering "/tmp/foo/bar/" fails, even though it's exactly the same as the autocomplete.
Entering "/tmp/foo" then pressing tab autocompletes to "/tmp/foo/", which cannot be autocompleted any further and doesn't work at all.
Update: as Konrad Rudolph pointed out, the Go To Folder behavior has changed as of El Capitan, and I there's no longer any way to use it to get to folders containing the controversial character.
To avoid as many problems as possible when dealing with File names, paths, and various OSes, you really should take advantage of the built-in File methods, like join, dirname, basename, extname, and split. They try to avoid system dependencies and try to give you a programmatic way to generate valid filenames cross-platform.
This problem was a lot worse back when Apple used the old Macintosh operating system. The move to Mac OS helped, because they dropped using : as a separator, however those people who were manually building filenames found code breaking because it generated the wrong delimiters, whereas taking advantage of the libraries handled the problem.
Because this particular problem isn't a bug, nor is it in Ruby's control but Apple's, I'd say it's not a Ruby problem at all, it's a visualization issue, and if you want the filename to resemble what the Finder displays code accordingly.

Django with Windows: Double backslash or front slash?

So, I'm a beginner trying to learn django right now with "Practical Django Projects" with windows in eclipse pydev.
Anyways, main issue is I use windows and it seems to suggest I should use front slash in the comments for settings.py. But by default, the databases is already set to:
'NAME': 'C:\\Users\\dtc\\Documents\\Eclipse\\cms\\sqlite.db'
And while I was going through the book, it wants me to add this:
url(r'^tiny_mce/(?P<path>.*)$', 'django.views.static.serve',
{ 'document_root': '/path/to/tiny_mce/' }),
But that path didn't work until I changed to double backslash \\path\\to\\...so I'm thinking I should just not worry about using double backslash.
It would be great if someone gave me a little insight on this because it's giving me a total headache while trying to learn django.
Use python to get current directory and call a join on it with whatever you need to add. This will make it cross platform as python will take care of converting backslashes and forward slashes for you.
import os
CURRENT_DIR = os.path.dirname(__file__)
TEMPLATE_DIRS = (
os.path.join(CURRENT_DIR, 'templates')
)
That will save you on typing and the path will be correct. If you look in settings.py that django generates it will tell you to always use / even on Windows.
# Put strings here, like "/home/html/django_templates" or "C:/www/django/templates".
# Always use forward slashes, even on Windows.
# Don't forget to use absolute paths, not relative paths.
One last thing, your urls should use forward slashes because that is how django is going to use them.
Hope that helps

Are there any invalid linux filenames?

If I wanted to create a string which is guaranteed not to represent a filename, I could put one of the following characters in it on Windows:
\ / : * ? | < >
e.g.
this-is-a-filename.png
?this-is-not.png
Is there any way to identify a string as 'not possibly a file' on Linux?
There are almost no restrictions - apart from '/' and '\0', you're allowed to use anything. However, some people think it's not a good idea to allow this much flexibility.
An empty string is the only truly invalid path name on Linux, which may work for you if you need only one invalid name. You could also use a string like "///foo", which would not be a canonical path name, although it could refer to a file ("/foo"). Another possibility would be something like "/dev/null/foo", since /dev/null has a POSIX-defined non-directory meaning. If you only need strings that could not refer to a regular file you could use "/" or ".", since those are always directories.
Technically it's not invalid but files with dash(-) at the beginning of their name will put you in a lot of troubles. It's because it has conflicts with command arguments.
I personally find that a lot of the time the problem is not Linux but the applications one is using on Linux.
Take for example Amarok. Recently I noticed that certain artists I had copied from my Windows machine where not appearing in the library. I check and confirmed that the files were there and then I noticed that certain characters in the folder names (Named for the artist) were represented with a weird-looking square rather than an actual character.
In a shell terminal the filenames look even stranger: /Music/Albums/Einst$'\374'rzende\ Neubauten is an example of how strange.
While these files were definitely there, Amarok could not see them for some reason. I was able to use some shell trickery to rename them to sane versions which I could then re-name with ASCII-only characters using Musicbrainz Picard. Unfortunately, Picard was also unable to open the files until I renamed them, hence the need for a shell script.
Overall this a a tricky area and it seems to get very thorny if you are trying to synchronise a music collection between Windows and Linux wherein certain folder or file names contain funky characters.
The safest thing to do is stick to ASCII-only filenames.

Resources