Is perl's rename doing something odd on APFS?

Is perl's rename doing something odd on APFS? - macos

I think this is related to iCloud Drive in some way, but I have not investigated.
I'm trying to track down an issue with Perl's rename on macOS using the Apple Filesystem (APFS). I've been able to replicate this with perls back to at least 5.12.3 but all of mine are compiled with Apple LLVM version 9.1.0 (clang-902.0.39.1). Those same perls do not have this problem with FAT or HFS+ filesystems. I haven't noticed this problem anywhere else.
Run it the first time. I end up with a Changes and a Changes.bak. That's exactly what I expected.
Run it again. You end up with Changes and Changes 3 file. There is no Changes.bak. This is odd.
Run it again. I end up with a Changes file, Changes.bak, and Changes 3.
Run it again. I end up with a Changes file, Changes 3, and Changes 4. Again, there's no Changes.bak.
If I remove the print line I can't get this to present ("Doctor, it hurts when I move my arm like this").
I re-ordered the file handle opens and closes but that didn't seem to fix anything.
I figure there's something happening at the filesystem level. So I really have two questions:
Is this a bug and at what level? Is the rename not guaranteed to finish whatever it needs to do before I start messing with file handles?
I want to read the old file and create the new one that inserts some data in the middle. Copy the header, insert the new lines, output all the old line into the new file. I could write to a temp file and move that later, but am I doing anything else stupid?
If you can reproduce this behavior but don't know, leave a comment. Maybe there's something else odd about my system.
my $changes = "Changes";
my $bak = $changes . ".bak";
rename $changes, $bak or die "Could not backup $changes. $!\n";
open my $in, '<', $bak or die "Could not read old $changes file! $!\n";
open my $out, ">", $changes;
# comment this print line and there's no problem
print {$out} 'Hello';
close $out;
close $in;

I know this is an old question, but this might be been related to a bug Apple's filesystem handling had about a year ago. We ran into some problems with the file metadata (mtime?) not getting set correctly in certain situations.
When you run into problems like this with perl, python, node, etc., try doing the same operation in a different language to see if the same behavior obtains. If so, that's likely an OS bug (most of these scripting languages often are thin wrappers around the c libraries anyway).
Cheers.

Related

What does the directory `//` in the shell command `rm -rf //` do, precisely?

I accidentally ran this command while trying to remove an errant directory named \\ from my project directory. Quite a mistake I know. It pretty quickly began hitting permissioned files at which point I realized my mistake so I ctrl-c'ed out of there. I have all my important projects backed up but the command killed my development environment. Opening vim anywhere is crashing and throwing a segfault like so:
Vim: Caught deadly signal SEGV
Error detected while processing function <SNR>130_PollServerReady[7]..<SNR>130_Pyeval:Vim: Finished.
line 4:
Exception MemoryError: MemoryError() in <module 'threading' from '/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.pyc'> ignored
[1] 6921 segmentation fault vim ~/dotfiles/.vimrc
My primary question for myself and anyone who commits a similar gaff is:
What, precisely does the double backslash // point to? What would be deleted first? Is there a logical first place to begin replacing util, configs, $PATH stuff etc?
Hopefully, this is clear and specific enough for SO.

cd // will take you to the root directory /.
rm performs a depth-first search, walking the results of the xfts_open call. find also traverses filesystems in this manner.
find / will list the files that exist. You can then use your knowledge of the expected structure to reverse the list that are missing.
Alternatively, you can use debugfs to help you get at the files.
This assumes that these commands will actually work. Realistically, your system is probably hosed. Deleting things in / will break your computer. Restoring from backup is probably the easiest way to return to a functional system. You can also try various utilities to recover recently erased files from your hard drive; if you plan on doing this, you should stop using your computer, as your hard drive currently treats many areas as free space (since you told it to) which recently held files in / and it could start writing to those areas.

In Linux, and I believe other *nix flavours, an extra slash in a path is simply ignored. Thus, a//b is the same as a/b and // is the same as /. I hope you didn't run this as a superuser...

Monitor A File For Additions And Get Last Added Line

I'm having trouble monitoring a file for changes. I need to be able to know when a file changes, and when it does, I need the new line that was added. I intend to parse each line and find ones that match certain criteria, and act on information in those lines. I know the expected number of matching lines ahead of time, but I do not know how many lines in total will be added to the file, or where the matching lines will be.
I've tried 2 packages so far, with no avail.
fsnotify/fsnotify
As fas as I can tell, fsnotify can only tell me when a file is modified, not what the details of the modification was. Since I need to know what exactly was added to the file, this is no good for me.
(As a side-question, can this be run in a loop? The example that I tried exited after just one modification. I need to monitor for multiple modifications.)
hpcloud/tail
This package tries to mimic the Unix tail command, but it seems to have its own issues. The output that I get includes timestamps and other data - I just want the added line, nothing else. Also, it seems to think a file has been modified multiple times, even when it's just one edit. Further, the deal breaker here is that it does not output the last line if the line was not followed by a newline character.
Delegating to tail
I came across this answer, which suggests to delegate this work to the tail command itself, but I need this to work cross-platform (specifically, macOS, Linux and Windows). I don't believe that an equivalent command exists on Windows.
How do I go about tackling this?

#user2515526,
Usually changed diff is out of scope of file watchers' functionality, because, you know, you could change an image, and a watcher would need to keep a track several Mb of a diff in memory, and what if we have thousands of files?
However, as bad as it sounds, this may be exactly the way you want to implement this (sure, depends on your app, etc. - could be fine for text files), i.e. - keeping a map of diffs (1 diff per file) since last modification. Cannot say I like it, but sounds like fsnotify has no support for changes/diffs that you need.
Also, regarding your question about running in a loop, maybe you can get some hints here: https://github.com/kataras/iris/blob/8370d76910cdd8de043753ed81ae080eae8dc798/utils/file.go
Its a framework that allows to build a server that watches for TypeScript file changes. So sounds similar to your case/question.
Cheers,
-D

Transferring (stopping, resuming) file using rsync

I have an external hard-drive that I suspect is on its way out. At the minute, I can transfer files from it, but only for a while. Unfortunately, I have one single file that's >50GB in size. My solution to this is to use rsync to transfer this one particular file a bit at a time, leave the drive to rest (switch it off), and resume a little while later.
I'm using rsync --partial --progress --inplace --append -a /Volumes/Backup\ Drive/chris/Desktop/Recording\ Sessions/S1/Session\ 1/untitled ~/Desktop/temp to transfer it. (The file is in the untitled folder, which I'm moving into the temp folder) However, after having stopped it and resumed it, it seems to be over-writing the previous attempt at the file, meaning I don't really get any further.
Is there something I'm missing? :X
Thankyou ^_^
EDIT: Still don't know :\

Well, since this is a programming site, here's a program to do it. I tested it on OS X, but you should definitely test it on some small files first to make sure it does what you want:
#!/usr/bin/env python
import os
import sys
source = sys.argv[1]
target = sys.argv[2]
begin = int(sys.argv[3])
end = int(sys.argv[4])
mode = 'r+b' if os.path.exists(target) else 'w+b'
with open(source, 'rb') as source_file, open(target, mode) as target_file:
source_file.seek(begin)
target_file.seek(begin)
buffer = source_file.read(end - begin)
target_file.write(buffer)
You run this with four arguments: the source file, the destination, and two numbers. The first number is the byte count to start copying from (so on the first run you'd use 0). The second number is the byte count to copy until (not including). So on subsequent runs you'd always use the previous fourth argument as the new third argument (new begin equals old end). And just go on like that until it's done, using whatever sizes you like along the way.

I know this is related to macOS, but the best way to get all the files off a dying drive is with GNU ddrescue. I have no idea if this runs nicely on macOS, but you can always use a Linux live-usb to do this. You'll want to open a terminal and be either root (preferred) or use sudo.
Firstly, find the disk that you want to backup. This can be done by running the following. Make note of the partition name or disk name that you want to back up. Hard drives/flash drives will typically use the format sdX, where X is the drive letter. Partitions will be listed under sdX1, sdX2... etc. NVMe drives/partitions follow a similar naming convention.
lsblk -o name,size,label,fstype,model
Mount and change directory (cd) to a writable location that is bigger than the drive/partition you want to back up.
Now we are going to do a first pass over the drive/partition. This will do a first pass, without stopping on problematic sections. This will ensure that ddrescue does not cause any more damage by trying to access a bad section. Think of it like a hole in a sweater -- you wouldn't want to keep picking at the hole or it would get bigger. Run the following, with sdX replaced with the drive/partition name from earlier:
ddrescue -d /dev/sdX backup.img backup.logfile
the -d flag uses direct disk access and ignores the kernel cache, and the logfile is important in case the drive gets disconnected or the process stops somehow.
Run ddrescue again with the -r flag. This will retry bad sections 3 times. Feel free to run this a few times, but note that ddrescue cannot restore everything. From my experience it usually restores in the high 90%s, and many of the files are system files (aka not your personal files).
ddrescue -d -r3 /dev/sdX backup.img backup.logfile
Finally, you can use the image however you want. You can either mount it to copy the files off or use it in a virtual machine/burn it to a working drive with dd. Do note that the latter options will not always work if system critical files were damaged.
Good luck and remember to make backups!

How to get a file's arrival time to a directory using Perl?

Assume a file is copied or moved to a directory by some other program. I want to get the time that this file was copied/moved to this folder. That is, I want the time that the file first appears in this directory.
Note that this file might exist before it was moved/copied or it might not.
This is not any of the time information that can be obtained by File::stat. Thanks.

You may find File::ChangeNotify helpful which tracks file and directory changes. I would suggest looking at incron, which can track various events and changes of files in filesystems.

My guess is you want the time the file was closed after being first written. This may or may not be available, and will be OS-specific. Most OSes track file creation, last modification, and last read (or some subset of those). If none of those work for you you're out of luck unless you control the creation and writing of the file in your application code, in which case you can use whatever you like.

While it may not be the best way to do it,
but for the copying case, if you make a file handle $fh,
You can keep checking for file existence using -e $fh
As soon as you find that file exists, record that moments time.
You may find more interesting -X $fileHandle stuff here.

If nothing else has happened in that directory, this will be the modification time of the directory.

Automatically saving notebook (or other type files in mathematica) files

I have been facing this problem for sometimes now, a laziness caused in part by the fact that Microsoft Office automatically save files you are working on with versions and automatic recovery.
Many times when I am starting a new notebook in mathematica to do some tests or whatever, I often forget to save what I am doing.
Every now and then, depending on the computer I am using, the computer crashes and all the beautiful work I was doing is lost forever...
Is there a way to get around this other that manically saving my files every five minutes? How about file versioning?
BTW: Using MMA V8

Regarding autosaving, you may want to check out the NotebookAutoSave option, which can be set to True through Fromat->Option Inspector. You have to choose "Selected notebook", then go to Notebook Options -> File Options, and set NotebookAutoSave to True. Then, your notebook will be saved after every evaluation. Whether or not this is a satisfactory solution, of course depends on the situation.
But my experience is that the most reliable way is to develop a CTRL+S reflex - this one never lets me down and is working quite well.
As for the versioning, it is much easier with packages, for which you can use WorkBench which has integrated support for CVS and support for SVN via Eclipse plugin. For notebooks, I refer you to this SO thread. You may also find this Mathgroup discussion of some interest.
EDIT
For M8, for auto-saving purposes you can probably also run
RunScheduledTask[NotebookSave[EvaluationNotebook[]],{300}]
But I can not test this code at the moment
EDIT2
I just came across this post in the Toolbag repository - which may also be an alternative for the autosave part of the question (but please see also the discussion in comments on the relative advantages of scheduled tasks vs. Dynamic)

Since you have MMA version 8 you could use:
saveTask = CreateScheduledTask[FrontEndExecute[FrontEndToken["Save"]], 5*60];
StartScheduledTask[saveTask];
to save every 5 minutes (change the term 5*60 for other timings).
To remove the auto-save task use:
RemoveScheduledTask[saveTask];
To save only a fixed, specific notebook, store its handle in nb (finding it using Notebooks, SelectedNotebook, InputNotebook or EvaluationNotebook) and use FrontEndToken[nb,"Save"] instead of just FrontEndToken["Save"]

I have a Mathematica package that provides auto-backup functionality. When enabled, the current notebook--call it "blah.nb"--will be backed up to "blah.nb~" after a configurable amount of time has elapsed. I use it constantly and it has saved me from losing work many, many times. It's better than autosaving since it doesn't touch the actual notebook file: if you screw something up or something gets corrupted you don't want to overwrite your main file. :)
It's on GitHub here.

I've got an autosave routine that saves a copy of every open, modified notebook every 5 minutes (or whatever interval you prefer. It leaves your manually-saved copy alone, and saves a "swap file" in a separate directory that can be easily recovered if need be. The code (to be copied to init.m) is given in this answer: https://mathematica.stackexchange.com/questions/18380/automatic-recovery-after-crash/65852#65852, and copied below:
Motivated by the same concerns, I wrote the following code and added it to my init.m file. There are two main entries you'll want to change to use this. The global variable $SwapDirectory is where the swap files are saved (by swap file, I mean it in the VIm sense; an "extra" copy of your notebook, separate from your manually saved copy that periodically saves any new work). The swap files are organized within the swap directory in a directory structure which "mirrors" their original file locations, and have ".swp" appended to their file names. The other variable you might want to change is the number of seconds between autosaves, indicated by the "300" (corresponding to 5 minutes) near the bottom of the code below. At the appropriate times, this code will (automatically in the background) save swap files for ALL open notebooks, unless they are unmodified from their manually-saved versions (this exception makes the code more efficient, and more importantly, prevents the storage of swap files for documentation notebooks, for example).
In its current form, the code does not filter for only the input cells, but hopefully you can use the other answers to make that modification yourself.
Some things to note:
1) the Mathematica Put command seems to have trouble writing to network drives, even when offline access is enabled. Therefore, it is probably best to choose a SwapDirectory that is on your local machine.
2) Within SwapDirectory, you should create a sub-directory called "Recovery". This is where the AutoSaveSwap routine will make an initial save of any notebooks for which there is NO existing manual save location.
3) Simply evaluate
RecoverSwap["filePath"]
where "filePath" is a string representing the filePath of the MANUALLY-SAVED copy of the file (i.e., not the file that was created by AutoSave). This will then pop up a window containing the most recent auto-saved version of the file. The manually saved version is NEVER overwritten, unless you explicitly choose to do so. Once the recovered version pops up, you can save it whereever you like, or discard it at your discretion.
4) You should probably add this code to the KERNEL version of init.m ($UserBaseDirectory/Kernel/init.m) rather than the frontend version... this way, if you quit and restart the kernel, the autosave feature will also restart. On the other hand, this means that you must evaluate at least one expression after each start or restart to begin auto-saving. Once this initial evaluation is done, you do NOT need to have evaluated a cell for it to be backed up (unlike the built-in autosave utility).
Hope this helps someone! Feel free to respond with any questions, suggestions, or requests for improvement you may have. And, if you find this post useful, upvotes would be most appeciated! Take care.
$SwapDirectory= "C:\\Users\\pacoj\\Swap Files\\";
SaveSwap[nb_NotebookObject]:=Module[
{fileName, swapFileName, nbout, nbdir, nbdirout, recoveryDir},
If[ ! SameQ[Quiet[NotebookFileName[nb]], $Failed],
(* if the notebook is already saved to the file system *)
fileName = Last[ FileNameSplit[ NotebookFileName[nb]] ];
swapFileName = fileName <> ".swp";
nbdir = Rest[FileNameSplit # NotebookDirectory[nb]];
nbdirout= FileNameJoin[ FileNameSplit[$SwapDirectory]~Join~nbdir]<>"\\";
If[!DirectoryQ[nbdirout], CreateDirectory[nbdirout]];
nbout = NotebookGet[nb];
Put[nbout, nbdirout <> swapFileName],
(* else, if the file has never been saved, save as untitled *)
recoveryDir= $SwapDirectory <> "Recovery\\\";
fileName= ("WindowTitle" /. NotebookInformation[nb])<>".nb";
NotebookSave[nb, recoveryDir <> fileName]
]
];
RecoverSwap::noswp= "swap file `1` not found in expected location";
RecoverSwap[nbfilename_String]:=Module[
{fileName, swapFileName, nbin, nbdir, nbdirout},
fileName= Last[ FileNameSplit[ nbfilename] ];
swapFileName= fileName <> ".swp";
nbdir= Most[ Rest[FileNameSplit # nbfilename] ];
nbdirout= FileNameJoin[ FileNameSplit[$SwapDirectory]~Join~nbdir]<>"\\\";
If[ FileNames[swapFileName, {nbdirout}] == {},
Message[RecoverSwap::noswp,nbdirout <> swapFileName]; Return[],
nbin= Get[nbdirout <> swapFileName]; NotebookPut[nbin]
]
];
AutoSaveSwaps= CreateScheduledTask[
SaveSwap /# Select[Notebooks[], "ModifiedInMemory" /. NotebookInformation[#]&],
300
]
StartScheduledTask[AutoSaveSwaps]

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio