Ruby: FileUtils.cp truncates file; FileUtils.mv it does not? - ruby

This is weird… and I can't figure out for the life of me why it's doing it this way.
I've got a folder full of various CoffeeScript, SASS, HTML, and XML files.
I've got a Ruby script that's taking them all, compiling them, and minifying them into one master XML file (it's for iGoogle Gadget development).
This script takes command line args using trollop (I only state this to clarify my code below).
I want this script to copy this file from the current directory where it's created to a destination directory where it will be run.
So far, the building/compiling/minifying step runs like magic. It's #3 that's borked to Twilight Zone-level.
#!/usr/bin/ruby
…
if opts[:deploy_local]
FileUtils.cp 'build.xml', '/path/to/destination/'
puts "Copied #{written_file_name} to #{output_destination}." if opts[:verbose]
end
When this copies the file, the destination file is truncated about 3/4 of the way through it. The source file is just fine. However, moving the file works like a charm, for some strange reason.
FileUtils.mv 'build.xml', '/path/to/destination/'
To add another level of weirdness, if I just do a system copy, it also gets truncated.
system("cp build.xml /path/to/destination")
FWIW, I'm running this script from zsh and not bash. In both instances (copying and moving) the source and destination files are not in use by any other process.
Can anybody explain this freaky behavior?

A few things:
Are you moving to the same disk volume? If so, then, yeah, cam's comment about atomicity is definitely true; the OS is probably just messing with the inode table during a move, as opposed to writing out the data. IF you're moving the data between volumes, then it wouldn't be so simple.
Have you tried passing
:verbose => true
to the FileUtils.cp command? That might give a diagnostic about the failure.

Related

Writing to popen and reading back several files in Ruby

I need to run some shell commands on a number of files and sometimes I get back more than one file in response. The question is: How can I read back several files from IO.popen in Ruby?
For instance, imagine the following case:
file = grid.get(record['_id']) # fetch a file from database
IO.popen('tar -Oxmz', 'ab') {|pipe| pipe.write(file.read)} # pass to tar and extract
This necessitates that I reread all the extracted files from the filesystem. I figured out this is the speed bottleneck of my script and I wonder if I can accomplish the same task in-memroy. I tried the following:
file = grid.get(record['_id'])
IO.popen('tar -Oxmz', 'w+b') do |pipe|
pipe.write(file.read)
pipe.close_write
output = pipe.read
end
It works, but I get the whole response, here including several extracted files, in one piece (in variable output). I need the files separate from each other and possibly with their names. Is there any way to do this?
By the way, the resulting files are most of the time text, but sometimes binary. Running a pipe for each output file is not a solution, because the actual overhead of running the commands for each file outweights the benefits of doing the transformation in-memory.
P.S. The actual use case does not rely on tar only. I use software that do not have Ruby wrappers.

No space left on device - write chef remote_file

I get a strange error when a chef-client tries to execute remote_resource for a big local file.
From stack trace I guess ruby copy files itself. My disk has a lot of free space. Also var and tmp folders has at leas 2 Gbytes. If I do this job myself with cp command or I replace remote_file resource with execute one it's okay.
Chef complains about lack of disk space.
This resource fails for a file of 4G size with message No space on device.
remote_file "/tmp/copy.img" do
source "file://tmp/origin.img"
action :create
end
I made workaround with bash resource and it works.
execute "disk-image" do
command "cp /tmp/origin.img /tmp/copy.img"
not_if "cmp /tmp/origin.img /tmp/copy.img"
end
It's not going to work. remote_file downloads the remote file to somewhere within /var/chef IIRC, then copies to its destination.
Since /var has only 2Gb of space and the file is 4Gb big, it correctly throws the No space left on device error.
Thank you #lamont for the explanation. To cut to the chase a bit, the only solution that worked for me was to add the following to my Chef recipe, prior to any calls to remote_file:
ENV['TMP'] = '/mytmp'
ENV['TMPDIR'] = '/mytmp'
where /mytmp is a directory on a volume with enough space to hold my file.
The promising feature of adding:
file_staging_uses_destdir true
to /etc/chef/client.rb currently does not work, due to this bug: https://tickets.opscode.com/browse/CHEF-5311.
9/20/2016: Chef 12.0 shipped with file_stating_uses_destdir being defaulted to true so this should no longer be an issue (the remote_file but where it streams to /tmp may still exist).
First the real simple statement: If you've got a 4GB file in /tmp and you only have 2GB left in /tmp, then obviously copying the 4GB will fail, and nothing can help you. I'm assuming you have at least 4GB in /tmp and only 2GB left in /var which is the only interesting case to address.
As of 11.6.0 (to 11.10.2 at least) chef-client will create a tempfile using ruby's Tempfile.new and will copy the contents to that temp file and then will mv it into place. The tempfile location will be determined by ENV['TMPDIR'] and that differs based on your O/S distro (e.g. on a Mac that will be something like /var/folders/rs/wqwmj_1n59135dhhbvfdrlqh0001yj/T/ and not just /tmp or even /var/tmp), so it may not be obvious where the intermediate tempfile is created. You may be running into that problem. You should be able to see from the chef-client -l debug output what tempfile location chef is using and if you df -k that directory you might see that it is 100%.
Also, look at df -i to see if you've run out of inodes somehow which will also throw a no space left on device error.
You can set chef-client globally to use the destination directory as the tmpdir for creating files via adding this to client.rb:
file_staging_uses_destdir true
Then if your destination dir is '/tmp' the tempfile will get created there and then will simply get renamed in the directory in order to deploy it. That ensures that if there's enough space on the target device to hold the result, then the resource should always succeed to write the tempfile. It also avoids the problem if /tmp and the destdir are on different filesystems that the mv to rename and deploy the file will get translated into a copy-and-unlink-src operation which can fail in several different ways.
The answer by #cassianoleal is not correct in stating that chef-client always uses /var/cache as a temp location. Changing file_cache_path will also not have an effect. That is confusing a common pattern of downloading remote_files into the Chef file_cache_path directory for how remote_file works internally -- those are not the same thing. There is no file_cache_path in the question, so there should not be any file_cache_path in the answer.
The behavior of remote_file with file:// URLs is a bit round-a-bout, but that is because they're necessary for all other URLs (as #cassianoleal correctly mentioned). The behavior with file_staging_uses_destdir is probably correct, however, since you do want to take into account edge conditions where you run out of room and truncate the file or the server crashes in the middle of a copy operation and you don't want a half-populated file left over. By writing to a tempfile and closing it and then renaming a lot of those edge conditions are avoided.

Writing to file with Ruby in Compilr

I'm banging through Zed A. Shaw's learn code the hard way (Ruby) on compilr and am stuck on exercise 16.
filename = ARGV.first
target = File.open(filename, 'w')
target.truncate(target.size)
target.close()
In the console, I type
run sample.txt
This should wipe the sample.txt file, but it doesn't.
The file, sample.txt is in the same folder as the Start file.
Any clues?
Ok it's not the Ruby issue (as I expected) it's to do with how Compilr works. By running the code without having first created a sample.txt file, Compilr created the file for me by default in the content folder. So putting the uh... writable files... into the contents folder enables Compilr to write to them. Putting them all in the same folder (the script and the files) makes it not work.

NodeJS fs.watch on directory only fires when changed by editor, but not shell or fs module

When the code below is ran, the watch is only triggered if I edit and save tmp.txt manually, using either my ide, TextEditor.app, or vim.
It doesn't by method of the write stream or manual shell output redirection (typing echo "test" > /path/to/tmp.txt").
Although if I watch the file itself, and not its dirname, then it works.
var fs, Path, file, watchPath, w;
fs = require('fs' );
Path = require('path');
file = __dirname + '/tmp.txt';
watchPath = Path.dirname(file); // changing this to just file makes it trigger
w = fs.watch ( watchPath, function (e,f) {
console.log("will not get here by itself");
w.close();
});
fs.writeFileSync(file,"test","utf-8");
fs.createWriteStream(file, {
flags:'w',
mode: 0777
} )
.end('the_date="'+new Date+'";' ); // another method fails as well
setTimeout (function () {
fs.writeFileSync(file,"test","utf-8");
},500); // as does this one
// child_process exec and spawn fail the same way with or without timeout
So the questions are: why? and how to trigger this event programmatically from a node script?
Thanks!
It doesn't trigger because a change to the contents of a file isn't a change to the directory.
Under the covers, at least as of 0.6, fs.watch on Mac uses kqueue, and it's a pretty thin wrapper around kqueue file system notifications. So, if you really want to understand the details, you have to understand kqueue, and inodes and things like that.
But if you want a short "lie-to-children" explanation: What a user thinks of as a "file" is really two separate things—the actual file, and the directory entry that points to the actual file. This is what allows you to have things like hard links, and files that can still be read and written even after you've deleted them, and so on.
In general, when you write to an existing file, this doesn't make any change to the directory entry, so anyone watching the directory won't see any change. That's why echo >tmp.txt doesn't trigger you.
However, if you, e.g., write a new temporary file and then move it over the old file, that does change the directory entry (making it a pointer to the new file instead of the old one), so you will be notified. That's why TextEditor.app does trigger you.
The thing is, you've asked to watch the directory and not the file.
The directory isn't updated when the file is modified, such as via shell redirection; in this case, the file is opened, modified, and closed. The directory isn't changed -- only the file is.
When you use a text editor to modify a file, the usual set of system calls behind the scenes looks something like this:
fd = open("foo.new")
write(fd, new foo contents)
unlink("foo")
rename("foo.new", "foo")
This way, the foo file is either entirely the old file or entirely the new file, and there's no way for there to be a "partial file" with the new contents. The renaming operations do modify the directory, thus triggering the directory watch.
Although the above answers seems reasonable, they are not fully accurate. It is actually a very useful feature to be able to listen to a directory for file changes, not just "renames". I think this feature works as expected in Windows at least, and in node 0.9.2 is also working for mac since they changed to the FSEvents API that supports the feature:
Version 0.9.2 (Unstable)

Deleted files not being removed from HDD in Ruby

I'm experiencing a weird situation with deleting files in Ruby, the code seems to report correctly, but the physical file isn't removed from my hard drive. I can do rm path/to/file on the command line - that works. I can even open up the Rails console and File.safe_unlink the file and that also works, it's just within my Rails app it fails to delete the actual file:
def destroy
Rails.logger.debug local_path #=> /Users/ryan/.../public/system/.../file.jpg
Rails.logger.debug File.exist?(local_path) #=> true
File.safe_unlink(local_path)
Rails.logger.debug File.exist?(local_path) #=> false
# yet the physical file actually STILL exists!
end
The physical file is within a Git repo (the repo is stored within /public/system/) any gotchas with that? I've tried using the ruby-git gem to delete the file using the rm command it provides, but that doesn't delete the file either.
I've opened up all the permissions on the files during testing and still nothing works. I've also confirmed this with File.writable?(local_path) which returned true.
Any thoughts why it could be preventing the file from being removed?
Have you checked the permissions on the directory? Deletion is a directory write operation, not file write operation. (rm will also check file perms and ask if you really want to do it, as a courtesy, if the file is write-protected; but if the directory isn't writable, it flat out refuses.)

Resources