I get a strange error when a chef-client tries to execute remote_resource for a big local file.
From stack trace I guess ruby copy files itself. My disk has a lot of free space. Also var and tmp folders has at leas 2 Gbytes. If I do this job myself with cp command or I replace remote_file resource with execute one it's okay.
Chef complains about lack of disk space.
This resource fails for a file of 4G size with message No space on device.
remote_file "/tmp/copy.img" do
source "file://tmp/origin.img"
action :create
end
I made workaround with bash resource and it works.
execute "disk-image" do
command "cp /tmp/origin.img /tmp/copy.img"
not_if "cmp /tmp/origin.img /tmp/copy.img"
end
It's not going to work. remote_file downloads the remote file to somewhere within /var/chef IIRC, then copies to its destination.
Since /var has only 2Gb of space and the file is 4Gb big, it correctly throws the No space left on device error.
Thank you #lamont for the explanation. To cut to the chase a bit, the only solution that worked for me was to add the following to my Chef recipe, prior to any calls to remote_file:
ENV['TMP'] = '/mytmp'
ENV['TMPDIR'] = '/mytmp'
where /mytmp is a directory on a volume with enough space to hold my file.
The promising feature of adding:
file_staging_uses_destdir true
to /etc/chef/client.rb currently does not work, due to this bug: https://tickets.opscode.com/browse/CHEF-5311.
9/20/2016: Chef 12.0 shipped with file_stating_uses_destdir being defaulted to true so this should no longer be an issue (the remote_file but where it streams to /tmp may still exist).
First the real simple statement: If you've got a 4GB file in /tmp and you only have 2GB left in /tmp, then obviously copying the 4GB will fail, and nothing can help you. I'm assuming you have at least 4GB in /tmp and only 2GB left in /var which is the only interesting case to address.
As of 11.6.0 (to 11.10.2 at least) chef-client will create a tempfile using ruby's Tempfile.new and will copy the contents to that temp file and then will mv it into place. The tempfile location will be determined by ENV['TMPDIR'] and that differs based on your O/S distro (e.g. on a Mac that will be something like /var/folders/rs/wqwmj_1n59135dhhbvfdrlqh0001yj/T/ and not just /tmp or even /var/tmp), so it may not be obvious where the intermediate tempfile is created. You may be running into that problem. You should be able to see from the chef-client -l debug output what tempfile location chef is using and if you df -k that directory you might see that it is 100%.
Also, look at df -i to see if you've run out of inodes somehow which will also throw a no space left on device error.
You can set chef-client globally to use the destination directory as the tmpdir for creating files via adding this to client.rb:
file_staging_uses_destdir true
Then if your destination dir is '/tmp' the tempfile will get created there and then will simply get renamed in the directory in order to deploy it. That ensures that if there's enough space on the target device to hold the result, then the resource should always succeed to write the tempfile. It also avoids the problem if /tmp and the destdir are on different filesystems that the mv to rename and deploy the file will get translated into a copy-and-unlink-src operation which can fail in several different ways.
The answer by #cassianoleal is not correct in stating that chef-client always uses /var/cache as a temp location. Changing file_cache_path will also not have an effect. That is confusing a common pattern of downloading remote_files into the Chef file_cache_path directory for how remote_file works internally -- those are not the same thing. There is no file_cache_path in the question, so there should not be any file_cache_path in the answer.
The behavior of remote_file with file:// URLs is a bit round-a-bout, but that is because they're necessary for all other URLs (as #cassianoleal correctly mentioned). The behavior with file_staging_uses_destdir is probably correct, however, since you do want to take into account edge conditions where you run out of room and truncate the file or the server crashes in the middle of a copy operation and you don't want a half-populated file left over. By writing to a tempfile and closing it and then renaming a lot of those edge conditions are avoided.
Related
I'm trying to setup standalone Spark on Windows 10. I would like to set spark.local.dir to D:\spark-tmp\tmp, as currently it appears to be using C:\Users\<me>\AppData\Local\Temp, which in my case is on an SSD drive which might not have enough space given the size of some datasets.
So I changed the file %SPARK_HOME%\conf\spark-defaults.conf to the following, without success
spark.eventLog.enabled true
spark.eventLog.dir file:/D:/spark-tmp/log
spark.local.dir file:/D:/spark-tmp/tmp
I also tried to run %HADOOP_HOME\bin\winutils.exe chmod -R 777 D:/spark-tmp, but it didn't change anything.
The error that I get is the following:
java.io.IOException: Failed to create a temp directory (under file:/D:/spark-tmp/tmp) after 10 attempts!
If I start the path with file://D:/... (note the double slash) nothing changes. If I remove the scheme at all, a different exception says that the scheme D: is not recognized.
I also noticed this warning:
WARN SparkConf:66 - In Spark 1.0 and later spark.local.dir will be overridden by the value set by the cluster manager (via SPARK_LOCAL_DIRS in mesos/standalone and LOCAL_DIRS in YARN).
So I tried to put the following line in %SPARK_HOME%\conf\spark-env.sh:
SPARK_LOCAL_DIRS=file:/D:/spark-tmp/tmp
If I put this line and comment the spark.local.dir line in the .conf file, Spark works perfectly, but the temporary files are still saved in my AppData\Local\Temp folder. So the SPARK_LOCAL_DIRS line is not read.
What's strange is that, if I let it run, it actually puts logs in D:/spark-tmp/log, which means that it's not a problem of syntax or permissions.
On windows you will have to make those environment variables
Add the key value pair
SPARK_LOCAL_DIRS -> d:\spark-tmp\tmp
to your systems environment variables
I've got the following as part of a shell script to copy site files up to a S3 CDN:
for i in "${S3_ASSET_FOLDERS[#]}"; do
s3cmd sync -c /path/to/.s3cfg --recursive --acl-public --no-check-md5 --guess-mime-type --verbose --exclude-from=sync_ignore.txt /path/to/local/${i} s3://my.cdn/path/to/remote/${i}
done
Say S3_ASSET_FOLDERS is:
("one/" "two/")
and say both of those folders contain a file called... "script.js"
and say I've made a change to two/script.js - but not touched one/script.js
running the above command will firstly copy the file from /one/ to the correct location, although I've no idea why it thinks it needs to:
INFO: Sending file
'/path/to/local/one/script.js', please wait...
File
'/path/to/local/one/script.js'
stored as
's3://my.cdn/path/to/remote/one/script.js' (13551
bytes in 0.1 seconds, 168.22 kB/s) [1 of 0]
... and then a remote copy operation for the second folder:
remote copy: two/script.js -> script.js
What's it doing? Why?? Those files aren't even similar. Different modified times, different checksums. No relation.
And I end up with an s3 bucket with two incorrect files in. The file in /two/ that should have been updated, hasn't. And the file in /one/ that shouldn't have changed is now overwritten with the contents of /two/script.js
Clearly I'm doing something bizarrely stupid because I don't see anyone else having the same issue. But I've no idea what??
First of all, try to run it without --no-check-md5 option.
Second, I suggest you to pay attention to directory names, specifically trailing slashes.
s3cmd documentation says:
With directories there is one thing to watch out for – you can either upload the directory and its contents or just the contents. It all depends on how you specify the source.
To upload a directory and keep its name on the remote side specify the source without the trailing slash
On the other hand to upload just the contents, specify the directory it with a trailing slash
This is a slightly odd one where I'm sure I'm missing something perfectly straightforward.
I'm trying to cut some of the cruft off our build time, part of that is rebuilding a set of .debs we use which occurs everytime we've changed an aspect of the system due to the way an ant script has been configured. I was hoping to use Makefiles to monitor the folders that are going to be used for the dpkg process, so only the directories that have had recent changes are recreated but:
build-printing:
fakeroot dpkg -b printing printing.deb
Is constantly rerun, even though the files in that specific directory haven't changed. I'm sure I've missed something really simple, but I can't spot it in the man pages.
Your build-printing rule doesn't depend on anything - tell it which files it should watch the timestamps of, e.g.:
build-printing: directory/myfile.src
....
will cause build-printing to only be run if the time stamp on directoy/myfile.src is newer than the timestamp of build-printing. Since the rule doesn't look like it actually creates build-printing as a file you probably want to rename it to match the output file, e..g.
printing.deb: directory/myfile.src
....
If you want to use a rule named build-printing you can either make that rule touch a file called build-printing, or make that rule depend upon printing.deb.
This is weird… and I can't figure out for the life of me why it's doing it this way.
I've got a folder full of various CoffeeScript, SASS, HTML, and XML files.
I've got a Ruby script that's taking them all, compiling them, and minifying them into one master XML file (it's for iGoogle Gadget development).
This script takes command line args using trollop (I only state this to clarify my code below).
I want this script to copy this file from the current directory where it's created to a destination directory where it will be run.
So far, the building/compiling/minifying step runs like magic. It's #3 that's borked to Twilight Zone-level.
#!/usr/bin/ruby
…
if opts[:deploy_local]
FileUtils.cp 'build.xml', '/path/to/destination/'
puts "Copied #{written_file_name} to #{output_destination}." if opts[:verbose]
end
When this copies the file, the destination file is truncated about 3/4 of the way through it. The source file is just fine. However, moving the file works like a charm, for some strange reason.
FileUtils.mv 'build.xml', '/path/to/destination/'
To add another level of weirdness, if I just do a system copy, it also gets truncated.
system("cp build.xml /path/to/destination")
FWIW, I'm running this script from zsh and not bash. In both instances (copying and moving) the source and destination files are not in use by any other process.
Can anybody explain this freaky behavior?
A few things:
Are you moving to the same disk volume? If so, then, yeah, cam's comment about atomicity is definitely true; the OS is probably just messing with the inode table during a move, as opposed to writing out the data. IF you're moving the data between volumes, then it wouldn't be so simple.
Have you tried passing
:verbose => true
to the FileUtils.cp command? That might give a diagnostic about the failure.
I'm experiencing a weird situation with deleting files in Ruby, the code seems to report correctly, but the physical file isn't removed from my hard drive. I can do rm path/to/file on the command line - that works. I can even open up the Rails console and File.safe_unlink the file and that also works, it's just within my Rails app it fails to delete the actual file:
def destroy
Rails.logger.debug local_path #=> /Users/ryan/.../public/system/.../file.jpg
Rails.logger.debug File.exist?(local_path) #=> true
File.safe_unlink(local_path)
Rails.logger.debug File.exist?(local_path) #=> false
# yet the physical file actually STILL exists!
end
The physical file is within a Git repo (the repo is stored within /public/system/) any gotchas with that? I've tried using the ruby-git gem to delete the file using the rm command it provides, but that doesn't delete the file either.
I've opened up all the permissions on the files during testing and still nothing works. I've also confirmed this with File.writable?(local_path) which returned true.
Any thoughts why it could be preventing the file from being removed?
Have you checked the permissions on the directory? Deletion is a directory write operation, not file write operation. (rm will also check file perms and ask if you really want to do it, as a courtesy, if the file is write-protected; but if the directory isn't writable, it flat out refuses.)