Ruby zip a stream - ruby

I am trying to write a ruby fcgi script which compresses files in a directory on the fly and sends the output blockwise as an http response. It is very important that this compression is done as a stream operation, otherwise the client will get a timeout for huge directories.
I have the following code:
d="/tmp/delivery/"
# send zip header
header(MimeTypes::ZIP)
# pseudocode from here on
IO.open(d) { |fh|
block=fh.readblock(1024)
#send zipped block as http response
print zip_it(block)
}
How do I achieve what I've written as pseudo-ruby in the above listing?

Tokland's idea of using the external zip command works pretty well. Here's a quick snippet that should work with Ruby 1.9 on Linux or similar environments. It uses an array parameter to popen() to avoid any shell quoting issues and sysread/syswrite to avoid buffering. You could display a status message in the empty rescue block if you like -- or you could use read and write, though I haven't tested those.
#! usr/bin/env ruby
d = '/tmp/delivery'
output = $stdout
IO.popen(['/usr/bin/zip', '-', d]) do |zip_output|
begin
while buf = zip_output.sysread(1024)
output.syswrite(buf)
end
rescue EOFError
end
end

AFAYK Zip format is not streamable, at end of compression it writes something in the file header.
gz or tar.gz is better option.

solved:
https://github.com/fringd/zipline.git

Related

Failing to read a named pipe being written to

I have a stream of data that I’m writing to a named pipe:
named_pipe = '/tmp/pipe' # Location of named pipe
File.mkfifo(named_pipe) # Create named pipe
File.open(named_pipe, 'w+') # Necessary to not get a broken pipe when ⌃C from another process later on
system('youtube-dl', '--newline', 'https://www.youtube.com/watch?v=aqz-KE-bpKQ', out: named_pipe) # Output download progress, one line at a time
Trouble is, while I can cat /tmp/pipe and get the information, I’m unable to read the file from another Ruby process. I’ve tried File.readlines, File.read with seeking, File.open then reading, and other stuff I no longer remember. Some of those hang, others error out.
How can I get the same result as with cat, in pure Ruby?
Note I don’t have to use system to send to the pipe (Open3 would be acceptable), but any solution requiring external dependencies is a no-go.
it looks like File.readlines/IO.readlines, File.read/IO.read need to load the whole temp file first so you don't see any be printed out.
try File#each/IO.foreach which process a file line by line and it does not require the whole file be loaded into memory
File.foreach("/tmp/pipe") { |line| p line }
# or
File.open('/tmp/pipe','r').each { |line| p line }

How to prevent capistrano replacing newlines?

I want to run some shell scripts remotely as part of my capistrano setup. To test that functionality, I use this code:
execute <<SHELL
cat <<TEST
something
TEST
SHELL
However, that is actually running /usr/bin/env cat <<TEST; something; TEST which is obviously not going to work. How do I tell capistrano to execute the heredoc as I have written it, without converting the newlines into semicolons?
I have Capistrano Version: 3.2.1 (Rake Version: 10.3.2) and do not know ruby particularly well, so there might be something obvious I missed.
I think it might work to just specify the arguments to cat as a second, er, argument to execute:
cat_args = <<SHELL
<<TEST
something
TEST
SHELL
execute "cat", cat_args
From the code #DavidGrayson posted, it looks like only the command (the first argument to execute) is sanitized.
I agree with David, though, that the simpler way might be to put the data in a file, which is what the SSHKit documentation suggests:
Upload a file from a stream
on hosts do |host|
file = File.open('/config/database.yml')
io = StringIO.new(....)
upload! file, '/opt/my_project/shared/database.yml'
upload! io, '/opt/my_project/shared/io.io.io'
end
The IO streaming is useful for uploading something rather than "cat"ing it, for example
on hosts do |host|
contents = StringIO.new('ALL ALL = (ALL) NOPASSWD: ALL')
upload! contents, '/etc/sudoers.d/yolo'
end
This spares one from having to figure out the correct escaping sequences for something like "echo(:cat, '...?...', '> /etc/sudoers.d/yolo')".
This seems like it would work perfectly for your use case.
The code responsible for this sanitization can be found in SSHKit::Command#sanitize_command!, which is called by that class's initialize method. You can see the source code here:
https://github.com/capistrano/sshkit/blob/9ac8298c6a62582455b1b55b5e742fd9e948cefe/lib/sshkit/command.rb#L216-226
You might consider monkeypatching it to do nothing by adding something like this to the top of your Rakefile:
SSHKit::Command # force the class to load so we can re-open it
class SSHKit::Command
def sanitize_command!
return if some_condition
super
end
end
This is risky and could introduce problems in other places; for example there might be parts of Capistrano that assume that the command has no newlines.
You are probably better off making a shell script that contains the heredoc or putting the heredoc in a file somewhere.
Ok, so this is the solution I figured out myself, in case it's useful for someone else:
str = %x(
base64 <<TEST
some
thing
TEST
).delete("\n")
execute "echo #{str} | base64 -d | cat -"
As you can see, I'm base64 encoding my command, sending it through, then decoding it on the server side where it can be evaluated intact. This works, but it's a real ugly hack - I hope someone can come up with a better solution.

A ruby script to run tail on a log file?

I want to write a ruby script that read from a config file that will have filenames, and then when I run the script it will take the tail of each file and output the console.
What's the best way to go about doing this?
Take a look at File::Tail gem.
You can invoke linux tail -number_of_lines file_name command from your ruby script and let it print on console or capture output and print it yourself (if you need to do something with these lines before you print it)
We have a configuration file that contain a list of the log files; for example, like this:
---
- C:\fe\logs\front_end.log
- C:\mt\logs\middle_tier.log
- C:\be\logs\back_end.log
The format of the configuration file is a yaml simple sequence , therefore suppose we named this file 'settings.yaml'
The ruby script that take the tail of each file and output the console could be like this:
require 'yaml'
require 'file-tail'
logs = YAML::load(File.open('settings.yaml'))
threads = []
logs.each do |the_log|
threads << Thread.new(the_log) { |log_filename|
File.open(log_filename) do |log|
log.extend(File::Tail)
log.interval = 10
log.backward(10)
log.tail { |line| p "#{File.basename(the_log,".log")} - #{line}" }
end
}
end
threads.each { |the_thread| the_thread.join }
Note: displaying each line I wanted to prefix it with the name of the file from which it originates, ...this for me is a good option but you can edit the script to change as you like ; is the same for the tails parameters.
if file-tail is missing in your environment, follow the link as #Mark Thomas posts in his answear; i.e you need to:
> gem install file-tail
I found the file-tail gem to be a bit buggy. I would write to a file and it would read the entire file again instead of just thelines appended. This happened even though I had log.backward set to 0. I ended up writing my own and figured that I would share it here in case any one else is looking for a Ruby alternative to the file-tail gem. You can find the repo here. It uses non_blocking io, so it will catch amendments to the file immediately. There is one caveat that can be easily fixed if you can program in the Ruby programming language; log.backward is hard coded to be -1.

Setting input for system() calls in ruby

I'm trying to download a file using net/sftp and pass its contents as the stdin for a command-line app. I can do it by first writing the file to disk but I'd rather avoid that step.
Is there any way to control the input to a program invoked with system() in ruby?
Don't use system at all for this sort of thing, system is best for running an external command that you don't need to talk to.
Use Open3.open3 or Open3.open2 to open up some pipes to your external process then write to the stdin pipe just like writing to any other IO channel; if there is any output to deal with, then you can read it straight from the stdout pipe just like reading from any other input IO channel.
Something like this perhaps (using open as mu suggested)?
contents = "Hello, World!"
open('|echo', 'w') { puts contents }
This can also be accomplished with IO.expect
require 'pty'
require 'expect'
str = "RUBY_VERSION"
PTY.spawn("irb") do |reader, writer|
reader.expect(/0> /)
writer.puts(str)
reader.expect(/=> /)
answer = reader.gets
puts "Ruby version from irb: #{answer}"
end
This waits for the spawned process to display "0> " (the end of an irb prompt) and when it sees that prints a defined string. It then looks for the irb to return by waiting for it to display "=> " and grabs the data returned.

How can I delete a file in Sinatra after it has been sent via send_file?

I have a simple sinatra application that needs to generate a file (via an external process), send that file to the browser, and finally, delete the file from the filesystem. Something along these lines:
class MyApp < Sinatra::Base
get '/generate-file' do
# calls out to an external process,
# and returns the path to the generated file
file_path = generate_the_file()
# send the file to the browser
send_file(file_path)
# remove the generated file, so we don't
# completely fill up the filesystem.
File.delete(file_path)
# File.delete is never called.
end
end
It seems, however, that the send_file call completes the request, and any code after it does not get run.
Is there some way to ensure that the generated file is cleaned up after it has been successfully sent to the browser? Or will I need to resort to a cron job running a cleanup script on some interval?
Unfortunately there is no any callbacks when you use send_file. Common solution here is to use cron tasks to clean temp files
It could be a solution to temporarily store the contents of the file in a variable, like:
contents = file.read
After this, delete the file:
File.delete(file_path)
Finally, return the contents:
contents
This has the same effect as your send_file().
send_file is streaming the file, it is not a synchronous call, so you may not be able to catch the end of it to the cleanup the file. I suggest using it for static files or really big files. For the big files, you'll need a cron job or some other solution to cleanup later. You can't do it in the same method because send_file will not terminate while the execution is still in the get method. If you don't really care about the streaming part, you may use the synchronous option.
begin
file_path = generate_the_file()
result File.read(file_path)
#...
result # This is the return
ensure
File.delete(file_path) # This will be called..
end
Of course, if you're not doing anything fancy with the file, you may stick with Jochem's answer which eliminate begin-ensure-end altogether.

Resources