Writing file to bucket fails on Elastic Beanstalk application - ruby

I am writing an application in Ruby on Elastic Beanstalk in which I download a file from a remote server and write it to an object in a bucket.
require 'open-uri'
...
s3 = AWS::S3.new
bucket = s3.buckets['mybucket']
f = open(params[:url]) #using open-uri
obj = bucket.objects[params[:key]]
obj.write[f] #<< fails here
The last line, however, fails with the following exception in the log:
:data must be provided as a String, Pathname, File, or an object that responds to #read and #eof?
I know, however, from executing the same #open on my machine, that f is a StringIO object, which does have #read and #eof?.

I was getting same error during zip file upload on S3 and finally this worked for me:
zip_data = File.read(zip_file_path)
means, zip_data will be the object at the zip file path that is located in your tmp directory.
Hope, this will work for you also.

Related

How do I get the file metadata from an AWS S3 file with Ruby?

I'm trying to simply retrieve the meta data from a file uploaded to S3. Specifically I need to the content type.
I know the file has metadata, because I can see it in S3 console. But I'm unable to get it programmatically. I must have some syntax error.
See the code below, the file.key returns the file name correctly. But the file.metadata doesn't seem to return an array with data.
s3 = Aws::S3::Resource.new(region: ENV['REGION'])
file = s3.bucket(sourceS3Bucket).object(sourceS3Key)
puts file.key # this works!
puts file.metadata # this returns an empty array {}
puts file.metadata['content-type'] # empty
As Aleksei Matiushkin suggested file.data[:content_type] will give the file type.

How to read file from s3?

I'm trying to read a CSV file directly from s3.
I'm getting the s3 URL but I am not able to open it as it's not in the local system. I don't want to download the file and read it.
Is there any other way to achieve this?
There are few ways, depending on the gems that you are using. For example, one of the approaches from official documentation:
s3 = Aws::S3::Client.new
resp = s3.get_object(bucket:'bucket-name', key:'object-key')
resp.body
#=> #<StringIO ...>
resp.body.read
#=> '...'
Or if you are using CarrierWave/Fog:
obj = YourModel.first
content = obj.attachment.read
You can open the file from URL directly:
require 'open-uri'
csv = open('http://server.com/path-to-your-file.csv').read
I think s3 doesn't provide you any way of reading the file without downloading it.
What you can do is save it in a tempfile:
#temp_file = Tempfile.open("your_csv.csv")
#temp_file.close
`s3cmd get s3://#{#your_path} #{#temp_file.path}`
For further information: http://www.ruby-doc.org/stdlib-1.9.3/libdoc/tempfile/rdoc/Tempfile.html

how to parse XML file remotely from FTP with nokogiri gem, without downloading

require 'net/ftp'
require 'nokogiri'
server = "xxxxxx"
user = "xxxxx"
password = "xxxxx"
ftp = Net::FTP.new(server, user, password)
files = ftp.nlst('File*.xml')
files.each do |file|
ftp.getbinaryfile(file)
doc = Nokogiri::XML(open(file))
# some operations with doc
end
With the code above I'm able to parse/read XML file, because it first downloads a file.
But how can I parse remote XML file without downloading it?
The code above is a part of rake task that loads rails environment when run.
UPDATE:
I'm not going to create any file. I will import info into the mongodb using mongoid.
If you simply want to avoid using a temporary local file, it is possible to to fetch the file contents direct as a String, and process in memory, by supplying nil as the local file name:
files.each do |file|
xml_string = ftp.getbinaryfile( file, nil )
doc = Nokogiri::XML( xml_string )
# some operations with doc
end
This still does an FTP fetch of the contents, and XML parsing happens at the client.
It is not really possible to avoid fetching the data in some form or other, and if FTP is the only protocol you have available, then that means copying data over the network using an FTP get. However, it is possible, but far more complicated, to add capabilities to your FTP (or other net-based) server, and return the data in some other form. That could include Nokogiri parsing done remotely on the server, but you'd still need to serialise the end result, fetch it and deserialise it.

How can I use fog to edit a file on s3?

I have a bunch of files on s3. I have fog set up with a .fog config file so I can fire up fog and get a prompt. Now how do I access and edit a file on s3, if I know its path?
The easiest thing to do is probably to use IRB or PRY to get a local copy of the file, or write a simple script to download, edit and then re-upload it. Assume you have a file named data.txt.
You can use the following script to initialize a connection to S3.
require 'fog'
connection = Fog::Storage.new({
:provider => 'AWS',
:aws_secret_access_key => YOUR_SECRET_ACCESS_KEY,
:aws_access_key_id => YOUR_SECRET_ACCESS_KEY_ID
})
directory = connection.directories.get("all-my-data")
Then use the directory object to get a copy of your file on your local file-system.
local_file = File.open("/path/to/my/data.txt", "w")
file = directory.files.get('data.txt')
local_file.write(file.body)
local_file.close
Edit the file using your favorite editor and then upload it to S3 again.
file = directory.files.get('data.txt')
file.body = File.open("/path/to/my/data.txt")
file.save

How to FTP in Ruby without first saving the text file

Since Heroku does not allow saving dynamic files to disk, I've run into a dilemma that I am hoping you can help me overcome. I have a text file that I can create in RAM. The problem is that I cannot find a gem or function that would allow me to stream the file to another FTP server. The Net/FTP gem I am using requires that I save the file to disk first. Any suggestions?
ftp = Net::FTP.new(domain)
ftp.passive = true
ftp.login(username, password)
ftp.chdir(path_on_server)
ftp.puttextfile(path_to_web_file)
ftp.close
The ftp.puttextfile function is what is requiring a physical file to exist.
StringIO.new provides an object that acts like an opened file. It's easy to create a method like puttextfile, by using StringIO object instead of file.
require 'net/ftp'
require 'stringio'
class Net::FTP
def puttextcontent(content, remotefile, &block)
f = StringIO.new(content)
begin
storlines("STOR " + remotefile, f, &block)
ensure
f.close
end
end
end
file_content = <<filecontent
<html>
<head><title>Hello!</title></head>
<body>Hello.</body>
</html>
filecontent
ftp = Net::FTP.new(domain)
ftp.passive = true
ftp.login(username, password)
ftp.chdir(path_on_server)
ftp.puttextcontent(file_content, path_to_web_file)
ftp.close
David at Heroku gave a prompt response to a support ticket I entered there.
You can use APP_ROOT/tmp for temporary file output. The existence of files created in this dir is not guaranteed outside the life of a single request, but it should work for your purposes.
Hope this helps,
David

Resources