I have a server side API running on Heroku for one of my iOS apps, implemented as a Ruby Rack app (Sinatra). One of the main things the app does is upload images, which the API then processes for meta info like size and type and then stores in S3. What's the best way to handle this scenario on Heroku since these requests can be very slow as users can be on 3G (or worse)?
Your best option is to upload the images directly to Amazon S3 and then have it ping you with the details of what was uploaded.
https://devcenter.heroku.com/articles/s3#file-uploads
Related
I am building a Video service using Azure Media Services and Node.js
Everything went semi-fine till now, however when I tried to deploy the app to Azure Web Apps for hosting, any large files fail with 404.13 error.
Yes I know about maxAllowedContentLength and not, that is NOT a solution. It only goes up to 4GB, which is pathetic - even HDR environment maps can easily exceed that amount these days. I need to enable users to upload files up to 150GB in size. However when azure web apps recieves a multipart request it appears to buffer it into memory until a certain threshold of either bytes or just seconds (upon hitting which, it returns me a 404.13 or a 502 if my connection is slow) BEFORE running any of my server logic.
I tried Transfer-Encoding: chunked header in the server code, but even if that would help, since Web Apps doesn't let the code run, that doesn't actually matter.
For the record: I am using Sails.js at backend and Skipper is handling the stream piping to Azure Blob Service. Localhost obviously works just fine regardless of file size. I made a duplicate of this question on MSDN forums, but those are as slow as always. You can go there to see what I have found so far: enter link description here
Clientside I am using Ajax FormData to serialize the fields (one text field and one file) and send them, using the progress even to track upload progress.
Is there ANY way to make this work? I just want it to let my serverside logic handle the data stream, without buffering the bloody thing.
Rather than running all this data through your web application, you would be better off having your clients upload directly to a container in your Azure blob storage account.
You will need to enable CORS on your Azure Storage account to support this. Then, in your web application, when a user needs to upload data you would instead generate a SAS token for the storage account container you want the client to upload to and return that to the client. The client would then use the SAS token to upload the file into your storage account.
On the back-end, you could fire off a web job to do whatever processing you need to do on the file after it's been uploaded.
Further details and sample ajax code to do this is available in this blog post from the Azure Storage team.
We are currently developing a service to share photos about people's interests and we are using below technologies. (newbies btw.)
for backend
Nodejs
MongoDb
Amazon S3
for frontend
ios
android
web (Angularjs)
Storing and serving images is a big deal for our service (it must be fast). We are thinking about performance issues. We stored photos on mongodb first but we changed it into aws S3 then.
So,
1.our clients can upload images from the app(s)
2.we are handling these images on nodejs and sending them to the aws S3 storage
3.s3 sends an url back to us
4.we save the url into the user's related post
5.so, when user wants to see his photos the app gets the photos from with their urls
6.finally we are getting images from S3 to the user directly.
Is this a good way to handle the situation? or is there a best way to do it?
Thanks
I am building an android app which has a backend written on ruby/sinatra. The data from the android app is coming in the form of json data.
The database being used is mongodb.
I am able to catch the data on the backend. Now what I want to do is to upload a video on Amazon S3 being sent from the android app in the form of byte array.
I also want to store the video in a form of a string in the local database.
I have been using carrierwave, fog and carrierwave-mongoid gems but didn't have any luck.
These are the some blogs I followed:
https://blog.engineyard.com/2011/a-gentle-introduction-to-carrierwave/
http://www.javahabit.com/2012/06/03/saving-files-in-amazon-s3-using-carrierwave-and-fog-gem/
If someone could just guide me with how to go about it specifically with sinatra and mongodb cause that's where I am facing the main issue.
You might think about using AWS SDK for Android to directly upload to S3 so that your app server thread doesn't get stuck while an user is uploading a file. If you are using a service like Heroku you would be paying extra $$$ just because your user had a lousy connection.
However in this scenario;
Uploading to S3 should be straight forward once you have your mounting in place using carrierwave.
You should never store your video in the database as it will slow you down! DBs are not optimised for files, OSs are. Video is binary data and cannot be stored as text, you would need a blob type if you want to do this crime.
IMO, uploading to S3 is good enough as then you can use Amazon cloudfront CDN services to copy and distribute your content in a more optimised way.
The app I am currently hosting on Heroku allows users to submit photos. Initially, I was thinking about storing those photos on the filesystem, as storing them in the database is apparently bad practice.
However, it seems there is no permanent filesystem on Heroku, only an ephemeral one. Is this true and, if so, what are my options with regards to storing photos and other files?
It is true. Heroku allows you to create cloud apps, but those cloud apps are not "permanent" - they are instances (or "slugs") that can be replicated multiple times on Amazon's EC2 (that's why scaling is so easy with Heroku). If you were to push a new version of your app, then the slug will be recompiled, and any files you had saved to the filesystem in the previous instance would be lost.
Your best bet (whether on Heroku or otherwise) is to save user submitted photos to a CDN. Since you are on Heroku, and Heroku uses AWS, I'd recommend Amazon S3, with optionally enabling CloudFront.
This is beneficial not only because it gets around Heroku's ephemeral "limitation", but also because a CDN is much faster, and will provide a better service for your webapp and experience for your users.
Depending on the technology you're using, your best bet is likely to stream the uploads to S3 (Amazon's storage service). You can interact with S3 with a client library to make it simple to post and retrieve the files. Boto is an example client library for Python - they exist for all popular languages.
Another thing to keep in mind is that Heroku file systems are not shared either. This means you'll have to be putting the file to S3 with the same application as the one handling the upload (instead of say, a worker process). If you can, try to load the upload into memory, never write it to disk and post directly to S3. This will increase the speed of your uploads.
Because Heroku is hosted on AWS, the streams to S3 happen at a very high speed. Keep that in mind when you're developing locally.
I have a unique set-up I am trying to determine if Heroku can accommodate. There is so much marketing around polygot applications, but only one example I can actually find!
My application consists of:
A website written in Django
A separate Java application, which takes files uploaded by users, parses them, and stores the data in a database
A shared database accessible by both applications
Because these user-uploaded files can be enormous, I want the uploaded file to go directly to the Java application. My preferred architecture is:
The Django-generated webpage displays the upload form.
The form does an AJAX submit to the Java application
The browser starts polling the database to see if the Java application has inserted the data
Meanwhile the Java application does its thing w/ the user-uploaded file and updates the database when it's done
The Django webpage AJAX-refreshes a div with the results of the user upload once the polling mechanism sees that the upload is complete
The big issue I can't figure out here is if I can get both the Django the Java apps either running on the same set of dynos or on different dynos but under the same domain to avoid AJAX cross-domain issues. Does Heroku support URL-level routing? For ex:
Django application available at http://www.myawesomewebsite.com
Java application available at http://www.myawesomewebsite.com/javaurl/
If this is not possible, does anyone have any ideas for work-arounds? I know I could have the user upload the file to Django and have Django send the request to Java from the server-side instead of the client side, but that's an awful lot of passing around of enormous files.
Thanks so much!
Heroku does not support the ability to route via the URL. Polyglot components should exist as their own subdomains and operate in a cross-domain fashion.
As a side-note: Have you considered directly uploading to S3 instead of uploading to your app on Heroku which will then (presumably) upload to S3. If you're dealing with cross-domain file uploads this is worth considering for its high level of scalability.