Deployment fails because of obscure REXML setting? - ruby

I'm using EngineYard for my production system. My deployment has Ruby 1.9.3p392. I develop on Ruby 1.9.3p429.
I get notifications from a 3rd party server that contains large XML files (larger than 10K anyway).
After a new deployment, for some reason, all of my notifications from this party are FAILING because the XML is greater than the 10K limit.
So on my dev instance I added the following line to application.rb:
REXML.entity_expansion_text_limit=102400
But that makes my deployment fail. So I look around and try another iteration:
REXML::Document.entity_expansion_text_limit=102400
Nope, that particular version of Ruby has no idea what I'm talking about.
What can I do to overcome this 10K default?

For some reason, I REXML::Document needs to be required on EngineYard. Here's what I did to fix my deployment.
In application.rb:
require 'rexml/document'
REXML::Document.entity_expansion_text_limit=102400
That appears to have done it.

Related

How to deploy a Ruby app - not ROR?

I am very new to ruby and I wonder, Is it possible to have my ruby script deployed on a server?
Or I should have to use Rails?
As I can understood that Rails is not part of the core Ruby lang, and Ruby have server functionality even without Rails. (as in Java, PHP, etc..)
EDIT:
I have a Ruby script - acts as a cmd-line passed program - and I want to deploy it to an external (or even internal) server the way CGI scripts/programs used to do.
Yes, you can deploy any Ruby application, not just Rails apps obviously. Take a look at Capistrano.
Deployment and serving are two different things however. If you're looking for Ruby HTTP servers look at Unicorn, Thin, WEBrick, Puma.
If you want a fully-fledged solution try Heroku which handles both the deployment and web serving parts.
There are many tools to deploy Ruby projects, but you can do it pretty much manually.
I also found it very hard to find an easy-to-go solution and I think this is a very annoying gap in RoR framework.
I've been working in a solution to deploy a project to a server using Git, like the Heroku toolbelt (google it, is a really nice tool). The main concept is: you use Git to push your project and the server does everything else! Here you can see my project: https://github.com/sentient06/RDH/.
But please, don't focus on that. Instead, read the way I came to all information in the wiki: https://github.com/sentient06/RDH/wiki.
It is a bit outdated, but I can summarize here to you:
First, setup your server. This is the most boring part, you must setup all configuration, security measures, remote access, etc, etc.
If you don't have a server, you can hire one specially for RoR applications. There are a few good out there and each has a different deployment workflow. But supposing you decide o setup yourself:
I suggest you have any Linux or Unix system, server version. Then install Ruby Version Manager, then Ruby and then Rails. Then install a server application. I suggest Thin, but lots of people use Unicorn or Apache or other servers. Dig a little bit on the internet, find an easy to use solution. If you do not use Apache, though, you will need a "reverse proxy" too, so you can redirect all requests on ports 80, 8080, etc, to your applications. I suggest Nginx (I don't like Apache, I think is too overkill).
Now, everything done, the deploy process can be done more or less like this:
1 - Commit everything in a way your files are updated in the server;
2 - In the server, cd to the directory of your application and execute these commands:
$ bundle package
$ bundle install --deployment
$ RAILS_ENV=production rake db:migrate
$ rake assets:precompile
3 - Restart the server and, if necessary, the reverse proxy.
Dig on the internet to understand each command. These will pretty much force your application into production mode, reduce the space used by your javascript and CSS, migrate your production database and install the bundles. Production RoR is not so different from development RoR, it is just more compact and faster.
I do hope these informations are useful.
Good luck!
Update:
I forgot to mention, check ruby-toolbox, it has some really useful statistics and information on how often Rails technologies are being updated. They have many categories, this one is on deployment automation, give it a look: https://www.ruby-toolbox.com/categories/deployment_automation.
Cheers!

Best practice for adding other language files in ruby project

I have a ruby project written purely in ruby. Now I want to include a java archive (jar) file which has some functionality my users want. It is good to just place the file in one of the directories and bundle as a gem? Are there any security issues related to this? Any advice would be greatly appreciated.
The answer is that it depends on the use case.
If this is a gem that users will be using purely for their own purposes, and it's not broadcasting over a network, then security issues are fairly minimal - they would relate more to system security.
If part of your program involves binding to a port and accepting TCP/UDP connections then you've got to really start thinking about network security. Another possible problem is if you're giving file system access to non-privileged users (e.g. if this is a rails gem, and the JAR gives functionality to manipulate the file system and for some reason you're passing this on to the site users - bit of a stupid example but I hope you see what I'm getting at).
However, as for running a java JAR file, there's nothing innately insecure about that unless there are known security flaws with that particular JAR.
In the end, it's up to the end-user of the gem. Make it clear what the gem does and they can make the decision about whether they want to use it.

Upload large files using Ruby

I'm wondering what is the best pattern to allow large files to be uploaded to a server using Ruby.
I've found Rails and Large, Large file Uploads: Looking at the alternative but it doesn't give any concrete solutions.
I don't want to use Rails since I'm working on a simple upload server that'll run in standalone mode. I'm guessing that Sinatra could be the key but I don't know which web server I should use to run it without raising a Timeout.
I also need this web server to allow simultaneous upload.
UPDATE: By "large files" I mean between 200MB and 5GB.
UPDATE2: Since those files are videos (in my case), I can deal with a max size of 2GB like youtube.
ok i am taking a bit of a strech here but:
if you would use a couchdb as a target for your uploads you would get rid of the timeout problem.
consider the couchdb as some "temp" memory in this example.
so if a downloads finishes you can take the file from the couchdb and do with it whatever you want.
i managed to upload files as big as 9gb over a dsl line into couchdb without any drama.
it may take a bit of reading but i think you could make it work.
couchdb has many rails gems so it plays nice with others ;)
let me know if you wanna go down that rabbit hole so i can give you some more pointers
passenger recommends using a separate apache/nginx module to handle uploads.

Writing an entire application on top of Capistrano

I am working on a task that needs to checkout source from a github repositroy, and then modify some files from the checked out repository based on some existing configuration data that's coming from a separate call to a different web service as JSON. The changes to the checked out code are temporary and will not be pushed back to github.
Once the checked out source is processed and modified based upon the configuration data, I will create a compressed archive of the resulting source.
I just discovered Capistrano and it seems great for this entire process, although it has nothing to do with deployment. On the other hand, I could simply use plain Ruby to do the same stuff. Currently, I am weighing more on the side of using Capistrano with custom tasks.
So you can say that it's an app based on Capistrano itself, with local deployment. Does it sound like a sane approach? Should I write it in plain Ruby instead? Or maybe write parts of the application in pure Ruby, and connect the pieces with Capistrano. Any suggestion is welcome.
Sincerely recommend Thor (see Github) it's pure-ruby syntax tax framework like Rake (but like Capistrano has a lot of cruft for server cluster grouping and connection handling… Rake has a lot to do with more classical "Make" or build tasks)
Recommendation from me is a set of Thor tasks, using raw-net-ssh (cap is based on Net::SSH) where appropriate.
For the checking out I recommend you watch the "Amp" project… they're coming up with a consistent cross-scm way to do checkouts (but thats the least of your problems) - You can take a look here, but it's early days for them yet - http://github.com/michaeledgar/amp
Sources: (as the Capistrano maintainer, i'm planning on throwing out our own DSL to replace it with Thor since it makes a lot more sense )
As for me, I write things like these in a Rakefile, and then use a rake command to call them.
You can find that Rakefiles are similar to Capfiles, so rake is usually used to perform some local tasks, and cap for remote.

Why is my sinatra website so slow?

After asking this question, I started using Sinatra as a way to serve web pages.
This evening, a friend of mine and I started to test the speed of the server.
The file to log in looks like:
require 'rubygems'
require 'sinatra'
require 'haml'
enable :sessions #for cookies!
get '/' do
haml :index
end
And the index.haml looks like:
%title
First Page
%header
%h2 First Page
He's sitting on a recent laptop, as am I, with an Apple 802.11n router between the two of us. We're both running Windows 7. I've also tried these same files on a laptop running Ubuntu 9.10 x64 with Sinatra and all relevant files installed from apt-get.
Sinatra is taking 7 seconds to serve up a single page request, no matter the server OS, Windows or Linux. I see that here the author managed to get over 400 requests/second processed. What gives? (or should this be on SuperUser or the like?)
I'll set aside any opinions on when you should optimize your web application.
Set up different configurations in your Sinatra app for development and production because some of these suggestions, you won't always want to use. In fact, you should probably go ahead and setup and environment similar to how you would deploy in production. You would not deploy by simply running ruby app.rb. You'd want to put apache or nginx in front of your Mongrel. Mongrel will serve up your static files, but that's really only advisable for development mode. In deployment, a web server is going to do a lot better job for that. In short, your deployed environment will be faster than your standalone development environment.
At this point, I wouldn't worry about Mongrel vs. Thin. If Thin is twice as fast - it isn't - then your 7 seconds becomes 3.5. Will that be good enough?
Some things to try ...
I know I just told you to set up a deployment environment, but maybe it's not the server side. Have you tried running YSlow or PageSpeed on your pages? I/O is going to take up more of those 7 seconds (Disclaimer: I'm assuming that there's nothing wrong with your network set up) than the server. YSlow - Firebug actually - will tell you how long each part of your page takes to get to the browser.
One of the things that YSlow told me to do was to put a far forward Expires header on my static assets, which I knew but I was leaving optimization until the end. That's when I realized that there were at least 3 different places that I could specify that header. I'm convincing myself that doing it in nginx is the right place to put it.
If you're happy with those results, then you can look at the server. Off the top of my head, so not exhaustive
Turn on gzip responses.
Combine your stylesheets so there's only one per page request. There may be some Rack Middleware for this, if you don't do it manually.
Cache. I'm trying Rack::Cache.
Use sprites to decrease the number of image downloads you use.
Minify your Javascript. Again, maybe via Rack Middleware.
Rack Middleware is neat, but it uses CPU. So, manually minifying your Javascript adds a new step to your workflow, but on the server, it's faster than Middleware. It's a tradeoff.
Sorry if this was rambly.
I had this problem when running Sinatra with shotgun but not when running my app directly (i.e., ruby -rubygems app.rb). This is because shotgun forks and reloads the application for each request.
I found a thread in Sinatra's mailing list which discussed this issue and people there advised using rerun instead of shotgun. I'm happy to say it solved this issue for me.
Try using Thin as the server. I noticed an increase in performance compared with WEBrick and Mongrel.
gem install thin
When you run your app using ruby TestServer.rb you'll see the following:
Sinatra/0.10.1 has taken the stage on 4567 for development with backup from Thin
I'm running Sinatra inside VMWare Fusion with Vagrant. My app was running slowly (about ten seconds to service a request). Then I found this gem:
Webrick is very slow to respond. How to speed it up?
It seems that WEBrick was (by default) configured to reverse dns lookup on every request, and that was slowing it down.

Resources