I'm using the excellent Fog gem to access just the Rackspace Cloud Files service. My challenge is that I'm trying to keep the service that is accessing Cloud Files lightweight, and it seems that Fog through its flexibility has a lot of dependencies and code I'll never need.
Has anybody tried to build a slimmed down copy of Fog, just to include a subset of providers, and therefore limit the dependencies? For example, for the Rackspace Cloud Files API exclusively, I'd expect to be able to handle everything without net-ssh, net-scp, nokogiri gems, and all the unused code for Amazon, Rackspace and 20 other providers that are not being used. I'm hoping to avoid upgrading the gem every time one of those unused providers notices a bug, while keeping my memory footprint down.
I'd appreciate any experience anybody may have in doing this, or advice from anybody familiar with building Fog in what I can and can't rip out.
If I'm just using the wrong gem, then that's equally fine. I'll move to something more focused.
I'm the maintainer of fog so I'll chime in to help fill in some of the explanation/gaps. The short answer is, yeah it's a lot of stuff, but mostly it shouldn't impact you negatively.
First off, fog grew pretty organically over time, so it did get bigger than intended. One of the ways that we contend with this is that we rather aggressively avoid requiring/loading files until really needed. So although you have to download lots of provider files you won't use to install fog, they shouldn't actually end up in memory. That was the simplest thing we could do in order to keep things "just working" while also reducing the memory usage (and load time).
The release schedule doesn't tend to be too crazy (on average about once a month) and tends to include a mix of stuff across most of the providers. So I'd expect you won't have too much churn here (outside of emergency/security type fixes which might warrant shortening the normal cycle).
So, that hopefully provides some insight in to the state of the art. We have also discussed starting to split things out more in the longer term. I think if/when that happens we would end up with something like fog-rackspace for all the rackspace related things. And then they could share things through fog-core or similar. We have a rough outline, but it is a pretty big undertaking without a huge upside, so it isn't something we have really actively begun on.
Hope that helps, certainly happy to discuss further if you have further questions or concerns.
I work for Rackspace on, among other things, our Ruby SDKs. You're using the right gem. Fog is our official Ruby API.
This is possibly something that could be done by introducing another gemspec into the project that builds from only fog core and the Rackspace-specific files. Though this would be unconventional and make #geemus' (the gem maintainer) gem release process more complicated––especially should other providers start to do the same. Longer term, this would serve to divert the fog community away from acting as a unified API.
I have installed Munin to provide some insight into server performance for a VPS server with some small rails and Sinatra applications. Is there a good resource for reading up on what to look for on the graphs Munin provides. Or a good resource on getting more details on specific measures (Fork rate, Swap in/out) - what they are telling me, what are signals that need to be looked into...
Mainly I am trying to learn about what measures I should pay attention to on the server side as I try to work with some small ruby application for fun.
When you install munin you get a bunch of default plugins which graph system metrics. To begin with, it would be a good idea to keep an eye on load-avg, cpu%, memory, swap in/out.
If you're not sure exactly how munin is calculating a specific metric, you can try reading the source code for the plugin script. Usually system metrics are obtained from the /proc filesystem. On a debian/ubuntu box, munin plugin scripts are installed (via a symlink) under /etc/munin/plugins. You can install your custom plugins by simply dropping them somewhere and symlinking to them from /etc/munin/plugins.
Ferret the ruby implementation of lucene is reasonably powerful, however online discussions in 2008 seemed to indicate ferret had many stability issues and would segfault regularly. There have been 10 or so commits this year so the project has pretty light activity.
Is Ferret stable enough to use in production?
It seems that the community has pulled back from Ferret and the two primary contenders are Sphinx and Apache Solr.
While I do not have any hard evidence of "the community pulling back" (yes, its subjective) it just seems like there is not much inertia behind it and I think there are more feature-rich and mature options (again, Sphinx and Solr to just name a few).
I used it for one project half year ago (July 2009). It was a database for one of festivals, so it just run for about 10 days (about 20 queries per minute with 50 updates per hour) and I had some problems. Few times I had problem with indexes and I had to rebuild it and few times server crashed. I didn't have time then to switch to something else, so I just added simple cron script that checked every minute if ferret server was running and, if not, it started it.
But I don't know how is it working now (I don't even know if there is a newer version).
Now I'm considering switching to something different, but I'll look into this later.
It depends on your need. I've been running Ferret for 3 years now, and the past few months have a fairly complex Ferret deployment. I don't have crashes, ever, on production, but you have to be careful with your deployment. E.g. you have to absolutely make sure that you don't have multiple writers, but that's not difficult. If you want to customize with your own filter and analyzers, you can, but you have to test and make sure first you don't run into weird problems (I just ran into one and I think fixed it). The point is, if you are careful, you can get a good deployment going, no problem.
Ferret allows you to be very flexible and customizable in managing documents in your index. You can incrementally delete and update documents and fields, which is harder to do in Sphinx. You can also very easily assign weights to different fields. You can easily control how words should be indexed and searched. I think if you want to be flexible at building your new app, and want to try different ways to index words and weigh fields, Ferret's easy of use is a win.
I've never used Sphinx. I heard a lot of good things about it and it's actively developed (unlike Ferret). But my app requires very fine grained and frequent incremental updates, so I am stuck with Ferret.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
So, I've got an idea for a website. I can start off using any platform and frameworks I want, but there are almost too many options.
OS Platform:
Windows, *nix
Web Framework:
Rails, ASP.NET, ASP.NET MVC, Django, Zend, Cake, others
Hosting:
EC2, Dedicated Server, Shared Hosting, VPS, App Engine, Azure, others
Persistence:
S3, MySql, PostreSql, Sql Server, SimpleDB, CouchDB, others
How do you avoid decision paralysis and get started?
Firstly, your familiarity with a framework's language should dictate which framework you choose. Don't add the burden of learning another language on top of learning a framework.
Next, have a look at the remaining frameworks. Do they have good documentation? What about the community. (A good community can go a long way to making up any shortcomings of a given technology.) Does the framework solve the problems that you need solved?
Finally, just dive in and try something! Pick the one that makes the most sense to you and start writing code. Don't do too much hand-wringing over your decision. If it becomes obvious that you made the wrong choice, it should be obvious quite early. Learn from what you've accomplished so far and consider restarting with a different technology. (Just don't get several weeks down the road before you make this decision!)
I'm sure you don't like all of those technologies equally. Pick a framework that you like and get to work.
It depends on what your app is going to be doing. A handful of the technologies you listed are direct competitors (like Django vs. Rails), but some are completely different ways to do things (like MySQL vs. S3).
Questions to answer before you begin:
Will the app need to be horizontally partitioned in the near term? If so, using EC2, Google App Engine or Azure would be a good option.
Will your app fit into the constraints of Google App Engine? If so, it requires a lot less hassle on your part than running on bare metal (whether real or virtual).
What's your preferred web framework? If you want an MS framework, you'll need to run on a host that supports that.
What will your persistence and data access patterns look like? This will determine whether to use a database or something more exotic.
If you are running on EC2, the other AWS services are more appealing. Similarly, if you are using GAE, you have only one option for persistence. If you are using Rails, may as well start with MySQL.
In answer to your question of how to reduce the number of options, the answer is to realize that many of the options are related, so you don't have as many choices to make as it first appears.
Some advice that was once given to me is, pick what your friends (or colleagues) are using. Having people around you that you can share ideas and the learning experience with is invaluable.
If you want to learn something new: I'd just go with your gut and get started. If it sucks then switch to something more familiar.
If you don't have much time: Go with what you know and forget about the other options. Just start coding.
Optimize for happiness. Pick the one that you like the most. Or the one that intrigues you the most.
I've worked in Microsoft shops, in Ruby on Rails, and in homegrown shops having Apache, Jetty, even Mason.
All frameworks have their warts, their idiosyncracies that will keep you up until 3 AM, and their "tribal knowledge" vagaries that will be completely unexportable to other frameworks. (The last point is sometimes by design, the whole "platform entrenchment" business strategy)
Listen to what the supporters of the frameworks say about the problems with the other frameworks (Google: X framework vs Y framework). Pick the framework that has the loudest supporters. If they are equally loud, make the decision with a dice roll.
With me it's simple.
I only know MS stack and see no point in "checking out" all of those you mentioned.
No, actually I once tried to use JSF before excluding it from my list permanently.
Use what you are experienced in and where you can be more productive. The objective is to get your site up and running. Go for it.
One of the biggest factors in determining which platform/framework to use is your budget. You have to factor in the cost of licensing, software required to develop/maintain your website and other miscellaneous costs.
I suggest you begin with a scorecard of your own construction. Perhaps you can find different ones on the web, but if you do, modify them to meet YOUR needs. There should be a scorecard for each level in the stack (as you've described). Each scorecard should share some aspects to grade with other scorecards but each will also have their unique aspects.
Once constructed, weight each aspect graded according to your needs.
Once you've chosen the weights, pick the scales for grades.
At this point promise yourself you wont mess with the weights or the scale and then start collecting data on your options for each level in the stack.
You may also want to put a time limit on the collection period.
Make your decision based on the outcome of the scorecard.
The beauty of this approach is that the effort is made in constructing the scorecard, not in circular arguments of options. The effort in making the scorecard is vendor agnostic and focuses on the desired result, not the options. Thus you can avoid paralysis.
One more thing, my best scorecards have included sections addressing the availability of resources and other human related things. Don't make the mistake of just looking at the technology.
good luck.
Go for personal preferences.
One decision at a time:
Firts I would begin with type of language:
Script: PHP, Python,
Serious: Java, .Net
The language will restrict your OS, plattform and will give you hints for the dataabse decission. The database load is also important. And, Do you want logic in the DDBB? how much data?
Last advice. Try combinations well tested. LAMP, WAMP, Windows with SQL Server and .NET.
Evaluate each platform and technology for quality of tools for your needs. For example, if you are cost sensitive, you would value free operating systems and tools higher than costly ones. If you need performance, you would value tools which provide high performance higher than ones that don't.
It entirely depends on your situation. I spent several months evaluating stuff for a new commercial web site last year, and it was very easy to feel paralized. In the end it was talking to several people who'd done similar things, and of course reading a lot of stuff online and from Amazon. I chose Java, since our team had a lot of experience in it, and it has good performance and extensive supporting technologies. Oracle is our database but we used a persistence manager to make it easy to change later on. We used a half-dozen very good libraries to eliminate much of the boring and repetitive coding (Restlet, iBatis, Freemarker, XStream, jQuery, SLF4J). We used Glassfish as our web server.
Yours sounds like a small project with only you to work on it. In that case, pick a complete framework instead of a smorgasbord like we did. Pick something fun to work with, and something with good "return on resume". Look very hard at Ruby on Rails, Django (kind of a Python on Rails), and Groovy on Grails (a Rails-wannabe for the Java world). In your shoes I'd pick Ruby on Rails because there's a large and growing community and a good number of books and tutorials. Plus, Ruby looks like a worthwhile language to learn. For your database, just pick one. These frameworks make it easy to change your mind later. Pick MySQL unless you have another you like better.
And as other posters said, just do it! ;-)
Like others said, pick something you and your employees are familiar with. I highly doubt you are close to being industry ready with all those techs.
OS Platform: Windows, *nix
Shouldn't matter except for Windows licensing costs, and that is probably the least of your expenses.
Web Framework: Rails, ASP.NET, ASP.NET MVC, Django, Zend, Cake, others
Dependent on your favorite language
Hosting: EC2, Dedicated Server, Shared Hosting, VPS, App Engine, Azure, others
You should design your product to be movable, so you can scale among these. If you know for sure you are going big, then just start off with EC2. App Engine is extremely limiting, ex. they don't let you form outbound connections.
Persistence: S3, MySql, PostreSql, Sql Server, SimpleDB, CouchDB, others
You need to do the research yourself whether or not your product requires an RDBMS or a simple key/value store, and what features each of these have.
Just go for it! Your platform choice really is not all that important as long as you make a reasonable choice (Ruby + Rails, Python + Django, PHP + Cake/CodeIgniter). Any of these can be used to build successful sites. If your site really takes off, you'll be able to scale it fine.
I'm pretty keen to develop my first Ruby app, as my company has finally blessed its use internally.
In everything I've read about Ruby up to v1.8, there is never anything positive said about performance, but I've found nothing about version 1.9. The last figures I saw about 1.8 had it drastically slower than just about everything out there, so I'm hoping this was addressed in 1.9.
Has performance drastically improved? Are there some concrete things that can be done with Ruby apps (or things to avoid) to keep performance at the best possible level?
There are some benchmarks of 1.8 vs 1.9 at http://www.rubychan.de/share/yarv_speedups.html. Overall, it looks like 1.9 is a lot faster in most cases.
If scalability and performance are really important to you you can also check out Ruby Enterprise Edition. It's a custom implementation of the Ruby interpreter that's supposed to be much better about memory allocation and garbage collection. I haven't seen any objective metrics comparing it directly to JRuby, but all of the anectdotal evidence I've heard has been very very good.
This is from the same company that created Passenger (aka mod_rails) which you should definitely check out as a rails deployment solution if you decide not to go the JRuby route.
Matz ruby 1.8.6 is much slower when it comes to performance and 1.9 and JRuby do alot to speed it up. But the performance isn't such that it will prevent you from doing anything you want in a web application. There are many large Ruby on Rails sites that do just fine with the "slower interpreted" language. When you get to scaling out web apps there are many more pressing performance issues than the speed of the language you are writing it in.
I've actually heard really good things performance with about the JVM implementation, JRuby. Completly anecdotal, but perhaps worth looking into.
See also http://en.wikipedia.org/wiki/JRuby#Performance
Check out "Writing Efficient Ruby Code" from Addison Wesley Professional:
http://safari.oreilly.com/9780321540034
I found some very helpful and interesting insights in this short work. And if you sign up for the free 10-day trial you could read it for free. (It's 50 pages and the trial gets you (AFAIR) 100 page views.)
https://ssl.safaribooksonline.com/promo
I am not a Ruby programmer but I have been pretty tightly involved in a JRuby deployment lately and can thus draw some conclusions. Do not expect to much from JRuby's performance. In interpreted mode, it seems to be somewhere in the range of C Ruby. JIT mode might be faster, but only in theory. In practice, we tried JIT mode on Glassfish for a decently-sized Rails application on a medium-sized server (dual core, 8GB RAM). And the truth is, the JITting took so freakingly much time, that the server needed 20-30 minutes before it answered the first request. Memory usage was astronomic, profiling did not work because the whole system grinded to halt with a profiler attached.
Bottom line: JRuby has its merits (multithreading, solid platform, easy Java integration), but given that interpreted mode is the only mode that worked for us in practice, it may be expected to be no better performance-wise than C Ruby.
I'd second the recommendation of the use of Passenger - it makes deployment and management of Rails applications trivial