One of my sites is static in the sense that the content is almost never updated. It was initially handcrafted with HTML+CSS+javascript and ran for more than 15 years with no need for maintenance! A few years ago I re-built the site with Rails with some more features, but I struggle with the need to keep up with the new versions of Rails; the initial build was with Rails 3, then I updated to Rails 4 and more recently to Rails 5 - updates necessary just to keep the site up and running. As much as I like Rails, I really don't want to spend time on updates on this particular site, and am thinking of using a static site generator to rebuild the site, but I'm not sure if they can handle my data structures.
The application involves just 8-9 models classes (and MySQL-tables), but with foreign keys to handle has_many and has_and_belongs_to_many relationships. I could export my database to JSON or YAML with id attributes for all records, but I wonder if static site generators like Middleman or Jekyll will be capable of handling these relationships. I quick look through tutorials only shows very simple data structures involving key-value.
Related
I am new in programming in Ruby, not to mention databases, so i would like a few pointers to my question. I have a few websites, where i gather information from users by a form, which they fill out (my websites are in Wordpress). The form is made in Contact Form 7 (https://wordpress.org/plugins/contact-form-7/) and the info is stored in the database with this plugin (https://wordpress.org/plugins/contact-form-7-to-database-extension/).
My question is: Is it possible to make a ruby program, that would fetch info from my sites (databases) and show me the information? Is there a better way of doing this (Ruby on rails perhaps)?
It is possible to access your wp data from a ruby app. You will need to familiarise yourself with the wordpress api. This gem might also help. You can build a small sinatra based app to keep it simple or use rails.
we are a parents initiative who runs a small kindergarden / child nursery and we set out to build a basic web based documentation and reporting application for our nursery.
After spending a week doing research on the various PHP MVC frameworks, I have a few questions that I could not answer myself - even after having a closer look (installed on local machine) at CakePHP, Symfony, CodeIgniter, YII and Joomla.
The goal is to build a web application where site managers, staff and parents (roles) can log in to do simple tasks, depending on their role (hence I need a RBAC). Site managers will for example be able to add staff to the database, staff will be able to add children and do some documenation on children (select a child and enter details on special needs etc.). Parents should be able to update information about who is picking up the child, or inform the staff if a child will not come in at a certain date.
Now, since this is all basic CRUD with only a few tables underneath, tables connected via some foreign key constraints, I would like the framework to generate CRUD skeletons for me.
And since I need to manage roles and limit access to certain pages, I would like a basic user management out of the box.
Plus, PHP and MySQL and MVC are set - otherwise I would have used Oracle Apex as I have some experience with that.
So, I looked at a vast amount of PHP frameworks and found the following to be promising:
CodeIgniter (with Bonfire plugin)
Symfony2
YII framework with GII
CakePHP
Joomla
BUT none of the frameworks I looked at seemed to fit my need:
creating CRUD skeleton pages based on my ERD with foreign keys (MySQL InnoDB) / Scaffolding
offering a basic user management out of the box with up to date security measures in place (passwords stored using phpass or salted hashes & md5, user registration, defining roles and limiting tasks to roles/users).
If somebody could suggest a PHP Framework that comes with those two requirements build in, I would be very happy to hear about it. THANK YOU very much in advance!!
(I really liked CodeIgniter because it was simple to set up and lightweight, I liked the Bonfire user management plugin but was disappointed when I found out they sacked scaffolding in the lasted release
And I liked Joomla for its user management, but found it quite hard to get started on component development, plus no scaffolding).
You should check FuelPHP and its ORM, Oil and Auth packages. It's really great. If you liked CodeIgniter, you'll probably love Fuel.
http://www.fuelphp.com
VS2010 Pro + SqlServer Express.
Having been dropped into ASP.NET MVC 3 with no guidance but the web (2 books on order), I can't even get off the ground.
The MVC itself I get. Not a problem.
PHP, Ruby, and even ghastly WebForms firmly tucked into my toolbelt, with a long history of C++ QT client-server development before that.
Tying ASP.NET MVC 3 to a database using EF4 ORM is killing me.
The goals:
Use database modeled by DBA. I can specify all naming conventions, but code first is not an option!
Import to EDMX. This will be regularly updated using VS tools from the DBA's DB, never edited directly.
Generate partial classes from EDMX, for use as model. This will regularly be updated using VS tools, never edited directly.
Use 'buddy' to extend above model class with code as the Controllers/Views need.
Intuitively use the resulting model, pass it to the view, retrieve posts into it for insert/save, etc...
I've seen and read so many blogs, forum posts, walkthroughs, and stack overflow posts regarding this very use case.
I even tried riding the magic unicorn, followed by the latest 4.2beta1 with DbContext generators.
But can't get off the ground.
I follow instructions, but just not understanding how to do anything with it.
What conventions does the 'buddy' require (if any)? How do I use it? How do I get data with it? How do I write data?
Every example looks different. MVC guides are always focused on the UI side. EF guides don't cover usage in the MVC.
These are basic questions, and I'm feeling like the most incompetent idiot in the WWW right now.
Is anyone out there currently using MVC3 & EF4.x in the way I describe above?
This video is a good starting resource. Its a video of a guy creating an app from scratch that uses entity and a sql database (though he makes the db in the video, its still good for seeing some basics in action). You can see how he pulls data from the database, displays it on the page, and saves changes back to the database.
The first question I would ask is why are you stuck on using EF as an ORM or even insisting an ORM at all? I'd choose tools to suit the job here, especially given the constraints of the data layer.
Buddy classes were a concept invented in a day when the main .NET ORMs had no code-first option as ORM-encumbered class instances really don't behave well under things like model binding. Nevermind you could not decorate them with the DataAnnotations one used to indicate fields were required. Typically, the technical requirement is to use [MetadataType] attributes to tie your buddies to your models and perhaps something like AutoMapper to map data to and fro.
All that said, as a guy who has a few apps with lots of buddies and lots of automapping going on, you might want to think otherwise -- it is a bit of a maintenance nightmare. I'm living it.
There are some really good getting-started videos and tutorials right on ASP.NET MVC's site. The "Model (Data)" section is Entity Framework focused and touches on hot/trending topics like Repositories and Units Of Work.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I am looking at writing my own, but I am wondering if there are any good web crawlers out there which are written in Ruby.
Short of a full-blown web crawler, any gems that might be helpful in building a web crawler would be useful. I know this part of the question is touched upon in a couple of places, but a list of gems applicable to building a web crawler would be a great resource as well.
I used to write spiders, page scrapers and site analyzers for my job, and still write them periodically to scratch some itch I get.
Ruby has some excellent gems to make it easy:
Nokogiri is my #1 choice for the HTML parser. I used to use Hpricot, but found some sites that made it explode in flames. I switched to Nokogiri afterwards and have been very happy with it. I regularly use it for parsing HTML, RDF/RSS/Atom and XML. Ox looks interesting too, so that might be another candidate, though I find searching the DOM a lot easier than trying to walk through a big hash, such as what is returned by Ox.
OpenURI is good as a simple HTTP client, but it can get in the way when you want to do more complex things or need to have multiple requests firing at once. I'd recommend looking at HTTPClient or Typhoeus with Hydra for modest to heavyweight jobs. Curb is good too, because it uses the cURL library, but the interface isn't as intuitive to me. It's worth looking at though. HTTPclient is also worth looking at, but I lean toward the previously mentioned ones.
Note: OpenURI has some flaws and vulnerabilities that can affect unsuspecting programmers so it's fallen out of favor somewhat. RestClient is a very worthy successor.
You'll need a backing database, and some way to talk to it. This isn't a task for Rails per se, but you could use ActiveRecord, detached from Rails, to talk to the database. I've done that a couple times and it works all right. Instead, I really like Sequel for my ORM. It's very flexible in how it lets you talk to the database, from using straight SQL to using Sequel's ability to programmatically build a query, to modeling the database and using migrations. Once you have the database built, you could use Rails to act as a front-end to the data though.
If you are going to navigate sites in any way beyond simply grabbing pages and following links, you'll want to look at Mechanize. It makes it easy to fill out forms and submit pages. As an added bonus, you can grab the content of a page as a Nokogiri HTML document and parse away using Nokogiri's multitude of tricks.
For massaging/mangling URLs I really like Addressable::URI. It's more full-featured than the built-in URI module. One thing that URI does that's nice is it has the URI#extract method to scan a string for URLs. If that string happened to be the body of a web page it would be an alternate way of locating links, but its downside is you'll also get links to images, videos, ads, etc., and you'll have to filter those out, probably resulting in more work than if you use a parser and look for <a> tags exclusively. For that matter, Mechanize also has the links method which returns all the links in a page, but you'll still have to filter them to determine whether you want to follow or ignore them.
If you think you'll need to deal with Javascript manipulated pages, or pages that get their content dynamically from AJAX, you should look into using one of the WATIR variants. There are flavors for the different browsers on different OSes, such as Firewatir, Safariwatir and Operawatir, so you'll have to figure out what works for you.
You do NOT want to rely on keeping your list of URLs to visit, or visited URLs, in memory. Design a database schema and store that information there. Spend some time up front designing the schema, thinking about what things you'll want to know as you collect links on a site. SQLite3, MySQL and Postgres are all excellent choices, depending on how big you think your database needs will be. One of my site analyzers was custom designed to help us recommend SEO changes for a Fortune 50 company. It ran for over three weeks covering about twenty different sites before we had enough data and stopped it. Imagine what would have happened if we had a power-outage and all that data went in the bit-bucket.
After all that you'll want to also make your code be aware of proper spidering etiquette: What are the key considerations when creating a web crawler?
I am building wombat, a Ruby DSL to crawl web pages and extract content. Check it out on github https://github.com/felipecsl/wombat
It is still in an early stage but is already functional with basic functionality. More stuff will be added really soon.
So you want a good Ruby-based web crawler?
Try spider or anemone. Both have solid usage according to RubyGems download counts.
The other answers, so far, are detailed and helpful but they don't have a laser-like focus on the question, which asks for ruby libraries for web crawlers. It would seem that this distinction can get muddled: see my answer to "Crawling vs. Web-Scraping?"
Tin Man's comprehensive list is good but partly outdated for me.
Most websites my customers deal with are heavily AJAX/Javascript dependent.
I've been using Watir / watir-webdriver / selenium for a few years too, but the overhead of having to load up a hidden web browser on the backend to render that DOM stuff just isn't viable, let alone that all this time they still haven't implemented a useable "browser session reuse" to let new code execution reuse an old browser in memory for this purpose, shooting down tickets that might have worked their way up the API layers eventually. (refering to https://code.google.com/p/selenium/issues/detail?id=18 ) **
https://rubygems.org/gems/phantomjs
is what we're migrating new projects over to now, to let the necessary data get rendered without even any sort of invisible Xvfb memory & CPU heavy web browser.
** Alternative approaches also failed to pan out:
how to serialize an object using TCPServer inside?
Can a watir browser object be re-used in a later Ruby process?
If you don't want to write your own, then use any ordinary web crawler. There are dozens out there.
If you do want to write your own, then write your own. A web crawler isn't exactly a complicated activity, it consists of:
Downloading a website.
Locating URLs in that website, filtered however you dang well please.
For each URL in that website, repeat step 1.
Oh, and this seems to be a duplicate of "Web crawler in ruby".
I am new to Codeigniter and I am thinking about the use of this framework in my new project.
I am going to need these two extensions. Before digging into too deep, I wonder if anyone already has experience with them and can kindly give some insights on whether there is any compatibility issue when they are used together.
Modular Extensions - HMVC
http://bitbucket.org/wiredesignz/codeigniter-modular-extensions-hmvc/wiki/Home
Datamapper ORM
http://datamapper.exitecms.org/
I believe there are many others who are going to use these two extensions together because they are actually very popular ones. So, many people are going to benefit from this thread.
Many many thanks to you all.
Firstly a caveat : I have used the HMVC extension but not the Datamapper ORM.
As far as I see it the two extensions have separate goals. In principle I cannot see a conflict.
The HMVC extension is useful where your view is composed of multiple sub-views. It allows you to modularise your application so that your views can be built from the output of multiple controller actions.
The Datamapper ORM allows you to map the data in your database directly onto PHP objects in your application. It saves you the cruft of writing SQL queries to pull rows from a database and hydrate objects in your application. You define what table your model is loaded from and how it is related to the other models in your application. The Datamapper generates the queries to perform the CRUD operations behind the scenes.
HMVC is concerned with how you structure your application. The Datamapper ORM is concerned with how you build your models. I don't see how the Datamapper would stop you using HMVC or vice-versa.
I'd also suggest taking a look at Doctrine ORM. It's a very powerful ORM framework that I've been using for the past year or so in all my CodeIgniter projects and works really well without any compatibility issues or such.
Tutorial for installing Doctrine with CodeIgniter.