API polling bot on heroku - ruby

I want to create a bot which makes API request per minute to some API url. This then needs to ping a particular user if data entry has changed against his name in the API feed. I want to go for a free solution on Heroku. Can this be achieved?

Yes, heroku supports thin as a web server, which is EventMachine enabled, so an easy way to do this is to write a quick sinatra app and use EM.add_periodic_timer for your API calls. When you deploy this sinatra app to heroku, it'll use thin by default, so there's no extra configuration needed. You can test via thin start -p 4567 assuming your config.ru is correct. Here's a pretty standard one, assuming your app is in app.rb:
require 'bundler/setup'
Bundler.require :default
require File.expand_path('app', File.dirname(__FILE__))
run Sinatra::Application

I currently check the status of some sites for free on heroku. The secret? Rufus-Scheduer.
Install the gem
gem install rufus-scheduler
Make sure you include the gem in bundler or however you are including it.
Then you need to create a file called task_scheduler.rb and stick this in you initializers directory.
require 'rufus/scheduler'
scheduler = Rufus::Scheduler.start_new
scheduler.every '1m' do
url = "http://codeglot.com"
response = Net::HTTP.get_response(URI.parse(url))
#do stuff with response.body
end
If you have any trouble you can see this blog post:
http://intridea.com/2009/2/13/dead-simple-task-scheduling-in-rails?blog=company

Related

What is this file config.ru, and what is it for?

What is this file config.ru, and what is it for in Sinatra projects? In my lanyard of the project, such code is written:
require './app'
run Sinatra::Application
config.ru is a Rack configuration file (ru stands for "rackup"). Rack provides a minimal interface between web servers that support Ruby and Ruby frameworks. It's like a Ruby implementation of a CGI which offers a standard protocol for web servers to execute programs.
Rack's run command here means for requests to the server, make Sinatra::Application the execution context from which Sinatra's DSL could be used. All DSL methods on the main are then delegated to this class.
So in this config.ru file, first you require your app code which uses Sinatra's DSL then run the Sinatra framework. In the context of Sinatra::Application if your app.rb contained this:
get '/' do
'Hello world!'
end
The get block would mean something to Rack, in this case when someone tries to access (GET) the home url, send back 'Hello world!'
Rack provides a minimal interface between webservers that support Ruby and Ruby frameworks.
The interface just assumes that you have an object that responds to a call method (like a proc) and returns a array with:
The HTTP response code
A Hash of headers
The response body, which must respond to each
You can run a basic Rack server with the rackup command which will search for a config.ru file in the current directory.
You can create a minimal hello world server with:
# config.ru
run Proc.new { |env| ['200', {'Content-Type' => 'text/html'}, ['Hello World']] }
# run this with the `rackup` command
Since Sinatra just like Rails builds on Rack it uses rackup internally to interface between the server and the framework. config.ru is thus the entry point to any Rack based program.
What it does is bootstrap the application and pass the Sinatra::Application class to rack which has a call class method.
Sinatra::Application is then responsible for taking the incoming request (the env) and passing it to the routes your application provides and then passing back the response code, headers, and response body.
config.ru is a default configuration file for a rackup command with a list of instructions for Rack.
Rack is an interface and architecture that provides a domain specific language (DSL) and connects an application with a world of web. In two words, it allows to build web applications and work with requests, responses (and many other web-related technologies) in a most convenient way.
Sinatra as well as Rails are web frameworks, so they both use Rack:
http://recipes.sinatrarb.com/p/middleware
https://guides.rubyonrails.org/rails_on_rack.html

Making a Rack CLI

I'm trying to make a framework similar to Rails, but purely focused on GraphQL. Once nice feature of Rails is that it provides a CLI interface and a config.ru for Rack. Therefore, you can call rackup or you can call bin/rails server and the Rails app will run. I managed to mimic this functionality by putting the Rack app into a separate file (config/application.rb), which I import in config.ru and in the CLI, then instantiate and run.
However, I have an issue with Rack middleware. Since Rack middleware appears to just magically work when you run use MyMiddleware with an instantiated Rack app, I'm not really sure how I can do this in both config.ru and in my CLI. Right now it looks like I need to instantiate the app in a separate location, add the middleware, then hand it over to config.ru or the CLI. Which, I could do, but it feels like there has to be a way to attach middleware in a cleaner way. For instance, can I require config.ru in some way and then run it? Or can I attach middleware before I instantiate the app?
config.ru is just a ruby file, it's loaded by Rails as part of running each command. You can require it yourself as normal if that's what you'd like to do.
If you want to really figure out how Rails does it, the config loading is buried in this part of the Rails CLI:
https://github.com/rails/rails/blob/3cac5fe94f0f81b4263cfa03d4822c05a55eb49c/railties/lib/rails/application.rb

Heroku and Web scraping

I have a nokigiri web scraper that publishes to a database that I'm trying to publish to heroku. I have a sinatra application frontend that I want to have pull in from the database. I'm new to Heroku and web development, and don't know the best way to handle something like this.
Do I have to place the web scraper script that uploads to the database under a sinatra route (like mywebsite.com/scraper ) and just make it so obscure that no one visits it? In the end, I'd like to have the sinatra part be a rest api that pulls from the database.
Thanks for all input
There are two approaches you can take.
The first one is to use One-off dynos by running the scraper through the console using heroku run YOURCMD. Just make sure scraper don't write to disk but uses database.
More information:
https://devcenter.heroku.com/articles/one-off-dynos
The second is differentiating between scraper and web process in a way that you have web process for normal UI interaction and a scraper process which web process can spawn/talk to. If you take this route it's up to you how to protect it from rest of the world (auth/url obfuscation etc.).
More information:
https://devcenter.heroku.com/articles/background-jobs-queueing
I did it by creating a rake task and using the one-off dynos as mentioned by XLII
Here is my rake task file
require 'bundler/setup'
Bundler.require
desc "Scrape Site"
task :scrape, [:companyname] => :environment do |t, args|
puts "Company Name is :" + args[:companyname]
agent = Mechanize.new
agent.user_agent_alias = 'Mac Safari'
puts "Agent (Mac Safari Created)"
# MORE SCRAPING CODE
end
You can simply run it by call
heroku run rake scrape[google]

What is the easiest servlet library in ruby?

What framework do you recommand for writing simple web applications in ruby, between WebRick, Mongrel and Sinatra ?
I would like to answer in json to requests from a client. I would like to have my own code decoupled from the Http framework as much as possible.
Do you know any other framework ?
I wouldn't recommend using WEBrick, period. You would best be served by a Rack-compatible framework. You could write directly in Rack for speed, but it's really unnecessary since Sinatra is so much more pleasant and still very fast.
You may also want to check out Halcyon. I don't know if it's still maintained, but it's designed for writing APIs that respond in JSON.
WEBrick and Mongrel are servers, not frameworks for building web applications. As such, they have APIs that are lower level and tied to their own idiosyncrasies which makes them a bad place to start if you want to design your web application so that it can run on different servers.
I would look for a framework that builds on Rack, which is the standard base layer for building web apps and web frameworks in Ruby these days.
If you are making something really simple, learning Rack's interface by itself is a good place to start.
E.G., a Rack Application that parses json out of a post request's body and prints it back out prettified.
# in a file named config.ru
require 'json'
class JSONPrettyPrinterPrinter
def call env
request = Rack::Request.new env
if request.post?
object = JSON.parse request.body
[200, {}, [JSON.pretty_generate(object)]]
else
[200, {}, ["nothing to see here"]]
end
end
end
run JSONPrettyPrinterPrinter
you can run it by running rackup in the same dir as the file.
Or, if you want something a bit more high level, you can use sinatra, which looks like this
require 'sinatra'
post '/' do
object = JSON.parse request.body
JSON.pretty_generate(object)
end
Sinatra's README is a good introduction to it's features.

Sinatra: three logs

I'm using a very simple Sinatra app that works well. However, every log message is repeated three times. I can bring that down to two by disabling the Sinatra logging with
disable :logging
but I still have two. The messages are slightly different, so I gather they are coming from Rack and somewhere else in the stack too.
How do I completely disable logging of successful web requests?
Rack is adding own logging as a middleware
try to run
rackup -E none
This removes one log entry. The second one is sinatra native which you've already disable. And the third one is Rack::Lint logging if I remember correctly.
General approach is to restructure your app like
app.rb
require 'sinatra/base'
class App < Sinatra::Base
get '/' do
"hello"
end
end
config.ru
require 'myapp'
run MyApp
Or you can run app outside rack
if __FILE__ == $0
App.run!
end

Resources