I've written a simple Jekyll plugin to pull in my tweets using the twitter gem (see below). I'd like to keep the ruby script for the plugin on my open Github site, but following recent changes to the twitter API, the gem now requires authentication credentials.
require 'twitter' # Twitter API
require 'redcarpet' # Formatting links
module Jekyll
class TwitterFeed < Liquid::Tag
def initialize(tag_name, text, tokens)
input = text.split(/, */ )
#user = input[0]
#count = input[1]
if input[1] == nil
#count = 3
def render(context)
# Initialize a redcarpet markdown renderer to autolink urls
# Could use octokit instead to get GFM
markdown = Redcarpet::Markdown.new(Redcarpet::Render::HTML,
:autolink => true,
:space_after_headers => true)
## Attempt to load credentials externally here:
require '~/.twitter_auth.rb'
out = "<ul>"
tweets = #client.user_timeline(#user)
for i in 0 ... #count.to_i
out = out + "<li>" + markdown.render(tweets[i].text) +
" <a href=\"http://twitter.com/" + #user + "/statuses/" +
tweets[i].id.to_s + "\">" + tweets[i].created_at.strftime("%I:%M %Y/%m/%d") +
"</a> " + "</li>"
out + "</ul>"
Liquid::Template.register_tag('twitter_feed', Jekyll::TwitterFeed)
If I replace the line
require '~/.twitter_auth.rb'
where twitter_auth.rb contains something like:
require 'twitter'
#client = Twitter::Client.new(
:consumer_key => "CEoYXXXXXXXXXXX",
:consumer_secret => "apnHXXXXXXXXXXXXXXXXXXXXXXXX",
If I place these contents directly into the script above, then my plugin script works just fine. But when I move them to an external file and try to read them in as shown, Jekyll fails to authenticate. The function seems to work just fine when I call it from irb, so I am not sure why it does not work during the Jekyll build.
I think that you may be confused about how require works. When you call require, first Ruby checks if the file has already been required, if so it just returns directly. If it hasn’t then the contents of the file are run, but not in the same scope as the require statement. In other words using require isn’t the same as replacing the require statement with the contents of the file (which is how, for example, C’s #include works).
In your case, when you require your ~/.twitter_auth.rb file, the #client instance variable is being created, but as an instance variable of the top level main object, not as an instance variable of the TwitterFeed instance where require is being called form.
You could do something like assign the Twitter::Client object to a constant that you could then reference from the render method:
MyClient = Twitter::Client.new{...
and then
require '~/twitter_auth.rb'
#client = MyClient
I only suggest this as an explanation of what’s happening with require, it’s not really a good technique.
A better option, I think, would be to keep your credentials in a simple data format in your home directory, then read them form your script and create the Twitter client with them. In this case Yaml would probably do the job.
First replace your ~/twitter_auth.rb with a ~/twitter_auth.yaml that looks soemthing like:
:consumer_key: "CEoYXXXXXXXXXXX"
Then where you have requre "~/twitter_auth.rb" in your class, replace with this (you’ll also need require 'yaml' at the top of the file):
#client = Twitter::Client.new(YAML.load_file("~/twitter_auth.yaml"))
How do I write a script that retrieves a listing by ID and saves it to a file as JSON: http://sparkplatform.com/docs/api_services/listings
How do I write a script that creates a new contact record, and then prints the new contact’s record (standard output is fine): http://sparkplatform.com/docs/api_services/contacts
SPARK_API Gem github page to answer the questions:
https://github.com/sparkapi/spark_api (provides auto parser)
SparkApi.client.get "/listings/#{listing_id}", :_expand => "CustomFields"
SparkApi.client.post "/listings/#{listing_id}/contacts
I'm newer to Ruby, how would I use the GET/POST requests properly?
Do the following:
Install ruby
Run gem install spark_api
Then, you basically need to run a get and a post request.
This is the script from the documentation:
require 'spark_api'
SparkApi.configure do |config|
config.endpoint = 'https://sparkapi.com'
# Using Spark API Authentication, refer to the Authentication documentation for OAuth2
config.api_key = 'SPARK_API_KEY'
config.api_secret = 'SPARK_API_SECRET'
listing_id = 12345
filename = 'my_file.json'
def get_listing(listing_id, filename)
response = SparkApi.client.get "/listings/#{listing_id}", :_expand => "CustomFields"
save_to_file(response, filename)
def create_contact(listing_id)
SparkApi.client.post "/listings/#{listing_id}/contacts"
def save_to_file(response, filename)
File.open(filename, 'w') do |f|
f << response.body
Use your own HTTP client like faraday or httparty but use the Gem which wraps all the API logic.
I'm trying to call but I keep getting an error. This is my code:
require 'rubygems'
require 'net/http'
require 'uri'
require 'json'
class AlchemyAPI
#Setup the endpoints
##ENDPOINTS['taxonomy'] = {}
##ENDPOINTS['taxonomy']['url'] = '/url/URLGetRankedTaxonomy'
##ENDPOINTS['taxonomy']['text'] = '/text/TextGetRankedTaxonomy'
##ENDPOINTS['taxonomy']['html'] = '/html/HTMLGetRankedTaxonomy'
##BASE_URL = 'http://access.alchemyapi.com/calls'
def initialize()
key = File.read('C:\Users\KVadher\Desktop\api_key.txt')
if key.empty?
#The key file should't be blank
puts 'The api_key.txt file appears to be blank, please copy/paste your API key in the file: api_key.txt'
puts 'If you do not have an API Key from AlchemyAPI please register for one at: http://www.alchemyapi.com/api/register.html'
if key.length != 40
#Keys should be exactly 40 characters long
puts 'It appears that the key in api_key.txt is invalid. Please make sure the file only includes the API key, and it is the correct one.'
#apiKey = key
rescue => err
#The file doesn't exist, so show the message and create the file.
puts 'API Key not found! Please copy/paste your API key into the file: api_key.txt'
puts 'If you do not have an API Key from AlchemyAPI please register for one at: http://www.alchemyapi.com/api/register.html'
#create a blank file to hold the key
File.open("api_key.txt", "w") {}
# Categorizes the text for a URL, text or HTML.
# For an overview, please refer to: http://www.alchemyapi.com/products/features/text-categorization/
# For the docs, please refer to: http://www.alchemyapi.com/api/taxonomy/
# flavor -> which version of the call, i.e. url, text or html.
# data -> the data to analyze, either the the url, text or html code.
# options -> various parameters that can be used to adjust how the API works, see below for more info on the available options.
# Available Options:
# showSourceText -> 0: disabled (default), 1: enabled.
# The response, already converted from JSON to a Ruby object.
def taxonomy(flavor, data, options = {})
unless ##ENDPOINTS['taxonomy'].key?(flavor)
return { 'status'=>'ERROR', 'statusInfo'=>'Taxonomy info for ' + flavor + ' not available' }
#Add the URL encoded data to the options and analyze
options[flavor] = data
return analyze(##ENDPOINTS['taxonomy'][flavor], options)
In ** ** I have entered my call. Am I doing something incorrect. The error I receive is:
C:/Users/KVadher/Desktop/testrub:139:in `<class:AlchemyAPI>': undefined local variable or method `text' for AlchemyAPI:Class (NameError)
from C:/Users/KVadher/Desktop/testrub:6:in `<main>'
I feel as though I'm calling as normal and that there is something wrong with the api code itself? Although I may be wrong.
Yes, as jon snow says, the function (method) call must be outside of the class. The methods are defined along with the class.
Also, Options should be a Hash, not a number, as you call options[flavor] = data, which is going to cause you another problem.
I believe maybe you meant to put text in quotes, as that is one of your flavors.
Furthermore, because you declared a class, this is called an instance method, and you must make an instance of the class to use this:
my_instance = AlchemyAPI.new
my_taxonomy = my_instance.taxonomy("text", "trees")
That's enough to get it to work, it seems like you have a ways to go to get this all working though. Good luck!
I am trying to scrape a website and store data in XML using Mechanize and Nokogiri. I didn't set up a Rails project and I am only using Ruby and IRB.
I wrote this method:
def mechanize_club
agent = Mechanize.new
form = agent.page.forms.first
form.field_with(:name => 'codeLigue').options[0].select
page2 = agent.get('http://www.rechercheclub.applipub-fft.fr/rechercheclub/club.do?codeClub=01670001&millesime=2015')
body = page2.body
html_body = Nokogiri::HTML(body)
codeclub = html_body.search('.form').children("tr:first").children("th:first").to_i
#codeclubs << codeclub
filepath = '/davidgeismar/Documents/codeclubs.xml'
builder = Nokogiri::XML::Builder.new(encoding: 'UTF-8') do |xml|
xml.root {
xml.codeclubs {
#codeclubss.each do |c|
xml.codeclub {
xml.code_ c.code
puts builder.to_xml
My first problem is that I don't know how to test my code.
I call ruby webscraper.rb in my console, the file is treated I think, but it doesn't create an XML file in the specified path.
Then, more specifically I am quite sure this code is wrong as I didn't get a chance to test it.
Basically what I am trying to do is to submit a form several times:
agent = Mechanize.new
form = agent.page.forms.first
form.field_with(:name => 'codeLigue').options[0].select
I think this code is ok, but I dont want it to only select options[0], I want it to select an option, then scrape all the data I need, then go back to page, then select options[1]... until there are no more options (an iteration I guess).
the file is treated I think, but it doesnt create an xml file in the specified path.
There is nothing in your code that creates a file. You print some output, but don't do anything to open or write a file.
Perhaps you should read the IO and File documentation and review how you are using your filepath variable?
The second problem is that you don't call your method anywhere. Though it's defined and Ruby will see it and parse the method, it has no idea what you want to do with it unless you invoke the method:
def mechanize_club
I'd like to make Jekyll create an HTML file and a JSON file for each page and post. This is to offer a JSON API of my Jekyll blog - e.g. a post can be accessed either at /posts/2012/01/01/my-post.html or /posts/2012/01/01/my-post.json
Does anyone know if there's a Jekyll plugin, or how I would begin to write such a plugin, to generate two sets of files side-by-side?
I was looking for something like this too, so I learned a bit of ruby and made a script that generates JSON representations of Jekyll blog posts. I’m still working on it, but most of it is there.
I put this together with Gruntjs, Sass, Backbonejs, Requirejs and Coffeescript. If you like, you can take a look at my jekyll-backbone project on Github.
# encoding: utf-8
# Title:
# ======
# Jekyll to JSON Generator
# Description:
# ============
# A plugin for generating JSON representations of your
# site content for easy use with JS MVC frameworks like Backbone.
# Author:
# ======
# Jezen Thomas
# jezenthomas#gmail.com
# http://jezenthomas.com
module Jekyll
require 'json'
class JSONGenerator < Generator
safe true
priority :low
def generate(site)
# Converter for .md > .html
converter = site.getConverterImpl(Jekyll::Converters::Markdown)
# Iterate over all posts
site.posts.each do |post|
# Encode the HTML to JSON
hash = { "content" => converter.convert(post.content)}
title = post.title.downcase.tr(' ', '-').delete("’!")
# Start building the path
path = "_site/dist/"
# Add categories to path if they exist
if (post.data['categories'].class == String)
path << post.data['categories'].tr(' ', '/')
elsif (post.data['categories'].class == Array)
path << post.data['categories'].join('/')
# Add the sanitized post title to complete the path
path << "/#{title}"
# Create the directories from the path
FileUtils.mkpath(path) unless File.exists?(path)
# Create the JSON file and inject the data
f = File.new("#{path}/raw.json", "w+")
f.puts JSON.generate(hash)
There are two ways you can accomplish this, depending on your needs. If you want to use a layout to accomplish the task, then you want to use a Generator. You would loop through each page of your site and generate a new .json version of the page. You could optionally make which pages get generated conditional upon the site.config or the presence of a variable in the YAML front matter of the pages. Jekyll uses a generator to handle slicing blog posts up into indices with a given number of posts per page.
The second way is to use a Converter (same link, scroll down). The converter will allow you to execute arbitrary code on your content to translate it to a different format. For an example of how this works, check out the markdown converter that comes with Jekyll.
I think this is a cool idea!
Take a look at JekyllBot and the following code.
require 'json'
module Jekyll
class JSONPostGenerator < Generator
safe true
def generate(site)
site.posts.each do |post|
site.pages.each do |page|
def render_json(post, site)
#add `json: false` to YAML to prevent JSONification
if post.data.has_key? "json" and !post.data["json"]
path = post.destination( site.source )
#only act on post/pages index in /index.html
return if /\/index\.html$/.match(path).nil?
#change file path
path['/index.html'] = '.json'
#render post using no template(s)
post.render( {}, site.site_payload)
#prepare output for JSON
post.data["related_posts"] = related_posts(post,site)
output = post.to_liquid
output["next"] = output["next"].id unless output["next"].nil?
output["previous"] = output["previous"].id unless output["previous"].nil?
#todo, figure out how to overwrite post.destination
#so we can just use post.write
File.open(path, 'w') do |f|
def related_posts(post, site)
related = []
return related unless post.instance_of?(Post)
post.related_posts(site.posts).each do |post|
related.push :url => post.url, :id => post.id, :title => post.to_liquid["title"]
Both should do exactly what you want.
Say I have some HTML documents stored on S3 likes this:
etc, etc
I'd like to serve these with a Rack (preferably Sinatra) application, mapping the following routes:
get "/posts/:id" do
render "http://alan.aws-s3-bla-bla.com/posts/#{params[:id]}.html"
get "/posts/:posts_id/comments/:comments_id" do
render "http://alan.aws-s3-bla-bla.com/posts/#{params[:posts_id]}/comments/#{params[:comments_id}.html"
Is this a good idea? How would I do it?
There would obviously be a wait while you grabbed the file, so you could cache it or set etags etc to help with that. I suppose it depends on how long you want to wait and how often it is accessed, its size etc as to whether it's worth storing the HTML locally or remotely. Only you can work that bit out.
If the last expression in the block is a string that will automatically be rendered, so there's no need to call render as long as you've opened the file as a string.
Here's how to grab an external file and put it into a tempfile:
require 'faraday'
require 'faraday_middleware'
#require 'faraday/adapter/typhoeus' # see https://github.com/typhoeus/typhoeus/issues/226#issuecomment-9919517 if you get a problem with the requiring
require 'typhoeus/adapters/faraday'
configure do
Faraday.default_connection = Faraday::Connection.new(
:headers => { :accept => 'text/plain', # maybe this is wrong
:user_agent => "Sinatra via Faraday"}
) do |conn|
conn.use Faraday::Adapter::Typhoeus
helpers do
def grab_external_html( url )
response = Faraday.get url # you'll need to supply this variable somehow, your choice
filename = url # perhaps change this a bit
tempfile = Tempfile.open(filename, 'wb') { |fp| fp.write(response.body) }
get "/posts/:whatever/" do
tempfile = grab_external_html whatever # surely you'd do a bit more here…
This might work. You may also want to think about closing that tempfile, but the garbage collector and the OS should take care of it.