Generate file inside _site with Jekyll plugin - ruby

I have written a Jekyll plugin, "Tags", which generates a file and returns string of links to that file.
Everything is fine, but if I write that file directly into the _site folder, it is removed. If I put that file outside the _site folder, it is not generated inside _site.
Where and how should I add my file so that it is available inside the _site folder?

You should use class Page for this and call methods render and write.
This is an example to generate the archive page at my blog:
module Jekyll
class ArchiveIndex < Page
def initialize(site, base, dir, periods)
#site = site
#base = base
#dir = dir
#name = 'archive.html'
self.process(#name)
self.read_yaml(File.join(base, '_layouts'), 'archive_index.html')
self.data['periods'] = periods
end
end
class ArchiveGenerator < Generator
priority :low
def generate(site)
periods = site.posts.reverse.group_by{ |c| {"month" => Date::MONTHNAMES[c.date.month], "year" => c.date.year} }
index = ArchiveIndex.new(site, site.source, '/', periods)
index.render(site.layouts, site.site_payload)
index.write(site.dest)
site.pages << index
end
end
end

Related

Ruby TempFile behaviour among different classes

Our processing server works mainly with TempFiles as it makes things easier on our side: no need to take care of deleting them as they get garbage collected or handle name collisions, etc.
Lately, we are having problems with TempFiles getting GCed too early in the process. Specially with one of our services that will convert a Foo file from a url to some Bar file and upload it to our servers.
For sake of clarity I added bellow a case scenario in order to make discussion easier and have an example at hand.
This workflow does the following:
Get a url as parameter
Download the Foo file as a TempFile
Duplicate it to a new TempFile
Download the related assets to TempFiles
Link the related assets into the local dup TempFile
Convert the Foo to Bar format
Upload it to our server
At times the conversion fail and everything points to the fact that our local Foo file is pointing to related assets that have been created and GCed before the conversion.
My two questions:
Is it possible that my TempFiles get GCed too early? I read about Ruby GCed system it was very conservative to avoid those scenarios.
How can I avoid this from happening? I could try to save all related assets from download_and_replace_uri(node) and passing them as a return to keep it alive while the instance of ConvertService is still existing. But I'm not sure if this would solve it.
myfile.foo
{
"buffers": [
{ "uri": "http://example.com/any_file.jpg" },
{ "uri": "http://example.com/any_file.png" },
{ "uri": "http://example.com/any_file.jpmp3" }
]
}
main.rb
ConvertService.new('http://example.com/myfile.foo')
ConvertService
class ConvertService
def initialize(url)
#url = url
#bar_file = Tempfile.new
end
def call
import_foo
convert_foo
upload_bar
end
private
def import_foo
#foo_file = ImportService.new(#url).call.edited_file
end
def convert_foo
`create-bar "#{#foo_file.path}" "#{#bar_file.path}"`
end
def upload_bar
UploadBarService.new(#bar_file).call
end
end
ImportService
class ImportService
def initialize(url)
#url = url
#edited_file ||= Tempfile.new
end
def call
download
duplicate
replace
end
private
def download
#original = DownloadFileService.new(#url).call.file
end
def duplicate
FileUtils.cp(#original.path, #edited_file.path)
end
def replace
file = File.read(#edited_file.path)
json = JSON.parse(file, symbolize_names: true)
json[:buffers]&.each do |node|
node[:uri] = DownloadFileService.new(node[:uri]).call.file.path
end
write_to_disk(#edited_file.path, json.to_json)
end
end
DownloadFileService
module Helper
class DownloadFileService < ApplicationHelperService
def initialize(url)
#url = url
#file = Tempfile.new
end
def call
uri = URI.parse(#url)
Net::HTTP.start(
uri.host,
uri.port,
use_ssl: uri.scheme == 'https'
) do |http|
response = http.request(Net::HTTP::Get.new(uri.path))
#file.binmode
#file.write(response.body)
#file.flush
end
end
end
end
UploadBarService
module Helper
class UploadBarService < ApplicationHelperService
def initialize(file)
#file = file
end
def call
HTTParty.post('http://example.com/upload', body: { file: #file })
# NOTE: End points returns the url for the uploaded file
end
end
end
Because of the complexity of your code and missing parts which may be obfuscated to us, the simple answer to your problem is to insure that your tempfile instance objects remain in memory throughout the lifecycle in which they are needed, otherwise they will get garbage collected immediately, removing the tempfile from the file system, and will lead to the the missing tempfile state you've encountered.
The Ruby Document for Tempfile states "When a Tempfile object is garbage collected, or when the Ruby interpreter exits, its associated temporary file is automatically deleted."
As per comments, others may find this conversation helpful when running into this problem.

`initialize': No such file or directory # rb_sysopen when using Nokogiri to open site

I created a CLI program that uses Scraper class to scrape site. I am Using Nokogiri and Open-URI. The error on top is popping up. I looked online and did not find help.
I made sure the site doesn't have typos.
from the CLI class I create a new Scraper class using the site as arg
class KefotoScraper::CLI
attr_accessor :kefoto_scraper
def initialize
site = "https://www.kefotos.mx"
#kefoto_scraper = Scraper.new(site)
end
end
In Scraper I have the following code:
class Scraper
attr_accessor :doc, :product_names, :site, :name, :link
def initialize(site)
#site = site
#doc = doc
#product_names = product_names
#name = name
#link = link
#price_range = [].uniq
scrape_product
end
def get_html
#doc = Nokogiri::HTML(open(#site))
#product_names = doc.css(".navbar-nav li")
product_names
end
def scrape_product
get_html.each {|product|
#name = product.css("span").text
plink = product.css("a").attr("href").text
#link = "#{site}#{link}"
link_doc = Nokogiri::HTML(open(#link))
pr = link_doc.scan(/[\$£](\d{1,3}(,\d{3})*(\.\d*)?)/)
prices = pr_link.text
prices.each {|price|
if #price_range.include?(price[0]) == false
#price_range << price[0]
end
}
new_product = Products.new(#name, #price_range)
puts new_product
}
end
end
I get the following error:
scraper.rb:18:in `initialize': No such file or directory # rb_sysopen - https://www.kefotos.mx (Errno::ENOENT)
open by default operates on local files, not URLs. That error means "I can't find a file on your hard drive named https://www.kefotos.mx".
You can let it work on URIs by requiring the open-uri library:
require 'open-uri'
This will make your code work, but it is a much better practice to use a proper HTTP client to read HTTP resources, as an attacker could potentially use an overloaded open() to access files on your machine's hard drive.
For example, if you were to use just net/http:
# At the top of your scraper.rb:
require 'net/http'
# Then, in your class:
link_doc = Nokogiri::HTML(Net::HTTP.get(URI(#link)))

How to render all Jekyll pages with a different layout?

I'm trying to create a Jekyll plugin, which should go through all posts and render them with a different layout. Can't figure out how to do that. That's what I have so far:
module Jekyll
class MyGenerator < Generator
priority :low
def generate(site)
site.posts.docs.each do |doc|
page = Page.new(site, site.source, File.dirname(doc.relative_path), doc.basename)
page.do_layout(
site.site_payload,
'post' => Layout.new(site, site.source, '_layouts/my.html')
)
page.write(?)
site.pages << page
end
end
end
end
This code doesn't work.
In my code below, I'm rendering all my pages a second time with the null layout. The resulting files all have the suffix "_BARE"
module Jekyll
class BareHtml < Page
def initialize(site, base, dest_dir, src_dir, page)
#site = site
#base = base
#dir = dest_dir
#dest_dir = dest_dir
#dest_name = page.basename
file_name = "#{page.basename}_BARE.html"
self.process(file_name)
self.read_yaml(base, page.path)
self.data['layout'] = nil ### <-- set the layout name here
end
end
class BareHtmlGenerator < Generator
safe true
priority :low
def generate(site)
# Converter for .md > .html
converter = site.find_converter_instance(Jekyll::Converters::Markdown)
dest = site.dest
src = site.source
# Create destination path
FileUtils.mkpath(dest) unless File.exists?(dest)
site_pages = site.pages.dup
site_pages.each do |page|
bare = BareHtml.new(site, site.source, dest, src, page)
bare.content = converter.convert(bare.content)
bare.render(site.layouts, site.site_payload)
bare.write(site.dest)
site.pages << bare
end
end
end
end

Including Service class in Controller

I want to include the following class from my services folder into my Controller..
Here is the Class in ..services/product_service.rb
class MyServices
class << self
def screen_print
"These are the words in screen print"
end
end
end
And all I want to do is this in my controller:
class AmazonsController < ApplicationController
def index
#joe = MyServices.screen_print
end
end
I thought I could just include it in the controller. And its not a module so include isn't working, and I tried updating my config/appliaction.rb file and that didn't work either..
Your class name needs to be the same as the name of your file, I believe. So since your file is named product_service.rb, your class should be:
class ProductService
class << self
def screen_print
"These are the words in screen print"
end
end
end
and in your controller:
class AmazonsController < ApplicationController
def index
#joe = ProductService.screen_print
end
end
In addition to the naming problems already pointed out, Rails won't automatically require arbitrary files from folders it doesn't know about.
If you want files in a new folder to be automatically required, you need to add it to Rails' autoload paths:
# config/application.rb
config.autoload_paths << Rails.root.join('services')
See Auto-loading lib files in Rails 4 for more details.
Rails does not load files from uncommon locations. You will need to tells Rails that the services folder exists and to load file from it.
Add the following to your config/application.rb:
# Custom directories with classes and modules you want to be autoloadable.
config.autoload_paths += [Rails.root.join('app', 'services')]

Can't get page data from Jekyll plugin

I'm trying to write a custom tag plugin for Jekyll that will output a hierarchical navigation tree of all the pages (not posts) on the site. I'm basically wanting a bunch nested <ul>'s with links (with the page title as the link text) to the pages with the current page noted by a certain CSS class.
I'm very inexperienced with ruby. I'm a PHP guy.
I figured I'd start just by trying to iterate through all the pages and output a one-dimensional list just to make sure I could at least do that. Here's what I have so far:
module Jekyll
class NavTree < Liquid::Tag
def initialize(tag_name, text, tokens)
super
end
def render(context)
site = context.registers[:site]
output = '<ul>'
site.pages.each do |page|
output += '<li>'+page.title+'</li>'
end
output += '<ul>'
output
end
end
end
Liquid::Template.register_tag('nav_tree', Jekyll::NavTree)
And I'm inserting it into my liquid template via {% nav_tree %}.
The problem is that the page variable in the code above doesn't have all the data that you'd expect. page.title is undefined and page.url is just the basename with a forward slash in front of it (e.g. for /a/b/c.html, it's just giving me /c.html).
What am I doing wrong?
Side note: I already tried doing this with pure Liquid markup, and I eventually gave up. I can easily iterate through site.pages just fine with Liquid, but I couldn't figure out a way to appropriately nest the lists.
Try:
module Jekyll
# Add accessor for directory
class Page
attr_reader :dir
end
class NavTree < Liquid::Tag
def initialize(tag_name, text, tokens)
super
end
def render(context)
site = context.registers[:site]
output = '<ul>'
site.pages.each do |page|
output += '<li>'+(page.data['title'] || page.url) +'</li>'
end
output += '<ul>'
output
end
end
end
Liquid::Template.register_tag('nav_tree', Jekyll::NavTree)
page.title is not always defined (example: atom.xml). You have to check if it is defined. Then you can take page.name or not process the entry...
def render(context)
site = context.registers[:site]
output = '<ul>'
site.pages.each do |page|
unless page.data['title'].nil?
t = page.data['title']
else
t = page.name
end
output += "<li>'+t+'</li>"
end
output += '<ul>'
output
end
Recently I faced a similar problem where the error "cannot convert nill into string" is just blowing my head. My config.yml file holds a line something like this " baseurl: /paradocs/jekyll/out/ " now thats for my local for a server i need to make that beseurl empty and the error starts to appear in build time so finally i have to made " baseurl: / " .. And that's did my job.

Resources