Sorting a tree structure by folders first in Ruby - ruby

I have an array of paths, array = [
'a.txt',
'b/a.txt',
'a/a.txt',
'a/z/a.txt'
]
I need to create a tree structure (for the jTree plugin), but it has to be sorted by folders first (alphabetically) and then leafs (alphabetically too).
A sorted tree structure with the above example would look like this:
a
z
a.txt
a.txt
b
a.txt
a.txt
EDIT: Im looking to build a Tree of HTML ordered lists and list items, where each node is a LI and if its a folder it has another UL as a sibling. This is one of the formats the jTree plugin takes as input. Structure for above example:
<ul>
<li class="folder">a</li>
<ul>
<li class="folder">z</li>
<ul>
<li class="leaf">a.txt</li>
</ul>
</ul>
<li class="folder">b</li>
<ul>
<li class="leaf">a.txt</li>
</ul>
<li class="leaf">a.txt</li>
</ul>
This will build the tree structure as a hash tree:
array = ["home", "about", "about/history", "about/company", "about/history/part1", "about/history/part2"]
auto_hash = Hash.new{ |h,k| h[k] = Hash.new &h.default_proc }
array.each{ |path|
sub = auto_hash
path.split( "/" ).each{ |dir| sub[dir]; sub = sub[dir] }
}

require 'rubygems'
require 'builder'
paths = ["home", "about", "about/history", "about/company", "about/history/part1", "about/history/part2"]
auto_hash = Hash.new{ |h,k| h[k] = Hash.new &h.default_proc }
paths.each do |path|
sub = auto_hash
path.split( "/" ).each{ |dir| sub[dir]; sub = sub[dir] }
end
def build_branch(branch, xml)
directories = branch.keys.reject{|k| branch[k].empty? }.sort
leaves = branch.keys.select{|k| branch[k].empty? }.sort
directories.each do |directory|
xml.li(directory, :class => 'folder')
xml.ul do
build_branch(branch[directory], xml)
end
end
leaves.each do |leaf|
xml.li(leaf, :class => 'leaf')
end
end
xml = Builder::XmlMarkup.new(:indent => 2)
xml.ul do
build_branch(auto_hash, xml)
end
puts xml.target!

Related

Print the first letter in the view but I don't need to repeat the letter

I'm making a list of categories but I need the header to only show the first letter without repeating
This is for a list of all the categories of a store
Controller:
def show
#category = Category.friendly.find(params[:id])
#category_articles = #category.articles.paginate(page: params[:page], per_page: 12)
end
view:
<div class="container" id="tag-container">
<% #categories.each do |category| %>
<section>
<h2><%= category.name.first %></h2>
<%= link_to "#{category.name}", category_path(category)%>
<span>(<%= pluralize(category.articles.count,"")%>)</span>
</section>
<% end %>
</div>
I'll really appreciate if you can help me with this.
Supposing categories to be sorted by name, this can be an option. I'm using plain Ruby, but you can do the same with Rails. Consider categories array as the collection of records.
categories = %W(bat bet bot cat cut dot git got gut)
grouped_categories = categories.group_by { |w| w[0] }
Grouping by first letter ({ |w| w[0] }) using Enumerable#group_by. The method returns a Hash that you can iterate with nested loop:
grouped_categories
#=> {"b"=>["bat", "bet", "bot"], "c"=>["cat", "cut"], "d"=>["dot"], "g"=>["git", "got", "gut"]}
grouped_categories.each do |initial, vals|
puts "-#{initial}"
vals.each do |val|
puts "----#{val}"
end
end
It prints:
-b
----bat
----bet
----bot
-c
----cat
----cut
-d
----dot
-g
----git
----got
----gut
If you wish the categories alphabetised,
categories = ["gut", "git", "bot", "cut", "got", "cat", "dot", "bet", "bat"]
categories.sort.chunk { |w| w[0] }.each { |ltr,a| puts "#{ltr}: #{a.join(' ')}" }
b: bat bet bot
c: cat cut
d: dot
g: git got gut

how to find all links at the same depth with a common closest ancestor with nokogiri

d=<<"EOM"
<ul>
<li><a id=t href="t">a</a></li>
<li><a id=b href="b">b</a></li>
<li>
<ul>
<li>don't want inner</li>
<li>don't want inner</li>
</ul>
</li>
<li><a id=c href="c">c</a></li>
</ul>
<ul>
<li>don't want</li>
</ul>
EOM
doc = Nokogiri.HTML(d)
t = doc.css("#t")[0]
how can i get all hrefs that have the same
outer container as "t" and are at the same
depth as "t"? in this case i'd want just the
hrefs t,b,c.
these will not always be in ul's, just using
it as an example.
To get all a tags with the same 'grandparent' as t you could do:
doc.css('a').select{|a| a.parent.parent == t.parent.parent}
To get their hrefs:
doc.css('a').select{|a| a.parent.parent == t.parent.parent}.map{|a| a[:href]}
If you know the IDs will be consistent:
puts doc.search('#t, #b, #c').map{ |n| n['href'] }
If you don't know what they would be, then XPath can get you there:
doc.search('//*[#id="t"]/../../*/*[#id]').to_html
=> "<a id=\"t\" href=\"t\">a</a><a id=\"b\" href=\"b\">b</a><a id=\"c\" href=\"c\">c</a>"
doc.search('//*[#id="t"]/../../*/*[#id]').map{ |n| n['href'] }
=> ["t", "b", "c"]
That means "find the node with an ID of 't', then back up two levels and look down finding the nodes with populated id attributes".
Thanks #pguardiario
The parent node could be at any level, so I modified your code like so:
t = doc.css("#a")[0]
r = []
p = t.parent
x = 0
while true
break if p.node_name == "body" || p.node_name == "html"
x += 1
r = doc.css('a').select{|a|
m = a
x.times { m = m.parent }
m == p
}
break if r.length > 1
p = p.parent
end
pp r.length
I'm sure there's a better way than this brute force method.

Get link and href text from html doc with Nokogiri & Ruby?

I'm trying to use the nokogiri gem to extract all the urls on the page as well their link text and store the link text and url in a hash.
<html>
<body>
<a href=#foo>Foo</a>
<a href=#bar>Bar </a>
</body>
</html>
I would like to return
{"Foo" => "#foo", "Bar" => "#bar"}
Here's a one-liner:
Hash[doc.xpath('//a[#href]').map {|link| [link.text.strip, link["href"]]}]
#=> {"Foo"=>"#foo", "Bar"=>"#bar"}
Split up a bit to be arguably more readable:
h = {}
doc.xpath('//a[#href]').each do |link|
h[link.text.strip] = link['href']
end
puts h
#=> {"Foo"=>"#foo", "Bar"=>"#bar"}
Another way:
h = doc.css('a[href]').each_with_object({}) { |n, h| h[n.text.strip] = n['href'] }
# yields {"Foo"=>"#foo", "Bar"=>"#bar"}
And if you're worried that you might have the same text linking to different things then you collect the hrefs in arrays:
h = doc.css('a[href]').each_with_object(Hash.new { |h,k| h[k] = [ ]}) { |n, h| h[n.text.strip] << n['href'] }
# yields {"Foo"=>["#foo"], "Bar"=>["#bar"]}

xpath to find all following sibling adjacent nodes up til another type [duplicate]

I have HTML code like this:
<div id="first">
<dt>Label1</dt>
<dd>Value1</dd>
<dt>Label2</dt>
<dd>Value2</dd>
...
</div>
My code does not work.
doc.css("first").each do |item|
label = item.css("dt")
value = item.css("dd")
end
Show all the <dt> tags firsts and then the <dd> tags and I need "label: value"
First of all, your HTML should have the <dt> and <dd> elements inside a <dl>:
<div id="first">
<dl>
<dt>Label1</dt>
<dd>Value1</dd>
<dt>Label2</dt>
<dd>Value2</dd>
...
</dl>
</div>
but that won't change how you parse it. You want to find the <dt>s and iterate over them, then at each <dt> you can use next_element to get the <dd>; something like this:
doc = Nokogiri::HTML('<div id="first"><dl>...')
doc.css('#first').search('dt').each do |node|
puts "#{node.text}: #{node.next_element.text}"
end
That should work as long as the structure matches your example.
Under the assumption that some <dt> may have multiple <dd>, you want to find all <dt> and then (for each) find the following <dd> before the next <dt>. This is pretty easy to do in pure Ruby, but more fun to do in just XPath. ;)
Given this setup:
require 'nokogiri'
html = '<dl id="first">
<dt>Label1</dt><dd>Value1</dd>
<dt>Label2</dt><dd>Value2</dd>
<dt>Label3</dt><dd>Value3a</dd><dd>Value3b</dd>
<dt>Label4</dt><dd>Value4</dd>
</dl>'
doc = Nokogiri.HTML(html)
Using no XPath:
doc.css('dt').each do |dt|
dds = []
n = dt.next_element
begin
dds << n
n = n.next_element
end while n && n.name=='dd'
p [dt.text,dds.map(&:text)]
end
#=> ["Label1", ["Value1"]]
#=> ["Label2", ["Value2"]]
#=> ["Label3", ["Value3a", "Value3b"]]
#=> ["Label4", ["Value4"]]
Using a Little XPath:
doc.css('dt').each do |dt|
dds = dt.xpath('following-sibling::*').chunk{ |n| n.name }.first.last
p [dt.text,dds.map(&:text)]
end
#=> ["Label1", ["Value1"]]
#=> ["Label2", ["Value2"]]
#=> ["Label3", ["Value3a", "Value3b"]]
#=> ["Label4", ["Value4"]]
Using Lotsa XPath:
doc.css('dt').each do |dt|
ct = dt.xpath('count(following-sibling::dt)')
dds = dt.xpath("following-sibling::dd[count(following-sibling::dt)=#{ct}]")
p [dt.text,dds.map(&:text)]
end
#=> ["Label1", ["Value1"]]
#=> ["Label2", ["Value2"]]
#=> ["Label3", ["Value3a", "Value3b"]]
#=> ["Label4", ["Value4"]]
After looking at the other answer here is an inefficient way of doing the same thing.
require 'nokogiri'
a = Nokogiri::HTML('<div id="first"><dt>Label1</dt><dd>Value1</dd><dt>Label2</dt><dd>Value2</dd></div>')
dt = []
dd = []
a.css("#first").each do |item|
item.css("dt").each {|t| dt << t.text}
item.css("dd").each {|t| dd << t.text}
end
dt.each_index do |i|
puts dt[i] + ': ' + dd[i]
end
In css to reference the ID you need to put the # symbol before. For a class it's the . symbol.

finding common ancestor from a group of xpath?

say i have
html/body/span/div/p/h1/i/font
html/body/span/div/div/div/div/table/tr/p/h1
html/body/span/p/h1/b
html/body/span/div
how can i get the common ancestor? in this case span would be the common ancestor of "font, h1, b, div" would be "span"
To find common ancestry between two nodes:
(node1.ancestors & node2.ancestors).first
A more generalized function that works with multiple nodes:
# accepts node objects or selector strings
class Nokogiri::XML::Element
def common_ancestor(*nodes)
nodes = nodes.map do |node|
String === node ? self.document.at(node) : node
end
nodes.inject(self.ancestors) do |common, node|
common & node.ancestors
end.first
end
end
# usage:
node1.common_ancestor(node2, '//foo/bar')
# => <ancestor node>
The function common_ancestor below does what you want.
require 'rubygems'
require 'nokogiri'
doc = Nokogiri::XML(DATA)
def common_ancestor *elements
return nil if elements.empty?
elements.map! do |e| [ e, [e] ] end #prepare array
elements.map! do |e| # build array of ancestors for each given element
e[1].unshift e[0] while e[0].respond_to?(:parent) and e[0] = e[0].parent
e[1]
end
# merge corresponding ancestors and find the last where all ancestors are the same
elements[0].zip(*elements[1..-1]).select { |e| e.uniq.length == 1 }.flatten.last
end
i = doc.xpath('//*[#id="i"]').first
div = doc.xpath('//*[#id="div"]').first
h1 = doc.xpath('//*[#id="h1"]').first
p common_ancestor i, div, h1 # => gives the p element
__END__
<html>
<body>
<span>
<p id="common-ancestor">
<div>
<p><h1><i id="i"></i></h1></p>
<div id="div"></div>
</div>
<p>
<h1 id="h1"></h1>
</p>
<div></div>
</p>
</span>
</body>
</html>

Resources