Word Count with Ruby - ruby

I am trying to figure out a way to count a words in a particular string that contains html.
Example String:
<p>Hello World</p>
Is there a way in Ruby to count the words in between the p tags? Or any tag for that matter?
Examples:
<p>Hello World</p>
<h2>Hello World</h2>
<li>Hello World</li>
Thanks in advance!
Edit (here is my working code)
Controller:
class DashboardController < ApplicationController
def index
#pages = Page.find(:all)
#word_count = []
end
end
View:
<% #pages.each do |page| %>
<% page.current_state.elements.each do |el| %>
<% #count = Hpricot(el.description).inner_text.split.uniq.size %>
<% #word_count << #count %>
<% end %>
<li><strong>Page Name: <%= page.slug %> (Word Count: <%= #word_count.inject(0){|sum,n| sum+n } %>)</strong></li>
<% end %>

Here's how you can do it:
require 'hpricot'
content = "<p>Hello World...."
doc = Hpricot(content)
doc.inner_text.split.uniq
Will give you:
[
[0] "Hello",
[1] "World"
]
(sidenote: the output is formatted with awesome_print that I warmly recommend)

Sure
Use Nokogiri to parse the HTML/XML and XPath to find the element and its text value.
Split on whitespace to count the words

You'll want to use something like Hpricot to remove the HTML, then it's just a case of counting words in plain text.
Here is an example of stripping the HTML: http://underpantsgnome.com/2007/01/20/hpricot-scrub/

First start with something able to parse HTML like Hpricot, then use simple regular expression to do what you want (you can merely split over spaces and then count for example)

Related

Linking to search results with Ruby

I'm a complete novice in Ruby and Nanoc, but a project has been put in my lap. Basically, the code for the page brings back individual URLs for each item linking them to the manual. I'm trying to create a URL that will list all of the manuals in one search. Any help is appreciated.
Here's the code:
<div>
<%
manuals = #items.find_all('/manuals/autos/*')
.select {|item| item[:tag] == 'suv' }
.sort_by {|item| item[:search] }
manuals.each_slice((manuals.size / 4.0).ceil).each do |manuals_column|
%>
<div>
<% manual_column.each do |manual| %>
<div>
<a href="<%= app_url "/SearchManual/\"#{manual[:search]}\"" %>">
<%= manual[:search] %>
</a>
</div>
<% end %>
</div>
<% end %>
</div>
As you didn't specify what items is returning, I did an general example:
require 'uri'
# let suppose that your items query has the follow output
manuals = ["Chevy", "GMC", "BMW"]
# build the url base
url = "www.mycars.com/search/?list_of_cars="
# build the parameter that will be passed by the url
manuals.each do |car|
url += car + ","
end
# remove the last added comma
url.slice!(-1)
your_new_url = URI::encode(url)
# www.mycars.com/?list_of_cars=Chevy,GMC,BMW
# In your controller, you will be able to get the parameter with
# URI::decode(params[:list_of_cars]) and it will be a string:
# "Chevy,GMC,BMW".split(',') method to get each value.
Some considerations:
I don't know if you are gonna use this on view or controller, if will be in view, than wrap the code with the <% %> syntax.
About the URL format, you can find more choices of how to build it in:
Passing array through URLs
When writing question on SO, please, put more work on that. You will help us find a quick answer to your question, and you, for wait less for an answer.
If you need something more specific, just ask and I can see if I can answer.

Ruby - trying to make hashtags within string into links

I am trying to make the hashtags within a string into links.
e.g. I'd like a string that's currently: "I'm a string which contains a #hashtag" to transform into: "I'm a string which contains #hashtag"
The code that I have at the moment is as follows:
<% #messages.each do |message| %>
<% string = message.content %>
<% hashtaglinks = string.scan(/#(\d*)/).flatten %>
<% hashtaglinks.each do |tag| %>
<li><%= string = string.gsub(/##{tag}\b/, link_to("google", "##{tag}") %><li>
<% end %>
<% end %>
I've been trying (in vain) for several hours to get this to work, reading through many similar stackoverflow threads- but frustration has got the better of me, and as a beginner rubyist, I'd be really appreciate it if someone could please help me out!
The code in my 'server.rb' is as follows:
get '/' do
#messages = Message.all
erb :index
end
post '/messages' do
content = params["content"]
hashtags = params["content"].scan(/#\w+/).flatten.map{|hashtag|
Hashtag.first_or_create(:text => hashtag)}
Message.create(:content => content, :hashtags => hashtags)
redirect to('/')
end
get '/hashtags/:text' do
hashtag = Hashtag.first(:text => params[:text])
#messages = hashtag ? hashtag.messages : []
erb :index
end
helpers do
def link_to(url,text=url,opts={})
attributes = ""
opts.each { |key,value| attributes << key.to_s << "=\"" << value << "\" "}
"<a href=\"#{url}\" #{attributes}>#{text}</a>"
end
end
Here is the code to get you started. This should replace (in-place) the hashtags in the string with the links:
<% string.gsub!(/#\w+/) do |tag| %>
<% link_to("##{tag}", url_you_want_to_replace_hashtag_with) %>
<% end %>
You may need to use html_safe on the string to display it afterwards.
The regex doesn't account for more complex cases, like what do you do in case of ##tag0 or #tag1#tag2. Should tag0 and tag2 be considered hashtags? Also, you may want to change \w to something like [a-zA-Z0-9] if you want to limit the tags to alphanumerics and digits only.

How to convert a hash into a string in ruby

I'm trying to do an wolfram api using Ruby. I found that you can create a hash from text you put to find an answer on wolfram page. I managed to do something like this in my controller:
class CountController < ApplicationController
def index
#result = Wolfram.fetch('6*7')
#hash = Wolfram::HashPresenter.new(#result).to_hash
#pods = #hash[:pods]
end
end
When I want to show this on my site I do something like this in my view:
<p>
<b>Result:</b>
<%= #result %>
<br>
<b>Hash:</b>
<%= #hash %>
<br>
<b>Hash.pods</b>
<%= #pods["Input"]%>
<br>
</p>
And I have something like this on my page:
Result: #<Wolfram::Result:0x00000004758b78>
Hash: {:pods=>{"Input"=>["6×7"], "Result"=>["42"], "Number name"=>["forty-two"], "Number line"=>[""], "Illustration"=>["6 | \n | 7"]}, :assumptions=>{}}
Hash.pods ["6×7"]
I'd like to have just 6x7 instead of ["6x7"]. Is there a solution to change this hash into a string?
The reason why it is being displayed like [6x7] is that your hash stores it within an array. Displaying it any other way will be misleading. However you can do it with:
Hash[#hash.map {|key, value| [key, (value.kind_of?(Array) && value.size == 1) ? value.first : value }]

Parsing XML and how to handle nested Arrays and Hashes

I'm using this instance variable:
#response = HTTParty.get("http://www.bart.gov/dev/eta/bart_eta.xml")
I'm trying to parse the xml using rails 3.2:
<% #response.each do |r| %>
<% r.each do |root| %>
<%= root.class %>
<% end %>
<% end %>
The output is
String Hash
I get "String Hash" for "root.class". I don't understand how it can be "String Hash," I would like to implement another "each" method to go deeper in the xml layers.
What does "String Hash" mean?
Your #response object is of the type HTTParty::Response.
It looks like it's wrapping an array with two values in it: the first value is a String, "root", and the second value is a Hash.
Since you have no line breaks in your ERB code, as you iterate through the array it is printing out String and Hash on the same line.
Try using root.inspect to dig deeper into what values you're actually iterating through.

Rails 3 refactoring issue

The following view code generates a series of links with totals (as expected):
<% #jobs.group_by(&:employer_name).sort.each do |employer, jobs| %>
<%= link_to employer, jobs_path() %> <%= "(#{jobs.length})" %>
<% end %>
However, when I refactor the view's code and move the logic to a helper, the code doesn't work as expect.
view:
<%= employer_filter(#jobs_clone) %>
helper:
def employer_filter(jobs)
jobs.group_by(&:employer_name).sort.each do |employer,jobs|
link_to employer, jobs_path()
end
end
The following output is generated:
<Job:0x10342e628>#<Job:0x10342e588>#<Job:0x10342e2e0>Employer A#<Job:0x10342e1c8>Employer B#<Job:0x10342e0d8>Employer C#<Job:0x10342ded0>Employer D#
What am I not understanding? At first blush, the code seems to be equivalent.
In the first example, it is directly outputting to erb, in the second example it is returning the result of that method.
Try this:
def employer_filter(jobs)
employer_filter = ""
jobs.group_by(&:employer_name).sort.each do |employer,jobs|
employer_filter += link_to(employer, jobs_path())
end
employer_filter
end
Then call it like this in the view:
raw(employer_filter(jobs))
Also note the use of "raw". Once you move generation of a string out of the template you need to tell rails that you don't want it html escaped.
For extra credit, you could use the "inject" command instead of explicitly building the string, but I am lazy and wanted to give you what I know would work w/o testing.
This syntax worked as I hoped it would:
def employer_filter(jobs_clone)
jobs_clone.group_by(&:employer_name).sort.collect { |group,items|
link_to( group, jobs_path() ) + " (#{items.length})"
}.join(' | ').html_safe
end

Resources