Gsub in-place not working - ruby

I have this code:
Firm.all.each do |firm|
url = firm.site
doc = Nokogiri::HTML(open(url))
data = doc.css("##{firm.menu_id} a")
data.each do |e|
e.text.strip!
e.text.gsub!(/[\n\t]*/,'')
puts e.text
end
end
The strings are being displayed in the same format as the input (that means, the gsub! method is not affecting the string). I think that e.text can be immutable, but I'd like to ensure that.

The text method returns a new String each time, which can be seen using object_id:
e = Nokogiri::XML('<a>text</a>')
e.text.object_id == e.text.object_id # => false
If you want to modify the node's text, set the content:
e.at_css('a').content = "foo"
e.text # => "foo"

Related

Converting Ruby Hash into string with escapes

I have a Hash which needs to be converted in a String with escaped characters.
{name: "fakename"}
and should end up like this:
'name:\'fakename\'
I don't know how this type of string is called. Maybe there is an already existing method, which I simply don't know...
At the end I would do something like this:
name = {name: "fakename"}
metadata = {}
metadata['foo'] = 'bar'
"#{name} AND #{metadata}"
which ends up in that:
'name:\'fakename\' AND metadata[\'foo\']:\'bar\''
Context: This query a requirement to search Stripe API: https://stripe.com/docs/api/customers/search
If possible I would use Stripe's gem.
In case you can't use it, this piece of code extracted from the gem should help you encode the query parameters.
require 'cgi'
# Copied from here: https://github.com/stripe/stripe-ruby/blob/a06b1477e7c28f299222de454fa387e53bfd2c66/lib/stripe/util.rb
class Util
def self.flatten_params(params, parent_key = nil)
result = []
# do not sort the final output because arrays (and arrays of hashes
# especially) can be order sensitive, but do sort incoming parameters
params.each do |key, value|
calculated_key = parent_key ? "#{parent_key}[#{key}]" : key.to_s
if value.is_a?(Hash)
result += flatten_params(value, calculated_key)
elsif value.is_a?(Array)
result += flatten_params_array(value, calculated_key)
else
result << [calculated_key, value]
end
end
result
end
def self.flatten_params_array(value, calculated_key)
result = []
value.each_with_index do |elem, i|
if elem.is_a?(Hash)
result += flatten_params(elem, "#{calculated_key}[#{i}]")
elsif elem.is_a?(Array)
result += flatten_params_array(elem, calculated_key)
else
result << ["#{calculated_key}[#{i}]", elem]
end
end
result
end
def self.url_encode(key)
CGI.escape(key.to_s).
# Don't use strict form encoding by changing the square bracket control
# characters back to their literals. This is fine by the server, and
# makes these parameter strings easier to read.
gsub("%5B", "[").gsub("%5D", "]")
end
end
params = { name: 'fakename', metadata: { foo: 'bar' } }
Util.flatten_params(params).map { |k, v| "#{Util.url_encode(k)}=#{Util.url_encode(v)}" }.join("&")
I use it now with that string, which works... Quite straigt forward:
"email:\'#{email}\'"
email = "test#test.com"
key = "foo"
value = "bar"
["email:\'#{email}\'", "metadata[\'#{key}\']:\'#{value}\'"].join(" AND ")
=> "email:'test#test.com' AND metadata['foo']:'bar'"
which is accepted by Stripe API

ruby trie implementation reference issue

I am trying to implement a trie in Ruby but can't figure out what the problem is with my print + collect methods.
I just implemented the same in JS and working fine. I guess the issue could be that Ruby is passed by reference (unlike JS) and how variable assignment works in Ruby.
So if I run the code with string.clone as argument when I recursively call the collect function then I get:
["peter", "peter", "petera", "pdanny", "pdjane", "pdjanck"]
and if I pass string then:
["peterradannyjaneck", "peterradannyjaneck", "peterradannyjaneck", "peterradannyjaneck", "peterradannyjaneck", "peterradannyjaneck"]
Any ideas how to fix this?
the code:
class Node
attr_accessor :hash, :end_node, :data
def initialize
#hash = {}
#end_node = false
#data = data
end
def end_node?
end_node
end
end
class Trie
def initialize
#root = Node.new
#words = []
end
def add(input, data, node = #root)
if input.empty?
node.data = data
node.end_node = true
elsif node.hash.keys.include?(input[0])
add(input[1..-1], data, node.hash[input[0]])
else
node.hash[input[0]] = Node.new
add(input[1..-1], data, node.hash[input[0]])
end
end
def print(node = #root)
collect(node, '')
#words
end
private
def collect(node, string)
if node.hash.size > 0
for letter in node.hash.keys
string = string.concat(letter)
collect(node.hash[letter], string.clone)
end
#words << string if node.end_node?
else
string.length > 0 ? #words << string : nil
end
end
end
trie = Trie.new
trie.add('peter', date: '1988-02-26')
trie.add('petra', date: '1977-02-12')
trie.add('danny', date: '1998-04-21')
trie.add('jane', date: '1985-05-08')
trie.add('jack', date: '1994-11-04')
trie.add('pete', date: '1977-12-18')
print trie.print
Ruby's string concat mutates the string and doesn't return a new string. You may want the + operator instead. So basically change the 2 lines inside collect's for-loop as per below:
stringn = string + letter
collect(node.hash[letter], stringn)
Also, you probably want to either always initialize #words to empty in print before calling collect, or make it a local variable in print and pass it to collect.

How can I find acronym in a text?

My project reads many files (these files have title text and sections) and should find the title of the files that contain an acronym. This is my docs class:
class Doc
def initialize(id, secciones)
#id, #secciones = id, secciones
end
def to_s
result = "" + #id.to_s + "\n" + #secciones.to_s
return result
end
def tiene_acronimo(acr)
puts "a ver si tiene acronimos el docu.."
tiene_acronimo = false
secciones.each do |seccion|
if seccion.tiene_acronimo(acr)
tiene_acronimo = true
end
end
return tiene_acronimo
end
attr_accessor :id
attr_accessor :secciones
end
And this my sections class:
class Section
def initialize ()
#title = ""
#text = ""
end
def tiene_acronimo(acr)
return title.include?(acr) || text.include?(acr)
end
end
And this my method:
def test()
results = Array.new
puts "Dame el acronimo"
acr = gets
documentos_cientificos.each do |d|
if d.tiene_acronimo(acr)
results << d
end
end
The method gets an acronym and should find all documents that contain it. The method inclue? [sic] ingores the upcase and returns true if the docs contain any substring like the acronym. For example:
Multiple sclerosis (**MS**), also known as # => `true`
Presenting signs and sympto**ms** # => `false` (but `include?` returns `true`)
How I can find an acronym more easily?
You could use some regex with the match function. The following regex will find a match if the content contains the FULL word provided. It will ignore substrings, and it will be case sensitive.
arc = "MS"
title = "Multiple sclerosis (MS), also known as"
text = "Presenting signs and symptoms"
title.match(/\b#{Regexp.escape(acr)}\b/) # => #<MatchData "MS">
text.match(/\b#{Regexp.escape(acr)}\b/) # => nil
or equivalently
title.match(/\b#{Regexp.escape(acr)}\b/).to_a.size > 0 # => true
text.match(/\b#{Regexp.escape(acr)}\b/).to_a.size > 0 # => false
...so you could redefine your function as:
def tiene_acronimo(acr)
regex_to_match = /\b#{Regexp.escape(acr)}\b/
has_acr = false
if (title.match(regex_to_match)) || (text.match(regex_to_match))
has_acr = true
end
return has_acr
end

Parse CSV Data with Ruby

I am trying to return a specific cell value based on two criteria.
The logic:
If ClientID = 1 and BranchID = 1, puts SurveyID
Using Ruby 1.9.3, I want to basically look through an excel file and for two specific values located within the ClientID and BranchID column, return the corresponding value in the SurveyID column.
This is what I have so far, which I found during my online searches. It seemed promising, but no luck:
require 'csv'
# Load file
csv_fname = 'FS_Email_Test.csv'
# Key is the column to check, value is what to match
search_criteria = { 'ClientID' => '1',
'BranchID' => '1' }
options = { :headers => :first_row,
:converters => [ :numeric ] }
# Save `matches` and a copy of the `headers`
matches = nil
headers = nil
# Iterate through the `csv` file and locate where
# data matches the options.
CSV.open( csv_fname, "r", options ) do |csv|
matches = csv.find_all do |row|
match = true
search_criteria.keys.each do |key|
match = match && ( row[key] == search_criteria[key] )
end
match
end
headers = csv.headers
end
# Once matches are found, we print the results
# for a specific row. The row `row[8]` is
# tied specifically to a notes field.
matches.each do |row|
row = row[1]
puts row
end
I know the last bit of code following matches.each do |row| is invalid, but I left it in in hopes that it will make sense to someone else.
How can I write puts surveyID if ClientID == 1 & BranchID == 1?
You were very close indeed. Your only error was setting the values of the search_criteria hash to strings '1' instead of numbers. Since you have converters: :numeric in there the find_all was comparing 1 to '1' and getting false. You could just change that and you're done.
Alternatively this should work for you.
The key is the line
Hash[row].select { |k,v| search_criteria[k] } == search_criteria
Hash[row] converts the row into a hash instead of an array of arrays. Select generates a new hash that has only those elements that appear in search_criteria. Then just compare the two hashes to see if they're the same.
require 'csv'
# Load file
csv_fname = 'FS_Email_Test.csv'
# Key is the column to check, value is what to match
search_criteria = {
'ClientID' => 1,
'BranchID' => 1,
}
options = {
headers: :first_row,
converters: :numeric,
}
# Save `matches` and a copy of the `headers`
matches = nil
headers = nil
# Iterate through the `csv` file and locate where
# data matches the options.
CSV.open(csv_fname, 'r', options) do |csv|
matches = csv.find_all do |row|
Hash[row].select { |k,v| search_criteria[k] } == search_criteria
end
headers = csv.headers
end
p headers
# Once matches are found, we print the results
# for a specific row. The row `row[8]` is
# tied specifically to a notes field.
matches.each { |row| puts row['surveyID'] }
Possibly...
require 'csv'
b_headers = false
client_id_col = 0
branch_id_col = 0
survey_id_col = 0
CSV.open('FS_Email_Test.csv') do |file|
file.find_all do |row|
if b_headers == false then
client_id_col = row.index("ClientID")
branch_id_col = row.index("BranchID")
survey_id_col = row.index("SurveyID")
b_headers = true
if branch_id_col.nil? || client_id_col.nil? || survey_id_col.nil? then
puts "Invalid csv file - Missing one of these columns (or no headers):\nClientID\nBranchID\nSurveyID"
break
end
else
puts row[survey_id_col] if row[branch_id_col] == "1" && row[client_id_col] == "1"
end
end
end

Can't convert symbol to integer from hash table

Edit: The issue is being unable to get the quantity of arrays within the hash, so it can be, x = amount of arrays. so it can be used as function.each_index{|x| code }
Trying to use the index of the amount of rows as a way of repeating an action X amount of times depending on how much data is pulled from a CSV file.
Terminal issued
=> Can't convert symbol to integer (TypeError)
Complete error:
=> ~/home/tests/Product.rb:30:in '[]' can't convert symbol into integer (TypeError) from ~home/tests/Product.rub:30:in 'getNumbRel'
from test.rb:36:in '<main>'
the function is that is performing the action is:
def getNumRel
if defined? #releaseHashTable
return #releaseHashTable[:releasename].length
else
#releaseHashTable = readReleaseCSV()
return #releaseHashTable[:releasename].length
end
end
The csv data pull is just a hash of arrays, nothing snazzy.
def readReleaseCSV()
$log.info("Method "+"#{self.class.name}"+"."+"#{__method__}"+" has started")
$log.debug("reading product csv file")
# Create a Hash where the default is an empty Array
result = Array.new
csvPath = "#{File.dirname(__FILE__)}"+"/../../data/addingProdRelProjIterTestSuite/releaseCSVdata.csv"
CSV.foreach(csvPath, :headers => true, :header_converters => :symbol) do |row|
row.each do |column, value|
if "#{column}" == "prodid"
proHash = Hash.new { |h, k| h[k] = [ ] }
proHash['relid'] << row[:relid]
proHash['releasename'] << row[:releasename]
proHash['inheritcomponents'] << row[:inheritcomponents]
productId = Integer(value)
if result[productId] == nil
result[productId] = Array.new
end
result[productId][result[productId].length] = proHash
end
end
end
$log.info("Method "+"#{self.class.name}"+"."+"#{__method__}"+" has finished")
#productReleaseArr = result
end
Sorry, couldn't resist, cleaned up your method.
# empty brackets unnecessary, no uppercase in method names
def read_release_csv
# you don't need + here
$log.info("Method #{self.class.name}.#{__method__} has started")
$log.debug("reading product csv file")
# you're returning this array. It is not a hash. [] is preferred over Array.new
result = []
csvPath = "#{File.dirname(__FILE__)}/../../data/addingProdRelProjIterTestSuite/releaseCSVdata.csv"
CSV.foreach(csvPath, :headers => true, :header_converters => :symbol) do |row|
row.each do |column, value|
# to_s is preferred
if column.to_s == "prodid"
proHash = Hash.new { |h, k| h[k] = [ ] }
proHash['relid'] << row[:relid]
proHash['releasename'] << row[:releasename]
proHash['inheritcomponents'] << row[:inheritcomponents]
# to_i is preferred
productId = value.to_i
# this notation is preferred
result[productId] ||= []
# this is identical to what you did and more readable
result[productId] << proHash
end
end
end
$log.info("Method #{self.class.name}.#{__method__} has finished")
#productReleaseArr = result
end
You haven't given much to go on, but it appears that #releaseHashTable contains an Array, not a Hash.
Update: Based on the implementation you posted, you can see that productId is an integer and that the return value of readReleaseCSV() is an array.
In order to get the releasename you want, you have to do this:
#releaseHashTable[productId][n][:releasename]
where productId and n are integers. Either you'll have to specify them specifically, or (if you don't know n) you'll have to introduce a loop to collect all the releasenames for all the products of a particular productId.
This is what Mark Thomas meant:
> a = [1,2,3] # => [1, 2, 3]
> a[:sym]
TypeError: can't convert Symbol into Integer
# here starts the backstrace
from (irb):2:in `[]'
from (irb):2
An Array is only accessible by an index like so a[1] this fetches the second element from the array
Your return a an array and thats why your code fails:
#....
result = Array.new
#....
#productReleaseArr = result
# and then later on you call
#releaseHashTable = readReleaseCSV()
#releaseHashTable[:releasename] # which gives you TypeError: can't convert Symbol into Integer

Resources