Can we make these two loop combined means to run them parallel? - ruby

Find the below code:
all_links = driver.find_elements(:xpath,"//fieldset[contains(#class,'attachmentTable')]/table/tbody/tr/td/a")
all_attachment_names = driver.find_elements(:xpath,"//fieldset[contains(#class,'attachmentTable')]/legend")
all_links.each do|link|
href = link.attribute("href").strip
puts href
end
all_attachment_names.each do |name|
text = name.attribute("text")
puts text
end
Can we make these two loop combined means to run them parallel?
provided that both of loop count is same.
I want to create hash where key will be text and item will be
href.

You can use zip to do this:
returned_hash = {}
all_links.zip(all_attachment_names) do |link, name|
returned_hash[name.text] = link.attribute("href").strip
end
You can also do it in a functional programming style by extracting the href and text with map:
hrefs = all_links.map{|link| link.attribute("href").strip}
names = all_attachment_names.map{|name| name.text}
returned_hash = Hash[names.zip(hrefs)]
Doing so is (arguably) more aesthetically pleasing but somewhat less efficient because it requires twice as many iterations, and creates a couple of extra arrays, but unless you have an enormous number of links that's not going to be an issue.

map = {}
all_attachment_names.zip(all_links) do |a, l|
map[a] = l
end

Related

Ruby Nokogiri parsing omit duplicates

I'm parsing XML files and wanting to omit duplicate values from being added to my Array. As it stands, the XML will looks like this:
<vulnerable-software-list>
<product>cpe:/a:octopus:octopus_deploy:3.0.0</product>
<product>cpe:/a:octopus:octopus_deploy:3.0.1</product>
<product>cpe:/a:octopus:octopus_deploy:3.0.2</product>
<product>cpe:/a:octopus:octopus_deploy:3.0.3</product>
<product>cpe:/a:octopus:octopus_deploy:3.0.4</product>
<product>cpe:/a:octopus:octopus_deploy:3.0.5</product>
<product>cpe:/a:octopus:octopus_deploy:3.0.6</product>
</vulnerable-software-list>
document.xpath("//entry[
number(substring(translate(last-modified-datetime,'-.T:',''), 1, 12)) > #{last_imported_at} and
cvss/base_metrics/access-vector = 'NETWORK'
]").each do |entry|
product = entry.xpath('vulnerable-software-list/product').map { |product| product.content.split(':')[-2] }
effected_versions = entry.xpath('vulnerable-software-list/product').map { |product| product.content.split(':').last }
puts product
end
However, because of the XML input, that's parsing quite a bit of duplicates, so I end up with an array like ['Redhat','Redhat','Redhat','Fedora']
I already have the effected_versions taken care of, since those values don't duplicate.
Is there a method of .map to only add unique values?
If you need to get an array of unique values, then just call uniq method to get the unique values:
product =
entry.xpath('vulnerable-software-list/product').map do |product|
product.content.split(':')[-2]
end.uniq
There are many ways to do this:
input = ['Redhat','Redhat','Redhat','Fedora']
# approach 1
# self explanatory
result = input.uniq
# approach 2
# iterate through vals, and build a hash with the vals as keys
# since hashes cannot have duplicate keys, it provides a 'unique' check
result = input.each_with_object({}) { |val, memo| memo[val] = true }.keys
# approach 3
# Similar to the previous, we iterate through vals and add them to a Set.
# Adding a duplicate value to a set has no effect, and we can convert it to array
result = input.each_with_object.(Set.new) { |val, memo| memo.add(val) }.to_a
If you're not familiar with each_with_object, it's very similar to reduce
Regarding performance, you can find some info if you search for it, for example What is the fastest way to make a uniq array?
From a quick test, I see these performing in increasing time. uniq is 5 times faster than each_with_object, which is 25% slower than the Set.new approach. Probably because sort is implemetned using C. I only tested with only an arbitrary input though, so it might not be true for all cases.

Subtraction of two arrays with incremental indexes of the other array to a maximum limit

I have lots of math to do on lots of data but it's all based on a few base templates. So instead of say, when doing math between 2 arrays I do this:
results = [a[0]-b[1],a[1]-b[2],a[2]-b[3]]
I want to instead just put the base template: a[0]-b[1] and make it automatically fill say 50 places in the results array. So I don't always have to manually type it.
What would be the ways to do that? And would a good way be to create 1 method that does this automatically. And I just tell it the math and it fills out an array?
I have no clue, I'm really new to programming.
a = [2,3,4]
b = [1,2,3,4]
results = a.zip(b.drop(1)).take(50).map { |v,w| v - w }
Custom
a = [2,3,4..............,1000]
b = [1,2,3,4,.............900]
class Array
def self.calculate_difference(arr1,arr2,limit)
begin
result ||= Array.new
limit.send(:times) {|index| result << arr1[index]-arr2[index+=1]}
result
rescue
raise "Index/Limit Error"
end
end
end
Call by:
Array.calculate_difference(a,b,50)

How do I make multiple combinations with a string in ruby?

Input should be a string:
"abcd#gmail.com"
Output should be an Array of strings:
["abcd#gmail.com",
"a.bcd#gmail.com",
"ab.cd#gmail.com",
"abc.d#gmail.com",
"a.b.cd#gmail.com",
"a.bc.d#gmail.com",
"a.b.c.d#gmail.com"]
The idea: "Make every possible combination in the first string part ("abcd") with a dot. Consecutive dots are not allowed. There are no dots allowed in the beginning and in the end of the first string part ("abcd")"
This is what I've came up with so far:
text,s = "abcd".split""
i=0
def first_dot(text)
text.insert 1,"."
end
def set_next_dot(text)
i = text.rindex(".")
text.delete_at i
text.insert(i+1,".")
end
My approach was
write a function, that sets the first dot
write a function that sets the next dot
...(magic)
I do not know how to put the pieces together. Any Idea? Or perhaps a better way?
thanx in advance
edit:
I think I found the solution :)
I will post it in about one hour (it's brilliant -> truth tables, binary numbers, transposition)
...and here the solution
s = "abc"
states = s.length
possibilites = 2**states
def set_space_or_dot(value)
value.gsub("0","").gsub("1",".")
end
def fill_with_leading_zeros(val, states)
if val.length < states
"0"*(states-val.length)+val
else
val
end
end
a = Array.new(possibilites,s)
a = a.map{|x| x.split ""}
b = [*0...possibilites].map{|x| x.to_s(2).to_s}
b = b.map{|x| fill_with_leading_zeros x,states}
b = b.map{|x| x.split ""}
c = []
for i in 0 ... a.size
c[i] = (set_space_or_dot (a[i].zip b[i]).join).strip
end
Changing pduersteler answer a little bit:
possibilities = []
string = "abcd#example.com"
(string.split('#')[0].size-1).times do |pos|
possibility = string.dup
possibilities << possibility.insert(pos+1, '.')
end
How about this (probably needs a bit more fine-tuning to suit your needs):
s = "abcd"
(0..s.size-1).map do |i|
start, rest = [s[0..i], s[(i+1)..-1]]
(0..rest.size-1).map { |j| rest.dup.insert(j, '.') }.map { |s| "#{start}#{s}"}
end.flatten.compact
#=> ["a.bcd", "ab.cd", "abc.d", "ab.cd", "abc.d", "abc.d"]
An option would be to iterate n times through your string moving the dot, where n is the amount of chars minus 1. This is what you're doing right now, but without defining two methods.
Something like this:
possibilities = []
string = "abcd#example.com"
(string.split('#')[0].size-1).times do |pos|
possibilities << string.dup.insert(pos+1, '.')
end
edit
Now tested. THanks to the comments, you need to call .dup on the string before the insert. Otherwise, the dot gets inserted into the string and will stay there for each iteration causing a mess. Calling .dup onthe string will copy the string and works on the copy instead, leaving the original string untouched.

How to rewrite this Ruby loop in a cleaner fashion

I'm implementing a loop in Ruby, but it looks ugly and I wonder if there's a neater, more Ruby-like way of writing it:
def get_all_items
items = []; page = 1; page_items = nil
while page_items != [] # Loop runs until no more items are received
items += (page_items = get_page_items(page))
page += 1
end
items
end
Note that the get_page_items method runs a HTTP request to get the items for the page, and there is no way of knowing the number of pages, or the total number of items, or the number of items for any page before actually executing the requests in order until one of them returns an empty item set.
Imagine leafing through a catalog and writing down all the products, without knowing in advance how many pages it has, or how many products there are.
I think that this particular problem is compounded because A) there's no API for getting the total number of items and B) the response from get_page_items is always truthy. Further, it doesn't make sense for you to iteratively call a method that is surely making individual requests to your DB with an arbitrary limit, only to concatenate them together. You should, at the risk of repeating yourself, implement this method to prompt a DB query (i.e. model.all).
Normally when you are defining an empty collection, iterating and transforming a set, and then returning a result, you should be using reduce (a.k.a inject):
array.reduce(0) { |result, item| result + item } # a quick sum
Your need to do a form of streaming in this same process makes this difficult without tapping into Enumerable. I find this to be a good compromise that is much more readable, even if a bit distasteful in fondling this items variable too much:
items = []
begin
items << page_items = get_page_items(page ||= 1)
page += 1
end until page_items.empty?
items.flatten
Here's how I'd have written it. You'll see it's actually more lines, but it's easier to read and more Rubyish.
def get_all_items
items = []
page = 1
page_items = get_page_items page
until page_items.empty? # Loop runs until no more items are received
items += page_items
page += 1
page_items = get_page_items page
end
items
end
You could also implement get_page_items as an Enumerator which would eliminate the awkward page += 1 pattern but that might be overkill.
I don't know that this is any better, but it does have a couple of Ruby-isms in it:
def get_all_items
items = []; n = 0; page = 1
while items.push(*get_page_items(page)).length > n
page += 1
n = items.length
end
end
I would use this solution, which is a good compromise between readability and length:
def get_all_items
[].tap do |items|
page = 0
until (page_items = get_page_items(page)).empty?
items << page_items
page += 1
end
end
end
The short version, just for fun ;-)
i=[]; p=0; loop { i+=get_page_items(p+=1).tap { |r| return i if r.empty? } }
I wanted to write a functional solution which would closely resemble the task you want to achieve.
I'd say that your solution comes down to this:
For all page numbers from 1 on, you get the corresponding list of
items; Take lists while they are not empty, and join them into a
single array.
Sounds ok?
Now let's try to translate this, almost literally, to Ruby:
(1..Float::INFINITY). # For all page numbers from 1 on
map{|page| get_page_items page}. # get the corresponding list of items
take_while{|items| !items.empty?}. # Take lists while they are not empty
inject(&:+) # and join them into a single array.
Unfortunately, the above code won't work right away, as Ruby's map is not lazy, i.e. it would try to evaluate on all members of the infinite range first, before our take_while had the chance to peek at the values.
However, implementing a lazy map is not that hard at all, and it could be useful for other stuff. Here's one straightforward implementation, along with nice examples in the blog post.
module Enumerable
def lazy_map
Enumerator.new do |yielder|
self.each do |value|
yielder.yield(yield value)
end
end
end
end
Along with a mockup of your actual HTTP call, which returns arrays of random length between 0 and 4:
# This simulates actual HTTP call, sometimes returning an empty array
def get_page_items page
(1..rand(5)).to_a
end
Now we have all the needed parts to solve our problem easily:
(1..Float::INFINITY). # For all page numbers from 1 on
lazy_map{|page| get_page_items page}. # get the corresponding list of items
take_while{|items| !items.empty?}. # Take lists while they are not empty
inject(&:+) # and join them into a single array.
#=> [1, 1, 2, 3, 1]
It's a small (and almost entirely cosmetic) tweak, but one option would be to replace while page_items != [] with until page_items.empty?. It's a little more "Ruby-ish," in my opinion, which is what you're asking about.
def get_all_items
items = []; page = 0
items << page_items while (page_items = get_page_items(page += 1))
items
end

handling data structures (hashes etc) gracefully in ruby

I recently did a class assignment where I made a really hacky data structure. I ended up using nested hashes, which seems like a good idea, but is really hard to iterate through and manage.
I was doing general stuff, like one tag maps to a hash of items that map to prices and stuff like that. But some of them were getting more complicated.
I know that rails uses a lot of more elegant seeming stuff with symbols and such (which I never use shameful face) and I was wondering how I could optimize this. For example if I had my nested hashes something like this
h["cool"][????][1.2]
is there a graceful way of pulling those values out? Maybe I'm just a total newbie in this regard but I wanted to set things straight before I started doing more things. Maybe I'm even looking for something different like a mix of array/hash or something. Please let me know!
It looks like you need to think about structuring your data more rigorously. Try creating a class for your items, which can contain prices among other things, and perhaps organising them in the way you need to access them. Think about what you want and place the information in structures in a way that makes sense to you. Anything else is a waste of time, both now and three months down the line when you need to extend the system and find you can't.
Yes, it'll be quite a bit of work, and yes, it'll be worth it.
Edit: Revised to provide the rough path to the item. It can't know the name of the variable though.
Try this:
def iterate_nested(array_or_hash, depth = [], &block)
case array_or_hash
when Array:
array_or_hash.each_with_index do |item, key|
if item.class == Array || item.class == Hash
iterate_nested(item, depth + [key], &block)
else
block.call(key, item, depth + [key])
end
end
when Hash:
array_or_hash.each do |key, item|
if item.class == Array || item.class == Hash
iterate_nested(item, depth + [key], &block)
else
block.call(key, item, depth + [key])
end
end
end
end
It should iterate to any depth necessary, limited by memory, etc, and return the key and item and depth of the returned item. Works with both hashes and arrays.
If you test with:
iterate_nested([[[1,2,3], [1,2,3]], [[1,2,3], [1,2,3]], [[1,2,3], [1,2,3]]]) do |key, item, depth|
puts "Element: <#{depth.join('/')}/#{key}> = #{item}"
end
It yields:
Element: <0/0/0/0> = 1
Element: <0/0/1/1> = 2
Element: <0/0/2/2> = 3
Element: <0/1/0/0> = 1
Element: <0/1/1/1> = 2
Element: <0/1/2/2> = 3
Element: <1/0/0/0> = 1
Element: <1/0/1/1> = 2
Element: <1/0/2/2> = 3
Element: <1/1/0/0> = 1
Element: <1/1/1/1> = 2
Element: <1/1/2/2> = 3
Element: <2/0/0/0> = 1
Element: <2/0/1/1> = 2
Element: <2/0/2/2> = 3
Element: <2/1/0/0> = 1
Element: <2/1/1/1> = 2
Element: <2/1/2/2> = 3
Cheerio!
h["cool"].keys
to then iterate the tree would be
h["cool"].keys.each |outer| { h["cool"][outer].each { |inner| puts innerĀ }}
It really depends on what you're trying to do (nowhere near enough information in the question), but if you need to dive in three or more levels into a Hash, you may very well want a recursive tree traversal algorithm:
def hash_traverse(hash)
result = ""
for key, value in hash
result << key.to_s + ":\n"
if !value.kind_of?(Hash)
result << " " + value.to_s + "\n"
else
result << hash_traverse(value).gsub(/^/, " ")
end
end
return result
end
Are you sure a Hash is the best data structure for what you're trying to do?

Resources