Pushing to an array not working as expected - ruby

When I execute the code below, my array 'tasks' ends up with the same last row from the dbi call repeated for each row in the database.
require 'dbi'
require 'PP'
dbh = DBI.connect('DBI:SQLite3:test', 'test', '')
dbh.do("DROP TABLE IF EXISTS TASK;")
dbh.do("CREATE TABLE TASK(ID INT, NAME VARCHAR(20))")
# Insert two rows
1.upto(2) do |i|
sql = "INSERT INTO TASK (ID, NAME) VALUES (?, ?)"
dbh.do(sql, i, "Task #{i}")
end
sth = dbh.prepare('select * from TASK')
sth.execute
tasks = Array.new
while row=sth.fetch do
p row
p row.object_id
tasks.push(row)
end
pp(tasks)
sth.finish
So if I have two rows in my TASK table, then instead of getting this in the tasks array:
[[1, "Task 1"], [2, "Task 2"]]
I get this
[[2, "Task 2"], [2, "Task 2"]]
The full output looks like this:
[1, "Task 1"]
19877028
[2, "Task 2"]
19876728
[[2, "Task 2"], [2, "Task 2"]]
What am I doing wrong?

It seems there are some strange behavior in row objects wich seems to be some kind of singleton, and that's why dup method wont solve it.
Jumping into the source code it seems that the to_a method will duplicate the inner row elements and that's why it works so the answer is to use to_a on the row object or if you want you can also transform it into a Hash to preserve meta.
while row=sth.fetch do
tasks.push(row.to_a)
end
But I recommend the more ruby way
sth.fetch do |row|
tasks << row.to_a
end

Are you sure you have copied your code exactly as it is ? AFAIK the code you have written shouldn't work at all... You mix two constructs that are not intended to be used that way.
Am i wrong to assume that you come from a C or Java background ? Iteration in ruby is very different, let me try to explain.
A while loop in ruby has this structure :
while condition
# code to be executed as long as condition is true
end
A method with a block has this structure :
sth.fetch do |element|
# code to be executed once per element in the sth collection
end
Now there something really important to understand : fetch, or any other method of this kind in ruby, is not an iterator as you would encounter in C for example - you do not have to call it again an again until the iterator hits the end of the collection.
You just call it once, and give it a block as argument, which is a kind of anonymous function (as in javascript). The fetch method will then pass ("yield") each element of the collection, one after another, to this block.
So the correct syntax in your case should be :
sth.fetch do |row|
p row
tasks.push row
end
which could be otherwise written like this, in a more "old school" fashion :
# define a function
# = this is your block
def process( row )
p row
tasks.push row
end
# pass each element of a collection to this function
# = this is done inside the fetch method
for row in sth
process row
end
I would advise you to read more on blocks / procs / lambdas, because they are all over the place in ruby, and IMHO are one of the reasons this language is so awesome. Iterators is just the beginning, you can do a LOT more with these...If you need good reference docs, the pickaxe is considered one of the best sources among rubyists, and i can tell you more if you want.

I don't know how your code works entirely, but I guess if you change tasks.push(row) into tasks.push(row.dup), then it shall work. If that is the case, then sth.fetch keeps giving you the same array (same object id) each time even if its content is renewed, and you are pushing the same array into tasks repeatedly.

There are so many things that can be happening but try this.
First ensuring the block is passed to the while using parens.
while (row=sth.fetch) do
p row
tasks.push(row)
end
Then the idiomatic ruby way
sth.fetch do |row|
p row
tasks << row # same as push
end

Related

How to set a variable equivalent to single array object while iterating through array?

Noob question. I need to pass 3,000+ URLs from a CSV sheet to Selenium. I need Selenium to navigate to each one of these links, scrape information and then put that information into a CSV.
The issue I am running into is when I push my CSV URLS into an array, I cannot pass one single object (url) into Selenium at a time.
I know I likely need some sort of loop. I have tried setting up loops and selecting from the array using .map, .select. and just a do loop.
urls.map do |url|
#driver.navigate.to #{url}
name = #driver.find_element(:css, '.sites-embed-
footer>a').attribute('href')
puts name
kb_link = name
kb_array.push(kb_link)
puts 'urls is #{n}'
end
In the above example, Selenium returns an "invalid URL" error message. De-bugging with Pry tells me that my 'url' object is not a single url, but rather still the entire array.
How can I set Selenium to visit each URL from the array one by one?
EDIT: ----------------
So, after extensive de-bugging with Pry, I found a couple issues. First being that my CSV was feeding a nested array to my loop which was causing the URL error. I had to flatten my array and un-nest it to get around that issue.
After that, I had to build a rescue into my loop so that my script didn't die when it encountered a page without the CSS element I was looking for.
Here's the finalized loop.
begin
#urls1.each do |url|
#driver.navigate.to(url)
#driver.manage.timeouts.implicit_wait = 10
name = #driver.find_element(:css, '.sites-embed-
footer>a').attribute('href')
puts name
kb_link = name
kb_array.push(kb_link)
puts 'done'
rescue Selenium::WebDriver::Error::NoSuchElementError
puts 'no google doc'
x = 'no google doc'
kb_array.push(x)
next
end
What about using .each?
Example:
array = [1, 2, 3, 4, 5, 6]
array.each { |x| puts x }
In your code:
urls.each do |url|
#driver.navigate.to #{url}
name = #driver.find_element(:css, '.sites-embed-footer>a').attribute('href')
puts name
kb_link = name
kb_array.push(kb_link)
puts 'urls is #{n}'
end
First of all, it doesn't make much sense to use map if you don't use the result of the block somewhere. map, applied to an Enumerable, returns a new Array, and you don't do anything with the returned array (which in your case would contain just the return values of puts, which is usually nil, so you would get back just an array of nils with the side effect that something is written to stdout.
If you are only interested in the side effects, each or each_with_indexshould be used to traverse an Enumerable. Given the problems you have with map and with each, I wonder what is the actual content of your object urls. Did you ever inspect it? You could do a
p urls
before entering the loop. With 3000 URLs, the output will be huge, but maybe you can run it on a simpler example with less URLs.

Ruby refactoring issue

Let's say I have an #array = [1 ,2 ,3]
I want to create three blocks of each number of the array with the rest like this :
#array.each do |first|
(#array - [first]).each do |second|
(#array - [first] - [second]) do |third|
#do something
So each block iterates through the entire array except the element that's being used by the preceding block. But this is kind of smelly and repetitive, because for each block I need to delete the preceding elements , what is the best way to refactor these blocks?
Thanks
The terminology for that is a permutation. Ruby has a built-in method for doing so:
#array.permutation(3) do |first, second, third|
# do something
end

What's i in each_with_index block

Okay, so im reading a guide for ruby and I can't make sense of this code. Where did i come from. I see that n is passed to iterate through the block but I have no idea where I comes from. If I could get a full explanation and breakdown of how this code works that would be great!
class Array
def iterate!
self.each_with_index do |n, i|
self[i] = yield(n)
end
end
end
array = [1, 2, 3, 4]
array.iterate! do |n|
n ** 2
end
i is the index of the element (hence the name, each_with_index).
Some methods that are called with code blocks will pass more than one value to the block, so you end up with multiple block arguments (in your case the block arguments are n and i, which will hold the current item in the array (n) and the index of it (i)).
You can find out how many arguments a block will be passed by looking at the documentation for a method (here's the docs for each_with_index). It does look like the extra values come from nowhere at first, and it takes a little while to memorize what a block will be passed when different methods are called.
i is commonly used as what's known as an "iterative variable". Basically, the loop block that you've copied here goes through each "iteration" of the loop and uses a new value of i and assigns it to the variable n, which is then passed on to the operation at the second to last line. In this case, the new value is simply the next number in array, and so there are four iterations of the loop.

Can't convert String onto integer (TypeError)

Following code return error "Can't convert String onto integer", please help
subject = ['eng','Math','Sci']
grade = ['grade 1','grade 2','grade 3']
subject.each do |sub|
puts ("some string")
grade[sub] .each do |grd|
puts ("some string")
end
end
grade[sub] .each do |grd| thats the problem.
Array elements are accessed by using a index of integer or a range of integers.
You are trying to access a array element by using the variable stored in sub. Since this is a ordinary .each loop it will loop all the elements in the array, in this case 'eng','Math','Sci'. If you want the position of for example 'eng' you could use a .each_with_index
it should probably just be
grade.each do |grd|
with each_with_index it would be
subject.each_with_index do |sub, index|
print sub
print grade[index]
end
If you want a subject -> grade collection it might be good to look into using a Hash like Dave Newton said.
{"eng" => "grade 1","Math" => "grade 2","Sci" => "grade 3"}.each do |subject, grade|
puts "#{subject| #{grade}"
end
When you do
grade[sub] .each do |grd|
Ruby expects sub to be using an integer to reference a position in the list of grades.
subject and grade are both arrays. They can only be accessed by their position. In your each loop, the block gets the actual element (in sub), not the position. So in line 5, you are trying to access grade['eng'], which will not work and produces the error. Note that unlike in e.g. PHP, an array and a hash (an associative array) are different things.
Guessing from your code, you might want to use each_index instead of each which will pass the index number to the block instead of the element.
I'm not sure I understand what you're trying to achieve; however, if you'd like to print subjects and grades and you're sure about the relative order of elements in the arrays, you could do this:
subject = ['eng','Math','Sci']
grade = ['grade 1','grade 2','grade 3']
subject.each_with_index do |sub, idx|
puts "#{sub} - #{grade[idx]}"
end
Output:
eng - grade 1
math - grade 2
sci - grade 3
An hash is however probably more suitable to your needs.

Accessing a passed block in Ruby

I have a method that accepts a block, lets call it outer. It in turn calls a method that accepts another block, call it inner.
What I would like to have happen is for outer to call inner, passing it a new block which calls the first block.
Here's a concrete example:
class Array
def delete_if_index
self.each_with_index { |element, i| ** A function that removes the element from the array if the block passed to delete_if_index is true }
end
end
['a','b','c','d'].delete_if_index { |i| i.even? }
=> ['b','d']
the block passed to delete_if_index is called by the block passed to each_with_index.
Is this possible in Ruby, and, more broadly, how much access do we have to the block within the function that receives it?
You can wrap a block in another block:
def outer(&block)
if some_condition_is_true
wrapper = lambda {
p 'Do something crazy in this wrapper'
block.call # original block
}
inner(&wrapper)
else
inner(&passed_block)
end
end
def inner(&block)
p 'inner called'
yield
end
outer do
p 'inside block'
sleep 1
end
I'd say opening up an existing block and changing its contents is Doing it WrongTM, maybe continuation-passing would help here? I'd also be wary of passing around blocks with side-effects; I try and keep lambdas deterministic and have actions like deleting stuff in the method body. In a complex application this will likely make debugging a lot easier.
Maybe the example is poorly chosen, but your concrete example is the same as:
[1,2,3,4].reject &:even?
Opening up and modifying a block strikes me as code smell. It'd be difficult to write it in a way that makes the side effects obvious.
Given your example, I think a combination of higher order functions will do what you're looking to solve.
Update: It's not the same, as pointed out in the comments. [1,2,3,4].reject(&:even?) looks at the contents, not the index (and returns [1,3], not [2,4] as it would in the question). The one below is equivalent to the original example, but isn't vary pretty.
[1,2,3,4].each_with_index.reject {|element, index| index.even? }.map(&:first)
So here's a solution to my own question. The passed in block is implicitly converted into a proc which can be received with the & parameter syntax. The proc then exists inside the closure of any nested block, as it is assigned to a local variable in scope, and can be called by it:
class Array
def delete_if_index(&proc)
ary = []
self.each_with_index { |a, i| ary << self[i] unless proc.call(i) }
ary
end
end
[0,1,2,3,4,5,6,7,8,9,10].delete_if_index {|index| index.even?}
=> [1, 3, 5, 7, 9]
Here the block is converted into a proc, and assigned to the variable proc, which is then available within the block passed to each_with_index.

Resources