For an array, when I type:
puts array[0]
==> text
Yet when I type
puts array[0].to_s
==> ["text"]
Why the brackets and quotes? What am I missing?
ADDENDUM: my code looks like this
page = open(url) {|f| f.read }
page_array = page.scan(/regex/) #pulls partial urls into an array
partial_url = page_array[0].to_s
full_url = base_url + partial_url #adds each partial url to a consistent base_url
puts full_url
what I'm getting looks like:
http://www.stackoverflow/["questions"]
This print the array as is without brackets
array.join(", ")
to_s is just an alias to inspect for the Array class.
Not that this means a lot other than instead of expecting array.to_s to return a string it's actually returning array.inspect which, based on the name of the method, isn't really what you are looking for.
If you want just the "guts" try:
page_array.join
If there are multiple elements to the array:
page_array.join(" ")
This will make:
page_array = ["test","this","function"]
return:
"test this function"
What "to_s" on an Array returns, depends on the version of Ruby you are using as mentioned above. 1.9.X returns:
"[\"test\"]"
You need to show us the regex to really fix this properly, but this will do it:
Replace this
partial_url = page_array[0].to_s
with this
partial_url = page_array[0][0]
This doesn't necessarily fix why you are getting a doubled-up array, but you can flatten it and then call the first element like this.
page_array = page.scan(/regex/).flatten
Flattening takes out stacked arrays and creates one level, so if you had [1,2,[3,[4,5,6]]] and called flatten on it, you would get [1,2,3,4,5,6]
It is also more robust than doing array[0][0], because, if you had more than two arrays nested in the first element, you would run into the same issue.
Iain is correct though, without seeing the regex, we can't suss out the root cause.
Related
Noob question. I need to pass 3,000+ URLs from a CSV sheet to Selenium. I need Selenium to navigate to each one of these links, scrape information and then put that information into a CSV.
The issue I am running into is when I push my CSV URLS into an array, I cannot pass one single object (url) into Selenium at a time.
I know I likely need some sort of loop. I have tried setting up loops and selecting from the array using .map, .select. and just a do loop.
urls.map do |url|
#driver.navigate.to #{url}
name = #driver.find_element(:css, '.sites-embed-
footer>a').attribute('href')
puts name
kb_link = name
kb_array.push(kb_link)
puts 'urls is #{n}'
end
In the above example, Selenium returns an "invalid URL" error message. De-bugging with Pry tells me that my 'url' object is not a single url, but rather still the entire array.
How can I set Selenium to visit each URL from the array one by one?
EDIT: ----------------
So, after extensive de-bugging with Pry, I found a couple issues. First being that my CSV was feeding a nested array to my loop which was causing the URL error. I had to flatten my array and un-nest it to get around that issue.
After that, I had to build a rescue into my loop so that my script didn't die when it encountered a page without the CSS element I was looking for.
Here's the finalized loop.
begin
#urls1.each do |url|
#driver.navigate.to(url)
#driver.manage.timeouts.implicit_wait = 10
name = #driver.find_element(:css, '.sites-embed-
footer>a').attribute('href')
puts name
kb_link = name
kb_array.push(kb_link)
puts 'done'
rescue Selenium::WebDriver::Error::NoSuchElementError
puts 'no google doc'
x = 'no google doc'
kb_array.push(x)
next
end
What about using .each?
Example:
array = [1, 2, 3, 4, 5, 6]
array.each { |x| puts x }
In your code:
urls.each do |url|
#driver.navigate.to #{url}
name = #driver.find_element(:css, '.sites-embed-footer>a').attribute('href')
puts name
kb_link = name
kb_array.push(kb_link)
puts 'urls is #{n}'
end
First of all, it doesn't make much sense to use map if you don't use the result of the block somewhere. map, applied to an Enumerable, returns a new Array, and you don't do anything with the returned array (which in your case would contain just the return values of puts, which is usually nil, so you would get back just an array of nils with the side effect that something is written to stdout.
If you are only interested in the side effects, each or each_with_indexshould be used to traverse an Enumerable. Given the problems you have with map and with each, I wonder what is the actual content of your object urls. Did you ever inspect it? You could do a
p urls
before entering the loop. With 3000 URLs, the output will be huge, but maybe you can run it on a simpler example with less URLs.
I have an output from an API that look like this... (its a string)
[[2121212,212121,asd],[2323232,23232323,qasdasd]]
Its a string - not an array. I want to convert it to an array and then extract the first two elements in each array in the nested array to:
[2121212,212121],[2323232,23232323]
What's the best way to do this ruby? I could use regexp and extract - but basically the string is already an array, however the class is a string.
I tried
array.push(response)
but that just put the string in to the array as one element. I guess what would be nice is a to_array method
You will need to use regular expression anyway if not eval (shrudder...), this is the shortest one
str = "[[2121212,212121,asd],[2323232,23232323,qasdasd],[2424242,24242424,qasdasd]]"
p str.scan(/(\d+),(\d+)/)
=>[["2121212", "212121"], ["2323232", "23232323"], ["2424242", "24242424"]]
Assuming this is a JSON response (and if so, it is badly malformed and you should talk to the people that are responsible for this) you could write something like:
require 'json'
input= '[[2121212,212121,Asd],[2323232,23232323,qasdasd]]'
input.gsub!(/([A-Za-z ]+)/,'"\1"')
json = JSON.parse input
output = json.map{|x| x[0...2]}
p output
this prints
[[2121212, 212121], [2323232, 23232323]]
Using eval is very bad but I have no other easy option.
test_str = "[[2121212,212121,asd],[2323232,23232323,qasdasd]]"
test_str.gsub!(/([a-z]+)/) do
"'#{$1}'"
end
=> "[[2121212,212121,'asd'],[2323232,23232323,'qasdasd']]"
test_array = eval(test_str)
=> [[2121212, 212121, "asd"], [2323232, 23232323, "qasdasd"]]
test_array.each do |element|
element.delete(element.last)
end
=> [[2121212, 212121], [2323232, 23232323]]
I have an array in ruby, and I am setting the index to id of object like below.
My first question is:
This code works:
#array = Array.new(#objects.size)
for i in 0...#objects.size
#array[i] = #objects[i].value
end
but when I do:
#array[#objects[i].id] = #objects[i].value
it says:
undefined method [] for nil::NilClass
I tried putting 100 or 1000 instead of i to make sure it's not about "index out of range", but those worked, I tried converting id to int by using to_i even though it should already be an int, but it still doesn't work. I don't get it.
My second question is:
If I get to make the ids work, does saying Array.new(#objects.size) become usless?
I am not using indexes 0 to size but IDs, so what is happening? Is it initializing indexes 0...size to nil or is it just creating a space for up to x objects?
EDIT:
So I've been told it is better to use Hash for this, and I agree, But I still seem to have the same error in the same situation (just changed Array.new(#objects.size)toHash.new)
Thats not how Arrays work in Ruby. You can however use a hash to do this, and look them up using the method you want:
#lookup_hash = Hash.new
for i in 0...#objects.size
#lookup_hash[#objects[i].id] = #objects[i].value
end
Now you can do:
#lookup_hash[#some_object.id]
And it will return that object's value as you have stored it.
Additional Info
You could also rewrite your loop like this, since you dont need the index anymore:
#lookup_hash = Hash.new
#objects.each do |obj|
#lookup_hash[obj.id] = obj.value
end
A little bit more readable in my opinion.
Your're trying to use an array like a hash. Try this:
Hash[#objects.map{|o| [o.id, o.value] }]
Take a look at the Array and Hash documentations.
#array = #objects.map { |obj| obj.value }
You can, but you don't need to specify the size when creating an array. Anyway, try to use the functional capabilities of Ruby (map, select, inject) instead of C-like imperative loops.
You could use map to do this in a rubyish way:
#array = #objects.map { |o| o.value }
I'd like to understand how the following code works:
def url
#url ||= {
"basename" => self.basename,
"output_ext" => self.output_ext,
}.inject("/:basename/") { |result, token|
result.gsub(/:#{token.first}/, token.last)
}.gsub(/\/\//, "/")
end
I know what it does; somehow it returns the url corresponding to a file located o a dir on a server. So it returns strings similar to this: /path/to/my/file.html
I understand that if #url already has a value, it will be returned and the right ||= will be discarded. I also understand that this begins creating a hash of two elements.
I also think I understand the last gsub; it replaces backslashes by slashes (to cope with windows servers, I guess).
What amazes me is the inject part. I'm not able to understand it. I have used inject before, but this one is too much for me. I don't see how this be done with an each, since I don't understand what it does.
I modified the original function slightly for this question; the original comes from this jekyll file.
Cheers!
foo.inject(bar) {|result, x| f(result,x) }
Can always be written as:
result = bar
foo.each {|x| result = f(result, x)}
result
So for your case, the version with each would look like this:
result = "/:basename/"
{
"basename" => self.basename,
"output_ext" => self.output_ext,
}.each {|token|
result = result.gsub(/:#{token.first}/, token.last)
}
result
Meaning: for all key-value-pairs in the hash, each occurrence of the key in the "/:basename/" is replaced with the value.
Perhaps splitting the code and tweaking a little helps
options = { "basename" => self.basename, "output_ext" => self.output_ext }
options.inject("/:basename") do |result, key_and_kalue|
# Iterating over the hash yields an array of two elements, which I called key_and_value
result.gsub(":#{key_and_value[0]}", key_and_value[1])
end.gsub!(//\/\/, '/')
Basically, the inject code is iterating over all your options and replacing for the actual value wherever it sees a ":key"
Is it possible to define a block in an inline statement with ruby? Something like this:
tasks.collect(&:title).to_block{|arr| "#{arr.slice(0, arr.length - 1).join(", ")} and #{arr.last}" }
Instead of this:
titles = tasks.collect(&:title)
"#{titles.slice(0, titles.length - 1).join(", ")} and #{titles.last}"
If you said tasks.collect(&:title).slice(0, this.length-1) how can you make 'this' refer to the full array that was passed to slice()?
Basically I'm just looking for a way to pass the object returned from one statement into another one, not necessarily iterating over it.
You're kind of confusing passing a return value to a method/function and calling a method on the returned value. The way to do what you described is this:
lambda {|arr| "#{arr.slice(0, arr.length - 1).join(", ")} and #{arr.last}"}.call(tasks.collect(&:title))
If you want to do it the way you were attempting, the closest match is instance_eval, which lets you run a block within the context of an object. So that would be:
tasks.collect(&:title).instance_eval {"#{slice(0, length - 1).join(", ")} and #{last}"}
However, I would not do either of those, as it's longer and less readable than the alternative.
I'm not sure exactly what you're trying to do, but:
If you said tasks.collect(&:title).slice(0, this.length-1) how can you make 'this' refer to the full array that was passed to slice()?
Use a negative number:
tasks.collect(&:title)[0..-2]
Also, in:
"#{titles.slice(0, titles.length - 1).join(", ")} and #{titles.last}"
you've got something weird going on with your quotes, I think.
I don't really understand why you would want to, but you could add a function to the ruby classes that takes a block, and passes itself as a parameter...
class Object
def to_block
yield self
end
end
At this point you would be able to call:
tasks.collect(&:title).to_block{|it| it.slice(0, it.length-1)}
Of course, modifying the Object class should not be taken lightly as there can be serious consequences when combining with other libraries.
Although there are many good answers here, perhaps you're looking for something more like this in terms of an objective:
class Array
def andjoin(separator = ', ', word = ' and ')
case (length)
when 0
''
when 1
last.to_s
when 2
join(word)
else
slice(0, length - 1).join(separator) + word + last.to_s
end
end
end
puts %w[ think feel enjoy ].andjoin # => "think, feel and enjoy"
puts %w[ mitchell webb ].andjoin # => "mitchell and webb"
puts %w[ yes ].andjoin # => "yes"
puts %w[ happy fun monkeypatch ].andjoin(', ', ', and ') # => "happy, fun, and monkeypatch"