I've currently got a system that involves quite a lot of new class instances, so I've had to assign them using an array, as was suggested here: Create and initialize instances of a class with sequential names
However, I'll have to be constantly adding new instances whenever a new one appears, without overwriting existing ones. Might some validation and a modified version of my existing code be the best option?
This is my code, currently every time it runs, the existing data is overwritten. I want the status to be overwritten if it's changed, but I also want to be able to store one or two variables in there permanently.
E2A: Ignore the global variables, they're just there for testing.
$allids = []
$position = 0 ## Set position for each iteration
$ids.each do |x| ## For each ID, do
$allids = ($ids.length).times.collect { MyClass.new(x)} ## For each ID, make a new class instance, as part of an array
$browser.goto("http://www.foo.com/#{x}") ## Visit next details page
thestatus = Nokogiri::HTML.parse($browser.html).at_xpath("html/body/div[2]/div[3]/div[2]/div[3]/b/text()").to_s ## Grab the ID's status
theamount = Nokogiri::HTML.parse($browser.html).at_xpath("html/body/div[2]/div[3]/div[2]/p[1]/b[2]/text()").to_s ## Grab a number attached to the ID
$allids[$position].getdetails(thestatus, theamount) ## Passes the status to getdetails
$position += 1 ## increment position for next iteration
E2A2: Gonna paste this from my comment:
Hmm, I was just thinking, I started off by making the previous values dump into another variable, then another variable grabs the new values, and iterates over them to see if any match the previous values. That's quite a messy way to do it though, I was thinking, would self.create with a ||= work? – Joe 7 mins ago
If I understand you correctly, you need to store status and amount for each ID, right? If so, then something like this would help you:
# I'll store nested hash with class instance, status and amount for each id in processed_ids var
$processed_ids = {}
$ids.each do |id|
processed_ids[id] ||= {} #
processed_ids[id][:instance] ||= MyClass.new(id)
processed_ids[id][:status] = get_status # Nokogiri method
processed_ids[id][:amount] = get_amount # Nokogiri method
What does this code do: it only once creates instance of your class for each id, but always updates its status and amount.
I'm building a site with users in all 50 states. We need to display information for each user that is specific to their situation, e.g., the number of events they completed in that state. Each state's view (a partial) displays state-specific information and, therefore, relies upon state-specific calculations in a state-specific model. We'd like to do something similar to this:
##{user.state} = #{user.state.capitalize}.new(current_user)
in the users_controller instead of
#illinois = Illinois.new(current_user) if (#user.state == 'illinois')
.... [and the remaining 49 states]
#wisconsin = Wisconsin.new(current_user) if (#user.state == 'wisconsin')
to trigger the Illinois.rb model and, in turn, drive the view defined in the users_controller by
def user_state_view
#user = current_user
#events = Event.all
#illinois = Illinois.new(current_user) if (#user.state == 'illinois')
I'm struggling to find a better way to do this / refactor it. Thanks!
I would avoid dynamically defining instance variables if you can help it. It can be done with instance_variable_set but it's unnecessary. There's no reason you need to define the variable as #illinois instead of just #user_state or something like that. Here is one way to do it.
First make a static list of states:
def states
%{wisconsin arkansas new_york etc}
then make a dictionary which maps those states to their classes:
def state_classes
states.reduce({}) do |memo, state|
memo[state] = state.camelize.constantize
# = { 'illinois' => Illinois, 'wisconsin' => Wisconsin, 'new_york' => NewYork, etc }
It's important that you hard-code a list of state identifiers somewhere, because it's not a good practice to pass arbitrary values to contantize.
Then instantiating the correct class is a breeze:
#user_state = state_classes[#user.state].new(current_user)
there are definitely other ways to do this (for example, it could be added on the model layer instead)
In OpsWorks Stacks, I have set a layer attribute using the custom JSON field:
"layer_apps" : [
The app_ portion of the attribute is necessary for the workflow. At times, I need to temporarily remove the app_ portion within a cookbook. To do this, I use slice!:
node['layer_apps'].each do |app_name|
install_certs_app_name = app_name
install_certs_app_name.slice!('app_') # 'app_manager' => 'manager'
# snip
However, once this is done, even though app_name isn't being directly modified, each node['layer_apps'] attribute gets sliced, which carries on to subsequent cookbooks and causes failures. The behaviour I expected was that slice! would modify app_name, and not the current node['layer_apps'] attribute. Thinking that app_name was a link to the attribute rather than being it's own variable, I tried assigning its value to a separate variable (install_certs_app_name and similar in other cookbooks), but the behaviour persisted.
Is this expected behaviour in Ruby/Chef? Is there a better way to be excluding the app_ prefix from the attribute?
app_name is being directly modified. That's the reason for the bang ! after the method... so that you're aware that the method mutates the object.
and app_name and install_certs_app_name are referencing the same object.
Note that slice and slice! both return "app_" but the bang object mutates the caller by removing the sliced text.
If you did
result = install_certs_app_name.slice!('app_')
puts result
==> app_
puts install_certs_app_name
--> manager
Try (instead)
install_certs_app_name = app_name.dup
So you have two separate objects.
install_certs_app_name = app_name.sub('app_', '')
In case you'd want a variable sliced, what you'll is the non-destructive version:
str.slice and not str.slice!
These are often referred to as Bang-methods, and replace the variable in place.
Below is an example with the .downcase method. This is the same principle for .slice.
However, since .slice returns the part that's been cut out, you could just remove the app_-part .sub like
"app_manager".sub("app_",'') #=> "manager"
When you assigning app_name to install_certs_app_name you still referencing to the same object. In order to create new object you can do:
install_certs_app_name = app_name.dup
New object with the same value is created. And slicing install_certs_app_name does not affect app_name this way.
I'm working on a web-scraping solution that grabs totally different webpages and lets the user define rules/scripts in order to extract information from the page.
I started scraping from a single domain and build a parser based on Nokogiri.
Basically everything works fine.
I could now add a ruby class each time somebody wants to add a webpage with a different layout/style.
Instead I thought about using an approach where the user specifies elements where content is stored using xpath and storing this as a sort of recipe for this webpage.
Example: The user wants to scrape a table-structure extracting the rows using a hash (column-name => cell-content)
I was thinking about writing a ruby function for extraction of this generic table information once:
# extracts a table's rows as an array of hashes (column_name => cell content)
# html - the html-file as a string
# xpath_table - specifies the html table as xpath which hold the data to be extracted
def basic_table(html, xpath_table)
xpath_headers = "#{xpath_table}/thead/tr/th"
html_doc = Nokogiri::HTML(html)
html_doc = Nokogiri::HTML(html)
row_headers = html_doc.xpath(xpath_headers)
row_headers = row_headers.map do |column|
row_contents = Array.new
table_rows = html_doc.xpath('#{xpath_table}/tbody/tr')
table_rows.each do |table_row|
cells = table_row.xpath('td')
cells = cells.map do |cell|
row_content_hash = Hash.new
cells.each_with_index do |cell_string, column_index|
row_content_hash[row_headers[column_index]] = cell_string
row_contents << [row_content_hash]
return row_contents
The user could now specify a website-recipe-file like this:
<basic_table xpath='//div[#id="grid"]/table[#id="displayGrid"]'
The function basic_table is referenced here, so that by parsing the website-recipe-file I would know that I can use the function basic_table to extract the content from the table referenced by the xPath.
This way the user can specify simple recipe-scripts and only has to dive into writing actual code if he needs a new way of extracting information.
The code would not change every time a new webpage needs to be parsed.
Whenever the structure of a webpage changes only the recipe-script would need to be changed.
I was thinking that someone might be able to tell me how he would approach this. Rules/rule engines pop into my mind, but I'm not sure if that really is the solution to my problem.
Somehow I have the feeling that I don't want to "invent" my own solution to handle this problem.
Does anybody have a suggestion?
I saw several variations of this question but did not really find a solid answer.
So I have an array of URLS. I want to loop through that array and for each individual URL, I would create an instance of class WebPages.
So if array URLS has 5 urls in it, then I would create 5 objects of WebPages. I tried to use eval() to do this but quickly learned that the instances made by eval have a very local scope and I cannot use those WebPage objects after.
string_to_eval = #urls.map{|x| "webpage#{urls.index(x)} = WebPage.new('# {x}')"}.join(';')
puts string_to_eval
String_to_eval prints out:
webpage0 = WebPage.new('http://www.google.com');
webpage1 = WebPage.new('http://www.yahoo.com');
webpage2 = WebPage.new('http://www.amazon.com');
webpage3 = WebPage.new('http://www.ebay.com')
How else can I make an object with each iteration of the loop in Ruby? Is there a way around this?
Why not just this?
webpages = #urls.map { |url| WebPage.new(url) }
It is generally a bad idea to have webpage0, webpage1... when you can have webpages[0], webpages[1]... (Also, the array way does not require the Evil of eval.)
In this situation I would forgo unique variable names and instead simply leave the resulting objects in an array. In that case the code would look like this:
>> #urls.map{|url| WebPage.new(url)}
=> [WebPage('http://www.google.com'), WebPage('http://www.yahoo.com'), WebPage('http://www.amazon.com'), WebPage('http://www.ebay.com') ]
How do I create an object if one is not found? This is the query I was running:
#event_object = #event_entry.event_objects.find_all_by_plantype('dog')
and I was trying this:
#event_object = EventObject.new unless #event_entry.event_objects.find_all_by_plantype('dog')
but that does not seem to work. I know I'm missing something very simple like normal :( Thanks for any help!!! :)
find_all style methods return an array of matching records. That is an empty array if no matching records are found. And an empty is truthy. Which means:
arr = []
if arr
puts 'arr is considered turthy!' # this line will execute
Also, the dynamic finder methods (like find_by_whatever) are officially depreacted So you shouldn't be using them.
You probably want something more like:
#event_object = #event_entry.event_objects.where(plantype: 'dog').first || EventObject.new
But you can also configure the event object better, since you obviously want it to belong to #event_entry.
#event_object = #event_entry.event_objects.where(plantype: 'dog').first
#event_object ||= #event_entry.event_objects.build(plantype: dog)
In this last example, we try to find an existing object by getting an array of matching records and asking for the first item. If there are no items, #event_object will be nil.
Then we use the ||= operator that says "assign the value on the right if this is currently set to a falsy value". And nil is falsy. So if it's nil we can build the object form the association it should belong to. And we can preset it's attributes while we are at it.
Why not use built in query methods like find_or_create_by or find_or_initialize_by
#event_object = #event_entry.event_objects.find_or_create_by(plantype:'dog')
This will find an #event_entry.event_object with plantype = 'dog' if one does not exist it will then create one instead.
find_or_initialize_by is probably more what you want as it will leave #event_object in an unsaved state with just the association and plantype set
#event_object = #event_entry.event_objects.find_or_initialize_by(plantype:'dog')
This assumes you are looking for a single event_object as it will return the first one it finds with plantype = 'dog'. If more than 1 event_object can have the plantype ='dog' within the #event_entry scope then this might not be the best solution but it seems to fit with your description.