How to view error messages generated by dajaxice - ajax

I'm learning to use dajaxice, and I keep breaking things, but since only the ajax parts aren't working properly, the entire site doesn't break and generate error messages, so it's taking me forever to find my mistakes. Is there a way to track down where exactly the problem lies?

I use firebug in firefox, in the net tab, the django error messages appear under the response from the server.
Here is an ajax example:
If django / dajaxice encounters a bug in the code, a http 500 (internal server error) response is generated, which firefug highlights in red. The responses tend to be fairly large and the dajaxice responses are in html but with some scrolling the error message can always be found! I think Google Chrome offers a similar feature in developer tools.
The other handy trick is to use print statements in the dajaxice methods, for example:
#dajaxice_register
def updateText(request, objId, text):
print "updateText:: objId: %s text %s" % (objId, text)
t = TextItem.objects.get(id=objId)
t.text = text
print "updateText:: t.text: " % t.text
t.save()
json_return = simplejson.dumps({'text': text, 'objId': objId})
print "updateText:: json_return: %s" % json_return
return json_return
When running the app using ./manage.py runserver, the print statements show up immediately in the console, interspersed with the network requests to django, if one of the print statements does not apear than code execution did not get that far, or did not follow that branch of a conditional (if statement).
The print statements should be removed from your code once it is working correctly, as they will effect the speed of the app in a production environment. A longer term solution is to use the django.logging module.

Related

How to fix strings being frozen randomly

I'm running into an issue where my script is failing in random places with:
Error: can't modify frozen String: "Please use text available here - Test jira"
These lines caused the error:
description = "Please use text available here - #{#jira[:url]}"
unless previous_jira.nil?
description << <<~PREVIOUSJIRACREATED
Please close previous jira's:
#{previous_jira}
PREVIOUSJIRACREATED
end
I think it's a pretty simple line and I am not freezeing it on purpose or anything like that. But can't understand why I am getting the error.
I have string interpolation all over my code, and the script started failing randomly at different places. The above code is just one example. I was able to identify a couple of high offenders and placed begin/rescue blocks around them, but I am afraid my code is getting ugly with the rescue blocks all over.
I tried searching, but I only got articles explaining what freeze is and how to dup the object.
I am on Ruby 2.7.0p0.

How to avoid getting blocked by websites when using Ruby Mechanize for web crawling

I am successful scraping building data from a website (www.propertyshark.com) using a single address, but it looks like I get blocked once I use loop to scrape multiple addresses. Is there a way around this? FYI, the information I'm trying to access is not prohibited according to their robots.txt.
Codes for single run is as follows:
require 'mechanize'
class PropShark
def initialize(key,link_key)
##key = key
##link_key = link_key
end
def crawl_propshark_single
agent = Mechanize.new{ |agent|
agent.user_agent_alias = 'Mac Safari'
}
agent.ignore_bad_chunking = true
agent.verify_mode = OpenSSL::SSL::VERIFY_NONE
page = agent.get('https://www.google.com/')
form = page.forms.first
form['q'] = "#{##key}"
page = agent.submit(form)
page = form.submit
page.links.each do |link|
if link.text.include?("#{##link_key}")
if link.text.include?("PropertyShark")
property_page = link.click
else
next
end
if property_page
data_value = property_page.css("div.cols").css("td.r_align")[4].text # <--- error points to these commands
data_name = property_page.css("div.cols").css("th")[4].text
#result_hash["#{data_name}"] = data_value
else
next
end
end
end
return #result_hash
end
end #endof: class PropShark
# run
key = '41 coral St, Worcester, MA 01604 propertyshark'
key_link = '41 Coral Street'
spider = PropShark.new(key,key_link)
puts spider.crawl_propshark_single
I get the following errors but in an hour or two the error disappears:
undefined method `text' for nil:NilClass (NoMethodError)
When I use a loop using the above codes, I delay the process by having sleep 80 between addresses.
The first thing you should do, before you do anything else, is to contact the website owner(s). Right now, you actions could be interpreted anywhere between overly aggressive and illegal. As others have pointed out, the owners may not want you scraping the site. Alternatively, they may have an API or product feed available for this particular thing. Either way, if you are going to be depending on this website for your product, you may want to consider playing nice with them.
With that being said, you are moving through their website with all of the grace of an elephant in a china store. Between the abnormal user agent, unusual usage patterns from a single IP, and a predictable delay between requests, you've completely blown your cover. Consider taking a more organic path through the site, with a more natural human-emulation delay. Also, you should either disguise your useragent, or make it super obvious (Josh's Big Bad Scraper). You may even consider using something like Selenium, which uses a real browser, instead of Mechanize, to give away fewer hints.
You may also consider adding more robust error handling. Perhaps the site is under excessive load (or something), and the page you are parsing is not the desired page, but some random error page. A simple retry may be all you need to get that data in question. When scraping, a poorly-functioning or inefficient site can be as much of an impediment as deliberate scraping protections.
If none of that works, you could consider setting up elaborate arrays of proxies, but at that point you would be much better of using one of the many online Webscraping/API creating/Data extraction services that currently exist. They are fairly inexpensive and already do everything discussed above, plus more.
It is very likely nothing is "blocking" you. As you pointed out
property_page.css("div.cols").css("td.r_align")[4].text
is the problem. So lets focus on that line of code for a second.
Say the first time round your columns are columns = [1,2,3,4,5] well then rows[4] will return 5 (the element at index 4).
No for fun let's assume the next go around your columns are columns = ['a','b','c','d'] well then rows[4] will return nil because there is nothing at the fourth index.
This appears to be your case where sometimes there are 5 columns and sometimes there are not. Thus leading to nil.text and the error you are recieving

What is the best way to get keyboard events (input without press 'enter') in a Ruby console application?

I've been looking for this answer in the internet for a while and have found other people asking the same thing, even here. So this post will be a presentation of my case and a response to the "solutions" that I have found.
I am such new in Ruby, but for learning purposes I decided to create a gem, here.
I am trying to implement a keyboard navigation to this program, that will allow the user use short-cuts to select what kind of request he want to see. And in the future, arrow navigations, etc.
My problem: I can't find a consistent way to get the keyboard events from the user's console with Ruby.
Solutions that I have tried:
Highline gem: Seems do not support this feature anymore. Anyway it uses the STDIN, keep reading.
STDIN.getch: I need to run it in a parallel loop, because at the same time that the user can use a short-cut, more data can be created and the program needs to show it. And well, I display formated text in the console, (Rails log). When this loop is running, my text lost the all the format.
Curses: Cool but I need to set position(x,y) to display my text every time? It will get confusing.
Here is where I am trying to do it.
You may note that I am using "stty -raw echo" (turns raw off) before show my text and "stty raw -echo" (turns raw on) after. That keeps my text formated.
But my key listener loop is not working. I mean, It works in sometimes but is not consistent. If a press a key twice it don't work anymore and sometimes it stops alone too.
Let me put one part of the code here:
def run
# Two loops run in parallel using Threads.
# stream_log loops like a normal stream in the file, but it also parser the text.
# break it into requests and store in #requests_queue.
# stream_parsed_log stream inside the #requests_queue and shows it in the screen.
#requests_queue = Queue.new
#all_requests = Array.new
# It's not working yet.
Thread.new { listen_keyboard }
Thread.new { stream_log }
stream_parsed_log
end
def listen_keyboard
# not finished
loop do
char = STDIN.getch
case char
when 'q'
puts "Exiting."
exit
when 'a'
#types_to_show = ['GET', 'POST', 'PUT', 'DELETE', 'ASSET']
requests_to_show = filter_to_show(#all_requests)
command = true
when 'p'
#types_to_show = ['POST']
requests_to_show = filter_to_show(#all_requests)
command = true
end
clear_screen if command
#requests_queue += requests_to_show if command
command = false
end
end
I need a light in my path, what should I do?
That one was my mistake.
It's just a logic error in another part of code that was running in another thread so the ruby don't shows the error by default. I used ruby -d and realized what was wrong. This mistake was messing my keyboard input.
So now it's fixed and I am using STDIN.getch with no problem.
I just turn the raw mode off before show any string. And everything is ok.
You can check here, or in the gem itself.
That's it.

How to handle security alert in Firefox with Selenium

I'm using selenium-webdriver with ruby to write automated tests.
Chrome and the chromedriver binary work really well, but I have an issue with Firefox that is related to the configuration of the browser and that's making my tests fail, whereas they pass with Chrome.
When executing the tests in Firefox, sometimes I get an alert with this message:
Although this page is encrypted, the information you have entered is
to be sent over an unencrypted connection and could easily be read by
a third party
And it breaks the execution. Is there a way of disabling this warning in recent Firefox versions (10+) or handling this behavior with Selenium?
In the process of finding a solution for this, I think I might have found a bug in Capybara (v1.1.2).
I managed to get around this problem using the following approach, instead of using the click from capybara (which would not allow me to capture an exception), I started using the click method from selenium-webdriver.
It seems that when Firefox triggers this alertbox, a string with the message
Although this page is encrypted, the information you have entered is
to be sent over an unencrypted connection and could easily be read by
a third party
is returned as a result of object.native.click, otherwise the string
ok
is returned.
# Define the click method in ruby and call it when you're executing a 'click'
# Public: Click the object.
#
# object - Capybara::Node::Element object to click.
def click object
return_string = object.native.click
check_firefox_alertbox if return_string == "ok"
end
def check_firefox_alertbox
if #current_browser == :firefox
#session.driver.browser.switch_to.alert.accept
end
rescue Exception => e
puts e
end
Here is what you can do. Type about:config in the firefox. You would be presented a number of options (once you pass through a warning message).
You have to look for security.warn_leaving_secure; and security.warn_leaving_insecure. Make both of them false. And you would be good to go.
Please note: This would work only on the FF instance you have made modification to, so you will need to use firefox profile to launch this instance.
Hope this helps.
Actually this meant to be a comment but I need to go above 50 in order to be able to comment..I suppose by 'breaking' the execution you mean that of the Ruby Script right? What happens to Firefox? Needs a click to proceed? If that is the case you can improvise by capturing the Ruby error after inserting the sensitive code (where it breaks) between a BEGIN and a RESCUE clause..Something like this..
BEGIN
.
.
Code that is giving you a headache
.
.
RESCUE
.
Capture the exception and give Ruby a chance to continue the script normally.
.
.
END
Alternatively if you don't fancy the above solution you can go to Firefox and then type in the address box about:config. Filter by 'security.warn' and set to false all the boolean variables you see there. Good riddance, fingers crossed ;)

Issues with Sinatra and Heroku

So I've created and published a Sinatra app to Heroku without any issues. I've even tested it locally with rackup to make sure it functions fine. There are a series of API calls to various places after a zip code is consumed from the URL, but Heroku just wants to tell me there is an server error.
I've added an error page that tries to give me more description, however, it tells me it can't perform a `count' for #, which I assume means hash. Here's the code that I think it's trying to execute...
if weather_doc.root.elements["weather"].children.count > 1
curr_temp = weather_doc.root.elements["weather/current_conditions/temp_f"].attributes["data"]
else
raise error(404, "Not A Valid Zip Code!")
end
If anyone wants to bang on it, it can be reached at, http://quiet-journey-14.heroku.com/ , but there's not much to be had.
Hash doesn't have a count method. It has a length method. If # really does refer to a hash object, then the problem is that you're calling a method that doesn't exist.
That # doesn't refer to Hash, it's the first character of #<Array:0x2b2080a3e028>. The part between the < and > is not shown in browsers (hiding the tags themselves), but visible with View Source.
Your real problem is not related to Ruby though, but to your navigation in the HTML or XML document (via DOM). Your statement
weather_doc.root.elements["weather"].children.count > 1
navigates the HTML/XML document, selecting the 'weather' elements, and (tries to) count the children. The result of the children call does not have a method count. Use length instead.
BTW, are you sure that the document contains a tag <weather>? Because that's what your're trying to select.
If you want to see what's behind #, try
raise probably_hash.class.to_s

Resources