What is the proper ruby way to redo a conditional? - ruby

The task is to check if a contact page exists and navigate to it. For the websites not in english, the method looks for an english page and then restarts to check for a contact page.
My conditional works fine, but I figured there must be a better way to do this:
# First, I set the #url variable during Booleans.
# Checks are either to see if a link exists or if a page exists,
# (aka no 404 error).
#
# Here are two examples:
# Boolean, returns true if contact link is present.
def contact_link?
#url = link_with_href('contact')
!#url.nil?
end
# True if contact page '../contact' does NOT get a 404 error.
def contact_page?
#url = page.uri.merge('../contact').to_s
begin
true if Mechanize.new.get(#url)
rescue Mechanize::ResponseCodeError
false
end
end
# #
# Now go to the correct page, based off of checks.
#
def go_to_contact_page
1.times do
case # No redo necessary.
when contact_link? # True if hyperlink exists
get(#url)
when contact_page? # False if 404 error
get(#url)
else # Redo is now necessary.
if english_link? # True if hyperlink exists
get(#url)
redo
elsif en_page? # False if 404 error
get(#url)
redo
elsif english_page? # False if 404 error
redo
end
end
end
end
There are a couple things to draw your attention to:
Is 1.times do the best way to do a single redo? Would begin be better?
Understanding that I set the #url variable in each of these checks, there seems to be redundancy in get(#url) in the conditional branch. Is there a more succinct way?
I am writing redo three times which also seems redundant. Is there a way to call it once and still set the #url variable?
Thanks for the help!

Something like this is more readable and dry
def english_contact_page
..
rescue
nil
end
def contact_page
..
rescue
nil
end
def get_page
#url = link_with_href('contact')
return nil if #url.nil?
contact_page || english_contact_page # left side is evaluated first
rescue
nil
end

Related

Increasing Ruby Resolv Speed

Im trying to build a sub-domain brute forcer for use with my clients - I work in security/pen testing.
Currently, I am able to get Resolv to look up around 70 hosts in 10 seconds, give or take and wanted to know if there was a way to get it to do more. I have seen alternative scripts out there, mainly Python based that can achieve far greater speeds than this. I don't know how to increase the number of requests Resolv makes in parallel, or if i should split the list up. Please note I have put Google's DNS servers in the sample code, but will be using internal ones for live usage.
My rough code for debugging this issue is:
require 'resolv'
def subdomains
puts "Subdomain enumeration beginning at #{Time.now.strftime("%H:%M:%S")}"
subs = []
domains = File.open("domains.txt", "r") #list of domain names line by line.
Resolv.new(:nameserver => ['8.8.8.8', '8.8.4.4'])
File.open("tiny.txt", "r").each_line do |subdomain|
subdomain.chomp!
domains.each do |d|
puts "Checking #{subdomain}.#{d}"
ip = Resolv.new.getaddress "#{subdomain}.#{d}" rescue ""
if ip != nil
subs << subdomain+"."+d << ip
end
end
end
test = subs.each_slice(4).to_a
test.each do |z|
if !z[1].nil? and !z[3].nil?
puts z[0] + "\t" + z[1] + "\t\t" + z[2] + "\t" + z[3]
end
end
puts "Finished at #{Time.now.strftime("%H:%M:%S")}"
end
subdomains
domains.txt is my list of client domain names, for example google.com, bbc.co.uk, apple.com and 'tiny.txt' is a list of potential subdomain names, for example ftp, www, dev, files, upload. Resolv will then lookup files.bbc.co.uk for example and let me know if it exists.
One thing is you are creating a new Resolv instance with the Google nameservers, but never using it; you create a brand new Resolv instance to do the getaddress call, so that instance is probably using some default nameservers and not the Google ones. You could change the code to something like this:
resolv = Resolv.new(:nameserver => ['8.8.8.8', '8.8.4.4'])
# ...
ip = resolv.getaddress "#{subdomain}.#{d}" rescue ""
In addition, I suggest using the File.readlines method to simplify your code:
domains = File.readlines("domains.txt").map(&:chomp)
subdomains = File.readlines("tiny.txt").map(&:chomp)
Also, you're rescuing the bad ip and setting it to the empty string, but then in the next line you test for not nil, so all results should pass, and I don't think that's what you want.
I've refactored your code, but not tested it. Here is what I came up with, and may be clearer:
def subdomains
puts "Subdomain enumeration beginning at #{Time.now.strftime("%H:%M:%S")}"
domains = File.readlines("domains.txt").map(&:chomp)
subdomains = File.readlines("tiny.txt").map(&:chomp)
resolv = Resolv.new(:nameserver => ['8.8.8.8', '8.8.4.4'])
valid_subdomains = subdomains.each_with_object([]) do |subdomain, valid_subdomains|
domains.each do |domain|
combined_name = "#{subdomain}.#{domain}"
puts "Checking #{combined_name}"
ip = resolv.getaddress(combined_name) rescue nil
valid_subdomains << "#{combined_name}#{ip}" if ip
end
end
valid_subdomains.each_slice(4).each do |z|
if z[1] && z[3]
puts "#{z[0]}\t#{z[1]}\t\t#{z[2]}\t#{z[3]}"
end
end
puts "Finished at #{Time.now.strftime("%H:%M:%S")}"
end
Also, you might want to check out the dnsruby gem (https://github.com/alexdalitz/dnsruby). It might do what you want to do better than Resolv.
[Note: I've rewritten the code so that it fetches the IP addresses in chunks. Please see https://gist.github.com/keithrbennett/3cf0be2a1100a46314f662aea9b368ed. You can modify the RESOLVE_CHUNK_SIZE constant to balance performance with resource load.]
I've rewritten this code using the dnsruby gem (written mainly by Alex Dalitz in the UK, and contributed to by myself and others). This version uses asynchronous message processing so that all requests are being processed pretty much simultaneously. I've posted a gist at https://gist.github.com/keithrbennett/3cf0be2a1100a46314f662aea9b368ed but will also post the code here.
Note that since you are new to Ruby, there are lots of things in the code that might be instructive to you, such as method organization, use of Enumerable methods (e.g. the amazing 'partition' method), the Struct class, rescuing a specific Exception class, %w, and Benchmark.
NOTE: LOOKS LIKE STACK OVERFLOW ENFORCES A MAXIMUM MESSAGE SIZE, SO THIS CODE IS TRUNCATED. GO TO THE GIST IN THE LINK ABOVE FOR THE COMPLETE CODE.
#!/usr/bin/env ruby
# Takes a list of subdomain prefixes (e.g. %w(ftp xyz)) and a list of domains (e.g. %w(nytimes.com afp.com)),
# creates the subdomains combining them, fetches their IP addresses (or nil if not found).
require 'dnsruby'
require 'awesome_print'
RESOLVER = Dnsruby::Resolver.new(:nameserver => %w(8.8.8.8 8.8.4.4))
# Experiment with this to get fast throughput but not overload the dnsruby async mechanism:
RESOLVE_CHUNK_SIZE = 50
IpEntry = Struct.new(:name, :ip) do
def to_s
"#{name}: #{ip ? ip : '(nil)'}"
end
end
def assemble_subdomains(subdomain_prefixes, domains)
domains.each_with_object([]) do |domain, subdomains|
subdomain_prefixes.each do |prefix|
subdomains << "#{prefix}.#{domain}"
end
end
end
def create_query_message(name)
Dnsruby::Message.new(name, 'A')
end
def parse_response_for_address(response)
begin
a_answer = response.answer.detect { |a| a.type == 'A' }
a_answer ? a_answer.rdata.to_s : nil
rescue Dnsruby::NXDomain
return nil
end
end
def get_ip_entries(names)
queue = Queue.new
names.each do |name|
query_message = create_query_message(name)
RESOLVER.send_async(query_message, queue, name)
end
# Note: although map is used here, the record in the output array will not necessarily correspond
# to the record in the input array, since the order of the messages returned is not guaranteed.
# This is indicated by the lack of block variable specified (normally w/map you would use the element).
# That should not matter to us though.
names.map do
_id, result, error = queue.pop
name = _id
case error
when Dnsruby::NXDomain
IpEntry.new(name, nil)
when NilClass
ip = parse_response_for_address(result)
IpEntry.new(name, ip)
else
raise error
end
end
end
def main
# domains = File.readlines("domains.txt").map(&:chomp)
domains = %w(nytimes.com afp.com cnn.com bbc.com)
# subdomain_prefixes = File.readlines("subdomain_prefixes.txt").map(&:chomp)
subdomain_prefixes = %w(www xyz)
subdomains = assemble_subdomains(subdomain_prefixes, domains)
start_time = Time.now
ip_entries = subdomains.each_slice(RESOLVE_CHUNK_SIZE).each_with_object([]) do |ip_entries_chunk, results|
results.concat get_ip_entries(ip_entries_chunk)
end
duration = Time.now - start_time
found, not_found = ip_entries.partition { |entry| entry.ip }
puts "\nFound:\n\n"; puts found.map(&:to_s); puts "\n\n"
puts "Not Found:\n\n"; puts not_found.map(&:to_s); puts "\n\n"
stats = {
duration: duration,
domain_count: ip_entries.size,
found_count: found.size,
not_found_count: not_found.size,
}
ap stats
end
main

RubyDNS otherwise not working

I am using RubyDNS.
When I use match block and otherwise, I want to skip some addresses in match so they would be caught by otherwise block.
But it doesn't go to otherwise block.
RubyDNS.run_server(listen: INTERFACES, asynchronous: false) do
upstream = RubyDNS::Resolver.new([[:udp, "8.8.8.8", 53], [:tcp, "8.8.8.8", 53]])
match(/^([\d\.]+)\.in-addr\.arpa$/, IN::PTR) do |transaction, match_data|
domain = nil # just for test
if domain
transaction.respond!(Name.create(domain))
else
# Pass the request to the otherwise handler
# !!! this doesn't work
false
end
end
otherwise do |transaction|
transaction.passthrough!(upstream)
end
end
When I return false from match block - it doesn't go to otherwise block.
How to fix this?
I found how to continue to otherwise block from match block: use 'next!'
match(/^([\d\.]+)\.in-addr\.arpa$/, IN::PTR) do |transaction, match_data|
domain = nil # just for test
if domain
transaction.respond!(Name.create(domain))
else
# Pass the request to the otherwise handler
next!
end
end
otherwise do |transaction|
transaction.passthrough!(upstream)
end

Structuring Nokogiri output without HTML tags

I got Ruby to travel to a web site, iterate through a list of campaigns and scrape the pages for specific data. The problem I have now is getting it from the structure Nokogiri gives me, and outputting it into a readable form.
campaign_list = Array.new
campaign_list.push(1042360, 1042386, 1042365, 992307)
browser = Watir::Browser.new :chrome
browser.goto '<redacted>'
browser.text_field(:id => 'email').set '<redacted>'
browser.text_field(:id => 'password').set '<redacted>'
browser.send_keys :enter
file = File.new('hourlysales.csv', 'w')
data = {}
campaign_list.each do |campaign|
browser.goto "<redacted>"
if browser.text.include? "Application Error"
puts "Error loading page, I recommend restarting script"
# Possibly automatic restart of script
else
hourly_data = Nokogiri::HTML.parse(browser.html).text
# file.write data
puts hourly_data
end
This is the output I get:
{"views":[[17,145],[18,165],[19,99],[20,71],[21,31],[22,26],[23,10],[0,15],[1,1], [2,18],[3,19],[4,35],[5,47],[6,44],[7,67],[8,179],[9,141],[10,112],[11,95],[12,46],[13,82],[14,79],[15,70],[16,103]],"orders":[[17,10],[18,9],[19,5],[20,1],[21,1],[22,0],[23,0],[0,1],[1,0],[2,1],[3,0],[4,1],[5,2],[6,1],[7,5],[8,11],[9,6],[10,5],[11,3],[12,1],[13,2],[14,4],[15,6],[16,7]],"conversion_rates":[0.06870229007633588,0.05442176870748299,0.050505050505050504,0.014084507042253521,0.03225806451612903,0.0,0.0,0.06666666666666667,0.0,0.05555555555555555,0.0,0.02857142857142857,0.0425531914893617,0.022727272727272728,0.07462686567164178,0.06134969325153374,0.0425531914893617,0.044642857142857144,0.031578947368421054,0.021739130434782608,0.024390243902439025,0.05063291139240506,0.08571428571428572,0.06741573033707865]}
The arrays stand for { views [[hour, # of views], [hour, # of views], etc. }. Same with orders. I don't need conversion rates.
I also need to add the values up for each key, so after doing this for 5 pages, I have one key for each hour of the day, and the total number of views for that hour. I tried a couple each loops, but couldn't make any progress.
I appreciate any help you guys can give me.
It looks like the output (which from your code I assume is the content of hourly_data) is JSON. In that case, it's easy to parse and add up the numbers. Something like this:
require "json" # at the top of your script
# ...
def sum_hours_values(data, hours_values=nil)
# Start with an empty hash that automatically initializes missing keys to `0`
hours_values ||= Hash.new {|hsh,hour| hsh[hour] = 0 }
# Iterate through the [hour, value] arrays, adding `value` to the running
# count for that `hour`, and return `hours_values`
data.each_with_object(hours_values) do |(hour, value), hsh|
hsh[hour] += value
end
end
# ... Watir/Nokogiri stuff here...
# Initialize these so they persist outside the loop
hours_views, orders_views = nil
campaign_list.each do |campaign|
browser.goto "<redacted>"
if browser.text.include? "Application Error"
# ...
else
# ...
hourly_data_parsed = JSON.parse(hourly_data)
hours_views = sum_hours_values(hourly_data_parsed["views"], hours_views)
hours_orders = sum_hours_values(hourly_data_parsed["orders"], orders_views)
end
end
puts "Views by hour:"
puts hours_views.sort.map {|hour_views| "%2i\t%4i" % hour_views }
puts "Orders by hour:"
puts hours_orders.sort.map {|hour_orders| "%2i\t%4i" % hour_orders }
P.S. There's a really nice recursive version of sum_hours_values I didn't include since the iterative version is clearer to most Ruby programmers. If you're into recursion I leave it as an exercise for you. ;)

Ruby regex matching of selective tweets

Im having difficulty using .match to only allow and block selective tweets and display only those from 'does_match?'
def does_match?
allow = "/orange|grape\sfruit|apple/"
block = "/#fruits|coconut/"
allowfruits = "/berry|mango/"
#tweet.match(allow).nil?
#tweet.match(block)
#tweet.match(allowfruits) if #user =~ /\A(twitteruser|anotheraccount)\Z/
#tweet.match(/#[A-Za-z0-9_:-]+/)
return #tweet
end
def show
return #tweet
end
firstly, You are defining Your regexps as a strings
do this instead
allow = /orange|grape\sfruit|apple/
secondly, You're doing some matches bot doing nothing with its return values
do this
if #tweet.match(allow)
# rest of logic
# checking blocked and allowed for user
#tweet # or true
else
nil # or false
end

Toggling true/false: editing a file in ruby

I have some code that tries to change 'false' to 'true' in a ruby file, but it only works once while the script is running.
toggleto = true
text = File.read(filename)
text.gsub!("#{!toggleto}", "#{toggleto}")
File.open(filename, 'w+') {|file| file.write(text); file.close}
As far as I know, as long as I close a file, i should be able to read it it afterwards with what I previously wrote and thus change it back and forth no matter how many times.
Larger Context:
def toggleAutoAction
require "#{#require_path}/options"
filename = "#{#require_path}/options.rb"
writeToggle(filename, !OPTIONS[:auto])
0
end
def writeToggle(filename, toggleto)
text = File.read(filename)
text.gsub!(":auto => #{!toggleto}", ":auto => #{toggleto}")
File.open(filename, 'w+') {|file| file.write(text); file.close}
end
def exitOrMenu
puts "Are you done? (y/n)"
prompt
if gets.chomp == 'n'
whichAction
else
exit
end
end
def whichAction
if action == 5
toggleAutoAction
else
puts "Sorry, that isn't an option...returning"
return 1
end
exitOrMenu
end
The problem lays within this method:
def toggleAutoAction
require "#{#require_path}/options" # here
filename = "#{#require_path}/options.rb"
writeToggle(filename, !OPTIONS[:auto])
0
end
Ruby will not load the options.rb a second time (i.e. with the exact same path name), hence your !OPTIONS[:auto] will only be evaluated once (otherwise you would get a constant-already-defined-warning, provided OPTIONS is defined in options.rb). See Kernel#require docs.
You could, of course, do crazy stuff like
eval File.read("#{#require_path}/options.rb")
but I would not recommend that (performance wise).
As noted above, reading/writing from/to YAML files is less painful ;-)

Resources