Preparing and executing SQLite Statements in Ruby - ruby

I have been trying to puts some executed statements after I prepare them. The purpose of this is to sanitize my data inputs, which I have never done before. I followed the steps here, but I am not getting the result I want.
Here's what I have:
require 'sqlite3'
$db = SQLite3::Database.open "congress_poll_results.db"
def rep_pull(state)
pull = $db.prepare("SELECT name, location FROM congress_members WHERE location = ?")
pull.bind_param 1, state
puts pull.execute
end
rep_pull("MN")
=> #<SQLite3::ResultSet:0x2e69e00>
What I am expecting is a list of reps in MN, but instead I just get "SQLite3::ResultSet:0x2e69e00" thing.
What am I missing here? Thanks very much.

Try this
def rep_pull(state)
pull = $db.prepare("SELECT name, location FROM congress_members WHERE location = ?")
pull.bind_param 1, state
pull.execute do |row|
p row
end
end

Related

Why is this ruby code returning a blank page instead of filling it up with user names?

I want to collect the names of users in a particular group, called Nature, in the photo-sharing website Fotolog. This is my code:
require 'rubygems'
require 'mechanize'
require 'csv'
def getInitUser()
agent1 = Mechanize.new
number = 0
while number<=500
address = 'http://http://www.fotolog.com/nature/participants/#{number}/'
logfile2 = File.new("Fotolog/Users.csv","a")
tryConut = 0
begin
page = agent1.get(address)
rescue
tryConut=tryConut+1
if tryConut<5
retry
end
return
end
arrayUsers= []
# search for the users
page.search("a[class=img_border_radius").map do |opt|
link = opt.attributes['href'].text
link = link.gsub("http://www.fotolog.com/","").gsub("/","")
arrayUsers << link
logfile2.print("#{link}\n")
end
number = number+100
end
return arrayUsers
end
arrayUsers = getInitUser()
arrayUsers.each do |user|
getFriend(user)
end
But the Users.csv file I am getting is empty. What's wrong here? I suspect it might have something to do with the "class" tag I am using. But from the inspect element, it seems to be the correct class, isn't it? I am just getting started with web crawling, so I apologise if this is a silly query.

Threading sqlite connections in Ruby

I've been trying to get my ruby script threaded since yesterday. I've since opted for SQLite to save data, with the parallel gem to manage concurrency.
I've built a quick script for testing, but I'm having trouble getting the threading working; the database is locked. I've added db.close to the end, which doesn't help, and I've tried adding sleep until db.closed?, but that just sleeps indefinitely. What am I doing wrong?
The error is "database is locked (SQLite3::BusyException)".
Here's my code:
require 'sqlite3'
require 'pry'
require 'parallel'
STDOUT.sync = true
db = SQLite3::Database.new "test.db"
arr = [1,2,3,4,5,6,7,8,9,10]
rows = db.execute <<-SQL
create table test_table (
original string,
conversion string
);
SQL
def test(num)
db = SQLite3::Database.new "test.db"
puts "the num: #{num}"
sleep 4
{ num => num + 10}.each do |pair|
db.execute "insert into test_table values (?, ?)", pair
end
db.close
end
Parallel.each( -> { arr.pop || Parallel::Stop}, in_processes: 3) { |number| test(number) }
SQLite is threadsafe by default (using its "serialized" mode) and the ruby wrapper apparently supports this to whatever extent it needs to. However, it's not safe across processes, which makes a certain sense since the adapter or engine probably has to negotiate some state in the process to prevent locks.
To fix your example change in_processes to in_threads

Increasing Ruby Resolv Speed

Im trying to build a sub-domain brute forcer for use with my clients - I work in security/pen testing.
Currently, I am able to get Resolv to look up around 70 hosts in 10 seconds, give or take and wanted to know if there was a way to get it to do more. I have seen alternative scripts out there, mainly Python based that can achieve far greater speeds than this. I don't know how to increase the number of requests Resolv makes in parallel, or if i should split the list up. Please note I have put Google's DNS servers in the sample code, but will be using internal ones for live usage.
My rough code for debugging this issue is:
require 'resolv'
def subdomains
puts "Subdomain enumeration beginning at #{Time.now.strftime("%H:%M:%S")}"
subs = []
domains = File.open("domains.txt", "r") #list of domain names line by line.
Resolv.new(:nameserver => ['8.8.8.8', '8.8.4.4'])
File.open("tiny.txt", "r").each_line do |subdomain|
subdomain.chomp!
domains.each do |d|
puts "Checking #{subdomain}.#{d}"
ip = Resolv.new.getaddress "#{subdomain}.#{d}" rescue ""
if ip != nil
subs << subdomain+"."+d << ip
end
end
end
test = subs.each_slice(4).to_a
test.each do |z|
if !z[1].nil? and !z[3].nil?
puts z[0] + "\t" + z[1] + "\t\t" + z[2] + "\t" + z[3]
end
end
puts "Finished at #{Time.now.strftime("%H:%M:%S")}"
end
subdomains
domains.txt is my list of client domain names, for example google.com, bbc.co.uk, apple.com and 'tiny.txt' is a list of potential subdomain names, for example ftp, www, dev, files, upload. Resolv will then lookup files.bbc.co.uk for example and let me know if it exists.
One thing is you are creating a new Resolv instance with the Google nameservers, but never using it; you create a brand new Resolv instance to do the getaddress call, so that instance is probably using some default nameservers and not the Google ones. You could change the code to something like this:
resolv = Resolv.new(:nameserver => ['8.8.8.8', '8.8.4.4'])
# ...
ip = resolv.getaddress "#{subdomain}.#{d}" rescue ""
In addition, I suggest using the File.readlines method to simplify your code:
domains = File.readlines("domains.txt").map(&:chomp)
subdomains = File.readlines("tiny.txt").map(&:chomp)
Also, you're rescuing the bad ip and setting it to the empty string, but then in the next line you test for not nil, so all results should pass, and I don't think that's what you want.
I've refactored your code, but not tested it. Here is what I came up with, and may be clearer:
def subdomains
puts "Subdomain enumeration beginning at #{Time.now.strftime("%H:%M:%S")}"
domains = File.readlines("domains.txt").map(&:chomp)
subdomains = File.readlines("tiny.txt").map(&:chomp)
resolv = Resolv.new(:nameserver => ['8.8.8.8', '8.8.4.4'])
valid_subdomains = subdomains.each_with_object([]) do |subdomain, valid_subdomains|
domains.each do |domain|
combined_name = "#{subdomain}.#{domain}"
puts "Checking #{combined_name}"
ip = resolv.getaddress(combined_name) rescue nil
valid_subdomains << "#{combined_name}#{ip}" if ip
end
end
valid_subdomains.each_slice(4).each do |z|
if z[1] && z[3]
puts "#{z[0]}\t#{z[1]}\t\t#{z[2]}\t#{z[3]}"
end
end
puts "Finished at #{Time.now.strftime("%H:%M:%S")}"
end
Also, you might want to check out the dnsruby gem (https://github.com/alexdalitz/dnsruby). It might do what you want to do better than Resolv.
[Note: I've rewritten the code so that it fetches the IP addresses in chunks. Please see https://gist.github.com/keithrbennett/3cf0be2a1100a46314f662aea9b368ed. You can modify the RESOLVE_CHUNK_SIZE constant to balance performance with resource load.]
I've rewritten this code using the dnsruby gem (written mainly by Alex Dalitz in the UK, and contributed to by myself and others). This version uses asynchronous message processing so that all requests are being processed pretty much simultaneously. I've posted a gist at https://gist.github.com/keithrbennett/3cf0be2a1100a46314f662aea9b368ed but will also post the code here.
Note that since you are new to Ruby, there are lots of things in the code that might be instructive to you, such as method organization, use of Enumerable methods (e.g. the amazing 'partition' method), the Struct class, rescuing a specific Exception class, %w, and Benchmark.
NOTE: LOOKS LIKE STACK OVERFLOW ENFORCES A MAXIMUM MESSAGE SIZE, SO THIS CODE IS TRUNCATED. GO TO THE GIST IN THE LINK ABOVE FOR THE COMPLETE CODE.
#!/usr/bin/env ruby
# Takes a list of subdomain prefixes (e.g. %w(ftp xyz)) and a list of domains (e.g. %w(nytimes.com afp.com)),
# creates the subdomains combining them, fetches their IP addresses (or nil if not found).
require 'dnsruby'
require 'awesome_print'
RESOLVER = Dnsruby::Resolver.new(:nameserver => %w(8.8.8.8 8.8.4.4))
# Experiment with this to get fast throughput but not overload the dnsruby async mechanism:
RESOLVE_CHUNK_SIZE = 50
IpEntry = Struct.new(:name, :ip) do
def to_s
"#{name}: #{ip ? ip : '(nil)'}"
end
end
def assemble_subdomains(subdomain_prefixes, domains)
domains.each_with_object([]) do |domain, subdomains|
subdomain_prefixes.each do |prefix|
subdomains << "#{prefix}.#{domain}"
end
end
end
def create_query_message(name)
Dnsruby::Message.new(name, 'A')
end
def parse_response_for_address(response)
begin
a_answer = response.answer.detect { |a| a.type == 'A' }
a_answer ? a_answer.rdata.to_s : nil
rescue Dnsruby::NXDomain
return nil
end
end
def get_ip_entries(names)
queue = Queue.new
names.each do |name|
query_message = create_query_message(name)
RESOLVER.send_async(query_message, queue, name)
end
# Note: although map is used here, the record in the output array will not necessarily correspond
# to the record in the input array, since the order of the messages returned is not guaranteed.
# This is indicated by the lack of block variable specified (normally w/map you would use the element).
# That should not matter to us though.
names.map do
_id, result, error = queue.pop
name = _id
case error
when Dnsruby::NXDomain
IpEntry.new(name, nil)
when NilClass
ip = parse_response_for_address(result)
IpEntry.new(name, ip)
else
raise error
end
end
end
def main
# domains = File.readlines("domains.txt").map(&:chomp)
domains = %w(nytimes.com afp.com cnn.com bbc.com)
# subdomain_prefixes = File.readlines("subdomain_prefixes.txt").map(&:chomp)
subdomain_prefixes = %w(www xyz)
subdomains = assemble_subdomains(subdomain_prefixes, domains)
start_time = Time.now
ip_entries = subdomains.each_slice(RESOLVE_CHUNK_SIZE).each_with_object([]) do |ip_entries_chunk, results|
results.concat get_ip_entries(ip_entries_chunk)
end
duration = Time.now - start_time
found, not_found = ip_entries.partition { |entry| entry.ip }
puts "\nFound:\n\n"; puts found.map(&:to_s); puts "\n\n"
puts "Not Found:\n\n"; puts not_found.map(&:to_s); puts "\n\n"
stats = {
duration: duration,
domain_count: ip_entries.size,
found_count: found.size,
not_found_count: not_found.size,
}
ap stats
end
main

cassandra Ruby : multiple values for a block parameter (2 for 1)

I am trying to follow a tutorial on big data, it wants to reads data from a keyspace defined with cqlsh.
I have compiled this piece of code successfully:
require 'rubygems'
require 'cassandra'
db = Cassandra.new('big_data', '127.0.0.1:9160')
# get a specific user's tags
row = db.get(:user_tags,"paul")
###
def tag_counts_from_row(row)
tags = {}
row.each_pair do |pair|
column, tag_count = pair
#tag_name = column.parts.first
tag_name = column
tags[tag_name] = tag_count
end
tags
end
###
# insert a new user
db.add(:user_tags, "todd", 3, "postgres")
db.add(:user_tags, "lili", 4, "win")
tags = tag_counts_from_row(row)
puts "paul - #{tags.inspect}"
but when I write this part to output everyone's tags I get an error.
user_ids = []
db.get_range(:user_tags, :batch_size => 10000) do |id|
# user_ids << id
end
rows_with_ids = db.multi_get(:user_tags, user_ids)
rows_with_ids.each do |row_with_id|
name, row = row_with_id
tags = tag_counts_from_row(row)
puts "#{name} - #{tags.inspect}"
end
the Error is:
line 33: warning: multiple values for a block parameter (2 for 1)
I think the error may have came from incompatible versions of Cassandra and Ruby. How to fix it?
Its a little hard to tell which line is 33, but it looks like the problem is that get_range yields two values, but your block is only taking the first one. If you only care about the row keys and not the columns then you should use get_range_keys.
It looks like you do in fact care about the column values because you fetch them out again using db.multi_get. This is an unnecessary additional query. You can update your code to something like:
db.get_range(:user_tags, :batch_size => 10000) do |id, columns|
tags = tag_counts_from_row(columns)
puts "#{id} - #{tags.inspect}"
end

Having problems with Ruby file from Dashing

I am having trouble with twitter_user.rb, which is supposed to get the number of tweets, followers, and following of a given Twitter username.
I assume that I am supposed to replace TWITTER_USERNAME in line 9 with the Twitter username that I am interested in. I did that and started dashing but I got:
scheduler caught exception:
undefined method '[]' for nil:NilClass
/.../jobs/twitter_user.rb:19:in 'block in <top (required)>'
It looks like the problem is with line 19 which is:
tweets = /profile["']>[\n\t\s]*<strong>([\d.,]+)/.match(response.body)[1].delete('.,').to_i
Can anybody tell me what is going on and how to fix it?
Your assumption is incorrect. The program is looking for an environment variable called TWITTER_USERNAME that is set to the relevant user name. If that variable doesn't exist then the code uses foobugs instead.
If you would rather modify the code than set up an environment variable, then change
twitter_username = ENV['TWITTER_USERNAME'] || 'foobugs'
to
twitter_username = 'myusername'
This is untested code, but it's a general idea how it should have been written. If you clone the source on the original page you can adjust it for your own purposes (i.e. fix it):
require 'nokogiri'
doc = Nokogiri::XML(content)
tweets = doc.at('profile strong').text.delete('.,').to_i
following = doc.at('following strong').text.delete('.,').to_i
followers = doc.at('followers strong').text.delete('.,').to_i
The above three lines can be reduced to something like:
tweets, following, followers = %w[profile following followers].map{ |tag|
doc.at("#{ tag } strong").text.delete(',.').to_i
}
Again, without a usable sample of the XML/HTML I can't do much more, but as a practice we (programmers) shouldn't use regular expressions to try to parse XML or HTML. It's much to easy to break a pattern with either of those types of files.
I managed to solve the same issue for myself by using the twitter API instead to pull out the relevant information. It seems the web page had changed too much for the scraping to work and it could also stop working again at no notice as various people have already said...
This is the solution I used.
#### Get your twitter keys & secrets:
#### https://dev.twitter.com/docs/auth/tokens-devtwittercom
Twitter.configure do |config|
config.consumer_key = 'YOUR_CONSUMER_KEY'
config.consumer_secret = 'YOUR_CONSUMER_SECRET'
config.oauth_token = 'YOUR_OAUTH_TOKEN'
config.oauth_token_secret = 'YOUR_OAUTH_SECRET'
end
twitter_username = 'foobugs'
MAX_USER_ATTEMPTS = 10
user_attempts = 0
SCHEDULER.every '10m', :first_in => 0 do |job|
begin
tw_user = Twitter.user("#{twitter_username}")
if tw_user
tweets = tw_user.statuses_count
followers = tw_user.followers_count
following = tw_user.friends_count
send_event('twitter_user_tweets', current: tweets)
send_event('twitter_user_followers', current: followers)
send_event('twitter_user_following', current: following)
end
rescue Twitter::Error => e
user_attempts = user_attempts +1
puts "Twitter error #{e}"
puts "\e[33mFor the twitter_user widget to work, you need to put in your twitter API keys in the jobs/twitter_user.rb file.\e[0m"
sleep 5
retry if(user_attempts < MAX_USER_ATTEMPTS)
end
end
I have resolved by substituting this line:
followers = /<strong>([\d.]+)<\/strong> Follower/.match(response.body)[0].delete('.,').to_i
with these two:
followers_count_metadata = /followers_count":[\d]+/.match(response.body)
followers = /[\d]+/.match(followers_count_metadata.to_s).to_s

Resources