Controlling Tor client with Ruby - ruby

I am writing a Ruby script which automatically crawls websites for data analysis, and now I have a requirement which is fairly complicated: I have to be able to simulate access from a variety of countries, about 20 different ones. The website will contain different information depending on the IP location, so the only way to get it done is to request it from a server which is actually in that country.
Since I don't want to buy servers in each of those 20 countries, I chose to give Tor a try - as many of you will know, by editing the torrc configuration file it is possible to specify the exit node and hence the country from which the actual request will originate.
When I do this manually, e.g. by editing the torrc file to use an Argentinian server, then disconnecting Tor using Vidalia, reconnecting Vidalia, and then rerunning the request, it works fine. However, I want to automate this process entirely, and do it as efficiently as possible. Tor is written in C, and I'd like to avoid taking apart its entire source code for this. Any idea of what's the easiest way to automate the whole process using only Ruby?
Also, if I'm missing something and there's a simpler alternative to this whole ordeal, let me know.
Thanks!

Please take a look at Tor control protocol. You can control circuits using telnet.
http://thesprawl.org/memdump/?entry=8
To switch to a new circuit wich switches to a new endpoint:
require 'net/telnet'
def switch_endpoint
localhost = Net::Telnet::new("Host" => "localhost", "Port" => "9051", "Timeout" => 10, "Prompt" => /250 OK\n/)
localhost.cmd('AUTHENTICATE ""') { |c| print c; throw "Cannot authenticate to Tor" if c != "250 OK\n" }
localhost.cmd('signal NEWNYM') { |c| print c; throw "Cannot switch Tor to new route" if c != "250 OK\n" }
localhost.close
end
Be aware of the delay to make a new circuit, may take couple seconds, so you'd better add a delay in the code, or check if your address has changed by calling some remote IP detection site.

Related

Forwarding Raw Encrypted HTTPS Data in Ruby for MITM

I'm investigating man-in-the-middle attacks and trying to pipe raw HTTPS data (that is, before decryption) to and from a pair of sockets. For now, I just want to listen to the encrypted traffic, so I want any data going out to go from my web browser, through my script, and out to the intended recipient, and any data coming in to do the reverse. Ideally I'd just like to connect the incoming and outgoing sockets together and have them transfer data between each other automatically, but I haven't seen a way to do it in Ruby so I have been using the following, which I took from How can I create a two-way SSL socket in Ruby .
Here is my code:
def socketLoop(incoming, outgoing)
loop do
puts "selecting"
ready = IO.select([outgoing, incoming])
if ready[0].include?(incoming)
data_to_send = incoming.read_nonblock(32768)
outgoing.write(data_to_send)
puts "sent out"
puts data_to_send
end
if ready[0].include?(outgoing)
data_received = outgoing.read_nonblock(32768)
incoming.write(data_received)
puts "read in"
puts data_received
break if outgoing.nil? || outgoing.closed? || outgoing.eof?
end
end
end
server = TCPServer.open(LISTENING_PORT)
loop {
Thread.start(server.accept){ |incoming|
outgoing = TCPSocket.new(TARGET_IP, TARGET_PORT)
socketLoop(incoming, outgoing)
outgoing.close # Disconnect from target
incoming.close # Disconnect from the client
}
}
It works beautifully for HTTP but for HTTPS, my browser keeps spinning, and the output seems to indicate that at least part of a certificate has been sent over, but not much more. I presume I was being naïve to think that it would work for SSL, but as far as I know it uses TCP as the transport layer so I'm not sure why it doesn't work. Is it possible to get the raw data in this way? Is it an issue with my Ruby or have I made some wrong assumptions? I'd prefer not to use a system-wide packet sniffer if possible. If it would not be easy in Ruby, I'd be very grateful for any pointers in another language too.
Thanks a lot for your help!
EDIT: It seems that I can do this easily with netcat -
sudo nc -l 443 0<backpipe | nc $TARGET_IP 443 >backpipe
so I am rather embarassed that I didn't think of something so simple in the first place, however I would still be interested to see what I was not doing right in Ruby.

What's the correct way to check if a host-alive and handle timeouts efficiently?

I'm trying to check if a given host is up, running, and listening to a specific port, and to handle any errors correctly.
I found a a number of references of Ruby socket programming but none of them seems to able to handle "socket time-out" efficiently. I tried IO.select, which takes four parameters, of which, the last one is the timeout value:
IO.select([TCPSocket.new('example.com', 22)], [nil], [nil], 4)
The problem is, it gets stuck, especially if the port number is wrong or the server is not listening on to it. So, finally I ended up with this, which I didn't like that much but doing the job:
require 'socket'
require 'timeout'
dns = "example.com"
begin
Timeout::timeout(3) { TCPSocket.new(dns, 22) }
puts "Responded!!"
# do some stuff here...
rescue SocketError
puts "No connection!!"
# do some more stuff here...
rescue Timeout::Error
puts "No connection, timed out!!"
# do some other stuff here...
end
Is there a better way doing this?
The best test for availability of any resource is to try to use it. Adding extra code to try to predict ahead of time whether the use will work is bound to fail:
You test the wrong thing and get a different answer.
You test the right thing but at the wrong time, and the answer changes between the test and the use, and your application performs double the work for nothing, and you write redundant code.
The code you have to write to handle the test failure is identical to the code you should write to handle the use-failure. Why write that twice?
We make extensive use of Net::SSH in one of our systems, and ran into timeout issues.
Probably the biggest fix was to implement use of the select method, to set a low-level timeout, and not try to use the Timeout class, which is thread based.
"How do I set the socket timeout in Ruby?" and "Set socket timeout in Ruby via SO_RCVTIMEO socket option" have code to investigate for that. Also, one of those links to "Socket Timeouts in Ruby" which has useful code, however be aware that it was written for Ruby 1.8.6.
The version of Ruby can make a difference too. Pre-1.9 the threading wasn't capable of stopping a blocking IP session so the code would hang until the socket timed out, then the Timeout would fire. Both the above questions go over that.

What is the best way to simulate no Internet connection within a Cucumber test?

Part of my command-line Ruby program involves checking if there is an internet connection before any commands are processed. The actual check in the program is trivial (using Socket::TCPSocket), but I'm trying to test this behaviour in Cucumber for an integration test.
The code:
def self.has_internet?(force = nil)
if !force.nil? then return force
begin
TCPSocket.new('www.yelp.co.uk', 80)
return true
rescue SocketError
return false
end
end
if has_internet? == false
puts("Could not connect to the Internet!")
exit 2
end
The feature:
Scenario: Failing to log in due to no Internet connection
Given the Internet is down
When I run `login <email_address> <password>`
Then the exit status should be 2
And the output should contain "Could not connect to the Internet!"
I obviously don't want to change the implementation to fit the test, and I require all my scenarios to pass. Clearly if there is actually no connection, the test passes as it is, but my other tests fail as they require a connection.
My question: How can I test for this in a valid way and have all my tests pass?
You can stub your has_internet? method and return false in the implementation of the Given the Internet is down step.
YourClass.stub!(:has_internet?).and_return(false)
There are three alternative solutions I can think of:
have the test temporarily monkeypatch TCPSocket.initialize (or maybe Socket#connect, if that's where it ends up) to pretend the internet is down.
write (suid) a script that adds/removes an iptables firewall rule to disable the internet, and have your test call the script
use LD_PRELOAD on a specially written .so shared library that overrides the connect C call. This is harder.
Myself, I would probably try option 1, give up after about 5 minutes and go with option 2.
maybe a bit late for you :), but have a look at
https://github.com/mmolhoek/vcr-uri-catcher
I made this to test network failures, so this should do the trick for you.

Handling event-stream connections in a Sinatra app

There is a great example of a chat app using Server-Sent Events by Konstantin Haase. I am trying to run it and have a problem with callbacks (I use Sinatra 1.3.2 and browse with Chrome 16). They simply do not run (e.g. after page reload), and therefore the number of connections is growing.
Also, connection is closed in 30-60 sec unless one sets a periodic timer to send empty data, as suggested by Konstantin elsewhere.
Can you replicate it? If yes, is it possible to fix these issues somehow? WebSockets work seamlessly in this respect...
# ruby
get '/stream', provides: 'text/event-stream' do
stream :keep_open do |out|
EventMachine::PeriodicTimer.new(20) { out << "data: \n\n" } # added
settings.connections << out
puts settings.connections.count # added
out.callback { puts 'closed'; settings.connections.delete(out) } # modified
end
end
# javascript
var es = new EventSource('/stream');
es.onmessage = function(e) { if (e.data != '') $('#chat').append(e.data + "\n") }; // modified
This was a bug in Sinatra https://github.com/sinatra/sinatra/issues/446
Neat bit of code. But you're right, WebSockets would address these problems. I think there are two problems here:
1) Your browser, the Web server, or a proxy in-between may shut down your connection after a period of time, idle or not. Your suggestion of a periodic timer sending empty data will help, but is no guarantee.
2) As far as I know, there's no built-in way to tell if/when one of these connections is still working. To keep your list of connections from growing, you're going to have to keep track of when each connection was last "used" (maybe the client should ping occasionally, and you would store this datetime.) Then add a periodic timer to check for and kill "stale" connections.
An easier, though perhaps uglier option, is to store the creation time of each connection, and kill it off after n minutes. The client should be smart enough to reconnect.
I know that takes some of the simplicity out of the code. As neat as the example is, I think it's a better candidate for WebSockets.

What is most efficient way to read from TCPServer?

I'm creating a ruby server which is connecting to a TCP client. My server is using a TCPServer and I'm attempting to use TCPServer::recv(), but it doesn't wait for data, so just continues in a tight loop until data is received.
What is the most efficient way to process intermittant data? I'm unable to change the data being sent in since I'm attempting to emulate another server. Which read like statement from TCPServer/TCPSocket would wait for data being sent?
require "socket"
dts = TCPServer.new('localhost', 20000)
s = dts.accept
print(s, " is accepted\n")
loopCount = 0;
loop do
Thread.start(s) do
loopCount = loopCount + 1
lineRcvd = s.recv(1024)
if ( !lineRcvd.empty? )
puts("#{loopCount} Received: #{lineRcvd}")
s.write(Time.now)
end
end
end
s.close
print(s, " is gone\n")
Thanks for your time.
are you sure recv isn't returning "" -- meaning the socket is closed?
If not then perhaps your sockets are set to non blocking somehow?
EventMachine is indeed far faster than using threads for socket programming :)
GL.
-r
Based on the questions you've been asking, I think you should try a framework like EventMachine and write a server that implements what you want instead of trying to fuss around with writing a server wrapper.
That being said, the most efficient way to read from a socket is to use a proper select call and poll all the open connections. While this is a fairly academic thing for anyone who's developed client-server applications before, it is a nuisance because there are a lot of things you can easily get wrong. For example. handling multiple connections can lead to all kinds of troublesome situations if you're not especially careful to avoid blocking calls.
The EventMachine framework makes it easy to develop query/response-type servers because you can always start with a template and work from there, for example, the built-in EventMachine::Protocols::LineAndTextProtocol one works as a great basis for most.

Resources