Grab all digits before a semi-colon - ruby

I'm creating a program that reads lines from a file and tries to connect to a randomly generated "proxy." What I'm trying to do is read the lines, if the connection errors out, either;
Save them to a file called proxies_to_check.txt
Or save them to a file called bad_proxies.txt
It works how it's suppose to, it actually works pretty well I'm kind of impressed with myself. However, while it's saving to the file it saves the ip, with the port, like so:
143.54.67.231:6543
143.23.567.23:3452
9.234.21.124:5432
What I want to do is just save the ip to the file like so:
143.54.67.231
143.23.567.23
9.234.21.124
I've tried a few things, like using a regex, and striping the line (I looked up striping and it doesn't do what I thought it did), how can I go about grabbing all the digits and periods before the semi-colon?
Source:
def check_possibles
puts "Testing possible proxies, this will take awhile..".green.bold
IO.read("possible_proxies.txt").each_line do |proxy|
begin
Timeout::timeout(6) do
begin
open("http://#{proxy.chomp}")
end
File.open("true_proxies.txt", "a+") {|s| s.puts(proxy)}
end
rescue Errno::ENETUNREACH, Errno::EADDRNOTAVAIL
File.open("bad_proxies.txt", "a+"){|s| s.puts("Bad IP => #{proxy}")}
rescue Timeout::Error, Errno::ECONNREFUSED
File.open("proxies_to_check.txt", "a+") {|s| s.puts(proxy)}
next
end
end
end

"143.54.67.231:6543".split(":")[0..-2].join
OR
"143.54.67.231:6543".split(":").first
OR maybe this will helps you if you need RGX
Is it a valid Regular expression for IP address

Related

Delete Duplicate Lines Ruby

I working on a json file, I think. But Regardless, I'm working with a lot of different hashes and fetching different values and etc. This is
{"notification_rule"=>
{"id"=>"0000000",
"contact_method"=>
{"id"=>"000000",
"address"=>"cod.lew#gmail.com",}
{"notification_rule"=>
{"id"=>"000000"
"contact_method"=>
{"id"=>"PO0JGV7",
"address"=>"cod.lew#gmail.com",}
Essential, this is the type of hash I'm currently working with. With my code:
I wanted to stop duplicates of the same thing in the text file. Because whenever I run this code it brings both the address of both these hashes. And I understand why, because its looping over again, but I thought this code that I added would help resolve that issue:
Final UPDATE
if jdoc["notification_rule"]["contact_method"]["address"].to_s.include?(".com")
numbers.print "Employee Name: "
numbers.puts jdoc["notification_rule"]["contact_method"]["address"].gsub(/#target.com/, '').gsub(/\w+/, &:capitalize)
file_names = ['Employee_Information.txt']
file_names.each do |file_name|
text = File.read(file_name)
lines = text.split("\n")
new_contents = lines.uniq.join("\n")
File.open(file_name, "w") { |file| file.puts new_contents }
end
else
nil
end
This code looks really confused and lacking a specific purpose. Generally Ruby that's this tangled up is on the wrong track, as with Ruby there's usually a simple way of expressing something simple, and testing for duplicated addresses is one of those things that shouldn't be hard.
One of the biggest sources of confusion is the responsibility of a chunk of code. In that example you're not only trying to import data, loop over documents, clean up email addresses, and test for duplicates, but somehow facilitate printing out the results. That's a lot of things going on all at once, and they all have to work perfectly for that chunk of code to be fully operational. There's no way of getting it partially working, and no way of knowing if you're even on the right track.
Always try and break down complex problems into a few simple stages, then chain those stages together as necessary.
Here's how you can define a method to clean up your email addresses:
def address_scrub(address)
address.gsub(/\#target.com/, '').gsub(/\w+/, &:capitalize)
end
Where that can be adjusted as necessary, and presumably tested to ensure it's working correctly, which you can now do indepenedently of the other code.
As for the rest, it looks like this:
require 'set'
# Read in duplicated addresses from a file, clean up with chomp, using a Set
# for fast lookups.
duplicates = Set.new(
File.open("Employee_Information.txt", "r").readlines.map(&:chomp)
)
# Extract addresses from jdoc document array
filtered = jdocs.map do |jdoc|
# Convert to jdoc/address pair
[ jdoc, address_scrub(jdoc["notification_rule"]["contact_method"]["address"]) ]
end.reject do |jdoc, address|
# Remove any that are already in the duplicates list
duplicates.include?(address)
end.map do |jdoc, _|
# Return only the document
jdoc
end
Where that processes jdocs, an array of jdoc structures, and removes duplicates in a series of simple steps.
With the chaining approach you can see what's happening before you add on the next "link", so you can work incrementally towards a solution, adjusting as you go. Any mistakes are fairly easy to catch because you're able to, at any time, inspect the intermediate products of those stages.

Asking user for information, and never having to ask again

I want to ask for user input, but I only want to do it once (possibly save the information within the program), meaning, something like this:
print "Enter your name (you will only need to do this once): "
name = gets.chomp
str = "Hello there #{name}" #<= As long as the user has put their name in the very first
# time the program was run, I want them to never have to put thier name in again
How can I got about doing this within a Ruby program?
This program will be run by multiple users throughout the day on multiple systems. I've attempted to store it into memory, but obviously that failed because from my understand that memory is wiped everytime a Ruby program stops executing.
My attempts:
def capture_user
print 'Enter your name: '
name = gets.chomp
end
#<= works but user has to put in name multiple times
def capture_name
if File.read('name.txt') == ''
print "\e[36mEnter name to appear on email (you will only have to do this once):\e[0m "
#esd_user = gets.chomp
File.open('name.txt', 'w') { |s| s.puts(#esd_user) }
else
#esd_user = File.read('name.txt')
end
end
#<= works but there has to be a better way to do this?
require 'tempfile'
def capture_name
file = Tempfile.new('user')
if File.read(file) == ''
print "\e[36mEnter name to appear on email (you will only have to do this once):\e[0m "
#esd_user = gets.chomp
File.open(file, 'w') { |s| s.puts(#esd_user) }
else
#esd_user = File.read(file)
end
end
#<= Also used a tempfile, this is a little bit over kill I think,
# and doesn't really help because the users can't access their Appdata
You will want to store the username in a file on the local file system. Ruby provides many ways to do this, and we'll explore one in this answer: YAML files.
YAML files are a structured storage file that can store all kinds of different data, and is a good place to store config data. In fact, YAML configuration files are key parts of the largest Ruby projects in existence. YAML gives you a good starting point for supporting future configuration needs, beyond the current one, which is a great way to plan feature development.
So, how does it work? Let's take a look at your requirement using a YAML config:
require 'yaml'
config_filename = "config.yml"
config = {}
name = nil
if file_exists(config_filename)
begin
config = YAML.load_file(config_filename)
name = config["name"]
rescue ArgumentError => e
puts "Unable to parse the YAML config file."
puts "Would you like to proceed?"
proceed = gets.chomp
# Allow the user to type things like "N", "n", "No", "nay", "nyet", etc to abort
if proceed.length > 0 && proceed[0].upcase == "N"
abort "User chose not to proceed. Aborting!"
end
end
end
if name.nil? || (name.strip.length == 0)
print "Enter your name (you will only need to do this once): "
name = gets.chomp
# Store the name in the config (in memory)
config["name"] = name
# Convert config hash to a YAML config string
yaml_string = config.to_yaml
# Save the YAML config string to the config file
File.open(config_filename, "w") do |out|
YAML.dump(config, out)
end
end
Rather than show you the bare minimum to meet your needs, this code includes a little error handling and some simple safety checks on the config file. It may well be robust enough for you to use immediately.
The very first bit simply requires the YAML standard library. This makes the YAML functions work in your program. If you have a loader file or some other common mechanism like that, simply place the require 'yaml' there.
After that, we initialize some variables that get used in this process. You should note that the config_filename has no path information in it, so it will be read from the current directory. You will likely want to store the config file in a common place, such as in ~/.my-program-name/config.yml or C:\Documents and Settings\MyUserName\Application Data\MyProgramName\. This can be done pretty easily, and there's plenty to help, such as this Location to Put User Config Files in Windows and Location of ini/config files in linux/unix.
Next, we check to see if the file actually exists, and if so, we attempt to read the YAML contents from it. The YAML.load_file() method handles all the heavy lifting here, so you just have to ask the config hash that's returned for the key that you're interested in, in this case, the "name" key.
If an error occurs while reading the YAML file, it indicates that the file might possibly be corrupted, so we try to deal with that. YAML files are easy to edit by hand, but when you do that, you can also easily introduce an error that will make loading the YAML file fail. The error handling code here will allow the user to abort the program and go back to fix the YAML file, so that it doesn't simply get overwritten.
After that, we try to see if we've been had a valid name from the YAML config, and if not, we go ahead and accept it from the user. Once they've entered a name, we add it to the config hash, convert the hash to a YAML-formatted string, and then write that string to the config file.
And that's all it takes. Just about anything that you can store in a Ruby hash, you can store in a YAML file. That's a lot of power for storing config information, and if you later need to add more config options, you have a versatile container that you can use exactly for that purpose.
If you want to do any further reading on YAML, you can find some good information here:
YAML in Ruby Tutorial on Robot Has No Heart
Jamming with Ruby YAML on Juixe Techknow
YAML on Struggling with Ruby
While some of these articles are a bit older, they're still very relevant and will give you a jumping off point for further reading. Enjoy!
If you need the name to persist across the user running the script several times, you're going to need to use some sort of data store. As much as I hate flat files, if all you're storing is the user's name, I think this is a valid option.
if File.exist?('username.txt')
name = File.open( 'username.txt', 'r' ) do |file|
name = file.gets
end
else
print "Enter your name (you will only need to do this once): "
name = gets.chomp
File.open( 'username.txt', 'w' ) do |file|
file.puts name
end
end
str = "Hello there #{name}"

Uploading and parsing text document in Rails

In my application, the user must upload a text document, the contents of which are then parsed by the receiving controller action. I've gotten the document to upload successfully, but I'm having trouble reading its contents.
There are several threads on this issue. I've tried more or less everything recommended on these threads, and I'm still unable to resolve the problem.
Here is my code:
file_data = params[:file]
contents = ""
if file_data.respond_to?(:read)
contents = file_data.read
else
if file_data.respond_to?(:path)
File.open(file_data, 'r').each_line do |line|
elts = line.split
#
#
end
end
end
So here are my problems:
file_data doesn't 'respond_to?' either :read or :path. According to some other threads on the topic, if the uploaded file is less than a certain size, it's interpreted as a string and will respond to :read. Otherwise, it should respond to :path. But in my code, it responds to neither.
If I try to take out the if statements and straight away attempt File.open(file_data, 'r'), I get an error saying that the file wasn't found.
Can someone please help me find out what's wrong?
PS, I'm really sorry that this is a redundant question, but I found the other threads unhelpful.
Are you actually storing the file? Because if you are not, of course it can't be found.
First, find out what you're actually getting for file_data by adding debug output of file_data.inspect. It maybe something you don't expect, especially if form isn't set up correctly (i.e. :multipart => true).
Rails should enclose uploaded file in special object providing uniform interface, so that something as simple as this should work:
file_data.read.each_line do |line|
elts = line.split
#
#
end

Parse huge file (10+gb) and write content in another one

I'm trying to use Sphinx Search Server to index a really huge file (around 14gb).
The file is whitespace separated, one entry per line.
To be able to use it with Sphinx, I need to provide a xml file to the Sphinx server.
How can I do it without killing my computer ?
What is the best strategy? Should I try to split the main file in several little files? What's the best way to do it?
Note: I'm doing it in Ruby, but I'm totally open to other hints.
Thanks for your time.
I think the main idea would be to parse the main file line by line, while generating a result XML. And every time it gets large enough, to feed it to Sphinx. Rinse and repeat.
What parsing do you need to do? If the transformations are restricted to just one line in the input at once and not too complicated, I would use awk instead of Ruby...
I hate guys who doesn't write solution after a question. So I'll try to don't be one of them, hopefully it will help somebody.
I added a simple reader method to the File class then used it to loop on the file based on a chunk size of my choice. Quite simple actually, working like a charm with Sphinx.
class File
# New static method
def self.seq_read(file_path,chunk_size=nil)
open(file_path,"rb") do |f|
f.each_chunk(chunk_size) do |chunk|
yield chunk
end
end
end
# New instance method
def each_chunk(chunk_size=1.kilobyte)
yield read(chunk_size) until eof?
end
end
Then just use it like this:
source_path = "./my_very_big_file.txt"
CHUNK_SIZE = 10.megabytes
File.seq_read(source_path, CHUNK_SIZE) do |chunk|
chunk.each_line do |line|
...
end
end

Ruby open returning a string instead of a file?

When trying to open() remote images, some return as StringIO and others return as File...how do I force the File?
data = open("http://graph.facebook.com/61700024/picture?type=square")
=> #<StringIO:0x007fd09b013948>
data = open("http://28.media.tumblr.com/avatar_7ef57cb42cb0_64.png")
=> #<StringIO:0x007fd098bf9490>
data = open("http://25.media.tumblr.com/avatar_279ec8ee3427_64.png")
=> #<File:/var/folders/_z/bb18gdw52ns0x5r8z9f2ncj40000gn/T/open-uri20120229-9190-mn52fu>
I'm using Paperclip to save remote images (which are stored in S3), so basically wanting to do:
user = User.new
user.avatar = open(url)
user.save
Open-URI has a 10KB limit on StringIO objects, anything above that and it stores it as a temp file.
One way to get past this is by actually changing the constant that Open-URI takes for the limit of StringIO objects. You can do this by setting the constant to 0;
OpenURI::Buffer.send :remove_const, 'StringMax' if OpenURI::Buffer.const_defined?('StringMax')
OpenURI::Buffer.const_set 'StringMax', 0
Add that to your initialiser and you should be good to go.
While steigers solution is a simple all around solution, some of us might be repelled by the "nasty hack" feeling of it and the way it changes behaviour globally. Including other gems and such that might benefit or depend on this feature of OpenURI. Ofc. you could also use the above approach and then when your done reset the constant back to it's original value and because of the GIL you might get away with that sort of nastiness as well (though be sure to stay away from jruby and threads then!).
Alternatively you could do something like this, which basically ensures that if you get a stream it's piped to a temp file:
def write_stream_to_a_temp_file(stream)
ext = begin
"."+MIME::Types[stream.meta["content-type"]].first.extensions.first
rescue #In case meta data is not available
#It seems sometimes the content-type is binary/octet-stream
#In this case we should grab the original ext name.
File.extname(stream.base_uri.path)
end
file = Tempfile.new ["temp", ext]
begin
file.binmode
file.write stream.read
ensure
file.flush rescue nil
file.close rescue nil
end
file
end
# and when you want to enforce that data must be a temp file then just...
data = write_stream_to_a_temp_file data unless data.is_a? Tempfile

Resources