order email address file by last name in ruby? - ruby

I have a file that is listed line by line as such:
first.last#example.com
first.last#example.com
last#example.com...
Note that some of the addresses don't have a first name, in which case, it is just the last name.
How can I write a simple Ruby script to read in this file (call it email.txt)
and write it back to the file in sorted order by last name?

Put this in a file, e.g. sort_by_last.rb:
puts IO.readlines('email.txt').sort_by { |e| e.match(/[^\.]+(?=#)/)[0].downcase }
then run it:
ruby sort_by_last.rb > emails_sorted.txt
For variable filename
Set contents of sort_by_last.rb to
puts STDIN.readlines.sort_by { |e| e.match(/[^\.]+(?=#)/)[0].downcase }
then run:
ruby sort_by_last.rb < email.txt > emails_sorted.txt

Related

A JSON text must at least contain two octets! (JSON::ParserError)

I'm working with a Ruby script that reads a .json file.
Here is the JSON file:
{
"feed.xml": "93d5b140dd2b4779edef0347ac835fb1",
"index.html": "1cbe25936e392161bad6074d65acdd91",
"md5.json": "655d7c1dbf83a271f348a50a44ba4f6a",
"test.sh": "9be192b1b5a9978cb3623737156445fd",
"index.html": "c064e204040cde216d494776fdcfb68f",
"main.css": "21b13d87db2186d22720e8c881a78580",
"welcome-to-jekyll.html": "01d7c7d66bdeecd9cd69feb5b4b4184d"
}
It is completely valid, and is checked for its existence before trying to read from it. Example:
if File.file?("md5.json")
puts "MD5s exists"
mddigests = File.open("md5.json", "r")
puts "MD5s" + mddigests.read
items = JSON.parse(mddigests.read) <--- Where it all goes wrong.
puts items["feed.xml"]
Everything works up until that point:
MD5s exists
MD5s{
"feed.xml": "93d5b140dd2b4779edef0347ac835fb1",
"index.html": "1cbe25936e392161bad6074d65acdd91",
"md5.json": "655d7c1dbf83a271f348a50a44ba4f6a",
"test.sh": "9be192b1b5a9978cb3623737156445fd",
"index.html": "c064e204040cde216d494776fdcfb68f",
"main.css": "21b13d87db2186d22720e8c881a78580",
"welcome-to-jekyll.html": "01d7c7d66bdeecd9cd69feb5b4b4184d"
}
common.rb:156:in `initialize': A JSON text must at least contain two octets! (JSON::ParserError)
I've searched and tried a lot of different things, to no avail. I'm stumped. Thanks!
You have a duplicate call to read() at the point that it all goes wrong. Replace the second call to read() with the variable mddigests and all should be fine.
This code should work like you'd expect:
if File.file?("md5.json")
puts "MD5s exists"
mddigests = File.open("md5.json", "r")
digests = mddigests.read
puts "MD5s" + digests
items = JSON.parse(digests) #<--- This should work now!
puts items["feed.xml"]
end
The reason is that the file pointer is moved after the first read(), and by the second read(), it's at the end of file, hence the message requiring at least 2 octets.

How to sequentially create multiple CSV files in Ruby?

Silly question, but I want to do some processing on a dataset and put them into different CSVs, like UDID1.csv, UDID2.csv, ..., UDID1000.csv. So this is my code:
for i in 1..1000
logfile = File.new('C:\Users\hp1\Desktop\Datasets\New File\UDID#{i}\.csv',"a")
#I'll do some processing here
end
But the program throws an error when running because of the UDID#{i} part. So, how to overcome this issue? Thanks.
Edit: This is the error:
in `initialize': No such file or directory # rb_sysopen - C:\Users\hp1\Desktop\Datasets\New File\udid#{1}\.csv (Errno::ENOENT)from C:/Ruby21/bin/hashedUDID.rb:38:in `new' from C:/Ruby21/bin/hashedUDID.rb:38:in '<main>'
The ' is one problem, another problem is the path.
In your posting the New File must exist as a directory. Inside this directory must exist another directories like UDID0001. This gets a .csv file.
Correct is (I don't use the non-rubyesk for-loop):
1.upto(1000) do |i|
logfile = File.new("C:\\Users\\hp1\\Desktop\\Datasets\\UDID#{i}.csv", "a")
#I'll do some processing here
logfile.close #Don't forget to close the file
end
Inside " the backslash must be masked (\\). Instead you may use /:
logfile = File.new("C:/Users/hp1/Desktop/Datasets/New File/UDID#{i}/.csv", "a")
Another possibility is the usage of %i to insert the number:
logfile = File.new("C:/Users/hp1/Desktop/Datasets/New File/UDID%02i/.csv" % i, "a")
I prefer to use open, then the file is closed with the end of the block:
File.open("C:/Users/hp1/Desktop/Datasets/New File/UDID%04i/.csv" % i, "a") do |logfile|
#I'll do some processing here
end #closes the file
Warning:
I'm not sure, if you really want to create 1000 log files (The File is opened inside the loop. so each step creates a file.).
If yes, then the %04i-version has the advantage, that the files get all the same number of digits (starting with 0001 and ending with 1000).
(1..10).each { |i| logfile = File.new("/base/path/UDID#{i}.csv") }
You must use double quote (") when you need string interpolation.
#{} can only be used in strings with double quotes ". So change your code to:
for i in 1..1000
logfile = File.new("C:\Users\hp1\Desktop\Datasets\New File\UDID#{i}\.csv","a")
# other stuff
end

Ruby - CSV works while SmarteCSV doesn't

I want to open a csv file using SmarterCSV.process
market_csv = SmarterCSV.process(market)
p "just read #{market_csv}"
The problem is that the data is not read and this prints:
[]
However, if I attempt the same thing with the default CSV library implementation the content of the file is read(the following print statement prints the file).
CSV.foreach(market) do |row|
p row
end
The content of the file I was reading is of the form:
Date,Close
03/06/15,0.1634
02/06/15,0.1637
01/06/15,0.1638
31/05/15,0.1638
The problem could come from the line separator, the file is not exactly the same if you're using windows or unix system ("\r\n" or "\r"). Try to identify and specify the character in the SmarterCSV.process like this:
market_csv = SmarterCSV.process(market, row_sep: "\r")
p "just read #{market_csv}"
or like this:
market_csv = SmarterCSV.process(market, row_sep: :auto)
p "just read #{market_csv}"

Ruby program which sorts images into different directories by their names?

I would like to make a Ruby program which sorts the images in the current directory into different subfolders, for example:
tree001.jpg, ... tree131.jpg -> to folder "tree"
apple01, ... apple20.jpg -> to folder "apple"
plum1.jpg, plum2.jpg, ... plum33.jpg -> to folder "plum"
and so on, the program should automagically recognize which files belong together by their names. I have no clue how to achive this. Till now I make a small program which collect the files with command "Dir" into an array and sort it alphabetically to help finding the appropriate classes by the file names. Does anybody have a good idea?
Check out Find:
http://www.ruby-doc.org/stdlib-2.0/libdoc/find/rdoc/Find.html
Or Dir.glob:
http://ruby-doc.org/core-2.0/Dir.html#method-c-glob
For instance:
Dir.glob("*.jpg")
will return an array that you can iterate with each.
I'd go about it something like this:
files = %w[
tree001.jpg tree03.jpg tree9.jpg
apple1.jpg apple002.jpg
plum3.jpg plum300.jpg
].shuffle
# => ["tree001.jpg", "apple1.jpg", "tree9.jpg", "plum300.jpg", "apple002.jpg", "plum3.jpg", "tree03.jpg"]
grouped_files = files.group_by{ |fn| fn[/^[a-z]+/i] }
# => {"tree"=>["tree001.jpg", "tree9.jpg", "tree03.jpg"], "apple"=>["apple1.jpg", "apple002.jpg"], "plum"=>["plum300.jpg", "plum3.jpg"]}
grouped_files.each do |grp, files|
Dir.mkdir(grp) unless Dir.exist(grp)
files.each { |f| FileUtils.mv(f, "#{grp}/#{f}") }
end
I can't test that because I don't have all the files, nor am I willing to generate them.
The important thing is group_by. It makes it easy to group the similarly named files, making it easy to walk through them.
For your case, you'll want to replace the assignment to files with Dir.glob(...) or Dir.entries(...) to get your list of files.
If you want to separate the file path from the file name, look at File.split or File.dirname and File.basename:
File.split('/path/to/foo')
=> ["/path/to", "foo"]
File.dirname('/path/to/foo')
=> "/path/to"
File.basename('/path/to/foo')
=> "foo"
Assuming every file name starts with non-digit characters followed by at least one digit character, and the initial non-digit characters define the directory you want the file moved to:
require 'fileutils'
Dir.glob("*").select{|f| File.file? f}.each do |file| # For each regular file
dir = file.match(/[^\d]*/).to_s # Determine destination directory
FileUtils.mkdir_p(dir) # Make directory if necessary
FileUtils.mv(file, dir) # Move file
end
The directories are created if necessary. You can run it again after adding files. For example, if you added the file tree1.txt later and re-ran this, it would be moved to tree/ where tree001.jpg through tree131.jpg already are.
Update: In the comments, you added the requirement that you only want to do this for files which form groups of at least 10. Here's one way to do that:
require 'fileutils'
MIN_GROUP_SIZE = 10
reg_files = Dir.glob("*").select{|f| File.file? f}
reg_files.group_by{|f| f.match(/[^\d]*/).to_s}.each do |dir, files|
next if files.size < MIN_GROUP_SIZE
FileUtils.mkdir_p(dir)
files.each do |file|
FileUtils.mv(file, dir)
end
end

How do I remove "hidden" characters when reading a line of text in Ruby?

I am using a custom Ruby function in Puppet to read a string of text from a file. I am than comparing whatever version is read against a list of known versions to determine which config file I should use for that particular server. The problem is that when I compare the read version to my list of known versions, none of them match.
I printed out the variable to the screen, and it looked fine. I then added a '-' to the beginning and the end and this time, the following was printed
-2.2#012-
Does anyone know what this is and how it could be removed?
Here is my process.
A script that handles the installation of an app
sudo echo "2.2" > /opt/version
My ruby function
if FileTest.exists?("/opt/version")
Facter.add("app_version") do
setcode do
version = File.open('/opt/version', &:readline)
version
end
end
end
My puppet manifest
if versioncmp( $app_version, '2.2') == 0 {
notice("===> Installing 2.2 Configs")
} elsif versioncmp ($app_version, '2.3') == 0 {
notice("===> Installing 2.3 Configs")
} else {
notice("===> No version match. Continuing on.")
}
}
File.readline includes the line termination (in your case, "\n"). chomp will get rid of the line termination:
version = File.open('/opt/version', &:readline).chomp
When debugging and you want to see what's really in a variable, use p instead of puts. p will escape unprintable characters so you can see them:
puts "2.2\n" # => 2.2
#
p "2.2\n" # => "2.2\n"

Resources