Write to a ruby file using ruby - ruby

Alright, so what I have is a ruby file that takes an input, and writes it to another ruby file. I do not want to write it as a text file, because I am trying to insert this item into a Hash that can later be accessed in another run of the program, which can only be achieved by writing the info to a text file or another ruby file. In this case I want to write it into another ruby file.Here's the first file:
test_text=gets.chomp
to_write_to=File.open("rubylib.rb", "a")
test_text="hobby => #{test_test},"
to_write_to.puts test_text
This inserts the given info at the BOTTOM of the page. The other file is this: (rubylib.rb)
user_info={
"name" => "bob",,
"favorite_color" => "red"
}
I have a threefold question:
1) Is it possible to add test_text to the hash BEFORE the closing bracket?
2) using this method, will the rubylib.rb file, when run, parse the added text as code, or something else?
3)is there a better way to do this?
What I am trying to do is actually physically write the new data to the Hash so that it is still there the next time the file is run, to store data about the user. Because if I add it the normal way, it will be lost the next time the file is run. Is there a way to store data between runs of a ruby file without writing to a text file?
I've done the best I can to give you the info you need and explain the situation as best I can. If you need clarification or more info, please leave a comment and I'll try and get back to you by commenting on that.
Thanks for the help

You should use YAML for this.
Here's how you could create a .yml file with the data you used in your example:
require "yaml"
user_info = { "name" => "bob", "favorite_color" => "red" }
File.write("user_info.yml", user_info.to_yaml)
This creates a file that looks like this:
---
name: bob
favorite_color: red
On a subsequent execution of your program, you can load the .yml file and you'll get back the same Hash that you started with:
user_info = YAML.load_file("user_info.yml")
# => { "name" => "bob", "favorite_color" => "red" }
And you can add new items to the Hash and save it again:
user_info["hobby"] = "fishing"
File.write("user_info.yml", user_info.to_yaml)
Now the file has these contents:
---
name: bob
favorite_color: red
hobby: fishing

Use a database, even SQLite, and it'll let you store data for multiple sessions without any sort of encoding. Writing to a file as you are is really not scalable or practical. You'll slam into some real problems quickly with it.
I'd recommend looking at Sequel and its associated documentation for how to easily work with databases. That's a much more scalable approach and will save you a lot of headaches as you grow your code.

Related

Using Kiba: Is it possible to define and run two pipelines in the same file? Using an intermediate destination & a second source

My processing has a "condense" step before needing further processing:
Source: Raw event/analytics logs of various users.
Transform: Insert each row into a hash according to UserID.
Destination / Output: An in-memory hash like:
{
"user1" => [event, event,...],
"user2" => [event, event,...]
}
Now, I've got no need to store these user groups anywhere, I'd just like to carry on processing them. Is there a common pattern with Kiba for using an intermediate destination? E.g.
# First pass
source EventSource # 10,000 rows of single events
transform {|row| insert_into_user_hash(row)}
#users = Hash.new
destination UserDestination, users: #users
# Second pass
source UserSource, users: #users # 100 rows of grouped events, created in the previous step
transform {|row| analyse_user(row)}
I'm digging around the code and it appears that all transforms in a file are applied to the source, so I was wondering how other people have approached this, if at all. I could save to an intermediate store and run another ETL script, but was hoping for a cleaner way - we're planning lots of these "condense" steps.
To directly answer your question: you cannot define 2 pipelines inside the same Kiba file. You can have multiple sources or destinations, but the rows will all go through each transform, and through each destination too.
That said you have quite a few options before resorting to splitting into 2 pipelines, depending on your specific use case.
I'm going to email you to ask a few more detailed questions in private, in order to properly reply here later.

Can we store multiple objects in file?

I am already familiar with How can I save an object to a file?
But what if we have to store multiple objects (say hashes) to a file.
I tried appending YAML.dump(hash) to a file from various locations in my code. But the difficult part is reading it back. As yaml dump can extend to many lines, do I have to parse the file? Also this will only complicate code. Is there a better way to achieve this?
PS: Same issue will persist with Marshal.dump. So I prefer YAML as its more human readable.
YAML.dump creates a single Yaml document. If you have several Yaml documents together in a file then you have a Yaml stream. So when you appended the results from several calls to YAML.dump together you would have had a stream.
If you try reading this back using YAML.load you will only get the first document. To get all the documents back you can use YAML.load_stream, which will give you an array with an entry for each of the documents.
An example:
f = File.open('data.yml', 'w')
YAML.dump({:foo => 'bar'}, f)
YAML.dump({:baz => 'qux'}, f)
f.close
After this data.yml will look like this, containing two separate documents:
---
:foo: bar
---
:baz: qux
You can now read it back like this:
all_docs = YAML.load_stream(File.open('data.yml'))
Which will give you an array like [{:foo=>"bar"}, {:baz=>"qux"}].
If you don’t want to load all the documents into an array in one go you can pass a block to load_stream and handle each document as it is parsed:
YAML.load_stream(File.open('data.yml')) do |doc|
# handle the doc here
end
You could manage to save multiple objects by creating a delimiter (something to mark that one object is finished and that you go to the next one). You could then process the file in two steps:
read the file, splitting it around each delimiter
use YAML to restore the hashes from each chunk
Now, this would be a bit cumbersome, as there is a much simpler solution. Let's say you have three hash to save:
student = { first_name: "John"}
restaurant = { location: "21 Jump Street" }
order = { main_dish: "Happy Meal" }
You can simply put them in an array and then dump them:
objects = [student, restaurant, order]
dump = YAML.dump(objects)
You can restore your objects easily:
saved_objects = YAML.load(dump)
saved_student = saved_objects[0]
Depending of your objects relationship, you may prefer to use an Hash to save them instead of an array (so that you can name them instead of depending on the order).

Ruby and Excel Data Extraction

I am learning Ruby and trying to manipulate Excel data.
my goal:
To be able to extract email addresses from an excel file and place them in a text file one per line and add a comma to the end.
my ideas:
i think my answer lies in the use of spreadsheet and File.new.
What I am looking for is direction. I would like to hear any tips or rather hints to accomplish my goal. thanks
Please do not post exact code only looking for direction would like to figure it out myself...
thanks, karen
UPDATE::
So, regex seems to be able to find all matching strings and store them into an array. I´m having some trouble setting that up but should be able to figure it out....but for right now to get started I will extract only the column labeled "E Mail"..... the question I have now is:
`parse_csv = CSV.parse(read_csv, :headers => true)`
The default value for :skip_blanks is set to false.. I need to set it to true but nowhere can I find the correct syntax for doing so... I was assumming something like
`parse_csv = CSV.parse(read_csv, :headers => true :skip_blanks => true)`
But no.....
save your excel file as csv (comma separated value) and work with Ruby's libraries
besides spreadsheet (which can read and write), you can read Excel and other file types with with RemoteTable.
gem install remote_table
and
require 'remote_table'
t = RemoteTable.new('/path/to/file.xlsx', headers: :first_row)
when you write the CSV, as #aug2uag says, you can use ruby's standard library (no gem install required):
require 'csv'
puts [name, email].to_csv
Personally, I'd keep it as simple as possible and use a CSV.
Here is some pseudocode of how that would work:
read in your file line by line
extract your fields using regex, or cell count (depending on how consistent the email address location is), and insert into an arry
iterate through the array and write the values in the fashion you wish (to console, or file)
The code in the comment you had is a great start, however, puts will only write to console, not file. You will also need to figure out how you are going to know you are getting the email address.
Hope this helps.

Rails 3 - Export to Excel with gridlines

What can I add to this method to force full gridlines in Excel export?
def export_invoices
headers['Content-Type'] = "application/vnd.ms-excel"
headers['Content-Disposition'] = 'attachment; filename="Invoices.xls"'
headers['Cache-Control'] = ''
#invoices = Invoice.all
render :layout => nil
end
Thanks!
Hmm, lots of things going on here that I don't think make sense. The line
#invoices = Invoice.all
results in SQL like SELECT "invoices".* FROM "invoices" -- the * means you want all columns from the table, and the .all means you want all the invoices, not just one. Unless the contents of the table is a single column binary type, I cannot see this working, since Excel's file format is vendor-specific binary (I think!).
Are you using some gem like paperclip or other to handle saving files? Unless you are manipulating the actual excel data from within Rails (perhaps with a gem that knows how to do this), either the file was saved with gridlines on, or not.
This page describes how you can format your Excel file using XML.
If I understand you question correctly, you are looking to style the output in excel. To do that you need to actually generate an office open XML document, not dump CSV with application headers.
Have a look at these two gems
http://rubygems.org/gems/axlsx
http://rubygems.org/gems/acts_as_xlsx
They should give you what you want.

GridFS in Ruby: How to upsert?

Does GridFS have an upsert?
For example, if i want to save an image with a specified _id, and one with that same _id already exists, i want it to overwrite (update) it. Otherwise, insert it.
The spec isn't really designed to support upserts, since you're technically modifying more than one document, and certainly tricky race conditions can arise. So we recommend what Matt has done, which is to delete first and then put.
I looked at the mongo ruby gem source code and found this:
# Store a file in the file store. This method is designed only for writing new files;
# if you need to update a given file, first delete it using #Grid#delete.
# ...
def put(data, opts={})
So, I did this in the code:
grid.delete(id) # if exists
grid.put(tmp_file.read, :_id => id, :content_type => file_type)
See the working sinatra script here:
http://github.com/acani/acani-sinatra/blob/master/acani.rb#L97

Resources