How to read value in cell from database - ruby

I am still fairly new to Ruby and to databases in general, and am trying to better learn how to use the two together. I have browsed through several online tutorials but haven't been able to figure a few things out. I am working with PostgreSQL and am simply trying to read the data in my database and manipulate in some way the data contained in the actual cell. From a tutorial I have the following functions:
def queryUserTable
#conn.exec( "SELECT * FROM users" ) do |result|
result.each do |row|
yield row if block_given?
end
end
end
and a simple way to print out the information in the rows would be something like
p.queryUserTable {|row| printf("%s %s\n", row['first_name'], row['last_name'])}
(with p being the connection). However all this is doing it printing out each value in the row and column specified as a whole, then continuing to the next row. What I would like to know is how I can grab for instance the value in row 1 under column first name and use it for something else? From what I understand, it looks like the rows are hashes and so I should be able to do something similar to {|row, value| #my_var = value } but I get no results by doing so, so I am not understanding how this all works properly. I am hoping someone can better explain how this works. Hope that makes sense. Thanks!
EDIT:
Does it have anything to do with this line in my function?:
result.each do |row| #do I need to add |row,value| here as well?

Is there a reason you're not using an ORM like ActiveRecord? Although it certainly has some downsides, it may well be helpful for someone who is new to databases and ruby. If you want a tutorial on active record and rails, I highly recommend Michael Hartl's awesome free tutorial[1].
I'm not exactly sure what you're trying to do, but I can correct a couple of misconceptions. First of all, result is not a hash - it is an array of hashes. That is why doing result.each { |row, value| ... doesn't initialize value. Once you have an individual row, you can do row.each { |col_name, val| ...
Second, if you want to grab a value from a specific row, you should specify the row in the query. You must know something about the row you want information about. For getting the user with id = 1, for instance:
user = #conn.exec("SELECT first_name FROM users WHERE id = 1").first
unless user.nil?
# do something with user["first_name"]
If you were to use activerecord, you could just do
user = User.findById(1)
I would not want to set the value in the queryUserTable loop, because it will get set on each loop, and just retain the value of the last time it executes.
[1] https://www.railstutorial.org/book

Related

Having a CSV file and letting a user edit

In ruby, if I have a CSV like this:
make,model,color,doors,email
dodge,charger,black,4,practice1#whatever.com
ford,focus,blue,5,practice2#whatever.com
nissan,350z,black,2,practice3#whatever.com
mazda,miata,white,2,practice4#whatever.com
honda,civid,brown,4,practice5#whatever.com
corvette,stingray,red,2,practice6#whatever.com
ford,fiesta,blue,5,practice7#whatever.com
bmw,m4,black,2,practice8#whatever.com
audi,a5,blue,2,practice9#whatever.com
subaru,brz,black,2,practice10#whatever.com
lexus,rc,black,2,practice11#whatever.com
I want to allow a user to enter an email and be able to edit any one of the options listed. For example, a user enters the email "practice11#whatever.com" and it will output "lexus,rc,black,2,practice11#whatever.com". Then from here the program will output some message that will tell the user to select to edit by "make,model,color,doors,email", and then be able to change whatever is there. Like lets say they choose "color", then they can change the color from "black" to "blue" of "practice11#whatever.com" line. I believe this can be done using a hash and using key-values but I am not sure how to exactly make the editing part work.
this is my current code:
require "csv"
csv = CSV.read('cars.csv', headers: true)
demo = gets.chomp
print csv.find {|row| row['email'] == demo}
all it does it takes in the csv file and allows a user to enter in an email and it will output that specific line.
So - your question is a bit vague and involves a number of implied questions, such as "how do I write code that can ask for different options and act accordingly" - so it might help if you clarify exactly what you are trying to ask.
From the looks of it, you seem most interested in understanding how to modify the CSV table, and to get info about the CSV fields/table/data etc..
And for this, you have two friends: The ruby 'p' method and the docs.
The 'p' method allows you to inspect objects. "p someObject" is the same as calling 'puts someObject.inspect' - and it's very handy, as is "puts someObject.class" to find out what type of object you're dealing with.
In this case, you can change the last line of your code a bit to get some info:
puts csv.class
got = csv.find {|row| row['email'] == demo}
p got
And suddenly we learn we are dealing with a CSV::Table
This is not surprising, let's head over to the docs. I don't know what version of ruby you're using, but 2.6.1 is current enough to have the info we need and is plenty old at this point, so you probably have access to it:
https://ruby-doc.org/stdlib-2.6.1/libdoc/csv/rdoc/CSV.html
Tells us that if we do the CSV.read using headers:
"If headers specified, reading methods return an instance of CSV::Table, consisting of CSV::Row."
So now we know we have a CSV::Table (which is much like an array/list but with some convenience methods (such as the 'find' that you are using).
And a CSV::Row is basically a hash that maintains it's order and is, as expected, keyed according to the headers.
So we can do:
p got.fields
p got['model']
got['model'] = 'edsel'
p got['model']
p got.fields
And not surprisingly, the CSV::Table has a 'to_s' method that let's us print out the CSV:
puts csv.to_s
You can probably take it from here.

Delete Duplicate Lines Ruby

I working on a json file, I think. But Regardless, I'm working with a lot of different hashes and fetching different values and etc. This is
{"notification_rule"=>
{"id"=>"0000000",
"contact_method"=>
{"id"=>"000000",
"address"=>"cod.lew#gmail.com",}
{"notification_rule"=>
{"id"=>"000000"
"contact_method"=>
{"id"=>"PO0JGV7",
"address"=>"cod.lew#gmail.com",}
Essential, this is the type of hash I'm currently working with. With my code:
I wanted to stop duplicates of the same thing in the text file. Because whenever I run this code it brings both the address of both these hashes. And I understand why, because its looping over again, but I thought this code that I added would help resolve that issue:
Final UPDATE
if jdoc["notification_rule"]["contact_method"]["address"].to_s.include?(".com")
numbers.print "Employee Name: "
numbers.puts jdoc["notification_rule"]["contact_method"]["address"].gsub(/#target.com/, '').gsub(/\w+/, &:capitalize)
file_names = ['Employee_Information.txt']
file_names.each do |file_name|
text = File.read(file_name)
lines = text.split("\n")
new_contents = lines.uniq.join("\n")
File.open(file_name, "w") { |file| file.puts new_contents }
end
else
nil
end
This code looks really confused and lacking a specific purpose. Generally Ruby that's this tangled up is on the wrong track, as with Ruby there's usually a simple way of expressing something simple, and testing for duplicated addresses is one of those things that shouldn't be hard.
One of the biggest sources of confusion is the responsibility of a chunk of code. In that example you're not only trying to import data, loop over documents, clean up email addresses, and test for duplicates, but somehow facilitate printing out the results. That's a lot of things going on all at once, and they all have to work perfectly for that chunk of code to be fully operational. There's no way of getting it partially working, and no way of knowing if you're even on the right track.
Always try and break down complex problems into a few simple stages, then chain those stages together as necessary.
Here's how you can define a method to clean up your email addresses:
def address_scrub(address)
address.gsub(/\#target.com/, '').gsub(/\w+/, &:capitalize)
end
Where that can be adjusted as necessary, and presumably tested to ensure it's working correctly, which you can now do indepenedently of the other code.
As for the rest, it looks like this:
require 'set'
# Read in duplicated addresses from a file, clean up with chomp, using a Set
# for fast lookups.
duplicates = Set.new(
File.open("Employee_Information.txt", "r").readlines.map(&:chomp)
)
# Extract addresses from jdoc document array
filtered = jdocs.map do |jdoc|
# Convert to jdoc/address pair
[ jdoc, address_scrub(jdoc["notification_rule"]["contact_method"]["address"]) ]
end.reject do |jdoc, address|
# Remove any that are already in the duplicates list
duplicates.include?(address)
end.map do |jdoc, _|
# Return only the document
jdoc
end
Where that processes jdocs, an array of jdoc structures, and removes duplicates in a series of simple steps.
With the chaining approach you can see what's happening before you add on the next "link", so you can work incrementally towards a solution, adjusting as you go. Any mistakes are fairly easy to catch because you're able to, at any time, inspect the intermediate products of those stages.

unwrapping an object returned from twitter api

While reading some data from the Twitter api, I inserted the data into the file like this
results.each do |f|
running_count += 1
myfile.puts "#{f.user_mentions}"
...
The results (2 sample lines below) look like this in the file
[#<Twitter::Entity::UserMention:0x007fda754035803485 #attrs={:screen_name=>"mr_blah_blah", :name=>"mr blah blah", :id=>2142450461, :id_str=>"2141354324324", :indices=>[3, 15]}>]
[#<Twitter::Entity::UserMention:0x007f490580928 #attrs={:screen_name=>"andrew_jackson", :name=>"Andy Jackson", :id=>1607sdfds, :id_str=>"16345435", :indices=>[3, 14]}>]
Since the only information I'm actually interested in is the :screen_name, I was wondering if there's a way that I could only insert the screen names into the file. Since each line is in array brackets and then I'm looking for the screen name inside the #attrs, I did this
myfile.puts "#{f.user_mentions[0]#attrs{"screen_name"}}"
This didn't work, and I didn't expect it to, as I'm not really sure if that's technically array etc. Can you suggest how it would be done?
You need to access the #attrs instance variable in the Twitter UserMention object. If you want to puts the screen name from the first object, based on your current output, I would write
myfile.puts "#{f.user_mentions[0].attrs[:screen_name]"
Also, putting the code on how results is returned would help get a definite answer quickly. Cheers!
Assuming that results is an array of Twitter::Entity::UserMention
results.each do |r|
myfile.puts r.screen_name
end

Are there any Ruby ORMs which use cursors or smart fetch?

I'm looking for a Ruby ORM to replace ActiveRecord. I've been looking at Sequel and DataMapper. They look pretty good however none of them seems to do the basic: not loading everything in memory when you don't need it.
I mean I've tried the following (or equivalent) on ActiveRecord and Sequel on table with lots of rows:
posts.each { |p| puts p }
Both of them go crazy on memory. They seem to load everything in memory rather than fetching stuff when needed. I used the find_in_batches in ActiveRecord, but it's not an acceptable solution:
ActiveRecord is not an acceptable solution because we had too many problems with it.
Why should my code be aware of a paging mechanism? I'm happy to configure somewhere the size of the page but that's it. With find_in_batches you need to do something like:
post.find_in_batches { |batch| batch.each { |p| puts p } }
But that should be transparent.
So is there somewhere a reliable Ruby ORM which does the fetch properly?
Update:
As Sergio mentioned, in Rails 3 you can use find_each which exactly what I want. However as ActiveRecord is not an option, except if someone can really convince me to use it, the questions are:
Which ORMs support the equivalent of find_each?
How to do it?
Why do we need a find_each, while find should do it, shouldn't it?
Sequel's Dataset#each does yield individual rows at a time, but most database drivers will load the entire result in memory first.
If you are using Sequel's Postgres adapter, you can choose to use real cursors:
posts.use_cursor.each{|p| puts p}
This fetches 1000 rows at a time by default, but you can use an option to specify the amount of rows to grab per cursor fetch:
posts.use_cursor(:rows_per_fetch=>100).each{|p| puts p}
If you aren't using Sequel's Postgres adapter, you can use Sequel's pagination extension:
Sequel.extension :pagination
posts.order(:id).each_page(1000){|ds| ds.each{|p| puts p}}
However, like ActiveRecord's find_in_batches/find_each, this does separate queries, so you need to be careful if there are concurrent modifications to the dataset you are retrieving.
The reason this isn't the default in Sequel is probably the same reason it isn't the default in ActiveRecord, which is that it isn't a good default in the general case. Only queries with large result sets really need to worry about it, and most queries don't return large result sets.
At least with the Postgres adapter cursor support, it's fairly easy to make it the default for your model:
Post.dataset = Post.dataset.use_cursor
For the pagination extension, you can't really do that, but you can wrap it in a method that makes it mostly transparent.
Sequel.extension :pagination
posts.order(:id).each_page(1000) do |ds|
ds.each { |p| puts p }
end
It is very very slow on large tables!
It becomes clear, looked at the method body:
http://sequel.rubyforge.org/rdoc-plugins/classes/Sequel/Dataset.html#method-i-paginate
# File lib/sequel/extensions/pagination.rb, line 11
def paginate(page_no, page_size, record_count=nil)
raise(Error, "You cannot paginate a dataset that already has a limit") if #opts[:limit]
paginated = limit(page_size, (page_no - 1) * page_size)
paginated.extend(Pagination)
paginated.set_pagination_info(page_no, page_size, record_count || count)
end
ActiveRecord actually has an almost transparent batch mode:
User.find_each do |user|
NewsLetter.weekly_deliver(user)
end
This code works faster than find_in_batches in ActiveRecord
id_max = table.get(:max[:id])
id_min = table.get(:min[:id])
n=1000
(0..(id_max-id_min)/n).map.each do |i|
table.filter(:id >= id_min+n*i, :id < id_min+n*(i+1)).each {|row|}
end
Maybe you can consider Ohm, that is based on Redis NoSQL store.

Ruby- Why can't I iterate over this data that I get back from Mongomapper map_reduce?

This is probably simple, but I have spent way too much time trying to figure it out, and am sure someone here will know, so here goes. Please be patient.
Bottom line is that I've got some data that I can't figure out how to loop over.
#Get the data from mongomapper map_reduce
#urls = DistinctUrls.build.find()
puts #urls.count
3
puts #urls.to_json
[{"_id":"http://msn.com","value":3.0},{"_id":"http://yahoo.com","value":12.0},{"_id":"http://google.com","value":2.0}]
#urls.each do |entry|
puts "Here I am" # Never gets printed, not sure why.
puts "url" + entry['_id']
end
What I don't understand is that if I have a count of 3, why it won't enter the loop?
I'm not sure if the mongomapper or map_reduce details matter. I'm putting them here just in case. If it makes sense, I can add the details of the map/reduce if needed.
Thanks for your help.
First you wrote #urls then #url. I think only one of them is correct.
Update: As the documentation says you can iterate over the cursor with each but after the full iteration it will be closed. Maybe this is your case that you've already iterated over it once. Probably the to_json did it.
You can check whether the cursor is closed or not with the following statement:
#urls.closed?
Check this before the iterating part.

Resources