Stop Activerecord from loading Blob column - activerecord

How can I tell Activerecord to not load blob columns unless explicitly asked for? There are some pretty large blobs in my legacy DB that must be excluded for 'normal' Objects.

I just ran into this using rail 3.
Fortunately it wasn't that difficult to solve. I set a default_scope that removed the particular columns I didn't want from the result. For example, in the model I had there was an xml text field that could be quite long that wasn't used in most views.
default_scope select((column_names - ['data']).map { |column_name| "`#{table_name}`.`#{column_name}`"})
You'll see from the solution that I had to map the columns to fully qualified versions so I could continue to use the model through relationships without ambiguities in attributes. Later where you do want to have the field just tack on another .select(:data) to have it included.

fd's answer is mostly right, but ActiveRecord doesn't currently accept an array as a :select argument, so you'll need to join the desired columns into a comma-delimited string, like so:
desired_columns = (MyModel.column_names - ['column_to_exclude']).join(', ')
MyModel.find(id, :select => desired_columns)

I believe you can ask AR to load specific columns in your invocation to find:
MyModel.find(id, :select => 'every, attribute, except, the, blobs')
However, this would need to be updated as you add columns, so it's not ideal. I don't think there is any way to specifically exclude one column in rails (nor in a single SQL select).
I guess you could write it like this:
MyModel.find(id, :select => (MyModel.column_names - ['column_to_exclude']).join(', '))
Test these out before you take my word for it though. :)

A clean approach requiring NO CHANGES to the way you code else where in your app, i.e. no messing with :select options
For whatever reason you need or choose to store blobs in databases.
Yet, you do not wish to mix blob columns in the same table as your
regular attributes. BinaryColumnTable helps you store ALL blobs in
a separate table, managed transparently by an ActiveRecord model.
Optionally, it helps you record the content-type of the blob.
http://github.com/choonkeat/binary_column_table
Usage is simple
Member.create(:name => "Michael", :photo => IO.read("avatar.png"))
#=> creates a record in "members" table, saving "Michael" into the "name" column
#=> creates a record in "binary_columns" table, saving "avatar.png" binary into "content" column
m = Member.last #=> only columns in "members" table is fetched (no blobs)
m.name #=> "Michael"
m.photo #=> binary content of the "avatar.png" file

Related

Renaming multiple items in a database

I have seeded a lot of data for my database in rails 4. The data that I imported was entered manually by hand by a user of gigabot (using the gigabot) API.
The problem that I have is that I am trying to list "club nights" in my case but I am getting lots of duplicates back as the names are similar but not identical. Is there any way I could group the items where is the name contains a certain word then they would group together.
Currently these are my only validations
class Club < ActiveRecord::Base
has_many :events
validates :name, presence:true, uniqueness:true
validates :location, presence:true
validates :description, presence:true, uniqueness:true
end
Here is some of example data that the table currently displays
Name
DC10
Amnesia
Circo Loco # DC10
Sankeys
Sankeys Ibiza
Cocoon
Privilege Ibiza
Circoloco at Dc 10
Space
Space Ibiza
If you look at the above example you will see that some of the clubs are repeated. I would like to clean up the table so it would only have "DC10" as 1 club and all the clubs which have DC10 in their name are grouped together.
SO in the example above instead of having 10 seperate clubs it would be 6.
DC10,
Amnesia,
Space,
Sankeys,
Priviledge,
Cocoon.
Have a look at the update_all method from ActiveRecord.
This will allow you to update all the values of fields in a collection. So now you just have to get a collection that you're certain fits together.
I suggest doing something like SIMILAR for postgres. So you could do something like:
pattern = '%DC10%' # This can be as advanced as you need it
collection = Club.where('name SIMILAR TO ?', pattern)
collection.update_all(name: 'DC10')
This sounds like a very difficult task. Most likely you won't be able to come up with a regex that can capture your intention.
For example let's imagine you have a club Space and other entries
Void # Space
Outer Space
Inner space
Alien in Outer Space
they all end in Space but which ones should be regrouped ? My examples was a big exaggerated, but it sounds like you are dealing with a lot of data and cases like this one may occur.
Do you not have any other fied which could help you regroup records together ? Like GPS coordinates, city, etc. ?

Sequel joining tables but I have overlapping column names. How do I alias these column names?

Here is my code for joining two tables:
DB.from(:sources).join(:payloads, :source_id => :id)
The table names are :sources, :payloads.
The problem is that there is an :id column in payloads which overwrites the :id column in :sources. I need to use an alias so that I just obtain a mega table with all of the column names. However, as currently written and as my tables are currently structured, the :id columns are getting combined and the second table takes precedence. Does this make sense?
How do I make an alias so that the :id column from :sources still shows up?
To alias sources.id to a different name, use the Identifier aliases.
.select_append(:sources__id___source_id).join...
# *, `sources`.`id` AS 'source_id'
I think this is a case where using Sequel's graph will help you.
From the documentation:
Similar to #join_table, but uses unambiguous aliases for selected columns and keeps metadata about the aliases for use in other methods.
The problem you're seeing is an identically named column in one table is colliding with the same column name in another. Sequel's use of graph should make sure that the table name and column are returned as the key, rather than just the column.
The various documentation files have a number of examples, which would make a really long answer, so I recommend going through the docs, searching for uses, and see how they work for you.
Also, the Sequel IRC channel can be a great asset for these sort of questions too.

SQL column type from Arel::Attributes::Attribute object

tl;dr Given an Arel::Attributes::Attribue object, say Model.arel_table[:created_at] how do get it's SQL type?
Context: I'm bypassing the ActiveRecord infrastructure in favour of Arel to write some SQL reports that need to be generated really efficiently. Using Arel's to_sql method I'm generating the final SQL and executing it directly via ActiveRecord::Base.connection.execute. However, I need to apply SQL transformations to certain columns (eg. change timezone of timestamps stored in GMT). Since the number of columns is large (and varying, based on user input), I don't want to hard code these transformations. I'd like to look at the SQL type of the columns being selected and apply the transformation accordingly.
If you have the ActiveRecord class set up then you have access to its columns and columns_hash methods. Those will give you column objects (instances of ActiveRecord::ConnectionAdapters::Column) and there you should find type and sql_type methods. For example:
> Model.columns_hash['created_at'].type
=> :datetime
> Model.columns_hash['created_at'].sql_type
=> "timestamp without time zone"
The sql_type will be database-specific (that's PostgreSQL above), the type will match the type in your migrations so you probably want to use that instead of sql_type.
That said, you could probably get away with use the usual ActiveRecord relation methods (which should deal with conversions and time zones for you) and then call to_sql at the end:
sql = Model.where('created_at > ?', some_time).select('pancakes').to_sql
and then feed that SQL into execute or select_rows. That will let you use most of the usual ActiveRecord stuff while avoiding the overhead of creating a bunch of ActiveRecord wrappers that you don't care about.
Something that might be helpful specifically in arel is type_cast_for_database. This can be used on an arel table:
Model.arel_table.type_cast_for_database(:id, 'test')
=> 0
Model.arel_table.type_cast_for_database(:id, '47test')
=> 47
While you don't get the type specifically you can see if values like strings are going to be converted to a number or something.
EDIT
It's important to note that this only works if the arel table has a type_caster able_to_type_cast?. If you get it from the model like above, it should have a type caster.

Ruby w/ Postgres & Sinatra - Query won't order right with parameter?

So I set a variable in my main ruby file that's handling all my post and get requests and then use ERB templates to actually show the pages. I pass the database handler itself into the erb templates, and then run a query in the template to get all (for this example) grants.
In my main ruby file:
grants_main_order = "id_num"
get '/grants' do
erb :grants, :locals => {:db=>db, :order=>grants_main_order, :message=>params[:message]}
end
In the erb template:
db = locals[:db]
getGrants = db.exec("SELECT * FROM grants ORDER BY $1", [locals[:order]])
This produces some very random ordering, however if I replace the $1 with id_num, it works as it should.
Is this a typing issue? How can I fix this? Using string replacement with #{locals[:order]} also gives funky results.
Parameters are there to put in constant values into the query. It's possible and legal, but not meaningful to use them in an ORDER BY-clause.
Say you want to issue this query:
SELECT first_name, last_name
FROM people
ORDER BY first_name
If you put "first_name" in a string and pass it in as a parameter, you instead get:
SELECT first_name, last_name
FROM people
ORDER BY "first_name"
The difference is huge. That last ORDER BY-clause really tells te database not to care about the column values for each row, and just sort as if all rows were identical. Sorting order will be random.
I would recommend using datamapper (http://datamapper.org/) for sinatra. It's a very slick ORM and handles the paramaterized queries you are trying to build quite well.
have you inspected what locals[:order] is? Maybe something funky in there.
p locals[:order]

LINQ - NOT selecting certain fields?

I have a LINQ query mapped with the Entity Framework that looks something like this:
image = this.Context.ImageSet
.Where(n => n.ImageId == imageId)
.Where(n => n.Albums.IsPublic == true)
.Single();
This returns a single image object and works as intended.
However, this query returns all the properties of my Image table in the DB.
Under normal circumstances, this would be fine but these images contain a lot of binary data that takes a very long time to return.
Basically, in it current state my linq query is doing:
Select ImageId, Name, Data
From Images
...
But I need a query that does this instread:
Select ImageId, Name
From Images
...
Notice i want to load everything except the Data. (I can get this data on a second async pass)
Unfortunately, if using LINQ to SQL, there is no optimal solution.
You have 3 options:
You return the Entity, with Context tracking and all, in this case Image, with all fields
You choose your fields and return an anonymous type
You choose your fields and return a strongly typed custom class, but you lose tracking, if thats what you want.
I love LINQ to SQL, but thats the way it is.
My only solution for you would be to restructure your DataBase, and move all the large Data into a separate table, and link to it from the Image table.
This way when returning Image you'd only return a key in the new DataID field, and then you could access that heavier Data when and if you needed it.
cheers
This will create a new image with only those fields set. When you go back to get the Data for the images you select, I'd suggest going ahead and getting the full dataset instead of trying to merge it with the existing id/name data. The id/name fields are presumably small relative to the data and the code will be much simpler than trying to do the merge. Also, it may not be necessary to actually construct an Image object, using an anonymous type might suit your purposes just as well.
image = this.Context.ImageSet
.Where(n => n.ImageId == imageId)
.Where(n => n.Albums.IsPublic == true)
.Select( n => new Image { ImageId = n.ImageId, Name = n.Name }
.Single();
[If using Linq 2 SQL] Within the DBML designer, there is an option to make individual table columns delay-loaded. Set this to true for your large binary field. Then, that data is not loaded until it is actually used.
[Question for you all: Does anyone know if the entity frameworks support delayed loaded varbinary/varchar's in MSVS 2010? ]
Solution #2 (for entity framework or linq 2 sql):
Create a view of the table that includes only the primary key and the varchar(max)/varbinary(max). Map that into EF.
Within your Entity Framework designer, delete the varbinary(max)/varchar(max) property from the table definition (leaving it defined only in the view). This should exclude the field from read/write operations to that table, though you might verify that with the logger.
Generally you'll access the data through the table that excludes the data blob. When you need the blob, you load a row from the view. I'm not sure if you'll be able to write to the view, I'm not sure how you would do writes. You may be able to write to the view, or you may need to write a stored procedure, or you can bust out a DBML file for the one table.
You cannot do it with LINQ at least for now...
The best approach I know is create View for the table you need without large fields and use LINQ with that View.
Alternatively you could use the select new in the query expression...
var image =
(
from i in db.ImageSet
where i.ImageId == imageId && i.Albums.IsPublic
select new
{
ImageId = i.ImageId,
Name = i.Name
}
).Single()
The LINQ query expressions actually get converted to the Lambda expression at compile time, but I prefair using the query expression generally because i find it more readable and understandable.
Thanks :)

Resources