Counting associations in ActiveRecord - ruby

I have two models, User and Group, where groups contain many users. If I want to count the number of users in each group using a single query, I can use the following SQL:
select id, (select count(1) from users where group_id = groups.id) from groups
Is it possible to do this efficiently with ActiveRecord?
To be clear, this query will list all group ids, along with the number of users in each group.

You can use either to get count
using associations
group = Group.find(1) #find group with id = 1
group.users.count # count users whose group_id = 1 calls db everytime
or
group.users.size # get size from cache if group.users loaded
or directly
User.where(:group_id=>1).count
count helper fires a count(*) query on the database with specified conditions
check more options at
http://apidock.com/rails/ActiveRecord/Calculations/count
also I recommend you to go through rails guides

I found an efficient solution using a join:
Group.all(
:select => "groups.id, count(u.group_id) as users_count",
:joins => "LEFT OUTER JOIN users u ON u.group_id = groups.id",
:group => "groups.id"
)

First solution is simply translate your query to ActiveRecord and use the subquery:
subquery = User.where("users.group_id = groups.id").select('count(1)')
groups_with_count = Group.select(:id, "(#{subquery.to_sql}) as users_count")
Or use a sql grouping for the same result
groups_with_count = Group.joins(:users).select(:id, 'count(users.id) as users_count').group(:id)
in both case you can now have the result in ONE query with MINIMAL raw sql:
groups_with_count.each { |group| puts "#{group.id} => #{group.users_count}" }
Additional note
You can write the first subquery as subquery = User.via(:group).select('count(1)') which is more simple and maintainable imo, by using the following helper.
I've used this code on several projects in order to write nicer subquery:
class ApplicationRecord < ActiveRecord::Base
# transform Raw sql that references an association such as: Shift.where('shifts.id = checkins.shift_id')
# into a simpler version Shift.via(:checkin) if shift have the checkin relationship
# No support for polymorphic association
# Basic support for "through" reflection (using join)
def via(name)
association = reflect_on_association(name)
raise ArgumentError, "#{name} is not a valid association of #{self.class.name}" unless association
raise NotImplementedError if association.polymorphic?
join_keys = association.join_keys
table_column = arel_table[join_keys.foreign_key]
association_column = Arel::Table.new(association.table_name)[join_keys.key]
if association.through_reflection?
through_association = association.through_reflection
table_column = Arel::Table.new(through_association.table_name)[join_keys.foreign_key]
joins(through_association.name).where(table_column.eq(association_column))
else
where(table_column.eq(association_column))
end
end
end

Related

Identification of erroneous queries in ObjectGears

Is there a way to check all the queries in ObjectGears? I can now display them only for particular models not all of them at once.
Create this query in ObjectGears. It will provide what you need.
SELECT m.Name AS Model_name, cd.Name, cd.Code, cd.Created, cd.Creator, cd.Modified, cd.Modifier, './QueryDetail.aspx?Id=' + CONVERT(VARCHAR,cd.Id) AS URL
FROM ClassDef cd
INNER JOIN Model m ON m.Id = cd.ModelId
WHERE
cd.isDirty = 1 AND
cd.EntityType = 1

Reduce SQL Queries from Includes assoc

I try to reduce SQL queries from my Rails application.
I have some controller like:
class Rest::MyController < Rest::BaseController
def show
render xml: some_model, status: :ok
end
private
def my_associations
[
:model2,
:model3,
:model4,
]
end
def some_model
#some_model ||= SomeModel.includes(my_associations).where(id: test_params[:id])
end
def test_params
params.permit(:id)
end
end
To avoid N + 1 I use includes, so basically when i try to execute some_model method, AR make lot of call's like that (SELECT ALL FROM):
SomeModel Load (1.7ms) SELECT `model2`.* FROM `model2` WHERE `model2`.`type` IN ('SomeModel') AND `model2`.`is_something` = 0 AND `model2`.`id` = 1
SomeModel Load (1.7ms) SELECT `model3`.* FROM `model3` WHERE `model3`.`type` IN ('SomeModel') AND `model3`.`is_something` = 0 AND `model3`.`id` = 1
SomeModel Load (1.7ms) SELECT `model4`.* FROM `model4` WHERE `model4`.`type` IN ('SomeModel') AND `model4`.`is_something` = 0 AND `model4`.`id` = 1
This is only example
Now, through my serializer I would like to get only selected columns for model2, model3 and model4
Unfortunately Active record make a call like SELECT model2.* FROM
For example, for model2 serializer i try to get only (:id, :name) columns.
Is it possible to make a call like ?
SELECT some_model.*, model2.id, model2.name FROM `some_model`
instead
SELECT `model2`.* FROM `model2` WHERE `model2`.`type` IN ('SomeModel')
If you want to use Rails's includes feature, then no, there isn't an especially good way to selectively control the columns from included models.
If you're looking for help to optimize the query, you'll need to provide more specifics about the data schema and the desired output.

Ransack And Column Alias

In our controller logic, we have to create a few column aliases for a special class of object (Which is a compilation of many objects):
The trouble is, when trying to ransack, ransack will ignore the alias and try to go to the original table.column_name. SQL after ransack:
"SELECT CONCAT_WS('
',users.first_name,users.middle_name,users.last_name) AS conducted_by,
COUNT(NULLIF(is_complete = false,true)) as complete_count,
COUNT(staff_assessments.id) as employee_count,
staff_assessment_groups.name as name,
MIN(staff_assessments.created_at) as created_at,
staff_assessment_groups.id as staff_assessment_group_id,
staff_assessment_groups.effective_date as effective_date,
staff_assessment_groups.effective_date as review_date, 'N/A' as
store_names, true as is_complete, true as is_360_evaluation,
\"staff_assessments\".\"assigner_position_id\" FROM
\"staff_assessments\" INNER JOIN \"positions\" ON \"positions\".\"id\"
= \"staff_assessments\".\"assigner_position_id\" AND \"positions\".\"deleted_at\" IS NULL INNER JOIN \"employees\" ON
\"employees\".\"id\" = \"positions\".\"employee_id\" AND
\"employees\".\"deleted_at\" IS NULL INNER JOIN \"users\" ON
\"users\".\"id\" = \"employees\".\"user_id\" AND
\"users\".\"deleted_at\" IS NULL INNER JOIN
\"staff_assessment_groups\" ON \"staff_assessment_groups\".\"id\" =
\"staff_assessments\".\"staff_assessment_group_id\" INNER JOIN
\"survey_types\" ON \"survey_types\".\"id\" =
\"staff_assessments\".\"survey_type_id\" INNER JOIN
\"survey_type_categories_types\" ON \"survey_types\".\"id\" =
\"survey_type_categories_types\".\"survey_type_id\" WHERE
\"staff_assessments\".\"position_id\" IN (12024,) AND
\"staff_assessments\".\"is_360_evaluation\" = 't' AND
\"survey_type_categories_types\".\"survey_type_category_id\" = 3 AND
\"staff_assessments\".\"conducted_by\" IN ('Bart Simpson') GROUP BY
users.id, \"staff_assessments\".\"assigner_position_id\",
staff_assessment_groups.id ORDER BY
staff_assessment_groups.effective_date DESC LIMIT 20 OFFSET 0"
(Note the bold conducted_by--that was the original ransack search, but it is using staff_assessments.name instead of the aliased 'name' above)
So here is my question--is there way to tell ransack to use the aliased field? Or is there a way to simply create the records as an Active Relation
(Tried putting the objects to an array, but I could no longer ransack it)

How to find records that have duplicate data using Active Record

What is the best way to find records with duplicate values in a column using ruby and the new Activerecord?
Translating #TuteC into ActiveRecord:
sql = 'SELECT id,
COUNT(id) as quantity
FROM types
GROUP BY name
HAVING quantity > 1'
#=>
Type.select("id, count(id) as quantity")
.group(:name)
.having("quantity > 1")
Here's how I solved it with the AREL helpers, and no custom SQL:
Person.select("COUNT(last_name) as total, last_name")
.group(:last_name)
.having("COUNT(last_name) > 1")
.order(:last_name)
.map{|p| {p.last_name => p.total} }
Really, it's just a nicer way to write the SQL. This finds all records that have duplicate last_name values, and tells you how many and what the last names are in a nice hash.
I was beating my head against this problem with a 2016 stack (Rails 4.2, Ruby 2.2), and got what I wanted with this:
> Model.select([:thing]).group(:thing).having("count(thing) > 1").all.size
=> {"name1"=>5, "name2"=>4, "name3"=>3, "name4"=>2, "name5"=>2}
With custom SQL, this finds types with same values for name:
sql = 'SELECT id, COUNT(id) as quantity FROM types
GROUP BY name HAVING quantity > 1'
repeated = ActiveRecord::Base.connection.execute(sql)
In Rails 2.x, select is a private method of AR class. Just use find():
klass.find(:all,
:select => "id, count(the_col) as num",
:conditions => ["extra conditions here"],
:group => 'the_col',
:having => "num > 1")
Here is a solution that extends the other answers to show how to find and iterate through the records grouped by the duplicate field:
duplicate_values = Model.group(:field).having(Model.arel_table[:field].count.gt(1)).count.keys
Model.where(field: duplicate_values).group_by(&:field).each do |value, records|
puts "The records with ids #{records.map(&:id).to_sentence} have field set to #{value}"
end
It seems a shame this has to be done with two queries but this answer confirms this approach.

how to find number of tag matches in acts as taggable on

I have two entries in my database
Obj1 is tagged with "hello, world, planet"
Obj2 is tagged with "hello"
if I do modelName.tagged_with(["hello", "world", "planet", "earth"], :any=>true)
I want to sort the returned objects in order of highest to lowest number of tags matched.
so in this case i'd like the order to be Obj1, Obj2
how can I do this? is there a way to get number of tags matched for each of the returned results?
You can call tag_list on the objects and use that to figure out how many tags there are:
tags = %w{hello world planet earth}
objs = ModelName.taggedWith(tags, :any => true)
objs.sort_by! { |o| -(tags & o.tag_list).length }
The tags & o.tag_list yields the intersection of the tags you're looking for and the tags found, then we negate the size of the intersection to tell sort_by (which sorts in ascending order) to put larger intersections at the front, negating the result is an easy way to reverse the usual sort order.
Posting this here if someone else is looking for a way to query a model by tags and order by the number of matches. This solution also allows for the usage of any "equality" operator like the % from pg_trgm.
query = <<-SQL
SELECT users.*, COUNT(DISTINCT taggings.id) AS ct
FROM users
INNER JOIN taggings ON taggings.taggable_type = 'User'
AND taggings.context = 'skills'
AND taggings.taggable_id = users.id
AND taggings.tag_id IN
(SELECT tags.id FROM tags
WHERE (LOWER(tags.name) % 'ruby'
OR LOWER(tags.name) % 'java'
OR LOWER(tags.name) % 'sa-c'
OR LOWER(tags.name) % 'c--'
OR LOWER(tags.name) % 'gnu e'
OR LOWER(tags.name) % 'lite-c'
))
GROUP BY users.id
ORDER BY ct DESC;
SQL
User.find_by_sql(query)
Note that the code above will only work if you have pg_trgm enabled. You can also simply replace % with ILIKE.
EDIT: With ActiveRecord and eager loading:
This could be in a scope or class method and can be chained with other ActiveRecord methods.
ActiveRecord::Base.connection
.execute('SET pg_trgm.similarity_threshold = 0.5')
matches = skills.map do
'LOWER(tags.name) % ?'
end.join(' OR ')
select('users.*, COUNT(DISTINCT taggings.id) AS ct')
.joins(sanitize_sql_array(["INNER JOIN taggings
ON taggings.taggable_type = 'User'
AND taggings.context = 'skills'
AND taggings.taggable_id = users.id
AND taggings.tag_id IN
(SELECT tags.id FROM tags WHERE (#{matches}))", *skills]))
.group('users.id')
.order('ct DESC')
.includes(:skills)
Override skill_list from acts-as-taggable-on in the model:
def skill_list
skills.collect(&:name)
end
and proceed normally.

Resources