ThinkingSphinx Order via URL Params - ruby

I am using ThinkingSphinx in an application and right now I am not doing any type of order on my results. However, I would like to make this an option via a link someone can click on the page and it just passes it through the URL to 'refresh' the page with the results now ordered.
In the .search parameters I tried doing :order => params[:o] then in the URL passing o=columnname but that does not seem to work.
Just to note, when I hard-code the ordering it works fine, I'm not having trouble with indexing/making a DB column sortable. I would just like to make it so via a URL argument it the results can be displayed ordered.

According to the Sphinx documentation, the fields you want to use for sorting must be flagged as sortable. Attributes defined with has do not have to be flagged, because all attributes are sortable:
class Article
..
define_index do
indexes title, :sortable => true
indexes author(:name), :as => :author, :sortable => true
..
end
Then one can use the :order and :sort_mode parameter to define the sort order:
sort_order = params[:o]
Article.search "pancakes", :order => sort_order, :sort_mode => :desc

Related

How do I remove sphinx_deleted from a Sphinx query?

I am new to Ruby and ThinkingSphinx.
I have the following Sphinx Query - SELECT * FROM user_core, user_delta WHERE sphinx_deleted = 0.
I do not want to see the condition "WHERE 'sphinx_deleted' = 0. How do I remove this? I have removed the sql_attr_uint = sphinx_deleted from my sphinx.conf file, yet I see the sphinx_deleted being passed in the query.
Here is the index file definition:
ThinkingSphinx::Index.define :user, :with => :active_record, :delta => true do
indexes [first_name,last_name,display_name], :as=>:name, :sortable=>true
indexes first_name, :sortable => true
indexes last_name, :sortable => true
indexes display_name, :sortable => true
indexes email, :sortable => true
indexes phone, :sortable => true
indexes title, :sortable => true
has id, :as => :user_id
has roles(:id), :as => :role_ids
has jurisdictions(:id), :as => :jurisdiction_ids
set_property :delta => true
end
I do not have a sphinx_scope or default_sphinx_scope defined.
We are using thinking-sphinx-3.1.0 and ruby-2.1.0
The sphinx_deleted attribute is created by Thinking Sphinx, and is used in the following cases (using your scenario of a User model with core and delta indices in the examples):
When a User is deleted, sphinx_deleted is set to 1 for that record in both the core and delta indices - there's no point returning Sphinx records if the underlying ActiveRecord object no longer exists.
When a User is updated, the delta index is processed with the latest field and attribute details, and the core index's document has sphinx_deleted set to 1, so only the latest (accurate) information will match. e.g. if a user has their name changed from Fred to Georgina, a search for 'Fred' will not return Georgina, because the core index document (which does match) is filtered out.
That is why the attribute exists. You cannot tell Thinking Sphinx to not add it, nor can you remove that filter, short of mucking around in the internals of Thinking Sphinx.
If there is a specific reason for wanting to remove the attribute and filter, feel free to comment here, or you can open an issue on the GitHub repo, or post to the TS Google Group.
Update
Okay, further to this, there are three ways around it.
Option One:
The first way is to make the query to Sphinx yourself, using a Thinking Sphinx connection:
results = ThinkingSphinx::Connection.take do |connection|
connection.execute "SELECT * FROM user_core, user_delta"
end
Keep in mind that this returns raw Sphinx values, not ActiveRecord instances.
Option Two:
A more complicated alternative, though, is to have your own search middleware stack. First, you'll want to create a custom subclass of ThinkingSphinx::Middlewares::SphinxQL that removes the :sphinx_deleted filter:
class SphinxQLWithoutFilter < ThinkingSphinx::Middlewares::SphinxQL
def call(contexts)
contexts.each do |context|
Inner.new(context).call
end
app.call contexts
end
private
class Inner < ThinkingSphinx::Middlewares::SphinxQL::Inner
def inclusive_filters
super.except :sphinx_deleted
end
end
end
Then, create a new middleware stack which uses this new SphinxQL query middleware:
WithoutFilterMiddleware = ::Middleware::Builder.new do
use ThinkingSphinx::Middlewares::StaleIdFilter
use SphinxQLWithoutFilter
use ThinkingSphinx::Middlewares::Geographer
use ThinkingSphinx::Middlewares::Inquirer
use ThinkingSphinx::Middlewares::ActiveRecordTranslator
use ThinkingSphinx::Middlewares::StaleIdChecker
use ThinkingSphinx::Middlewares::Glazier
end
And then you can use that middleware stack in specific search queries:
User.search 'foo', :middleware => WithoutFilterMiddleware
It's worth noting the two middleware present in that stack for stale ids. They work together to catch any Sphinx results that do not have a matching ActiveRecord object, and re-run the Sphinx query up to three times filtering out those unmatched records. They're probably useful, but if you don't want to use them, you can remove them from your custom stack. However, without them, any Sphinx records that don't have matching ActiveRecord objects will be transformed into nils.
Option Three:
This is the more hackish version of the previous solution, but will apply to all searches, so probably isn't worthwhile: re-open the class that adds the filter with class_eval and change the method definition:
ThinkingSphinx::Middlewares::SphinxQL::Inner.class_eval do
def inclusive_filters
# normally:
# (options[:with] || {}).merge({:sphinx_deleted => false})
# but without the sphinx_deleted filter:
options[:with] || {}
end
end
Now, all that said: I presume you're not actually deleting users, but somehow the deletion callbacks are being fired anyway? Hence, users do exist but are currently being filtered out by Sphinx? If so, I highly recommend not using ActiveRecord's destroy method, and instead having a custom method to mark users as inactive. This avoids the callbacks, and thus avoids the need for any of the above 'solutions'.

Mongoid push with upsert

I've got model User:
class User
field :username, type: String
embeds_many :products
end
class Product
field :name, type: String
embedded_in :user
end
I would like to have single operation that would:
insert the user
update the user in case the user exists already (this i can easily do with upsert)
push the products
This works for upserting:
User.new(username: 'Hello').upsert
The problem is that this will delete the embedded products (the products attribute is not specified).
Can I ask mongoid to skip setting array to empty?
Can I ask mongoid to push new products at the end of products array?
Something like this:
User.new(username: 'Hello').push(products: [Product.new(name: 'Screen')]).upsert
Finally I ended up by manually writing the following query:
User.mongo_client[:users].update_one({username: 'Hello'},
{"$set" => {first_name: 'Jim', last_name: 'Jones'},
"$pushAll" => [products: [{name: 'Screen'}, {name: 'Keyboard'}]
},
upsert: true)
Where:
$set - are the params that we want to set for a given document
$pushAll - when you use $push you can specify only one element, $pushAll allows you to append multiple elements (when you specify only one it will behave like $push)
upsert - will do the insert/update magic in the mongodb
In the second hash you can also specify $inc, $dec, $pop, $set etc... which is quite useful.

How can I avoid duplication in a join query using Sequel with Postgres on Sinatra?

I want to do a simple join. I have two tables: "candidates" and "notes".
Not all candidates have notes written about them, some candidates have more than one note written about them. The linking fields are id in the candidates table and candidate_id in the notes table. The query is:
people = candidates.where(:industry => industry).where("country = ?", country).left_outer_join(:notes, :candidate_id => :id).order(Sequel.desc(:id)).map do |row|
{
:id => row[:id],
:first => row[:first],
:last => row[:last],
:designation => row[:designation],
:company => row[:company],
:email => row[:email],
:remarks => row[:remarks],
:note => row[:note]
}
end
It works kind of fine and gets all the specified candidates from the candidates table and the notes from the notes table but where there is more than one note it repeats the name of the candidate. In the resulting list, person "abc" appears twice or three times depending on the number of notes associated with that person.
I am not actually printing the notes in the HTML result just a "tick" if that person has notes and "--" if no notes.
I want the person's name to appear only once. I have tried adding distinct in every conceivable place in the query but it made no difference.
Any ideas?
In order for distinct to work, you need to make sure you are only selecting columns that you want to be distinct on. You could try adding this to the query
.select(:candidates__id, :first, :last, :designation, :company, :email, :remarks, Sequel.as({:notes=>nil}).as(:notes)).distinct
But you may be better off using a subselect instead of a join to check for the existence of notes (assuming you are using a decent database):
candidates.where(:industry => industry, :country=>country).select_append(Sequel.as({:id=>DB[:notes].select(:candidate_id)}, :note)).order(Sequel.desc(:id)).map do |row|
{ :id => row[:id], :first => row[:first], :last => row[:last], :designation => row[:designation], :company => row[:company], :email => row[:email], :remarks => row[:remarks], :note => row[:note] }
end

Can I retrieve objects with Sequel from a complex query that limits results to fields from a single table?

I have a model whose rows I always want to sort based on the values in another associated model and I was thinking that the way to implement this would be to use set_dataset in the model. This is causing query results to be returned as hashes rather than objects, though, so none of the methods from the class can be used when iterating over the dataset.
I basically have two classes
class SortFields < Sequel::Model(:sort_fields)
set_primary_key :objectid
end
class Items < Sequel::Model(:items)
set_primary_key :objectid
one_to_one :sort_fields, :class => SortFields, :key => :objectid
end
Some backstory: the data is imported from a legacy system into mysql. The values in sort_fields are calculated from multiple other associated tables (some one-to-many, some many-to-many) according to some complicated rules. The likely solution will be to just add the values in sort_fields to items (I want to keep the imported data separate from the calculated data, but I don't have to). First, though, I just want to understand how far you can go with a dataset and still get objects rather than hashes.
If I set the dataset to sort on a field in items like so
class Items < Sequel::Model(:items)
set_primary_key :objectid
one_to_one :sort_fields, :class => SortFields, :key => :objectid
set_dataset(order(:sortnumber))
end
then the expected clause is added to the generated SQL, e.g.:
>> Items.limit(1).sql
=> "SELECT * FROM `items` ORDER BY `sortnumber` LIMIT 1"
and queries still return objects:
>> Items.limit(1).first.class
=> Items
If I order it by the associated fields though...
class Items < Sequel::Model(:items)
set_primary_key :objectid
one_to_one :sort_fields, :class => SortFields, :key => :objectid
set_dataset(
eager_graph(:sort_fields).
order(:sort1, :sort2, :sort3)
)
end
...I get hashes
?> Items.limit(1).first.class
=> Hash
My first thought was that this happens because all fields from sort_fields are included in the results and maybe if selected only the fields from items I would get Items objects again:
class Items < Sequel::Model(:items)
set_primary_key :objectid
one_to_one :sort_fields, :class => SortFields, :key => :objectid
set_dataset(
eager_graph(:sort_fields).
select(:items.*).
order(:sort1, :sort2, :sort3)
)
end
The generated SQL is what I would expect:
>> Items.limit(1).sql
=> "SELECT `items`.* FROM `items` LEFT OUTER JOIN `sort_fields` ON (`sort_fields`.`objectid` = `items`.`objectid`) ORDER BY `sort1`, `sort2`, `sort3` LIMIT 1"
It returns the same rows as the set_dataset(order(:sortnumber)) version but it still doesn't work:
>> Items.limit(1).first.class
=> Hash
Before I add the sort fields to the items table so that they can all live happily in the same model, is there a way to tell Sequel to return on object when it wants to return a hash?
If you use #eager_graph, you must use #all instead of #each to retrieve the results in order for the graph to be processed (since you cannot eagerly load without having all instances up front), or use the eager_each plugin (which makes #each call #all internally).

Is it possible to specify what index a query should use in Mongoid?

MongoDB seems like it is using an inefficient query pattern when one index is a subset of another index.
class Model
field :status, :type => Integer
field :title, :type => String
field :subtitle, :type => String
field :rating, :type => Float
index([
[:status, Mongo::ASCENDING],
[:title, Mongo::ASCENDING],
[:subtitle, Mongo::ASCENDING],
[:rating, Mongo::DESCENDING]
])
index([
[:status, Mongo::ASCENDING],
[:title, Mongo::ASCENDING],
[:rating, Mongo::DESCENDING]
])
end
The first index is being used both when querying on status, title and subtitle and sorting on rating and when querying on just status and title and sorting on rating even though using explain() along with hint() in the javascript console states that using the second index is 4 times faster.
How can I tell Mongoid to tell MongoDB to use the second index?
You can pass options such as hint to Mongo::Collection using Mongoid::Criterion::Optional.extras
An example:
criteria = Model.where(:status => true, :title => 'hello world').desc(:rating)
criteria.extras(:hint => {:status => 1, :title => 1, :rating => -1})
extras accepts anything that Mongo::Collection can handle
http://www.mongodb.org/display/DOCS/Optimization#Optimization-Hint
While the mongo query optimizer often
performs very well, explicit "hints"
can be used to force mongo to use a
specified index, potentially improving
performance in some situations.
db.collection.find({user:u, foo:d}).hint({user:1});
You need to work from http://www.rdoc.info/github/mongoid/mongoid/master/Mongoid/Cursor here as I do not know Ruby enough. It mentions hint.

Resources