Remove orhpan child documents in a mongo database - ruby

It was my understanding that when you destroy a parent document in Mongo that you also destroy its children and it will cascade down the chain until all referenced documents have been removed.
I have a collection structure like the following
class A
include Mongoid::Document
field :name, :type => String
has_many :bs
end
class B
include Mongoid::Document
field :name, :type => String
has_many :cs
end
class C
include Mongoid::Document
field :name, :type => String
end
I came across a situation in my code where I needed to delete one of Class A and all of its relevant documents. Since Each of these models were based of Mongoid I used the destroy_all method like so
a = A.where({'_id' => "123456789"})
a.bs.destroy_all
=> 'however many a's I had'
From reading the documentation I thought that each of the referenced documents would be removed aswell.
Unfortunately what has happened is all my class b's are gone and I have a bunch of orphaned class c's in my database.
So:
A) Assuming destroy_all doesn't do what I thought it would. Is there anything that can be used to actually delete a parent and all of its referenced documents in mongoid?
B) Although I performed this operation on a local machine, I would still like to know, is there any way to remove orphan documents from an altered collection?

You'll need to add:
:dependent => :destroy
to your associations.
See "DEPENDENT BEHAVIOR": http://two.mongoid.org/docs/relations/referenced/1-n.html

It was my understanding that when you destroy a parent document in Mongo that you also destroy its children
Only when it is a single document. You are showing a structure of many documents.
I am not a Ruby programmer and I have never used mongoid however it seems that destroy_all is essentially remove that matches more than one document as is supported by the docs: http://two.mongoid.org/docs/persistence/standard.html#destroy_all
Deletes all matching documents in the database given the supplied
conditions. See the criteria section on deletion for preferred ways to
perform these actions. This runs destroy callbacks on all matching
documents.
I am guessing that if you wish to remove the children as well you will be required to manually specify them since MongoDB has no relational behaviour and so has no ability to cascade your "relations" on its own.
The only real way, I would say, to remove orphaned documents is most likely the hard way by going through all distinct references to the documents parents within the child collection querying the parent collection to see if it exists. If it does not exist, remove it.

Related

What is a good way to do an atomic insert or prevent a race condition on create with Mongoid?

class User
include Mongoid::Document
field :email, type: String
validates_uniqueness_of email
end
Although Mongoid supports atomic operations, I do not see one for insert.
Since User.create is not atomic, it seems that 2 Users could be created with the same email address simultaneously.
So, what is a good way to ensure that 2 users do not register the same email address simultaneously?
I can see one solution is to use a unique DB index, but are there any other good ways of doing this?

Possible to use `one_to_many_through` associations in Sequel ORM?

I have a case where one model is related 2 other ones. I am trying to correctly setup the model relationships between these 3 models.
A simplified example... The first 2 tables are clients and invoices:
db.create_table(:clients) do
primary_key :id
String :name
end
db.create_table(:invoices) do
primary_key :id
String :description
Integer :balance
end
A third table, called files, contains records for files which can be related to either clients or invoices:
db.create_table(:files) do
primary_key :id
String :name
String :path
String :type # [image, pdf, word, excel]
end
There are 2 joiner tables to connect files to clients and invoices:
db.create_table(:clients_files) do
Integer :client_id
Integer :file_id
end
db.create_table(:files_invoices) do
Integer :invoice_id
Integer :file_id
end
The question is, how to correctly set up the relationships in the models, such that each client and invoice can have one or more related files?
I can accomplish this using many_to_many and has_many :through associations, however, this doesn't seem to be the right approach, because a given file can belong to only one customer or invoice, not to many.
I can also do this using polymorphism, but the documentation discourages this approach:
Sequel discourages the use of polymorphic associations, which is the
reason they are not supported by default. All polymorphic associations
can be made non-polymorphic by using additional tables and/or columns
instead of having a column containing the associated class name as a
string.
Polymorphic associations break referential integrity and are
significantly more complex than non-polymorphic associations, so their
use is not recommended unless you are stuck with an existing design
that uses them.
The more correct association would be one_to_many_through or many_to_one_through, but I can't find the right way to do this. Is there a vanilla Sequel way to achieve this, or is there a model plugin that provides this functionality?
With your current schema, you just want to use a many_to_many association to files:
Client.many_to_many :files
Invoice.many_to_many :files
To make sure each file can only have a single client/invoice, you can make file_id the primary key of clients_files and files_invoices (a plain unique constraint/index would also work). Then you can use one_through_one:
File.one_through_one :client
File.one_through_one :invoice
Note that this still allows a File to be associated to both a client and an invoice. If you want to prevent that, you need to change your schema. You could move the client_id and invoice_id foreign keys to the files table (or use a single join table with both keys), and have a check constraint that checks that only one of them is set.
Note that the main reason to avoid polymorphic keys (in addition to complexity), is that it allows the database to enforce referential integrity. With your current join tables, you aren't creating foreign keys, just integer fields, so you aren't enforcing referential integrity.

How to create Cassandra tables with cequel (without using rails)

I am using Cequel as ORM for Cassandra without rails.
I have a problem when trying to create a simple list of projects.
First I defined the model with three columns which should belong to a compound key.
class Project
include Cequel::Record
key :client, :text, { partition: true }
key :type, :text, { partition: true }
key :dep, :text, { partition: true }
end
Later when I try to create the Project via
project = Project.create!({client: "test", type: "test", dep: "test"})
I get an error that the table does not exist.
The tables are not created automatically with Project.create! but I have to create the table manually first:
connection.schema.create_table(:projects) do
partition_key :client, :text
partition_key :type, :text
partition_key :dept, :text
end
But this syntax is different from the documented Record definition and I only found it by sifting through the source code. But this creates two problems.
Code overhead
I don't know the syntax for has_many and belongs_to so I cannot create the table correctly if the Record includes this
Am I overlooking a method to create the table automatically from the Project class definition?
The tables can be created by calling the method synchronize_schema on the class. So, in your case, you should execute Project.synchronize_schema before actually attempting to read/write into it.
Given that you are building a broader project, you can consider using Rake tasks for it. You also need to migrate so that the tables are actually created in Cassandra. You can use rake cequel:migrate for that. There are more tasks which you can see via rake --tasks.
If you are creating your custom project with custom places for models, you probably need to hack the rake migration task a little. This is an implementation that I did for my project https://github.com/octoai/gems/blob/master/octo-core/Rakefile#L75. Also take a look at how models are defined https://github.com/octoai/gems/tree/master/octo-core/lib/octocore/models
Hope this helps.

Datamapper - create unique index over belongs_to attribute

I'm using DataMapper connected to an SQLite backend. I need to create a Unique index across my four belongs_to columns. Here is the table.
class Score
include DataMapper::Resource
property :id, Serial
property :score, Integer
belongs_to :pageant
belongs_to :candidate
belongs_to :category
belongs_to :judge
#What we want is a UNIQUE INDEX on the four properties!
end
Things I've done:
A unique index on the four via something like :unique_index => :single_score. This works only if you have a property already included.
validates_uniqueness_of, I think the scope only works for a 2-column unique index.
My current solution, which is to just create a dummy field "dont_mind_me", just so I can put :unique_index => single_score in it and everything works. Is this something that's okay to do?
Create an index using raw SQL, SQLite supports a unique index among the four fields.
Basically there are two parts of this question:
Is my solution okay, or should I find another one? I'm at wit's end dealing with what seems to be something trivial, even with raw SQL
How do I create an "after_create_table" hook in DataMapper? The hooks in the documentation only tell about post-CRUD data.

Sinatra with existing database (that doesn't abide by naming conventions)

I have an existing legacy Firebird database with nonstandard table and field names.
I would like to write a Sinatra app that can access it and display information. I've seen stuff like dm-is-reflective that appears to work when a database has proper naming conventions, but how do I use DataMapper (or ActiveRecord whichever is the easiest) to access those tables?
For example, assuming I had these two tables:
Bookshelfs
shelf_id: integer
level: integer
created: timestamp
Book
id: integer
id_of_shelf: integer
title: string
pages: integer
Something like with odd naming conventions that don't follow any set pattern and where one table's record might "own" multiple entries in another table even though there is not foreign_key assigned.
How would you set up datamapper (or activerecord) to communicate with it?
Look in to this gem to get setup with ActiveRecord on Sinatra:
https://github.com/bmizerany/sinatra-activerecord
As for how to define the relations, activerecord can do this easily.
class Book < ActiveRecord::Base
belongs_to :bookshelf, :class_name => 'Bookshelf', :foreign_key => 'id_of_shelf'
end
class Bookshelf < ActiveRecord::Base
has_many :books, :class_name => 'Book', :foreign_key => 'id_of_shelf'
end
Assuming you figured out how to connect to your legacy database using ActiveRecord's Firebird adapter, the next thing I would do is define a view on top of each table, e.g.
CREATE VIEW books AS SELECT * FROM Book;
CREATE VIEW bookshelves AS SELECT * FROM Bookshelfs;
This way you can simply define models Book and Bookshelf in ActiveRecord as per usual and it will find everything in the right place inside the database.

Resources