Case insensitive uniqueness validation in a Ruby Sequel.migration - ruby

I'm trying to figure out a good validation to use in my migration that will require case-insensitive uniqueness for user email addresses. In short, I want something like validate :email, :uniqueness => {:case_sensitive => false} without having to convert everything to use Rails or ActiveRecord. I could run emails through regexes but I don't like that solution.
I found a comment[1] saying you could use
validates_unique(:email){ |ds| ds.opts[:where].args.map! { |x| Sequel.function(:lower, x)}; ds}
but I don't understand what that code is doing and I don't want to use that code when I have no idea what that ds object is or what all is going on (why map!, does postgresql have a Sequel.function of :lower? ... probably, but I just don't know.)
[1] http://comments.gmane.org/gmane.comp.lang.ruby.sequel/6447
So I need one of two things answered:
1) How do I perform a case-insensitive uniqueness validation in a pure Sequel.migration (no ActiveRecord, no Rails)?
- OR -
2) If that code snippet I found online is actually what I want, what does it do & how does it work? (What is the ds object and what does this validation do with my database?)

As the Tin Man mentioned, you are confusing validations and constraints. You say you are trying to add a constraint and talk about Sequel.migration, but those have nothing to do with validations.
If you want to add a database constraint, you need to do something like this in a migration:
alter_table(:table){add_unique_constraint Sequel.function(:lower, :email)}
This is done so that the database doesn't allow duplicate emails in a case insensitive manner.
Validations are just for presenting nice error messages to the user. They are run before saving so that instead of the database raising a exception (which is difficult to deal with), you get a nice error message.
Like that comment mentions, you can't use validates_unique for case insensitive lookups on case sensitive databases without a hack. It would require that validates_unique accept an additional option (which may be added in the future).
If you don't want to use a hack like that, you'll have to do the validation manually:
dataset = model.where{|o| {o.lower(:email)=>o.lower(email)}}
dataset.exclude(pk_hash) unless new?
errors.add(:email, 'is already taken') unless ds.count == 0
In terms of what that hack does, ds is a Sequel::Dataset instance that validates_unique uses to check for uniqueness. If you do validates_unique :email, it'll be something like:
model.where(:email=>email)
# WHERE email = 'some email'
ds.opts[:where] extracts the where clause from that dataset, and transforms the arguments, wrapping them in SQL lower function calls, in order to transform the where clause so that it is similar to:
model.where{|o| {o.lower(:email)=>o.lower(email)}}
# WHERE lower(email) = lower('some email')
It's a hack as it only works if the model's dataset is not already filtered.

Related

DataMapper use only certain columns

I have a code section like the following:
users = User.all(:fname => "Paul")
This of course results in getting all users called "Paul". Now I only need some of the columns available for each user which leads to replacing the above line by something like this:
users = User.all(:name => "Paul", :fields => [:id, :fname, :lname, :email])
Until now everything works as expected. Unfortunately now I want to work with users but as soon as I use something like users.to_json, also the other columns available will be lazy-loaded even due the fact, that I don't need those. What's the correct or at least a good way to end up with users only containing the attributes for each user that I need?
An intermediate object like suggested in How to stop DataMapper from double query when limiting columns/fields? is not a very good option as I have a lot of places where would need to define at least twice which fields I need and also I would loose the speed improvement gained by loading only the needed data from the DB. In addition such an intermediate object also seems to be quite ugly to build when having multiple rows of the DB selected (=> multiple objects in a collection) instead of just one.
If you usually works with the collection using json I suggest overriding the as_json method in your model:
def as_json(options = nil)
# this example ignores the user's options
super({:only => [:fname]}.merge(options || {}))
end
You are able to find more detailed explanation here http://robots.thoughtbot.com/better-serialization-less-as-json

Validating a blank field Rails Model

This may be somewhat basic but cannot find a definitive answer anywhere. I have set up a contact form within my app and have put in a hidden field that when completed disables the submit button with some Jquery. My attempt at stopping automated spam..
Can I also add some validations in my model?
validates :ghost, :presence => false
Looking at the docs this is invalid? I want the form to fail if this field is filled in. Not sure how to go about this one
EDIT
So I have now read that I could possibly use
validates_exclusion_of :ghost, :on => :create
Though this is still failing as i dont think i am passing the correct arguments.
:presence => false means that you disable presence validator.
You need to write own absence validation (though in Rails 4.0 such validation exists, absence: true).
validate :ghost_is_absent
def ghost_is_absent
errors.add :ghost if ghost.present?
end
I am sorry to say , but why are you trying to do things so differently, doing it this way will make things more confusing for any future developer working on this piece of validation.
First thing:
1) You can do the reverse of it , mark it as spam when the field is empty and vice versa and then simply check with the validation validates_presence_of :ghost
2)or if you want to protect spam use capcha (recapcha gem for that)
3) or if you want it do it your way only , then just add a custom validation
Try creating a custom validation.
validate :check_for_spam
def check_for_spam
errors.add_to_base "ghost is present this is a spam" if ghost.present?
end
If you want to check if :ghost is blank:
validates :ghost, inclusion: {in: ['']}
If you want to check if :ghost is nil, you have to rely to a custom validator.

Can you data-bind a composite id in Grails such that it (or parts of it) becomes updateable?

I am trying to read through the dataBind documentation, but it's not all that clear:
http://grails.org/doc/2.1.0/ref/Controllers/bindData.html
I have a composite id composed of 4 columns, and I need to update one of those. It refuses to .save() and doesn't even throw an error. Is there some configuration that will allow me to change these values and save the model?
If I delete it and create a new record, it will bump the rowid, which I was using on the browser side with datatables/jeditable, and it's not really an option. However, even if I include all the parameters with an empty list:
def a = WaiverExemption.find("from WaiverExemption as e where e.exemptionRowId = ?", [params.rowid])
a.properties = params
bindData(a, params, [include: []])
a.save(flush: true, failOnError: true)
This does not seem to work. I've also tried naming the columns/properties explicitly both by themselves and also with "id".
I was confused on what bindData() actually does. Still confused on that.
If you have a composite id in Grails and wish to change one or more of the column values, save() will never ever execute as suggested in the question. Instead, you'll want to use .executeUpdate(). You can pass in HQL that updates (though most of the examples on the web are for delete) the table in question, with syntax that is nearly identical to proper SQL. Something along the lines of "update domain d set d.propertyName = ?" should work.
I do not know if this is a wise thing to do, or if it violates some philosophical rule of how a Grails app should work, but it will actually do the update. I advise caution and plenty of testing. This crap's all voodoo to me.

Ignore 'read-only' column in creates and updates in Ruby ActiveRecord

I'm looking for a solution to the following problem: I have an ActiveRecord entity that is backed by an updatable database view (in DB2 via the activerecord-jdbc-adapter gem). This view contains one column that is calculated from other columns and is 'read-only': you cannot set that column in any valid way. When a new record is created for this entity, that field should not be set. However, by default, ActiveRecord does set it with the 'default' (NULL), which is rejected by the database.
attr_readonly isn't a solution, because that only excludes a column from updates and not from creates.
attr_ignore, such as implemented by the 'lincoln' gem, is not a solution either, because then the field is ignored entirely. However, the column still needs to be read and be accessible. It's actually even used as part of a relation.
There are ways to prevent you from setting a certain attribute of an ActiveRecord entity, but that doesn't usually prevent that attribute from being included in create or update statements
Does anyone know if there is a way in ActiveRecord to specify a column as 'never set this field'?
Update, in response to Arsen7:
I've attempted to use the after_initialize hook to remove the attribute from a newly created entity, so it isn't included in the SQL that is built. The trouble with this is that the attribute is completely removed and not available anymore at all, pretty much identical to the 'igonre_attr' situation described above. Due to caching, that's not trivial to get around and would require additional logic to force a reload of entities of these specific tables. That can probably be achieved by overriding create to add a 'reload', in addition to using the after_initialize.
(As pointed out by Arsen7, I forgot to mention I'm at ActiveRecord 3.0.9)
My solution
Since my entities already inherit from a subclass of ActiveRecord::Base, I've opted to add before_create and after_create hooks. In the before_create hook, I remove the 'calculated' columns from the #attributes of the instance. In the after_create hook, I add them again and read the values of the 'calculated' columns from the database to set them to the values they received.
Adding such hooks is almost identical to overriding create, so I consider Arsen7's answer to be correct.
I'm afraid ActiveRecord is not prepared for the use case you need. (By the way: which version of AR are you using?)
But I believe you may apply two possible workarounds.
The first, is to overwrite the 'create' method of your model, executing some other SQL, prepared manually in the worst case. I suppose that the real function which will need to be overwritten will not be the 'create' itself, but looking at the sources you could find the one.
The other solution, and I believe, a more elegant one, would be to create a trigger in the database. I am more in the PostgreSQL world, where I would use a 'CREATE RULE', but looking at the DB2 documentation I see that in DB2 there are 'INSTEAD OF' triggers. I hope this may be helpful.
I have achieved the same result by overriding ActiveRecord::Base#arel_attributes in my model:
Class Model < ActiveRecord::Base
##skip_attrs = [:attr1, :attr2]
def arel_attributes_values(include_primary_key = true, include_readonly_attributes = true, attribute_names = #attributes.keys)
skip_attrs = ##skip_attrs.map { |attr| [self.class.arel_table[attr] }
attrs = super(include_primary_key, include_readonly_attributes, attribute_names)
attrs.delete_if {|key, value| skip_attrs.include?(key) }
end
end
The attributes in the ##skip_attrs array will be ignored by ActiveRecord on both insert and update statements, as they both rely on arel_attributes_values for returning the list of attributes of the model.
A better solution would be: a patch on ActiveRecord::Base#arel_attributes along with a 'attr_ignore' macro similar to 'attr_readonly'.
cheers
I know this is very old, but I have been struggling with this very same issue. I have a database with a trigger that calculates an index value based on the max value within a key. I, too, want to prevent any ability to set the value in AR as it could throw off the index applied as rows are inserted.
CREATE TRIGGER incr_col_idx
AFTER INSERT ON fl_format_columns
FOR EACH ROW
BEGIN UPDATE fl_format_columns
SET idx = (SELECT coalesce(max(idx),-1) + 1
FROM fl_format_columns
WHERE fl_file_format_id = new.fl_file_format_id)
WHERE fl_file_format_id = new.fl_file_format_id AND name = new.name;
END;
I've tried a variety of things, but it always came back to overriding the setter directly.
# #raise ArgumentError when an attempt is made to set a value that is calculated in db
def idx=(o)
raise ArgumentError,'the value of idx is set by the db. attempts to set value is not allowed.' unless o.nil?
end
This will require catching the exception rather than interrogating the errors array, but that is what I ended up with. It does pass the following inspection:
context 'column index' do
it 'should prevent idx from being set' do
expect{FL_Format_Column.create(fl_file_format_id:-1,name:'test idx',idx:0)}.to raise_error(ArgumentError)
end
it 'should calculate idx relative to zero' do
x = FL_Format_Column.create(fl_file_format_id:-1,name:'test_idx_nil')
expect(x.errors[:idx].any?).to be false
expect(FL_Format_Column.last.idx).to be > -1
end
end

blacklisting vs whitelisting in form's input filtering and validation

which is the preferred approach in sanitizing inputs coming from the user?
thank you!
I think whitelisting is the desired approach, however I never met a real whitelist HTML form validation. For example here is a symfony 1.x form with validation from the documentation:
class ContactForm extends sfForm
{
protected static $subjects = array('Subject A', 'Subject B', 'Subject C');
public function configure()
{
$this->setWidgets(array(
'name' => new sfWidgetFormInput(),
'email' => new sfWidgetFormInput(),
'subject' => new sfWidgetFormSelect(array('choices' => self::$subjects)),
'message' => new sfWidgetFormTextarea(),
));
$this->widgetSchema->setNameFormat('contact[%s]');
$this->setValidators(array(
'name' => new sfValidatorString(array('required' => false)),
'email' => new sfValidatorEmail(),
'subject' => new sfValidatorChoice(array('choices' => array_keys(self::$subjects))),
'message' => new sfValidatorString(array('min_length' => 4)),
));
}
}
What you cannot see, that it accepts new inputs without validation settings and it does not check the presence of inputs which are not registered in the form. So this is a blacklist input validation. By whitelist you would define an input validator first, and only after that bind an input field to that validator. By a blacklist approach like this, it is easy to forget to add a validator to an input, and it works perfectly without that, so you would not notice the vulnerability, only when it is too late...
A hypothetical whitelist approach would look like something like this:
class ContactController {
/**
* #input("name", type = "string", singleLine = true, required = false)
* #input("email", type = "email")
* #input("subject", type = "string", alternatives = ['Subject A', 'Subject B', 'Subject C'])
* #input("message", type = "string", range = [4,])
*/
public function post(Inputs $inputs){
//automatically validates inputs
//throws error when an input is not on the list
//throws error when an input has invalid value
}
}
/**
* #controller(ContactController)
* #method(post)
*/
class ContactForm extends sfFormX {
public function configure(InputsMeta $inputs)
{
//automatically binds the form to the input list of the #controller.#method
//throws error when the #controller.#method.#input is not defined for a widget
$this->addWidgets(
new sfWidgetFormInput($inputs->name),
new sfWidgetFormInput($inputs->email),
new sfWidgetFormSelect($inputs->subject),
new sfWidgetFormTextarea($inputs->message)
);
$this->widgetSchema->setNameFormat('contact[%s]');
}
}
The best approach is to either use stored procedures or parameterized queries. White listing is an additional technique that is ok to prevent any injections before they reach the server, but should not be used as your primary defense. Black listing is usually a bad idea because it's usually impossible to filter out all malicious inputs.
BTW, this answer is considering you mean sanitizing as in preventing sql injection.
WL is a best practice against BL whenever it is practicable.
The reason is simple: you can't be reasonably safe enumerating what it is not permitted, an attacker could always find a way you did not think about. If you can, say what is allowed for sure, it is simpler and much much safer !
Let me explain your question with few more question and answer.
Blacklist VS Whitelist restriction
i. A Blacklist XSS and SQL Injection handling verifies a desired input against a list of negative input's. Basically one would compile a list of all the negative or bad conditions, and verifies that the input received is not one among the bad or negative conditions.
ii. A Whitelist XSS and SQL Injection handling verifies a desired input against a list of possible correct input's. To do this one would compile a list of all the good/positive input values/conditions, and verifies that the input received is one among the correct conditions.
Which one is better to have?
i. An attacker will use any possible means to gain access to your application. This includes trying all sort of negative or bad conditions, various encoding methods, and appending malicious input data to valid data. Do you think you can think of every possible bad permutation that could occur?
ii. A Whitelist is the best way to validate input. You will know exacty what is desired and that there is not any bad types accepted. Typically the best way to create a whitelist is with the use of regular expression's. Using regular expressions is a great way to abstract the whitelisting, instead of manually listing every possible correct value.
Build a good regular expression. Just because you are using a regular expression does not mean bad input will not be accepted. Make sure you test your regular expression and that invalid input cannot be accepted by your regular expression.
Personally, I gauge the number of allowed or disallowed characters and go from there. If there are more allowed chars than disallowed, then blacklist. Else whitelist. I don't believe that there is any 'standard' that says you should do it one way or the other.
BTW, this answer is assuming you want to limit inputs into form fields such as phone numbers or names :) #posterBelow
As a general rule it's best to use whitelist validation since it's easier to accept only characters you know should go there, for example if you have a field where the user inputs his/her phone number you could just do a regex and check that the values received are only numbers, drop everything else and just store the numbers. Note that you should proceed to validate the resulting numbers as well. Blacklist validation is weaker because a skilled attacker could evade your validation functions or send values that your function did not expect, from OWASP "Sanitize with Blacklist":
Eliminate or translate characters (such as to HTML entities or to remove quotes) in an effort to make the input "safe". Like blacklists, this approach requires maintenance and is usually incomplete. As most fields have a particular grammar, it is simpler, faster, and more secure to simply validate a single correct positive test than to try to include complex and slow sanitization routines for all current and future attacks.
Realize that this validation is just a first front defense against attacks. For XSS you should always "Escape" your output so you can print any character's needed but they are escaped meaning that they are changed to their HTML entity and thus the browser knows it's data and not something that the parser should interpret thus effectively shutting down all XSS attacks. For SQL injections escape all data before storing it, try to never use dynamic queries as they are the easiest type of query to exploit. Try to use parameterized store procedures. Also remember to use connections relevant to what the connection has to do. If the connection only needs to read data, create a db account with only "Read" privileges this depends mostly on the roles of the users. For more information please check the links from where this information was extracted from:
Data Validation OWASP
Guide to SQL Injection OWASP
The answer generally is, it depends.
For inputs with clearly defined parameters (say the equivalent of a dropdown menu), I would whitelist the options and ignore anything that wasn't one of those.
For free-text inputs, it's significantly more difficult. I subscribe to the school of thought that you should just filter it as best you can so it's as safe as possible (escape HTML, etc). Some other suggestions would be to specifically disallow any invalid input - however, while this might protect against attacks, it might also affect usability for genuine users.
I think it's just a case of finding the blend that works for you. I can't think of any one solution that would work for all possibilities. Mostly it depends on your userbase.

Resources