Read null as empty set in springdata-cassandra - spring

I use spring-data-cassandra, and have entity like this:
#Table("users")
public class User {
#Column("permissions")
#CassandraType(type = DataType.Name.SET, typeArguments = {DataType.Name.TEXT})
public Set<String> permissions = new HashSet<>();
}
In cassandra I have table users with field permissions of type Set. It works fine when I store some values in the set, but when I try to store empty set, it becomes null when I read such entity from the repository.
Is there a way to force spring-data-cassandra to change null to empty HashSet? Or can I somehow add custom reader for this specific property of the entity?

TL;DR;
That's Cassandra's default behavior to return null for empty Collection and Map-typed columns.
Further Read
Cassandra returns null values for lists, sets, and maps, which do not contain any items. This is especially unfortunate when using classes with pre-initialized fields as seen in your question. There's an open ticket (DATACASS-266 - Loading empty collection-typed properties overwrites pre-initialized fields) in the issue tracker - as of now, without comments or votes.
We're not exactly sure whether it's a good idea to skip setting properties or apply some sort of defaulting when dealing with empty (null) collections as this raises follow-up questions what to do when:
Creating an instance through constructor creation: A value is required in such case. For property access, we could omit to set the property, for constructor creation we must provide a value.
The pre-initialized collection contains items but the one received from Cassandra is null.
We assume, the change would be applied, what will happen with already existing code that assumes empty collections default to null.
A possibility to address this behavior could be configuration on MappingCassandraConverter or an extension point to override so users can apply their own empty collection behavior.

I've been trying to eliminate the null collections in my model objects as well, and while it may not be possible to do that at the Spring Data level currently (version 2.1.x), there are some options you can consider:
Use property access for the field in question (i.e. use the annotation #AccessType(PROPERTY)), and in the setter method, set the field to an empty collection when the argument is null.
Define a compatible (see below) constructor that sets the field to an empty collection when a null is provided (and if the model is mutable, you may still want to provide the setter as above).
There are some caveats to ensure Spring Data Cassandra uses the desired constructor (e.g. don't provide a no argument constructor), so it's critical to review the "Object Mapping Fundamentals" section of the reference guide (https://docs.spring.io/spring-data/cassandra/docs/current/reference/html/#mapping.fundamentals).
Among the recommendations in that reference guide (as of version 2.1 at least) is to use an all argument constructor and make model objects immutable, which would work well with the constructor-based approach to handling nulls. Though it does mean writing and maintaining the constructor to handle the nulls rather than relying on Lombok's #AllArgsConstructor.
I have used the property access approach in one case, but not the constructor approach. However I do intend go the constructor route when adding new or model classes (I'm a fan of immutable objects, and will explore that route even without any collection fields)
I believe Spring Data Cassandra 2.0 also added persistence lifecycle callbacks which is another possible option I suppose, but I ruled that out, mainly because the logic would not reside in the model class itself (as well as going against the recommendations from the creators of the framework)

Related

Tapestry: When should a Session State Object be preferred to a Session Attribute?

From the Tapestry doc I seem to understand that a field annotated with #SessionAttribute and a field annotated with #SessionState work in the same way, except that #SessionAttribute stores the value by name (and the name can be specified), which means that different instances of the same class can be stored, while #SessionState stores the value by type so storing different instances of the same class will not work, the new instance will always overwrite the old one (even if the two are different fields with different name and from different classes).
So it seems that #SessionState doesn't offer any advantage over #SessionAttribute, only limitations, but I'm probably missing something. I'm not able to figure out any case where using #SessionState could be more advisable than #SessionAttribute for any reason.
Are there such cases ?
#SessionAttribute is largely intended for some interop cases, where some other, non-Tapestry code (another servlet) is expecting the data to be stored using an explicitly specified name.
#SessionState's advantage is that the name is automatically determined from the type ... one less thing to care about, and more amenable to refactoring.

Parameter validation vs property validation

Most (almost all?) validation frameworks are based on reading object's property value and checking if it obeys validation rules.
Do we really need it?
If we pass valid parameters into object's constructor, property setters and other methods, object seems to be perfectly valid, and property value checks are not needed!
Isn't it better to validate parameters instead of properties?
What validation frameworks can be used to validate parameters before passing them into an object?
Update
I'm considering situation where client invokes service method and passes some data. Service method must check data, create / load domain objects, do business logic and persist changes.
It seems that most of the time data is passed by means of data transfer objects. And property validation is used because DTO can be validated only after it has been created by network infrastructure.
This question can spread out into wider topic. First, let's see what Martin Fowler has said:
One copy of data lies in the database itself. This copy is the lasting
record of the data, so I call it the record state.
A further copy lies inside in-memory Record Sets within the application. This data
was only relevant for one particular session between the application
and the database, so I call it session state.
The final copy lies
inside the GUI components themselves. This, strictly, is the data they
see on the screen, hence I call it the screen state.
Now I assume you are talking about validation at session state, whether it is better to validate the object property or validate the parameter. It depends. First, it depends on whether you use Anemic or Rich Domain Model. If you use anemic domain model, it will clear that the validation logic will reside at other class.
Second, it depends on what type of object do you build. An Framework / operation /utility object need to have validation against object property. e.g: C#'s FileStream object, in which the stream class need to have valid property of either file path, memory pointer, file access mode, etc. You wouldn't want every developer that use the utility to validate the input beforehand or it will crash in one operation, and giving wrong error message instead of fail fast.
Third, you need to consider that "parameter can come in many sources / forms", while "class / object property only has 1 definition". You need to place the parameter validation at every input source, while object property validation only need to be defined once. But you also need to understand the object's state. An object can be valid in some state (draft mode) and not in other state (submission mode).
Of course you can also add validation into other state level as well, such as database (record state) or UI (screen state), but it also have different pros/cons.
What validation frameworks can be used to validate parameters before passing them into an object?
C#'s ASP.Net MVC can do one kind of parameter validation (for data type) before constructing into an object, at controller level.
Conclusion
It depends entirely on what architecture and kind of object you want to make.
In my experience such validations were done when dealing with complex validation rules and Parameter object. Since we need to keep the Separation of concerns - the validation logic is not in the object itself. That's why - yes we
we really need it
What is more interesting - why construct expensive objects and later validate them.

Custom model metadata provider caching issue

In order to allow us dynamic control over labels and error messages, we created a custom DataAnnotationsModelMetadataProvider. In a Display attribute we store the key in the Name property and using the custom DataAnnotationsModelMetadataProvider we substitute the key for a string value from our custom CMS. The problem is that we now have two sets of values. One for Web views and one for mobile views. At runtime we check if the client is on a mobile device and substitute the values accordingly.
After test running this setup I came across a strange issue. When the AppDomain is first created and the Name properties of the different data annotations are replaced with the string values, everything works fine. In debug, when I enter the custom DataAnnotationsModelMetadataProvider for a second time, I see the name properties already populated with the values I had substituted the previous run. This was strange to me, since it was my understanding that data annotation propeties could not be chnaged at runtime. It now seems like there is a model metadata cache happening somewhere. Since I based my custom solution on replacing the values each time the DataAnnotationsModelMetadataProvider is called upon, I would like to disable this caching, if possible.
For now I started using the ShortName property as my key storing property and I replace the Name property, and this way I can repopulate the strings on each run. But this was not the initial design and I don't have such a key store property for ValidationAttributes.
So is there a way to disable this cache? I don't need the cache for the sake of caching, since all CMS data is cached in memory in another layer anyway.

Is there a reason why the default modelbinder doesn't bind to fields?

I'm using ASP.NET MVC3 and i'm wondering that the default modelbinder binds to public properties but not to public fields.
Normally i just define the model classes with properties but sometimes i use some predefined classes which contains some fields. And everytime i have to debug and remember that the modelbinder just don't like fields.
The question: Whats the reason behind it?
but sometimes i use some predefined classes which contains some fields
While I cannot answer your question about the exact reason why the default model binder works only with properties (my guess is that it respects better encapsulation this way and avoids modifying internal state of the object which is what fields represent) I can say that what you call predefined classes should normally be view models. You should always use view models to and from your controller actions. Those view models are classes that are specifically defined to meet the requirements of the given view.
So back to the main point: fields are supposed to be modified only from within the given class. They should not be accessed directly from the outside. They represent and hold internal state of the class. Properties on the other hand is what should be exposed to the outside world. Imagine that in the property getter/setter you had some custom logic. By modifying directly the field this custom logic would be broken and potentially bring the object into an inconsistent state.
Maybe the reason for ignoring fields is to increase performance of the binder. Instead of searching all the Fields and properties. The Model Binder search for Properties only.
Though I think the Model Binder use cache to improve performance.
DefaultModelBinder exposes a public method:
DefaultModelBinder.BindModel, and a number of protected method available for overriding. All of them listed here.
Besides the model, these method refer to properties only, not fields, like
GetModelProperties,
GetFilteredModelProperties,
GetPropertyValue,
OnXYZValidating,
OnXYZValidated,
OnXYZUpdating,
OnXYZUpdated,
GetXYZValue,
where XYZ stands for either Model, or Property/ies, or both, and so on.
As you can see there is no Fields mentioned with these names whatsoever. As Darin explained no direct changes to Model's state are tolerated by the Binder. Hence no Field in its methods.
And also, you may wish to take a look at another important class: ModelBindingContext. An instance of this class gets passed to the BindModel, and subsequently to BindSimpleModel, and BindComplexModel, depending on model type (string, int,... are considered simple, everything else is complex).
So, this context has the following properties:
ModelXYZ, and
PropertyXYZ.
In other words you have no means to reference the fields in your ViewModel unless you do not override these classes and undertake special actions to do so.
But again, beware of fighting the framework, its always easier to follow it instead.
EDIT: The ModelMetadata class holds all the data needed to bind the model. Its code however, shows no sign of fields, field names, etc. Only properties are referenced and accessed. So, even if you try to inherit and override DefaultModelBinder and ModelBinderContext, you still won't be able to access fiellds, nevermind what their access modifier is: public, private, etc.
Hope this explains most of it.

Fetch annotation in SDG 2.0, fetching strategy questions

Hi all patient developers using spring data graph. Since there is so less documentation and pretty poor test coverage it is sometimes very difficult to understand what is the expected behavior of the underlying framework how the framework is supposed to work. Currently i have some questions related to new fetching approach introduced in SDG 1.1. As opposite to SDG 1.1 write\read through in 2.0 only relations and related object annotated with #Fetch annotation are eagerly fetched others are supposed to be fetched lazily .. and now my first question:
Is it possible to configure SDG so that if the loading of entity and
invoking getter on lazy relation takes place in the same transaction,
requested collection is fetch automatically? Kind of Persistence
Context in transaction scope, or maybe it is planned for the feature
releases.
How can I fetch lazy collection at once for #RelatedTo annotation ? fetch() method on from Neo4jOperation allows to fetch only one entity. Do i have to iterate through whole list and fetch entity for each object? What would be the best way to check if given object is already fetched / initialized or not?
As suggestion i think it would be more intuitive if there will be kind of lazy loading exception thrown instead of getting NPE when working with not initialized objects. Moreover the behavior is misleading since when object is not initialized and all member properties are null apart from id, equals method can provide true for different objects which has not been initialized, which is quite serious issues considering for example appliance of sets
Another issue which i noticed when working with SDG 2.0.0.RC1 is following: when i add new object to not fetched collection sometimes is properly added and persisted,however sometimes is not. I wrote test for this case and it works in non deterministic way. Sometimes it fails sometimes end with success. Here is the use case:
Group groupFromDb = neoTemplate.findOne(group.getId(), Group.class);
assertNotNull(groupFromDb);
assertEquals("Number of members must be equals to 1", 1, groupFromDb.getMembers().size());
User secondMember = UserMappingTest.createUser("secondMember");
groupFromDb.addMember(secondMember);
neoTemplate.save(groupFromDb);
Group groupAfterChange = neoTemplate.findOne(groupFromDb.getId(), Group.class);
assertNotNull(groupAfterChange);
assertEquals("Number of members must be equals to saved entity", groupFromDb.getMembers().size(), groupAfterChange.getMembers().size());
assertEquals("Number of members must be equals to 2", 2, groupAfterChange.getMembers().size());
This test fails sometimes on the last assert, which would mean that sometimes member is added to the set and sometimes not. I guess that the problem lies somewhere in the ManagedFieldAccessorSet, but it is difficult to say since this is non deterministic. I run the test with mvn2 and mvn3 with java 1.6_22 and 1.6_27 and i got always the same result: sometimes is Ok sometimes test fails. Implementation of User equals seems as follows:
#Override
public boolean equals(final Object other) {
if ( !(other instanceof User) ) {
return false;
}
User castOther = (User) other;
if(castOther.getId() == this.getId()) {
return true;
}
return new EqualsBuilder().append(username, castOther.username).isEquals();
}
- I find it also a bit problematic that for objects annotated with #Fetch java HashSet is used which is serializable, while using for lazy loaded fields ManagedFieldAccessorSet is used which is not serializable and causes not serializable exception.
Any help or advice are welcome. Thanks in advance!
I put together a quick code sample showing how to use the fetch() technique Michael describes:
http://springinpractice.com/2011/12/28/initializing-lazy-loaded-collections-with-spring-data-neo4j/
The simple mapping approach was only added to Spring Data Neo4j 2.0, so it is not as mature as the advanced AspectJ mapping. We're currently working on documenting it more extensively.
The lazy loading option was also added lately. So your feedback is very welcome.
Right now SDN doesn't employ a proxy approach for the lazily loaded objects. So the automatic "fetch on access" is not (yet) supported. That's why also no exception is thrown when accessing non-loaded fields and there is no means of "discovering" if an entity was not fully loaded.
In the current snapshot there is the template.fetch() operation to fully load lazy loaded objects and collections.
We'll look into the HashSet vs. ManagedSet issue, it is correct that this is not a good solution.
For the test-case. is the getId() returning a Long object or a long primitive? It might be sensible to use getId().equals(castOther.getId()) here as reference equality is not guaranteed for Number objects.

Resources