Is it problematic that Spring Data REST exposes entities via REST resources without using DTOs?

Is it problematic that Spring Data REST exposes entities via REST resources without using DTOs? - spring

In my limited experience, I've been told repeatedly that you should not pass around entities to the front end or via rest, but instead to use a DTO.
Doesn't Spring Data Rest do exactly this? I've looked briefly into projections, but those seem to just limit the data that is being returned, and still expecting an entity as a parameter to a post method to save to the database. Am I missing something here, or am I (and my coworkers) incorrect in that you should never pass around and entity?

tl;dr
No. DTOs are just one means to decouple the server side domain model from the representation exposed in HTTP resources. You can also use other means of decoupling, which is what Spring Data REST does.
Details
Yes, Spring Data REST inspects the domain model you have on the server side to reason about the way the representations for the resources it exposes will look like. However it applies a couple of crucial concepts that mitigate the problems a naive exposure of domain objects would bring.
Spring Data REST looks for aggregates and by default shapes the representations accordingly.
The fundamental problem with the naive "I throw my domain objects in front of Jackson" is that from the plain entity model, it's very hard to reason about reasonable representation boundaries. Especially entity models derived from database tables have the habit to connect virtually everything to everything. This stems from the fact that important domain concepts like aggregates are simply not present in most persistence technologies (read: especially in relational databases).
However, I'd argue that in this case the "Don't expose your domain model" is more acting on the symptoms of that than the core of the problem. If you design your domain model properly there's a huge overlap between what's beneficial in the domain model and what a good representation looks like to effectively drive that model through state changes. A couple of simple rules:
For every relationship to another entity, ask yourself: couldn't this rather be an id reference. By using an object reference you pull a lot of semantics of the other side of the relationship into your entity. Getting this wrong usually leads entities referring to entities referring to entities, which is a problem on a deeper level. On the representation level this allows you to cut off data, cater consistency scopes etc.
Avoid bi-directional relationships as they're notoriously hard to get right on the update side of things.
Spring Data REST does quite a few things to actually transfer those entity relationships into the proper mechanisms on the HTTP level: links in general and more importantly links to dedicated resources managing those relationships. It does so by inspecting the repositories declared for entities and basically replaces an otherwise necessary inlining of the related entity with a link to an association resource that allows you to manage that relationship explicitly.
That approach usually plays nicely with the consistency guarantees described by DDD aggregates on the HTTP level. PUT requests don't span multiple aggregates by default, which is a good thing as it implies a scope of consistency of the resource matching the concepts of your domain.
There's no point in forcing users into DTOs if that DTO just duplicates the fields of the domain object.
You can introduce as many DTOs for your domain objects as you like. In most of the cases, the fields captured in the domain object will reflect into the representation in some way. I have yet to see the entity Customer containing a firstname, lastname and emailAddress property, and those being completely irrelevant in the representation.
The introduction of DTOs doesn't guarantee a decoupling by no means. I've seen way too many projects where they where introduced for cargo-culting reasons, simply duplicated all fields of the entity backing them and by that just caused additional effort because every new field had to be added to the DTOs as well. But hey, decoupling! Not. ¯\_(ツ)_/¯
That said, there are of course situations where you'd want to slightly tweak the representation of those properties, especially if you use strongly typed value objects for e.g. an EmailAddress (good!) but still want to render this as a plain String in JSON. But by no means is that a problem: Spring Data REST uses Jackson under the covers which offers you a wide variety of means to tweak the representation — annotations, mixins to keep the annotations outside your domain types, custom serializers etc. So there is a mapping layer in between.
Not using DTOs by default is not a bad thing per se. Just imagine the outcry by users about the amount of boilerplate necessary if we required DTOs to be written for everything! A DTO is just one means to an end. If that end can be achieved in a different way (and it usually can), why insist on DTOs?
Just don't use Spring Data REST where it doesn't fit your requirements.
Continuing on the customization efforts it's worth noticing that Spring Data REST exists to cover exactly the parts of the API, that just follow the basic REST API implementation patterns it implements. And that functionality is in place to give you more time to think about
How to shape your domain model
Which parts of your API are better expressed through hypermedia driven interactions.
Here's a slide from the talk I gave at SpringOne Platform 2016 that summarizes the situation.
The complete slide deck can be found here. There's also a recording of the talk available on InfoQ.
Spring Data REST exists for you to be able to focus on the underlined circles. By no means we think you can build a great really API solely by switching Spring Data REST on. We just want to reduce the amount of boilerplate for you to have more time to think about the interesting bits.
Just like Spring Data in general reduces the amount of boilerplate code to be written for standard persistence operations. Nobody would argue you can actually build a real world app from only CRUD operations. But taking the effort out of the boring bits, we allow you to think more intensively about the real domain challenges (and you should actually do that :)).
You can be very selective in overriding certain resources to completely take control of their behavior, including manually mapping the domain types to DTOs if you want. You can also place custom functionality next to what Spring Data REST provides and just hook the two together. Be selective about what you use.
A sample
You can find a slightly advanced example of what I described in Spring RESTBucks, a Spring (Data REST) based implementation of the RESTBucks example in the RESTful Web Services book. It uses Spring Data REST to manage Order instances but tweaks its handling to introduce custom requirements and completely implement the payment part of the story manually.

Spring Data REST enables a very fast way to prototype and create a REST API based on a database structure. We're talking about minutes vs days, when comparing with other programming technologies.
The price you pay for that, is that your REST API is tightly coupled to your database structure. Sometimes, that's a big problem. Sometimes it's not. It depends basically on the quality of your database design and your ability to change it to suit the API user needs.
In short, I consider Spring Data REST as a tool that can save you a lot of time under certain special circumstances. Not as a silver bullet that can be applied to any problem.

We used to use DTOs including the fully traditional layering ( Database, DTO, Repository, Service, Controllers,...) for every entity in our projects. Hopping the DTOs will some day save our life :)
So for a simple City entity which has id,name,country,state we did as below:
City table with id,name,county,.... columns
CityDTO with id,name,county,.... properties ( exactly same as database)
CityRepository with a findCity(id),....
CityService with findCity(id) { CityRepository.findCity(id) }
CityController with findCity(id) { ConvertToJson( CityService.findCity(id)) }
Too many boilerplate codes just to expose a city information to client. As this is a simple entity no business is done at all along these layers, just the objects is passing by.
A change in City entity was starting from database and changed all layers. (For example adding a location property, well because at the end the location property should be exposed to user as json). Adding a findByNameAndCountryAllIgnoringCase method needs all layers be changed changed ( Each layer needs to have new method).
Considering Spring Data Rest ( of course with Spring Data) this is beyond simple!
public interface CityRepository extends CRUDRepository<City, Long> {
City findByNameAndCountryAllIgnoringCase(String name, String country);
}
The city entity is exposed to client with minimum code and still you have control on how the city is exposed. Validation, Security, Object Mapping ... is all there. So you can tweak every thing.
For example, if I want to keep client unaware on city entity property name change (layer separation), well I can use custom Object mapper mentioned https://docs.spring.io/spring-data/rest/docs/3.0.2.RELEASE/reference/html/#customizing-sdr.custom-jackson-deserialization
To summarize
We use the Spring Data Rest as much as possible, in complicated use cases we still can go for traditional layering and let the Service and Controller do some business.

A client/server release is going to publish at least two artifacts. This already decouples client from server. When the server's API is changed, applications do not immediately change. Even if the applications are consuming the JSON directly, they continue to consume the legacy API.
So, the decoupling is already there. The important thing is to think about the various ways a server's API is likely to evolve after it is released.
I primarily work with projects which use DTOs and numerous rigid layers of boilerplate between the server's SQL and the consuming application. Rigid coupling is just as likely in these applications. Often, changing anything in the DB schema requires us to implement a new set of endpoints. Then, we support both sets of endpoints along with the accompanying boilerplate in each layer (Client, DTO, POJO, DTO <-> POJO conversions, Controller, Service, Repository, DAO, JDBC <-> POJO conversion, and SQL).
I'll admit that there is a cost to dynamic code (like spring-data-rest) when doing anything not supported by the framework. For example, our servers need to support a lot of batch insert/update operations. If we only need that custom behavior in a single case, it's certainly easier to implement it without spring-data-rest. In fact, it may be too easy. Those single cases tend to multiply. As the number of DTOs and accompanying code grows, the inconsistencies eventually become extremely burdensome to maintain. In some non-dynamic server implementations, we have hundreds of DTOs and POJOs that are likely no longer used by anything. But, we are forced to continue supporting them as their number grows each month.
With spring-data-rest, we pay the cost of customization early. With our multi-layer hard-coded implementations, we pay it later. Which one is preferred depends on a lot of factors (including the team's knowledge and the expected lifetime of the project). Both types of project can collapse under their own weight. But, over time, I've become more comfortable with implementations (like spring-data-rest without DTOs) that are more dynamic. This is especially true when the project lacks good specifications. Over time, such a project can easily drown in the inconsistencies buried within its sea of boilerplate.

From the Spring documentation I don't see Spring data REST exposes entities, you are the one doing it.
Spring Data projects intend to ease the process of accessing different data sources, but you are the one deciding which layer to expose on Spring Data Rest.
Reorganizing your project will help to solve your issue.
Every #Repository that you create with Spring data represents more a DAO in the sense of design than a Repository. Each one is tightly coupled with a particular Data source you want to reach out to. Say JPA, Mongo, Redis, Cassandra,...
Those layers are meant to return entity representations or projections.
However if you check out the Repository pattern from a design perspective you should have a higher layer of abstraction from those specific DAOs where your app use those DAOs to get info from as many different sources it needs, and builds business specific objects for your app (Those might looks more like your DTOs).
That is probably the layer you want to expose on your Spring Data Rest.
NOTE: I see an answer recommending to return Entity instances only because they have the same properties as the DTO. This is normally a bad practice and in particular is a bad idea in Spring and many other frameworks because they do not return your actual classes, they return proxy wrappers so that they can work some magic like lazy loading of values and the likes.

Related

Shall I use a DTO or not?

I'm building a web application with Spring, and I'm at the point where I have an Entity, a Repository, a RestController, and I can access endpoints in my browser.
I'm now trying to return JSON data to the browser, and I'm seeing all of this stuff about DTOs in various guides.
Do I really need a DTO? Can't I just put the serialization logic on the entity itself?

I think, this is a little bit debatable question, where the short answer would be:
It depends.
Little longer answer
There are plenty of people, who, in plenty of cases, would prefer one approach (using DTOs) over another (using bare entities), and vice versa; however, there is no the single source of truth on which is better to use.
It very much depends on the requirements, architectural approach you decide to stick with, (even on) personal preference and other (project-related) specific details.
Some even claim that DTO is an anti-pattern; some love using them; some think, that data refinement/adjustment should happen on the consumer/client side (for various reasons, out of which, one can be No Policy for API changes).
That being said, YES, you can simply return the #Entity instance (or list of entities) right from your controller and there is no problem with this approach. I would even say, that this does not necessarily violate something from SOLID or Clean Code principles.again, it depends on what do you use a response for, what representation of data do you need, what should be the capacity and purpose of the object in question, and etc..
DTO is generally a good practice in the following scenarios:
When you want to aggregate the data for your object from different resources, i.e. you want to put some object transformation logic between the Persistence Layer and the Business(or Web) Layer:
Imagine you fetch from your database a List<Employee>; however, from another 3rd party web-service, you also receive some complementary-to-employee data for each Employee object, which you have to aggregate in the Employee objects (aggregate, or do some calculation, or etc. point is that you want to combine the data from different resources). This is a good case when you might want to use DTO pattern. It is reusable, it conforms to Single-Responsibility Principle, and it is well segregated from other layers;
When you don't necessarily combine data received from different sources, but you want to modify the entity which you will be returning:
Imagine you have a very big Entity (with a lot of fields), and the client, which calls the corresponding endpoint (Front-End application, Mobile, or any client), has no need of receiving this huge entity (or list of entities). If you, despite the client's requirement, will still be sending the original/unchanged entity, you will end up consuming network bandwidth/load inefficiently (more than enough), performance will be weaker, and generally, you will be just wasting computing resources for no good reason. In this case, you might want to transform your original Entity to the DTO object, which the client needs (only with required fields). Here, you might even want to implement different DTO classes, for one entity, for different consumers/clients.
However, if you are sure, that your table/relation representations (instances of #Entity classes) are exactly what the client needs, I see no necessity of introducing DTOs.
Supporting further the idea, that #Entity can be returned to the presentation layer without DTO
Java Persistence with Hibernate, Second Edition, in §3.3.2, even motivates it explicitly, that:
You can reuse persistent classes outside the context of persistence, in unit tests or in the presentation layer, for example. You can create instances in any runtime environment with the regular Java new operator, preserving testability and reusability;
Hibernate entities do not need to be explicitly Serializable;
You might also want to have a look at this question.

In general, it’s up to you to decide. If your application is relatively simple and you don’t expose any sensitive information, an response is y ambiguous for the client, there is nothing criminal in returning back the whole entity. If your client expect a small slice of entity, eg only 2-3 fields from 30 fields entity, then it make sense to do the translation or consider different protocol such as GraphQL.

It is ideal design where you should not expose the entity.
It is a good design to convert your entity to DTO before you pass the same to web layer.
These days RestJpacontrollers are also available.
But again it all varies from application to application which one to use.
If your application does a need only read only operation then make sense to use RestJpacontrollers and can use entity at web layer.
In other case where application modifies data frequently then in that case better option to opt DTO and use it at the UI layer.
Another case is of multiple requests are required to bring data for a particular task. In the same case data to be brought can be combined in a DTO so that only one request can bring all the required data.
We can use data of multiple entities data into one DTO.
This DTO can be used for the front end or in the rest API.

Do I really need a DTO? Can't I just put the serialization logic on the entity itself?
I'd say you don't, but it is better to use them, according to SOLID principles, namely single responsibility one. Entities are ORM should be used to interact with database, not being serialized and passed to the other layers.

Does GraphQL obviate Data Transfer Objects?

To my understanding, Data Transfer Objects (DTOs) are typically smallish, flattish, behavior-less, serializable objects whose main advantage is ease of transport across networks.
GraphQL has the following facets:
encourages serving rich object graphs, which (in my head anyway) contradicts the "flattish" portion of DTOs,
lets clients choose exactly the data they want, which addresses the "smallish" portion,
returns JSON-esque objects, which addresses the "behavior-less" and "serializable" portions
Do GraphQL and the DTO pattern mutually exclude one another?
Here's what led to this question: We envision a microservices architecture with a gateway. I'm designing one API to fit into that architecture that will serve (among other things) geometries. In many (likely most) cases the geometries will not be useful to client applications, but they'll be critical in others so they must be served. However they're serialized, geometries can be big so giving clients the option to decline them can save lots of bandwidth. RESTful APIs that I've seen handling geometries do that by providing a "returnGeometry" parameter in the query string. I never felt entirely comfortable with that approach, and I initially envisioned serving a reasonably deep set of related/nested return objects many of which clients will elect to decline. All of that led me to consider a GraphQL interface. As the design has progressed, I've started considering flattening the output (either entirely or partially), which led me to consider the DTO pattern. So now I'm wondering if it would be best to flatten everything into DTOs and skip GraphQL (in favor of REST, I suppose?). I've considered a middle ground with DTOs served using GraphQL to let clients pick and choose the attributes they want on them, but I'm wondering if that's mixing patterns & technologies inappropriately.

I think it's worthwhile differentiating between 2 typical use cases for GraphQL, and a hidden 3rd use case which combines the first two.
In all 3 however, the very nature of a GraphType is to selectively decide which fields you want to expose from your domain entity. Sounds familiar? It should, that's what a DTO is. GraphQL or not, you do not want to expose the 'password' field on your Users table for example, hence you need to hide it from your clients one way or another.
This is enabled by the fact that GraphQL doesn't make any assumptions about your persistence layer and gives you the tools to treat your input types / queries as you see fit.
1. GraphQL endpoint exposed directly to clients (e.g. web, mobile):
In this use case you'd use any GraphQL client to talk to your graphql endpoint directly. The DTOs here are the actual GraphType objects, and are structured depending on the Fields you added to your exposed GraphTypes.
Internally, you would use field resolvers to transform your DTO to your domain entity and then use your repository to persist it.
DTO transformation occurs inside the GraphType's Field resolver.
GraphQL --> DTO --> Domain Entity --> Data Store
2. REST endpoint exposed to clients, which internally consumes a GraphQL endpoint:
In this use case, your web and mobile clients are working with traditional DTOs via REST. The controllers however are connecting to an internally-exposed GraphQL endpoint - as opposed to use case #1 - whose GraphTypes are an exact mapping of your domain entities, password field included!
DTO transformation occurs in the controller before calling the endpoint.
DTO --> Domain Entity --> GraphQL --> Data Store
3. Combining 1 and 2
This is is a use case for when you're shifting your architecture from one to the other and you don't want to break things for client consumers, so you leave both options open and eventually decommission one of them.

Domain driven design is confusing

1) What are the BLL-services? What's the difference between them and Service Layer services? What goes to domain services and what goes to service layer?
2) Howcome I refactor BBL model to give it a behavior: Post entity holds a collection of feedbacks which already makes it possible to add another Feedback thru feedbacks.Add(feedback). Obviosly there are no calculations in a plain blog application. Should I define a method to add a Feedback inside Post entity? Or should that behavior be mantained by a corresponing service?
3) Should I use Unit-Of-Work (and UnitOfWork-Repositories) pattern like it's described in http://www.amazon.com/Professional-ASP-NET-Design-Patterns-Millett/dp/0470292784 or it would be enough to use NHibernate ISession?

1) Business Layer and Service Layer are actually synonyms. The 'official' DDD term is an Application Layer.
The role of an Application Layer is to coordinate work between Domain Services and the Domain Model. This could mean for example that an Application function first loads an entity trough a Repository and then calls a method on the entity that will do the actual work.
2) Sometimes when your application is mostly data-driven, building a full featured Domain Model can seem like overkill. However, in my opinion, when you get used to a Domain Model it's the only way you want to go.
In the Post and Feedback case, you want an AddFeedback(Feedback) method from the beginning because it leads to less coupling (you don't have to know if the FeedBack items are stored in a List or in a Hashtable for example) and it will offer you a nice extension point. What if you ever want to add a check that no more then 10 Feedback items are allowed. If you have an AddFeedback method, you can easily add the check in one single point.
3) The UnitOfWork and Repository pattern are a fundamental part of DDD. I'm no NHibernate expert but it's always a good idea to hide infrastructure specific details behind an interface. This will reduce coupling and improves testability.

I suggest you first read the DDD book or its short version to get a basic comprehension of the building blocks of DDD. There's no such thing as a BLL-Service or a Service layer Service. In DDD you've got
the Domain layer (the heart of your software where the domain objects reside)
the Application layer (orchestrates your application)
the Infrastructure layer (for persistence, message sending...)
the Presentation layer.
There can be Services in all these layers. A Service is just there to provide behaviour to a number of other objects, it has no state. For instance, a Domain layer Service is where you'd put cohesive business behaviour that does not belong in any particular domain entity and/or is required by many other objects. The inputs and ouputs of the operations it provides would typically be domain objects.
Anyway, whenever an operation seems to fit perfectly into an entity from a domain perspective (such as adding feedback to a post, which translates into Post.AddFeedback() or Post.Feedbacks.Add()), I always go for that rather than adding a Service that would only scatter the behaviour in different places and gradually lead to an anemic domain model. There can be exceptions, like when adding feedback to a post requires making connections between many different objects, but that is obviously not the case here.

You don't need a unit-of-work pattern on top on the NHibernate session:
Why would I use the Unit of Work pattern on top of an NHibernate session?
Using Unit of Work design pattern / NHibernate Sessions in an MVVM WPF

Is it possible to inject too many repositories into a controller?

I have the first large solution that I am working on using MVC3. I am using ViewModels, AutoMapper, and DI.
To create my ViewModels for some of the more complex edit/creates I am injecting 10 or so
repositories. For all but one of the repositories they are only there to get the data to populate a select list on the ViewModel as I am simply getting associated FK entities etc.
I've seen it mentioned that injecting large numbers of repositiories is bad practice and I should refactor. How many is to many? is this to many? How should I refactor? Should I create a dedicated service that returns select lists etc?
Just to to give an example here is the the constructor for my RequirementsAndOffer Controller
public RequirementsAndOfferController(
IdefaultnoteRepository defaultnoteRepository,
IcontractformsRepository contractformsRepository,
IperiodRepository periodRepository,
IworkscopeRepository workscopeRepository,
IcontactRepository contactRepository,
IlocationRepository locationRepository,
IrequirementRepository requirementRepository,
IContractorRepository contractRepository,
IcompanyRepository companyRepository,
IcontractRepository contractRepository,
IrequirementcontracttypeRepository requirementcontracttypeRepository,
IoffercontractRepository offercontractRepository)
All of the above populate the selects apart from the requirementRepository and offercontractRepository which I use to get the requirements and offers.
Update
General thoughts and updates. I was encouraged to consider this issue by Mark Seemann blog article on over injection. I was interested in specifically the repositories and why I was having to inject this number. I think having considered my design I am clearly not using one repository for each aggregate root (as per DDD).
I have for example cars, and cars have hire contracts, and hire contracts have hire periods.
I was creating a repository for cars, hire contracts, and hire periods. So that was creating 3 repositories when I think there should only be one. hire contracts and periods can't exist without cars. Therefore I have reduced some repositories that way.
I am still left with some complex forms (the customer is demanding these large forms) that are requiring a number of repositories in the controller. This maybe is because I haven't refactored enough. As far as I can see though I am going to need separate repositories to get the select lists.
I'm considering options for creating some sort of service that will provide all the select lists I need. Is that good practice/bad practice? Should my services only be orientated around aggregate roots? If so having one service providing selects would be wrong. However the selects do seem to be the same type of thing and grouping them together is attractive in some ways.
Would seem my question is similar to how-to-deal-with-constructor-over-injection-in-net
I guess I am now more looking for specific advice on whether a Select List service is good or bad.
Any advice appreciated.

You have the right idea starting with a repository pattern. Depending on how you use your repositories, I completely understand how you might end up with a lot (maybe even 1 per database table.)
You'll have to be the judge of your own specifications and business requirements, but perhaps you can consider a business layer or service layer.
Business Layer
This layer might be composed of business objects that encapsulate one or more intents of your view models (and inherently views.) You didn't describe any of your domain, but maybe a business object for Users could include some CRUD methods. Then, your view models would rely on these business objects instead of directly calling the repository methods. You may have already guessed the refactoring would move calls to repository methods into the business objects
Service Layer
A service layer might even use some of the business objects described above, but perhaps you can design some type of messaging protocol/system/standard to communicate between your web app and maybe a WCF service running on the server to control state of some kind.
This isn't the most descriptive of examples, but I hope it helps with a very high level view of refactoring options.

Architecture : layer responsibility and communication with modularity

Im currently trying to design an architecture for my new webapp project that has this kind of concept :
consists of several big modules that are independent from one another, but can still communicate and affecting one another.
For example, i could enable the purchasing module along with production module in my webapp, and let's assume the modules could communicate with one another.
But then i could activate only the purchasing module, but disabling production module in the webapp, just from configuring it, without changing any of the code., and the purchasing module will still work fine (independent from the production module)
Here's what i've been thinking about for the architectural layers to support this kind of application :
The UI Layer
JSF 2.0 + Primefaces widgets
Requestscoped ManagedBean + Flash object to transfer data between pages
The ManagedBean will deal with the UI states, UI validations, but not with the business logic operations
The ManagedBean also has access to the service layer, injected by Spring
ManagedBean could have simple fields (like string, integer, etc), or view models (to encapsulate some related fields), or even the Entity models, which should be a transient object in the beginning, and becoming a detached object once having get in and persisted and get out of a transaction.
These fields combinations could be used based on the situation, and the validations, for example, like the #Required, will be placed in the ManageBean's setter method. The Entity model could have #NotNull or #Size within the fields.
The entities in my thinking is only JPA POJOs with the JPA annotations defining the relationships between the entities, without any behaviours, except those validations defined by the the annotations also.
The Service Layer
This layer will handle the business logic validations and operations
Modularity : Could also call other service layer for other modules where he other modules could be non-existent, if disabled via configuration. Perhaps this can be achieved via nother layer for the communication between modules, or perhaps i could use Spring to inject empty implementations for the disabled modules ?
Input : It can accept Entity models, or plain variables, or view models
Output : The return value could vary from void, Entity, a list of Entities (to be displayed later in a datatable in JSF), and could be plain variables like boolean, string, integer, etc.
In the future, this layer will also provide web services for mobile devices or other kind of language that support web service (i still dont know how, but i think this is possible, even if the method accept objects or entities as the parameters)
Each service object will have DAO instance injected by Spring, and will call the DAO for data operations, like CRUD operations, querying, etc
The DAO Layer
Will have the data operations like CRUD operations, querying (jpql, named query, criteria query, native sql query, stored proecure calls) etc
Input : It can accept Entity models, or plain variables, or view models
Output : The return value could vary from void, Entity, a list of Entities (to be displayed later in a datatable in JSF), and could be plain variables like boolean, string, integer, etc.
Having one DAO for each entity is the norm, but when dealing with multiple tables in a single data operation, i'd have to introduce new DAOs.
Will have the EntityManager injected by Spring
These are the things i have in mind, and with this, i tried doing some googling around these topics, and found out many other stuffs like :
Doman Driven Design (DDD), where the entities could have persisting logics in them ? I think this is also the active record pattern ? Spring roo seems to be generating this kind of model also. It seems to be the opposite of Anemic Domain Model.
The data transfer object (DTO), encapsulating the communication data between layers, avoiding the lazy initialization problems with the unloaded fetchtype lazy hierarchies when using JPA ? Open Session in the View seems to be have it's own PROs and CONs also in solving the lazy exception.
And some would say you dont need the DAO anymore, as described in the spring roo documentation
And with all these matters, please share your thinking my current design when it comes to these :
Speed of development, with me thinking about having less boilerplate because being able to make use of the Entities, converting to-and-from DTOs
Ease of maintenance, with me thinking about having clear separation between ui state/logic, business process logic, data operations layer
Support for the modularization, perhaps using maven with each module as one artifact, depending one another as needed ? <-- this is where it's all very foggy for me
Webservice in the future. I have never tried webservices before, but i can just assume, public methods in the service layers could be exported as webservices, so they could be called from mobile devices, or any other platforms that support webservice call ?
Could you please share your experience in this matter ?

Find an OR Mapper you like and don't devote any more attention to the data layer. That is mostly a solved problem, and most of the attention you devote to that will be reinventing the wheel. Very people write applications whose CRUD needs are so unique that they obviate ORM use these days.
Some of the same advice for the UI - find tools and frameworks rather than spending too much time on all of that, there's a lot of good development wealth in place there.
So, concentrate on the service layer, where the unique nature of your application is really expressed. But we can't really validate or critique your service layer because we don't know anything about the problem you're trying to solve. All of the things you've listed are certainly good approaches for certain problems, certain sets of trade-offs, etc. Without knowing more about what matters (performance / development time / configurability / robustness / clarity), nobody can tell you what the right set of choices is.
On your "output" item - other devices can support communication with your app as long as everything serializes down to a common format, usually XML. Then you just send it over the wire, and rehydrate it on the other end.
Software development, when it is non-trivial is a Wicked Problem. It is likely that much advice that you get would need to be thrown out halfway through your project. I don't generally believe in grand architectures - focus on solving particular problems as well as you can, and if you're lucky, a pattern will emerge that you can take advantage. Anything more is generally hubris.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio