Related
I'm building a web application with Spring, and I'm at the point where I have an Entity, a Repository, a RestController, and I can access endpoints in my browser.
I'm now trying to return JSON data to the browser, and I'm seeing all of this stuff about DTOs in various guides.
Do I really need a DTO? Can't I just put the serialization logic on the entity itself?
I think, this is a little bit debatable question, where the short answer would be:
It depends.
Little longer answer
There are plenty of people, who, in plenty of cases, would prefer one approach (using DTOs) over another (using bare entities), and vice versa; however, there is no the single source of truth on which is better to use.
It very much depends on the requirements, architectural approach you decide to stick with, (even on) personal preference and other (project-related) specific details.
Some even claim that DTO is an anti-pattern; some love using them; some think, that data refinement/adjustment should happen on the consumer/client side (for various reasons, out of which, one can be No Policy for API changes).
That being said, YES, you can simply return the #Entity instance (or list of entities) right from your controller and there is no problem with this approach. I would even say, that this does not necessarily violate something from SOLID or Clean Code principles.again, it depends on what do you use a response for, what representation of data do you need, what should be the capacity and purpose of the object in question, and etc..
DTO is generally a good practice in the following scenarios:
When you want to aggregate the data for your object from different resources, i.e. you want to put some object transformation logic between the Persistence Layer and the Business(or Web) Layer:
Imagine you fetch from your database a List<Employee>; however, from another 3rd party web-service, you also receive some complementary-to-employee data for each Employee object, which you have to aggregate in the Employee objects (aggregate, or do some calculation, or etc. point is that you want to combine the data from different resources). This is a good case when you might want to use DTO pattern. It is reusable, it conforms to Single-Responsibility Principle, and it is well segregated from other layers;
When you don't necessarily combine data received from different sources, but you want to modify the entity which you will be returning:
Imagine you have a very big Entity (with a lot of fields), and the client, which calls the corresponding endpoint (Front-End application, Mobile, or any client), has no need of receiving this huge entity (or list of entities). If you, despite the client's requirement, will still be sending the original/unchanged entity, you will end up consuming network bandwidth/load inefficiently (more than enough), performance will be weaker, and generally, you will be just wasting computing resources for no good reason. In this case, you might want to transform your original Entity to the DTO object, which the client needs (only with required fields). Here, you might even want to implement different DTO classes, for one entity, for different consumers/clients.
However, if you are sure, that your table/relation representations (instances of #Entity classes) are exactly what the client needs, I see no necessity of introducing DTOs.
Supporting further the idea, that #Entity can be returned to the presentation layer without DTO
Java Persistence with Hibernate, Second Edition, in §3.3.2, even motivates it explicitly, that:
You can reuse persistent classes outside the context of persistence, in unit tests or in the presentation layer, for example. You can create instances in any runtime environment with the regular Java new operator, preserving testability and reusability;
Hibernate entities do not need to be explicitly Serializable;
You might also want to have a look at this question.
In general, it’s up to you to decide. If your application is relatively simple and you don’t expose any sensitive information, an response is y ambiguous for the client, there is nothing criminal in returning back the whole entity. If your client expect a small slice of entity, eg only 2-3 fields from 30 fields entity, then it make sense to do the translation or consider different protocol such as GraphQL.
It is ideal design where you should not expose the entity.
It is a good design to convert your entity to DTO before you pass the same to web layer.
These days RestJpacontrollers are also available.
But again it all varies from application to application which one to use.
If your application does a need only read only operation then make sense to use RestJpacontrollers and can use entity at web layer.
In other case where application modifies data frequently then in that case better option to opt DTO and use it at the UI layer.
Another case is of multiple requests are required to bring data for a particular task. In the same case data to be brought can be combined in a DTO so that only one request can bring all the required data.
We can use data of multiple entities data into one DTO.
This DTO can be used for the front end or in the rest API.
Do I really need a DTO? Can't I just put the serialization logic on the entity itself?
I'd say you don't, but it is better to use them, according to SOLID principles, namely single responsibility one. Entities are ORM should be used to interact with database, not being serialized and passed to the other layers.
What exactly are those controllers? We were asked to build an ATM in Java for school project, and part of our design is:
We have accounts that stores most of informations
We have users that can make operations for their own accounts and stores some minor informations.
(Also we have atm class to store users and make some toplevel changes)
We have userInterface to catch inputs and use controllers.
Am I right that our accounts are models, interfaces are views and users are controllers?
Thanks a lot to address my problem,!
You say: "accounts are models". Actually no, they are not.
A domain model (also called model layer, or model) is not a single component, but a layer, composed of multiple components. It abstracts a real-life process and the needed resources. In other words, it models a business logic (represented by data and, especially, behavior).
The model layer can be part of an application, or can be shared by multiple applications.
Each of the model components has a certain role. It can be an entity (e.g. a domain object), a value object, a service, a data mapper abstraction, a repository abstraction, an abstraction of an external service (like an email or paying service), etc. By abstractions, I mean interfaces or abstract classes. The concrete implementations should not be part of the domain model, but of a different, external space vis-à-vis model, serving as infrastructure constructs.
So, your User, Account and Atm classes are just components of the model. More exactly, they are entities.
On the other hand, the controllers and the views are components of the UI layer.
The controllers (should) take the responsibility of only deferring (e.g. dispatching) the execution of the user request to the model layer. More precisely, to the service layer - which is defined as a boundary layer of the domain model and is represented by its service components. So, a certain service instance is injected into the controller as dependency. The controller passes it the current user input (data) and calls a certain method of it, in which the required user request processing steps are defined. The service instance, in conjunction with other model layer objects, performs all these steps to update the model with the user input. Having the dispatching task in mind, a controller method should therefore be slim, containing 1-3 lines of code.
The views (should) obtain (the updated) data from the domain model - by querying it (e.g. by pulling data from it) itself - and display it. Analog to the controllers, the views communicate with the model through certain service component(s).
I used the verb "should" to emphasize the fact, that there are different implementation models of the UI layer. An implementation could use the controllers and the views as described above - the original MVC design. Another implementation would use the controllers not only to update the model (through services), but also to query it (through services), in order to pass the received data to the view for displaying to the user. Or an implementation could even not use services at all, forcing the steps of processing the user request to be defined in the controllers and/or the views. And so on. So, it's up to you, how you choose to implement the UI layer.
Note that you can also have controllers and/or views named like the model components (User, Account, Atm, etc). But then you must use namespaces to differentiate between all of them - the recommended way to go. In Java, namespaces are managed by packages.
Some resources with more details (mainly web MVC related, with examples in PHP):
How should a model be structured in MVC
PHP - Structuring a Slim3 web application using MVC and understanding the role of the model
MVC, controller - use cases
GeeCON 2014: Sandro Mancuso - Crafted Design
MVC, Delivery Mechanism and Domain Model
Catalog of Patterns of Enterprise Application Architecture
Controllers in this context will receive user's request via interfaces and call service to perform any action, call Database layer to fetch data and populate into models, integrate model with view to create desired view and return back the combined view to the user. The User and Account will be different but related entities which have their representation in the database.
You didn't get it quite right, but don't worry -- it's actually not complicated to figure out what the parts are when you know why they exist in the first place.
Here it is:
MODEL: MVC is an architectural pattern for separating concerns in applications with user interfaces. The model is simply the part of the application that is not concerned with the user interface. It has no code in it at all that depends on the UI, the specific operations that the UI provides. Together the view and controller define the user interface, and the model is just everything else, although when we talk about it in the context of a MVC application, we usually just refer to everything else the UI might see or affect. Sometimes this leads us to make an object in the application that is called the model, but that is not really required and often not useful.
It's important to understand that the arrows in those MVC diagrams that go in a circle show data flow. When you draw a diagram that shows dependency, the view and controller depend on the model. The model does not depend on the view.
Now that we've defined the model as "the model is not the user interface", obviously the user interface consists of the view and the controller.
We separate these, and put the user between them:
VIEW: This is the part of the user interface that presents information to the user. It determines what the user will see, hear, etc. It takes information from the model and provides information to the user.
CONTROLLER: This is the part of the user interface that gets information from the user and implements the actions that he wants to perform.
The canonical example is that the view creates the button, and the controller handles the click. The separation between the view and controller, with the user (and time) in between, is the whole purpose of MVC. They do not connect in software except via the model.
The fundamental assumption of MVC is just this: We can separate the view code from the controller code, because they run at different times. The view code runs after the model changes, before the user can act on those changes, and the controller code runs after the user decides to act.
It's also important to understand that this is never entirely accurate. MVC is an over-simplification and there are actually always connections between the controller code and the view code that don't go through the model. But we try. Most of the design work in MVC frameworks or applications is an attempt to manage and properly design the ways in which they cheat in this.
In my limited experience, I've been told repeatedly that you should not pass around entities to the front end or via rest, but instead to use a DTO.
Doesn't Spring Data Rest do exactly this? I've looked briefly into projections, but those seem to just limit the data that is being returned, and still expecting an entity as a parameter to a post method to save to the database. Am I missing something here, or am I (and my coworkers) incorrect in that you should never pass around and entity?
tl;dr
No. DTOs are just one means to decouple the server side domain model from the representation exposed in HTTP resources. You can also use other means of decoupling, which is what Spring Data REST does.
Details
Yes, Spring Data REST inspects the domain model you have on the server side to reason about the way the representations for the resources it exposes will look like. However it applies a couple of crucial concepts that mitigate the problems a naive exposure of domain objects would bring.
Spring Data REST looks for aggregates and by default shapes the representations accordingly.
The fundamental problem with the naive "I throw my domain objects in front of Jackson" is that from the plain entity model, it's very hard to reason about reasonable representation boundaries. Especially entity models derived from database tables have the habit to connect virtually everything to everything. This stems from the fact that important domain concepts like aggregates are simply not present in most persistence technologies (read: especially in relational databases).
However, I'd argue that in this case the "Don't expose your domain model" is more acting on the symptoms of that than the core of the problem. If you design your domain model properly there's a huge overlap between what's beneficial in the domain model and what a good representation looks like to effectively drive that model through state changes. A couple of simple rules:
For every relationship to another entity, ask yourself: couldn't this rather be an id reference. By using an object reference you pull a lot of semantics of the other side of the relationship into your entity. Getting this wrong usually leads entities referring to entities referring to entities, which is a problem on a deeper level. On the representation level this allows you to cut off data, cater consistency scopes etc.
Avoid bi-directional relationships as they're notoriously hard to get right on the update side of things.
Spring Data REST does quite a few things to actually transfer those entity relationships into the proper mechanisms on the HTTP level: links in general and more importantly links to dedicated resources managing those relationships. It does so by inspecting the repositories declared for entities and basically replaces an otherwise necessary inlining of the related entity with a link to an association resource that allows you to manage that relationship explicitly.
That approach usually plays nicely with the consistency guarantees described by DDD aggregates on the HTTP level. PUT requests don't span multiple aggregates by default, which is a good thing as it implies a scope of consistency of the resource matching the concepts of your domain.
There's no point in forcing users into DTOs if that DTO just duplicates the fields of the domain object.
You can introduce as many DTOs for your domain objects as you like. In most of the cases, the fields captured in the domain object will reflect into the representation in some way. I have yet to see the entity Customer containing a firstname, lastname and emailAddress property, and those being completely irrelevant in the representation.
The introduction of DTOs doesn't guarantee a decoupling by no means. I've seen way too many projects where they where introduced for cargo-culting reasons, simply duplicated all fields of the entity backing them and by that just caused additional effort because every new field had to be added to the DTOs as well. But hey, decoupling! Not. ¯\_(ツ)_/¯
That said, there are of course situations where you'd want to slightly tweak the representation of those properties, especially if you use strongly typed value objects for e.g. an EmailAddress (good!) but still want to render this as a plain String in JSON. But by no means is that a problem: Spring Data REST uses Jackson under the covers which offers you a wide variety of means to tweak the representation — annotations, mixins to keep the annotations outside your domain types, custom serializers etc. So there is a mapping layer in between.
Not using DTOs by default is not a bad thing per se. Just imagine the outcry by users about the amount of boilerplate necessary if we required DTOs to be written for everything! A DTO is just one means to an end. If that end can be achieved in a different way (and it usually can), why insist on DTOs?
Just don't use Spring Data REST where it doesn't fit your requirements.
Continuing on the customization efforts it's worth noticing that Spring Data REST exists to cover exactly the parts of the API, that just follow the basic REST API implementation patterns it implements. And that functionality is in place to give you more time to think about
How to shape your domain model
Which parts of your API are better expressed through hypermedia driven interactions.
Here's a slide from the talk I gave at SpringOne Platform 2016 that summarizes the situation.
The complete slide deck can be found here. There's also a recording of the talk available on InfoQ.
Spring Data REST exists for you to be able to focus on the underlined circles. By no means we think you can build a great really API solely by switching Spring Data REST on. We just want to reduce the amount of boilerplate for you to have more time to think about the interesting bits.
Just like Spring Data in general reduces the amount of boilerplate code to be written for standard persistence operations. Nobody would argue you can actually build a real world app from only CRUD operations. But taking the effort out of the boring bits, we allow you to think more intensively about the real domain challenges (and you should actually do that :)).
You can be very selective in overriding certain resources to completely take control of their behavior, including manually mapping the domain types to DTOs if you want. You can also place custom functionality next to what Spring Data REST provides and just hook the two together. Be selective about what you use.
A sample
You can find a slightly advanced example of what I described in Spring RESTBucks, a Spring (Data REST) based implementation of the RESTBucks example in the RESTful Web Services book. It uses Spring Data REST to manage Order instances but tweaks its handling to introduce custom requirements and completely implement the payment part of the story manually.
Spring Data REST enables a very fast way to prototype and create a REST API based on a database structure. We're talking about minutes vs days, when comparing with other programming technologies.
The price you pay for that, is that your REST API is tightly coupled to your database structure. Sometimes, that's a big problem. Sometimes it's not. It depends basically on the quality of your database design and your ability to change it to suit the API user needs.
In short, I consider Spring Data REST as a tool that can save you a lot of time under certain special circumstances. Not as a silver bullet that can be applied to any problem.
We used to use DTOs including the fully traditional layering ( Database, DTO, Repository, Service, Controllers,...) for every entity in our projects. Hopping the DTOs will some day save our life :)
So for a simple City entity which has id,name,country,state we did as below:
City table with id,name,county,.... columns
CityDTO with id,name,county,.... properties ( exactly same as database)
CityRepository with a findCity(id),....
CityService with findCity(id) { CityRepository.findCity(id) }
CityController with findCity(id) { ConvertToJson( CityService.findCity(id)) }
Too many boilerplate codes just to expose a city information to client. As this is a simple entity no business is done at all along these layers, just the objects is passing by.
A change in City entity was starting from database and changed all layers. (For example adding a location property, well because at the end the location property should be exposed to user as json). Adding a findByNameAndCountryAllIgnoringCase method needs all layers be changed changed ( Each layer needs to have new method).
Considering Spring Data Rest ( of course with Spring Data) this is beyond simple!
public interface CityRepository extends CRUDRepository<City, Long> {
City findByNameAndCountryAllIgnoringCase(String name, String country);
}
The city entity is exposed to client with minimum code and still you have control on how the city is exposed. Validation, Security, Object Mapping ... is all there. So you can tweak every thing.
For example, if I want to keep client unaware on city entity property name change (layer separation), well I can use custom Object mapper mentioned https://docs.spring.io/spring-data/rest/docs/3.0.2.RELEASE/reference/html/#customizing-sdr.custom-jackson-deserialization
To summarize
We use the Spring Data Rest as much as possible, in complicated use cases we still can go for traditional layering and let the Service and Controller do some business.
A client/server release is going to publish at least two artifacts. This already decouples client from server. When the server's API is changed, applications do not immediately change. Even if the applications are consuming the JSON directly, they continue to consume the legacy API.
So, the decoupling is already there. The important thing is to think about the various ways a server's API is likely to evolve after it is released.
I primarily work with projects which use DTOs and numerous rigid layers of boilerplate between the server's SQL and the consuming application. Rigid coupling is just as likely in these applications. Often, changing anything in the DB schema requires us to implement a new set of endpoints. Then, we support both sets of endpoints along with the accompanying boilerplate in each layer (Client, DTO, POJO, DTO <-> POJO conversions, Controller, Service, Repository, DAO, JDBC <-> POJO conversion, and SQL).
I'll admit that there is a cost to dynamic code (like spring-data-rest) when doing anything not supported by the framework. For example, our servers need to support a lot of batch insert/update operations. If we only need that custom behavior in a single case, it's certainly easier to implement it without spring-data-rest. In fact, it may be too easy. Those single cases tend to multiply. As the number of DTOs and accompanying code grows, the inconsistencies eventually become extremely burdensome to maintain. In some non-dynamic server implementations, we have hundreds of DTOs and POJOs that are likely no longer used by anything. But, we are forced to continue supporting them as their number grows each month.
With spring-data-rest, we pay the cost of customization early. With our multi-layer hard-coded implementations, we pay it later. Which one is preferred depends on a lot of factors (including the team's knowledge and the expected lifetime of the project). Both types of project can collapse under their own weight. But, over time, I've become more comfortable with implementations (like spring-data-rest without DTOs) that are more dynamic. This is especially true when the project lacks good specifications. Over time, such a project can easily drown in the inconsistencies buried within its sea of boilerplate.
From the Spring documentation I don't see Spring data REST exposes entities, you are the one doing it.
Spring Data projects intend to ease the process of accessing different data sources, but you are the one deciding which layer to expose on Spring Data Rest.
Reorganizing your project will help to solve your issue.
Every #Repository that you create with Spring data represents more a DAO in the sense of design than a Repository. Each one is tightly coupled with a particular Data source you want to reach out to. Say JPA, Mongo, Redis, Cassandra,...
Those layers are meant to return entity representations or projections.
However if you check out the Repository pattern from a design perspective you should have a higher layer of abstraction from those specific DAOs where your app use those DAOs to get info from as many different sources it needs, and builds business specific objects for your app (Those might looks more like your DTOs).
That is probably the layer you want to expose on your Spring Data Rest.
NOTE: I see an answer recommending to return Entity instances only because they have the same properties as the DTO. This is normally a bad practice and in particular is a bad idea in Spring and many other frameworks because they do not return your actual classes, they return proxy wrappers so that they can work some magic like lazy loading of values and the likes.
Recently I was learning about ORM (Object Relational Mapping) and the 3 tier architecture style (presentation,business and data persistence).
If I understand correctly, I can separate the data persistence layer into DTO and DAO layer.
I would like to understand, how the following parts works together in a data persistence layer.
DAL (Data Access Layer)
DTO (Data Transfer Object)
DAO (Data Access Object)
In a top of that I learnt that
In larger applications MVC is the presentation tier only of an N-tier
architecture.
I got really confused, how it can be even possible for example in a 3 tier architecture style where the MVC is the just a presentation tier, and the DTO, DAO, DAL is just a part of data persistence tier. I'm totally lost.
I'd be glad if someone tell me the truth about how does it works together.
Please don't close this question because the many different expressions, I saw it everywhere these things are related to each other basically in big applications and I can't imagine how does it works.
I appreciate any answer!
Lets start with purpose of each: -
DTO
Data Transfer Objects. These are generally used to transfer data from controller to client (JS). Term is also used for POCOs/POJOs by few which actually holds the data retrieved from Database.
DAO
Data Access Object is one of the design patterns used to implement DAL. This builds and executes queries on database and maps the result to POCO/POJO using various other patterns including 'Query Object', 'Data Mapper' etc. DAO layer could be further extended using 'Repository' pattern.
DAL
Data Access Layer abstracts your database activities using DAO/Repository/POCO etc. ORMs help you to build your DAL but it could be implemented without using them also.
MVC
Model View Control is a pattern which is used to separate view (presentation) from business logic. For MVC, it does not matter if DAL is implemented or not. If DAL is not implemented, database logic simply go into your model which is not a good approach.
In larger applications MVC is the presentation tier only of an N-tier
architecture.
Models consume most of your business logic as stated above. In N-tier application, if business logic is entirely separated for the purpose of re-usability across applications/platforms, then Models in MVC are called anemic models. If BI need not to be re-used at that scale in your application, you can use Model to hold it. No confusion, right?
I'd be glad if someone tell me the truth about how does it works
together.
All MV* patterns define the idea/concept only; they do not define implementation. MV* patterns mainly focus on separating view from BI. Just concentrate on this.
Refer this answer for details about different objects holding data.
You might wanna first distinguish between the MVC pattern and the 3-tier architecture. To sum up:
3-tier architecture:
data: persisted data;
service: logical part of the application;
presentation: hmi, webservice...
Now, for the above 3-tier architecture, the MVC pattern takes place in the presentation tier of it (for a webapp):
data: ...;
service: ...;
presentation:
controller: intercepts the HTTP request and returns the HTTP response;
model: stores data to be displayed/treated;
view: organises output/display.
Life Cycle of a typical HTTP request:
The user sends the HTTP request;
The controller intercepts it;
The controller calls the appropriate service;
The service calls the appropriate dao, which returns some persisted data (for example);
The service treats the data, and returns data to the controller;
The controller stores the data in the appropriate model and calls the appropriate view;
The view get instantiated with the model's data, and get returned as the HTTP response.
I need to learn the difference between the type of methods (in term of business logic) that should be inside the Domain, DAO and Service layers objects.
For example, if I am building a small web application to create, edit and delete customers data, as far as I understand inside Domain layer object I should add methods that Get/Set Customers object properties, for example (getName, getDOB, setAddress, setPhone...etc).
Now what I am trying to learn is what methods shall I put in DAO and Service layers objects.
Thanks in advance for your time and efforts.
Speaking generally (not Hibernate or Spring specific):
The DAO layer contains queries and updates to save your domain layer into your datastore (usually a relational DB but doesn't have to be). Use interfaces to abstract your DAO away from the actual datastore. It doesn't happen often, but sometimes you want to change datastores (or use mocks to test your logic), and interfaces make that easier. This would have methods like "save", "getById", etc.
The Service layer typically contains your business logic and orchestrates the interaction between the domain layer and the DAOs. It would have whatever methods make sense for your particular domain, like "verifyBalance", or "calculateTotalMileage".
DAO: "wrapper" methods for "wrapping" JPA or JDBC or SQL or noSQL calls or whatever for accessing DB systems.
Domain: Business logic calls correlated to a single type of entities (domain objects).
Service: Business logic calls correlated to a group of type of entities or to a group of several entities of the same type.
(I'm not sure about English, sorry.......)
It means:
Service layer is "bigger" than Domain layer, is often close to front-end, often calls or uses several domain objects.
Domain objects encapsulate most stuff for one part of the domain (that's why they are called D.O.)
DAO is just sth technical, sometimes needed, sometimes not.
When real domain objects are used, then often "repositories" are used to hide access to database systems, or adding special db functionality or whatever.
front-end --> service method 1 --> d.o. A of type X, d.o. B of type X, List