Spring Data JPA DistinctBy projections - spring-boot

Good day fellow hibernators!
I have a question on how the DistinctBy clause works in conjunction with Spring Data's projection
Assume I have 3 classes:
public class Task {
Long id;
#ManyToOne(fetch = LAZY)
#JoinColumn(name = "project_id")
private Project project;
#OneToOne
#JoinColumn(name = "contact_id")
private Contact assigned;
Boolean deleted;
// ...
}
public class Contact {
Long id;
// ...
}
public class Project {
Long id;
#OneToMany(fetch = LAZY, mappedBy = "project")
private Set<Task> tasks;
// ...
}
These would be my domain classes. Notice, Project does have a "One2Many" to Tasks, Contact does not. Now, I have 2 interfaces for my projections and the basic TaskRepo with 2 methods:
public interface JustProject {
Project getProject();
}
public interface JustAssignee {
Contact getContact();
}
public class TaskRepo extends CrudRepository<Task, Long>, JpaSpecificationExecutor<Task> {
List<JustAssignee> findDistinctByDeletedFalse();
List<JustProject> findDistinctByDeletedFalseAndDeletedFalse();
}
The way it works for me right now is that, findDistinctByDeletedFalse returns as many instances as there are distinct contacts for tasks (e.g. if there are 10 tasks but only 3 contacts, the method will return just 3 objects containing all the 3 distinct contacts). Same for findDistinctByDeletedFalseAndDeletedFalse but on project level.
Now I have a few questions here and would love to get some help in understanding how this works exactly.
is the distinct clause applied after the search is done?
my initial assumption was that this behavior would not work as it does now. I assumed that the distinct clause is applied before the result is fetched, meaning that it would be DISTINCT based on the underlying task model, not the returned JustContact or JustProject model.
is there any way I could somehow not abuse the ...AndDeletedFalse redundant appendix? I need both the two methods from the repo but I feel like I had to cheat just to obtain that result...
... am I doing something wrong? I wanted to get "all distinct contacts/projects assigned to all tasks" as elegant of a way as possible. I ended up thinking about this distinctby exactly because I was unsure on how it works and wanted to try mu luck out. I really didn't think it would work this way, but now that it does I would really want to understand why it does!
Many thanks <3

The DISTINCT keyword is applied to the query and therefore it's effect depends on the select list which in turn is controlled by the projection. Therefore if you have only project or only contact in your projection the DISTINCT will get applied to those values only. Note though, that this relies somewhat on the boundaries of the JPA specification and I wouldn't be surprised if you see different behaviour with different implementations. See https://github.com/eclipse-ee4j/jpa-api/issues/189 and https://github.com/eclipse-ee4j/jpa-api/issues/124 for somewhat related issues raised against the specification.
In oder to differentiate methods that otherwise only differ in the return value you might add any additional string between find and By in the method name. For example you might want to rename your methods to findDistinctContactsByDeletedFalse and findDistinctProjectsByDeletedFalse

I guess this is the best that you can get with Spring Data JPA. You might be able to use just a single method by using the dynamic projections approach, but I think this is a perfect use case for Blaze-Persistence Entity Views.
I created the library to allow easy mapping between JPA models and custom interface or abstract class defined models, something like Spring Data Projections on steroids. The idea is that you define your target structure(domain model) the way you like and map attributes(getters) via JPQL expressions to the entity model.
A DTO model for your use case could look like the following with Blaze-Persistence Entity-Views:
#EntityView(Task.class)
public interface TaskAggregateDto {
// A synthetic "id" to get a grouping context on object level
#IdMapping("1")
int getGroupKey();
Set<ProjectDto> getProjects();
Set<ContactDto> getContacts();
#EntityView(Project.class)
interface ProjectDto {
#IdMapping
Long getId();
String getName();
}
#EntityView(Contact.class)
interface ContactDto {
#IdMapping
Long getId();
String getName();
}
}
The Spring Data integration allows you to use it almost like Spring Data Projections: https://persistence.blazebit.com/documentation/entity-view/manual/en_US/index.html#spring-data-features
public interface TaskRepo extends CrudRepository<Task, Long>, JpaSpecificationExecutor<Task> {
TaskAggregateDto findOneByDeletedFalse();
}

Related

Same entity for two different aggregate

My schema will be something similar to the above picture.
I am planning to use Spring data JDBC and found that
If multiple aggregates reference the same entity, that entity can’t be part of those aggregates referencing it since it only can be part of exactly one aggregate.
Following are my questions:
How to create two different aggregates for the above without changing the DB design?
How to retrieve the Order / Vendor list alone? i.e. I don't want to traverse through the aggregate root.
How to create two different aggregates for the above without changing the DB design?
I think you simply have three Aggregates here: Order, Vendor and ProductType. A mental test that I always use is:
If A has a reference to B and I delete an A, should I automatically and without exception delete all Bs referenced by that A? If so B is part of the A Aggregate.
This doesn't seem to be true for any of the relationships in your diagram, so let's go with separate Aggregates for each entity.
This in turn makes each reference in the diagram one between different Aggregates.
As described in "Spring Data JDBC, References, and Aggregates" these must be modelled as ids in your Java code, not as Java references.
class Order {
#Id
Long orderid;
String name;
String description;
Instance created;
Long productTypeId;
}
class Vendor {
#Id
Long vid;
String name;
String description;
Instance created;
Long productTypeId;
}
class ProductType {
#Id
Long pid;
String name;
String description;
Instance created;
}
Since they are separate Aggregates each gets it's own Repository.
interface Orders extends CrudRepository<Order, Long>{
}
interface Vendors extends CrudRepository<Vendor, Long>{}
interface ProductTypes extends CrudRepository<ProductType, Long>{}
At this point I think we fulfilled your requirements. You might have to add some #Column and #Table annotations to get the exact names you want or provide a NamingStrategy.
You probably also want some kind of caching for the product types since I'd expect they see lots of reads with only few writes.
And of course you can add additional methods to the repositories, for example:
interface Orders extends CrudRepository<Order, Long>{
List<Orders> findByProductTypeId(Long productTypeId);
}

Dynamic JPA query

I have two entities Questions and UserAnswers. I need to make an api in spring boot which returns all the columns from both the entities based on some conditions.
Conditions are:
I will be give a comparator eg: >, <, =, >=, <=
A column name eg: last_answered_at, last_seen_at
A value of the above column eg: 28-09-2020 06:00:18
I will need to return an inner join of the two entities and filter based on the above conditions.
Sample sql query based on above conditions will be like:
SELECT q,ua from questions q INNER JOIN
user_answers ua on q.id = ua.question_id
WHERE ua.last_answered_at > 28-09-2020 06:00:18
The problem I am facing is that the column name and the comparator for the query needs to be dynamic.
Is there an efficient way to do this using spring boot and JPA as I do not want to make jpa query methods for all possible combinations of columns and operators as it can be a very large number and there will be extensive use of if else?
I have developed a library called spring-dynamic-jpa to make it easier to implement dynamic queries with JPA.
You can use it to write the query templates. The query template will be built into different query strings before execution depending on your parameters when you invoke the method.
This sounds like a clear custom implementation of a repository method. Firstly, I will make some assumptions about the implementation of your entities. Afterwards, I will present an idea on how to solve your challenge.
I assume that the entities look basically like this (getters, setters, equals, hachCode... ignored).
#Entity
#Table(name = "questions")
public class Question {
#Id
#GeneratedValue
private Long id;
private LocalDateTime lastAnsweredAt;
private LocalDateTime lastSeenAt;
// other attributes you mentioned...
#OneToMany(mappedBy = "question", cascade = CascadeType.ALL, orphanRemoval = true)
private List<UserAnswer> userAnswers = new ArrayList();
// Add and remove methods added to keep bidirectional relationship synchronised
public void addUserAnswer(UserAnswer userAnswer) {
userAnswers.add(userAnswer);
userAnswer.setQuestion(this);
}
public void removeUserAnswer(UserAnswer userAnswer) {
userAnswers.remove(userAnswer);
userAnswer.setQuestion(null);
}
}
#Entity
#Table(name = "user_answers")
public class UserAnswer {
#Id
#GeneratedValue
private Long id;
#ManyToOne(fetch = FetchType.LAZY)
#JoinColumn(name = "task_release_id")
private Question question;
}
I will write the code with the knowledge about the JPA of Hibernate. For other JPAs, it might work similarly or the same.
Hibernate often needs the name of attributes as a String. To circumvent the issue of undetected mistakes (especially when refactoring), I suggest the module hibernate-jpamodelgen (see the class names suffixed with an underscore). You can also use it to pass the names of the attributes as arguments to your repository method.
Repository methods try to communicate with the database. In JPA, there are different ways of implementing database requests: JPQL as a query language and the Criteria API (easier to refactor, less error prone). As I am a fan of the Criteria API, I will use the Criteria API together with the modelgen to tell the ORM Hibernate to talk to the database to retrieve the relevant objects.
public class QuestionRepositoryCustomImpl implements QuestionRepository {
#PersistenceContext
private EntityManager entityManager;
#Override
public List<Question> dynamicFind(String comparator, String attribute, String value) {
CriteriaBuilder cb = entityManager.getCriteriaBuilder();
CriteriaQuery<Question> cq = cb.createQuery(Question.class);
// Root gets constructed for first, main class in the request (see return of method)
Root<Question> root = cq.from(Question.class);
// Join happens based on respective attribute within root
root.join(Question_.USER_ANSWER);
// The following ifs are not the nicest solution.
// The ifs check what comparator String contains and adds respective where clause to query
// This .where() is like WHERE in SQL
if("==".equals(comparator)) {
cq.where(cb.equal(root.get(attribute), value));
}
if(">".equals(comparator)) {
cq.where(cb.gt(root.get(attribute), value));
}
if(">=".equals(comparator)) {
cq.where(cb.ge(root.get(attribute), value));
}
if("<".equals(comparator)) {
cq.where(cb.lt(root.get(attribute), value));
}
if("<=".equals(comparator)) {
cq.where(cb.le(root.get(attribute), value));
}
// Finally, query gets created and result collected and returned as List
// Hint for READ_ONLY is added as lists are often just for read and performance is better.
return entityManager.createQuery(cq).setHint(QueryHints.READ_ONLY, true).getResultList();
}
}

Is there a way to create one JPA entity based on many database tables and do I really have to do this or is it a bad practice?

I'm quite new to Spring Data JPA technology and currently facing one task I can't deal with. I am seeking best practice for such cases.
In my Postgres database I have a two tables connected with one-to-many relation. Table 'account' has a field 'type_id' which is foreign key references to field 'id' of table 'account_type':
So the 'account_type' table only plays a role of dictionary. Accordingly to that I've created to JPA entities (Kotlin code):
#Entity
class Account(
#Id #GeneratedValue var id: Long? = null,
var amount: Int,
#ManyToOne var accountType: AccountType
)
#Entity
class AccountType(
#Id #GeneratedValue var id: Long? = null,
var type: String
)
In my Spring Boot application I'd like to have a RestConroller which will be responsible for giving all accounts in JSON format. To do that I made entities classes serializable and wrote a simple restcontroller:
#GetMapping("/getAllAccounts", produces = [APPLICATION_JSON_VALUE])
fun getAccountsData(): String {
val accountsList = accountRepository.findAll().toMutableList()
return json.stringify(Account.serializer().list, accountsList)
}
where accountRepository is just an interface which extends CrudRepository<Account, Long>.
And now if I go to :8080/getAllAccounts, I'll get the Json of the following format (sorry for formatting):
[
{"id":1,
"amount":0,
"accountType":{
"id":1,
"type":"DBT"
}
},
{"id":2,
"amount":0,
"accountType":{
"id":2,
"type":"CRD"
}
}
]
But what I really want from that controller is just
[
{"id":1,
"amount":0,
"type":"DBT"
},
{"id":2,
"amount":0,
"type":"CRD"
}
]
Of course I can create new serializable class for accounts which will have String field instead of AccountType field and can map JPA Account class to that class extracting account type string from AccountType field. But for me it looks like unnecessary overhead and I believe that there could be a better pattern for such cases.
For example what I have in my head is that probably somehow I can create one JPA entity class (with String field representing account type) which will be based on two database tables and unnecessary complexity of having inner object will be reduced automagically each time I call repository methods :) Moreover I will be able to use this entity class in my business logic without any additional 'wrappers'.
P.s. I read about #SecondaryTable annotation but it looks like it can only work in cases where there is one-to-one relation between two tables which is not my case.
There are a couple of options whic allow clean separation without a DTO.
Firstly, you could look at using a projection which is kind of like a DTO mentioned in other answers but without many of the drawbacks:
https://docs.spring.io/spring-data/jpa/docs/current/reference/html/#projections
#Projection(
name = "accountSummary",
types = { Account.class })
public Interface AccountSummaryProjection{
Long getId();
Integer getAmount();
#Value("#{target.accountType.type}")
String getType();
}
You then simply need to update your controller to call either query method with a List return type or write a method which takes a the proection class as an arg.
https://docs.spring.io/spring-data/jpa/docs/current/reference/html/#projection.dynamic
#GetMapping("/getAllAccounts", produces = [APPLICATION_JSON_VALUE])
#ResponseBody
fun getAccountsData(): List<AccountSummaryProjection>{
return accountRepository.findAllAsSummary();
}
An alternative approach is to use the Jackson annotations. I note in your question you are manually tranforming the result to a JSON String and returning a String from your controller. You don't need to do that if the Jackson Json library is on the classpath. See my controller above.
So if you leave the serialization to Jackson you can separate the view from the entity using a couple of annotations. Note that I would apply these using a Jackson mixin rather than having to pollute the Entity model with Json processing instructions however you can look that up:
#Entity
class Account(
//in real life I would apply these using a Jacksin mix
//to prevent polluting the domain model with view concerns.
#JsonDeserializer(converter = StringToAccountTypeConverter.class)
#JsonSerializer(converter = AccountTypeToStringConverter.class
#Id #GeneratedValue var id: Long? = null,
var amount: Int,
#ManyToOne var accountType: AccountType
)
You then simply create the necessary converters:
public class StringToAccountTypeConverter extends StdConverter<String, CountryType>
implements org.springframework.core.convert.converter.Converter<String, AccountType> {
#Autowired
private AccountTypeRepository repo;
#Override
public AccountType convert(String value) {
//look up in repo and return
}
}
and vice versa:
public class AccountTypeToStringConverter extends StdConverter<String, CountryType>
implements org.springframework.core.convert.converter.Converter<AccountType, String> {
#Override
public String convert(AccountType value) {
return value.getName();
}
}
One of the least complicated ways to achieve what you are aiming for - from the external clients' point of view, at least - has to do with custom serialisation, what you seem to be aware of and what #YoManTaMero has extended upon.
Obtaining the desired class structure might not be possible. The closest I've managed to find is related to the #SecondaryTable annotation but the caveat is this only works for #OneToOne relationships.
In general, I'd pinpoint your problem to the issue of DTOs and Entities. The idea behind JPA is to map the schema and content of your database to code in an accessible but accurate way. It takes away the heavy-lifting of managing SQL queries, but it is designed mostly to reflect your DB's structure, not to map it to a different set of domains.
If the organisation of your DB schema does not exactly match the needs of your system's I/O communication, this might be a sign that:
Your DB has not been designed correctly;
Your DB is fine, but the manageable entities (tables) in it simply do not match directly to the business entities (models) in your external communication.
Should second be the case, Entities should be mapped to DTOs which can then be passed around. Single Entity may map to a few different DTOs. Single DTO might take more than one (related!) entities to be created. This is a good practice for medium-to-large systems in the first place - handing out references to the object that's the direct access point to your database is a risk.
Mind that simply because the id of the accountType is not taking part in your external communication does not mean it will never be a part of your business logic.
To sum up: JPA is designed with ease of database access in mind, not for smoothing out external communication. For that, other tools - such as e.g. Jackson serializer - are used, or certain design patterns - like DTO - are being employed.
One approach to solve this is to #JsonIgnore accountType and create getType method like
#JsonProperty("type")
var getType() {
return accountType.getType();
}

Replacing entire contents of spring-data Page, while maintaining paging info

Using spring-data-jpa and working on getting data out of table where there are about a dozen columns which are used in queries to find particular rows, and then a payload column of clob type which contains the actual data that is marshalled into java objects to be returned.
Entity object very roughly would be something like
#Entity
#Table(name = "Person")
public class Person {
#Column(name="PERSON_ID", length=45) #Id private String personId;
#Column(name="NAME", length=45) private String name;
#Column(name="ADDRESS", length=45) private String address;
#Column(name="PAYLOAD") #Lob private String payload;
//Bunch of other stuff
}
(Whether this approach is sensible or not is a topic for a different discussion)
The clob column causes performance to suffer on large queries ...
In an attempt to improve things a bit, I've created a separate entity object ... sans payload ...
#Entity
#Table(name = "Person")
public class NotQuiteAWholePerson {
#Column(name="PERSON_ID", length=45) #Id private String personId;
#Column(name="NAME", length=45) private String name;
#Column(name="ADDRESS", length=45) private String address;
//Bunch of other stuff
}
This gets me a page of NotQuiteAPerson ... I then query for the page of full person objects via the personIds.
The hope is that in not using the payload in the original query, which could filtering data over a good bit of the backing table, I only concern myself with the payload when I'm retrieving the current page of objects to be viewed ... a much smaller chunk.
So I'm at the point where I want to map the contents of the original returned Page of NotQuiteAWholePerson to my List of Person, while keeping all the Paging info intact, the map method however only takes a Converter which will iterate over the NotQuiteAWholePerson objects ... which doesn't quite fit what I'm trying to do.
Is there a sensible way to achieve this ?
Additional clarification for #itsallas as to why existing map() will not suffice..
PageImpl::map has
#Override
public <S> Page<S> map(Converter<? super T, ? extends S> converter) {
return new PageImpl<S>(getConvertedContent(converter), pageable, total);
}
Chunk::getConvertedContent has
protected <S> List<S> getConvertedContent(Converter<? super T, ? extends S> converter) {
Assert.notNull(converter, "Converter must not be null!");
List<S> result = new ArrayList<S>(content.size());
for (T element : this) {
result.add(converter.convert(element));
}
return result;
}
So the original List of contents is iterated through ... and a supplied convert method applied, to build a new list of contents to be inserted into the existing Pageable.
However I cannot convert a NotQuiteAWholePerson to a Person individually, as I cannot simply construct the payload... well I could, if I called out to the DB for each Person by Id in the convert... but calling out individually is not ideal from a performance perspective ...
After getting my Page of NotQuiteAWholePerson I am querying for the entire List of Person ... by Id ... in one call ... and now I am looking for a way to substitute the entire content list ... not interively, as the existing map() does, but in a simple replacement.
This particular use case would also assist where the payload, which is json, is more appropriately persisted in a NoSql datastore like Mongo ... as opposed to the sql datastore clob ...
Hope that clarifies it a bit better.
You can avoid the problem entirely with Spring Data JPA features.
The most sensible way would be to use Spring Data JPA projections, which have good extensive documentation.
For example, you would first need to ensure lazy fetching for your attribute, which you can achieve with an annotation on the attribute itself.
i.e. :
#Basic(fetch = FetchType.LAZY) #Column(name="PAYLOAD") #Lob private String payload;
or through Fetch/Load Graphs, which are neatly supported at repository-level.
You need to define this one way or another, because, as taken verbatim from the docs :
The query execution engine creates proxy instances of that interface at runtime for each element returned and forwards calls to the exposed methods to the target object.
You can then define a projection like so :
interface NotQuiteAWholePerson {
String getPersonId();
String getName();
String getAddress();
//Bunch of other stuff
}
And add a query method to your repository :
interface PersonRepository extends Repository<Person, String> {
Page<NotQuiteAWholePerson> findAll(Pageable pageable);
// or its dynamic equivalent
<T> Page<T> findAll(Pageable pageable, Class<T>);
}
Given the same pageable, a page of projections would refer back to the same entities in the same session.
If you cannot use projections for whatever reason (namely if you're using JPA < 2.1 or a version of Spring Data JPA before projections), you could define an explicit JPQL query with the columns and relationships you want, or keep the 2-entity setup. You could then map Persons and NotQuiteAWholePersons to a PersonDTO class, either manually or (preferably) using your object mapping framework of choice.
NB. : There are a variety of ways to use and setup lazy/eager relations. This covers more in detail.

Avoid N+1 with DTO mapping on Hibernate entities

In our Restful application we decided to use DTO's to shield the Hibernate domain model for several reasons.
We map Hibernate entities to DTO and vice versa manually using DTOMappers in the Service Layer.
Example in Service Layer:
#Transactional(readOnly=true)
public PersonDTO findPersonWithInvoicesById(Long id) {
Person person = personRepository.findById(id);
return PersonMapperDTOFactory.getInstance().toDTO(person);
}
The main concept could be explained like this:
JSON (Jackson parser) <-> Controller <-> Service Layer (uses Mapping Layer) <-> Repository
We agreed that we retrieve associations by performing a HQL (or Criteria) using a left join.
This is mostly a performant way to retrieve relations and avoids the N+1 select issue.
However, it's still possible to have the N+1 select issue when a developer mistakenly forgets to do a left join. The relations will still be fetched because the PersonDTOMapper will iterate over the Invoices of a Person for converting to InvoiceDTOs. So the data is still fetched because the DTOMapper is executed where a Hibernate Session is active (managed by Spring)
Is there some way to make the Hibernate Session 'not active' in our DTOMappers? We would face a LazyInitializationException that should trigger the developer that he didn't fetch some data like it should.
I've read about #Transactional(propagation = Propagation.NOT_SUPPORTED) that suspends the transaction. However, I don't know that it was intended for such purposes.
What is a clean solution to achieve this? Alternatives are also very welcome!
Usually I use the mapper in the controller layer. From my prspective, the service layer manages the application business logic, dtos are very useful if you want to rapresent data to the external world in a different way. In this way you may get the lazy inizitalization excpetion you are looking for.
I have one more reason to prefer this solution: just image you need to invoke a public method inside a public method in the service class: in this case you might need to call the mapper several times.
If you are using Hibernate, then there are specific ways that you can determine if an associated object has been lazy-loaded.
For example, let's say you have an entity class Foo that contains a #ManyToOne 'foreign' association to entity class Bar which is represented by a field in Foo called bar.
In you DTO mapping code you can check if the associated bar has been lazy-loaded using the following code:
if (!(bar instanceof HibernateProxy) ||
!((HibernateProxy)bar).getHibernateLazyInitializer().isUninitialized()) {
// bar has already been lazy-loaded, so we can
// recursively load a BarDTO for the associated Bar object
}
The simplest solution to achieve what you desire is to clear the entity manager after querying and before invoking the DTO mapper. That way, the object will be detached and access to uninitialized assocations will trigger a LazyInitializationException instead.
I felt your pain as well which drove me to developing Blaze-Persistence Entity Views which allows you to define DTOs as interfaces and map to the entity model, using the attribute name as default mapping, which allows very simple looking mappings.
Here a little example
#Entity
class Person {
#Id Long id;
String name;
String lastName;
String address;
String city;
String zipCode;
}
#EntityView(Person.class)
interface PersonDTO {
#IdMapping Long getId();
String getName();
}
Querying would be as simple as
#Transactional(readOnly=true)
public PersonDTO findPersonWithInvoicesById(Long id) {
return personRepository.findById(id);
}
interface PersonRepository extends EntityViewRepository<PersonDTO, Long> {
PersonDTO findById(Long id);
}
Since you seem to be using Spring data, you will enjoy the spring data integration.

Resources