JPA entitymanager remove operation is not performant - performance

When I try to do an entityManager.remove(instance) the underlying JPA provider issues a separate delete operation on each of the GroupUser entity. I feel this is not right from a performance perspective, since if a Group has 1000 users there will be 1001 calls issued to delete the entire group and itr groupuser entity.
Would it make more sense to write a named query to remove all entries in groupuser table (e.g. delete from group_user where group_id=?), so I would have to make just 2 calls to delete the group.
#Entity
#Table(name = "tbl_group")
public class Group {
#OneToMany(mappedBy = "group", cascade = CascadeType.ALL, fetch = FetchType.LAZY)
#Cascade(value = DELETE_ORPHAN)
private Set<GroupUser> groupUsers = new HashSet<GroupUser>(0);

Simple answer is yes.
If you want to delete a Group and you know there are tons of records in GroupUser table, then it is much better to create a delete query that will do all in one batch instead of one and one.
If you have no cascading on the underlying database, (or even if you do) its good practice to do it in correct order.
So delete the GroupUser first.
Assuming you have a the Group object you want to delete.
int numberDeleted = entityManager.createQuery("DELETE FROM GroupUser gu WHERE gu.group.id=:id").setParameter("id",group.getId()).executeUpdate();
The returning int shows how many records where deleted.
Now you can finally delete Group
entityManager.remove(group);
entityManager.flush();
UPDATE
Seems like #OnDelete on the #OneToMany does the trick

Since the GroupUser may have cascades as well, I don't think there is a way to tell hibernate to batch-delete them via configuration.
But if you are certain there are no cascade=DELETE on GroupUser, feel free to issue an HQL/JPA-QL query:
DELETE FROM GroupUser WHERE group=:group
If there are cascades, handle them with a query as well.

Related

spring data jpa + unwanted unique key generation

I am using spring data jpa for creation of my tables. My Requirement.
I have two Tables.
Basket Table - It has one to many Relationship with the Item table. A basket can have many Items.
Item Table - Here an Item can be associated with many Baskets.
I am using IGNORE_ROW_ON_DUPKEY_INDEX to make sure that the same combination of basketId and itemId, does not get persisted
So there is a mapping table in between in which holds the mapping of baskets and items. Now, in this table, i want the combination of basketId and itemId to be unique. Below is my entity stucture.
#Entity
Class Basket{
#Id
private long basketId;
...
#OneToMany(cascade = {CascadeType.Merge, CascadeType.DETACH, CascadeType.REFRESH})
#JoinTable(
name= "mapping_table",
joinColumns = #JoinColumn(name ='basketId'),
inverseJoinColumns = #JoinColumn(name ='itemId'),
indexes = {#Index(name = "my_index", columnList = "basketId, itemId", unique = true)}
#SQLInsert(sql = "INSERT /*+ IGNORE_ROW_ON_DUPKEY_INDEX (mapping_table, my_index) */ INTO mapping_table(basketId, itemId) values (?,?)")
private List<Item> itemList;
...
}
#Entity
Class Item{
#Id
private long itemId;
}
my_index with the combination of both the keys are getting created, as expected in the mapping_table
Problem 1:
In the mapping_table, for some wierd reason, a new unique constraint with only itemId is created. Due to this unique key constraint, i am not able to persist rows where an item is associated with multiple baskets. As i said, i want the combination of both the keys to be unique and i am achieving this by creating the index (my_index)
Problem 2:
Why is basketId (which is also Identifier) in the basketTable not marked as unique in the mapping table. This not a problem but more of a question. Becuase itemId which is identifier in the item table has unique key constraint in the mapping table.
Solution :
I create the tables using spring data jpa, login to the db manually and drop the unique key constraint and the persist. But this is not possible in Production
Question
I want to do alter table to drop the constraint first before the persist thing happens. How to do that.
P.S, As you can see, I have imagined these tables and have not put the names of the actual table. Not withstanding the made up table names, the problem statement is accurate. I am using Oracle as target DB
Try to use a Set instead of a List, JPA should generated the correct schema with the correct constraints.

How to specify ignore case in JPA #JoinColumn when I am using two tables entity

I have two entities. Account entity and B_Account.
In Account entity I am joining table like below.
#ManyToOne(fetch=eger)
#JoinColumn(name = "a_type")
private B_Account b_Account;
Now the issue is a_type attribute value in Account is "ganesh" and in B_Account is "GANESH".
This is the reason I am not getting the data when I used findBy(B_Account) using repository method.
How can I join the tables above so that it can ignore the small and uppercase?
You can try #JoinFormula That's a Hibernate proprietary annotation that doesn't exist in JPA.
#ManyToOne(fetch=eger)
#JoinFormula(name = "lower(b_account) = a_type")
private B_Account b_Account;
I cannot test it so please try it out.
Documentation: https://docs.jboss.org/hibernate/orm/current/userguide/html_single/Hibernate_User_Guide.html#associations-JoinFormula

Spring boot JPA how to query a #OneToMany relationship given one object in the many relationship

I've seen a few related questions but I can't seem to find the right answer for what I'm trying to do.
I have two tables Jobs and Workers, a job can have many workers, simplified entity
#Entity
#Table(name = "jobs")
data class Job(
#Id
#Type(type = "pg-uuid")
val id: UUID = UUID.randomUUID()
) {
#ManyToOne
var office: Office? = null
#OneToMany(targetEntity = Worker::class)
var requests: MutableList<Worker> = mutableListOf()
}
I want to be able to fetch a list of jobs for a specific worker
I've tried a few queries native and not, but trying to just do it by namedMethods now, whatever works I guess to be honest here is what seems like it should work in my jobs repo
#Repository
interface JobsRepo : CrudRepository<Job, UUID> {
#Query("SELECT j FROM Job j WHERE id = ?1")
fun findJobById(id: UUID): Job?
#Query("SELECT j FROM Job j WHERE office_id = ?1")
fun findJobsByOffice(id: UUID): List<Job>?
#Modifying
#Transactional
#Query("UPDATE jobs SET job_status = 4 WHERE job_status = 1 AND start_time < ?1", nativeQuery = true)
fun expireJobs(date: Date)
fun findByRequests_Worker(worker: Worker): List<Job>?
}
I'm not really sure how to query the array property
requests
with in input of one worker. I tried querying the UUID of the worker too since thats whats in the join table
JPA creates a join table with both foreign keys the table is
jobs_requests
and columns
job_id UUID
requests_id UUID
You mentioned that a Job can have many workers, and you also mentioned that you want to get a list of jobs for a specific worker. So this sounds like a ManyToMany relationship rather than OneToMany.
For a ManyToMany relationship, a join table is unavoidable. You need to specify #ManyToMany annotation on both entities, then you can use WorkerRepository to query for the worker and then to get the job list for that worker you will just need to access by worker.getJobs().
Following is the setup for the ManyToMany relationship, hope it can help:
#Entity
#Table(name = "jobs")
data class Job (
#ManyToMany
#get:ManyToMany
#get:JoinTable(
name = "worker_job",
joinColumns = [JoinColumn(name = "job_id")],
inverseJoinColumns = [JoinColumn(name = "worker_id")]
)
val worker: Set<Worker> = HashSet()
)
#Entity
#Table(name = "worker")
data class Worker (
#ManyToMany
#get:ManyToMany
#get:JoinTable(
name = "worker_job",
joinColumns = [JoinColumn(name = "worker_id")],
inverseJoinColumns = [JoinColumn(name = "job_id")]
)
val jobs: Set<Jobs> = HashSet()
)
Its also worth noting that for all practical uses, you should try to avoid explicit ManyToMany annotation. You should rather make a new Entity/Table which will have fields of a Job and Worker both with ManyToOne associations to their respective entities.
The reason why you should avoid explicit ManyToMany is because, say when there is a specific job for example JavaDev, and there are say 1000 Workers on the given job. And at one point you have a JavaDev entity and you want to get all the workers whose ages are between 20 and 25. To do this, one would iterate on the Workers list which is part of JavaDev entity and filter them out. When that happens there is 2 scenarios:
Workers list is a lazy association and when you try to access it hibernate will send 1000(n+1 problem) queries to the database.
Workers list is an eager association and hibernate will send same amount of queries to the database but as soon as you try to fetch JavaDev entity.
Hence your execution times will increase as there will be unnecessary amount of load on the database itself. And on larger projects this will get even more complicated.
All of this can be avoided if you create your own manytomany(job_workers for ex.) table, with accompanying repository and a service, from witch you'll be able to query your data more efficiently.

Spring Data / Hibernate save entity with Postgres using Insert on Conflict Update Some fields

I have a domain object in Spring which I am saving using JpaRepository.save method and using Sequence generator from Postgres to generate id automatically.
#SequenceGenerator(initialValue = 1, name = "device_metric_gen", sequenceName = "device_metric_seq")
public class DeviceMetric extends BaseTimeModel {
#Id
#GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "device_metric_gen")
#Column(nullable = false, updatable = false)
private Long id;
///// extra fields
My use-case requires to do an upsert instead of normal save operation (which I am aware will update if the id is present). I want to update an existing row if a combination of three columns (assume a composite unique) is present or else create a new row.
This is something similar to this:
INSERT INTO customers (name, email)
VALUES
(
'Microsoft',
'hotline#microsoft.com'
)
ON CONFLICT (name)
DO
UPDATE
SET email = EXCLUDED.email || ';' || customers.email;
One way of achieving the same in Spring-data that I can think of is:
Write a custom save operation in the service layer that
Does a get for the three-column and if a row is present
Set the same id in current object and do a repository.save
If no row present, do a normal repository.save
Problem with the above approach is that every insert now does a select and then save which makes two database calls whereas the same can be achieved by postgres insert on conflict feature with just one db call.
Any pointers on how to implement this in Spring Data?
One way is to write a native query insert into values (all fields here). The object in question has around 25 fields so I am looking for an another better way to achieve the same.
As #JBNizet mentioned, you answered your own question by suggesting reading for the data and then updating if found and inserting otherwise. Here's how you could do it using spring data and Optional.
Define a findByField1AndField2AndField3 method on your DeviceMetricRepository.
public interface DeviceMetricRepository extends JpaRepository<DeviceMetric, UUID> {
Optional<DeviceMetric> findByField1AndField2AndField3(String field1, String field2, String field3);
}
Use the repository in a service method.
#RequiredArgsConstructor
public class DeviceMetricService {
private final DeviceMetricRepository repo;
DeviceMetric save(String email, String phoneNumber) {
DeviceMetric deviceMetric = repo.findByField1AndField2AndField3("field1", "field", "field3")
.orElse(new DeviceMetric()); // create new object in a way that makes sense for you
deviceMetric.setEmail(email);
deviceMetric.setPhoneNumber(phoneNumber);
return repo.save(deviceMetric);
}
}
A word of advice on observability:
You mentioned that this is a high throughput use case in your system. Regardless of the approach taken, consider instrumenting timers around this save. This way you can measure the initial performance against any tunings you make in an objective way. Look at this an experiment and be prepared to pivot to other solutions as needed. If you are always reading these three columns together, ensure they are indexed. With these things in place, you may find that reading to determine update/insert is acceptable.
I would recommend using a named query to fetch a row based on your candidate keys. If a row is present, update it, otherwise create a new row. Both of these operations can be done using the save method.
#NamedQuery(name="getCustomerByNameAndEmail", query="select a from Customers a where a.name = :name and a.email = :email");
You can also use the #UniqueColumns() annotation on the entity to make sure that these columns always maintain uniqueness when grouped together.
Optional<Customers> customer = customerRepo.getCustomersByNameAndEmail(name, email);
Implement the above method in your repository. All it will do it call the query and pass the name and email as parameters. Make sure to return an Optional.empty() if there is no row present.
Customers c;
if (customer.isPresent()) {
c = customer.get();
c.setEmail("newemail#gmail.com");
c.setPhone("9420420420");
customerRepo.save(c);
} else {
c = new Customer(0, "name", "email", "5451515478");
customerRepo.save(c);
}
Pass the ID as 0 and JPA will insert a new row with the ID generated according to the sequence generator.
Although I never recommend using a number as an ID, if possible use a randomly generated UUID for the primary key, it will qurantee uniqueness and avoid any unexpected behaviour that may come with sequence generators.
With spring JPA it's pretty simple to implement this with clean java code.
Using Spring Data JPA's method T getOne(ID id), you're not querying the DB itself but you are using a reference to the DB object (proxy). Therefore when updating/saving the entity you are performing a one time operation.
To be able to modify the object Spring provides the #Transactional annotation which is a method level annotation that declares that the method starts a transaction and closes it only when the method itself ends its runtime.
You'd have to:
Start a jpa transaction
get the Db reference through getOne
modify the DB reference
save it on the database
close the transaction
Not having much visibility of your actual code I'm gonna abstract it as much as possible:
#Transactional
public void saveOrUpdate(DeviceMetric metric) {
DeviceMetric deviceMetric = metricRepository.getOne(metric.getId());
//modify it
deviceMetric.setName("Hello World!");
metricRepository.save(metric);
}
The tricky part is to not think the getOne as a SELECT from the DB. The database never gets called until the 'save' method.

JDBC: select entities with Many to one relation

I have the two entity classes with bi-directional Many-to-one relation.
class A {
#Column(name="ID")
long Id;
}
class B {
#ManyToOne
#JoinColumn(name="A_ID")
A a;
}
The entities are well-coded with additional data fields and getters and setters. And now I want to construct a query string to fetch data from table B, where B's "A_ID" column is equal to A's "ID".
I tried something like this:
"select b.data1, b.data2 from B b, A a WHERE b.a.Id=a.Id"
But it does not work. What is the correct way to construct such a query? And if A and B are in a uni directional relation, would there be any difference?
Thanks in advance.
You don't need to join the tables, the whole idea behind #ManyToOne and #OneToMany is to do away with the need for most joins.
I refer you to a tutorial on JPA, like http://en.wikibooks.org/wiki/Java_Persistence/ManyToOne and http://en.wikibooks.org/wiki/Java_Persistence/OneToMany.
Now, without seeing your actual db definitions it's a bit difficult to guess the actual structure of your program and database, but it should be something like this:
class A {
#Id
#Column(name="ID")
long Id;
#OneToMany(mappedBy="a")
List<B> bees;
}
class B {
#ManyToOne
#JoinColumn(name="A_ID") // Note that A_ID is a column in the B table!!!
A a;
}
With the example above you could just select any list of B's you need, and JPA will automatically fetch the associated A for each found B. You don't need to do anything to be able to access it, b.a.Id will just work.
As we also have the OneToMany relationship, every A can have multiple B's associated with it. So, for any select that fetches a set of A's, each returned A's bees field will give access to the proper list of B objects, without the need to pull the B able into the query.

Resources