Hibernate complex sort via computed field - spring

Having some performance issues when attempting to sort a list of parent/child entities - currently I have mapped an entity "view" and I sort by this sub-select, but have some scaling issues.
I'll try and simplify the scenario somewhat;
Both parent and child extend a base class - which contains an ID and name. When showing a parent in the client we display its name, and when showing a child we show a concatenation of its parent name and its own. (E.g. parent1, parent1.child1, parent1.child2 etc)
The requirement is that when showing a list, parents and children are grouped together, as the natural sort based on the name shown would be as such.
(E.g. parent1, parent1.child1, parent1.child2, parent2, parent3.child1 etc)
We are using #Inheritance(strategy = InheritanceType.SINGLE_TABLE) - and the child does not capture the name of it's parent - they are referenced by the primary key (ID). De-normalising this data is not an option as renames are common and there can be hundreds of thousands of child objects per parent, this would be too costly.
Here is a brief overview of the classes;
public class BaseClass {
#Id
#GeneratedValue(strategy=GenerationType.IDENTITY)
private Long id;
private String name;
#ManyToOne
Parent parent;
#OneToMany(mappedBy="parent", fetch = FetchType.LAZY)
private Set<BaseId> baseId;
}
#DiscriminatorValue("PARENT")
public class Parent extends BaseClass {
}
#DiscriminatorValue("CHILD")
public class Child extends BaseClass {
}
And we have also mapped the Subselect to the BaseClass;
#Entity
#Immutable
#Subselect(
"SELECT 0 as id, N'' as primaryId, N'' as secondaryId WHERE 0=1 \n" +
"UNION ALL\n" +
"SELECT base.id, base.name, null \n"+
"FROM sec.BaseClass base \n" +
"WHERE base.parent_sid IS NULL \n" +
"UNION ALL \n" +
"SELECT child.id, parent.name, child.name \n"+
"FROM sec.BaseClass child \n" +
"INNER JOIN sec.BaseClass parent ON child.parent_sid=parent.sid \n"
)
#Synchronize({ "BaseClass" })
public class BaseId {
// Fields etc
}
With the above in place, we then have the ability to select a page via rest and then sort as required, e.g. /api/base?sort=baseId.primaryId,asc&sort=baseId.secondaryId,arc
Note: built around this we have a lot of specifications which makes the query relatively dynamic based on the rest filters supplied, e.g. you can also search by name while sorting in this way. So switching to native queries is also unfortunately not an option for us.
As mentioned, this doesn't scale - the dataset we are currently working towards has 5 millions parent/child rows (roughly broken down as 1:10000 parent/child).
EDIT: This is a paged response from the database
Are there any suggestions with how to sort in this way but without the performance cost?

Related

#SecondaryTable with where condition

I am creating entity for table created outside of my system. I want to bind data from other table to entity field by using #SecondaryTable (or possibly better solution), but only to do so if condition is met. IE. my table has 1 row, I want to bind data from other table (oneToMany) where certain condition is met (exactly one match from other table(transform to one to one)). Can I use #Where annotation and how? If not is there alternative?
Edit: here is the entity and additional info on the related table
#Entity
#Table(name = "RE_STORAGE_INSTANCE")
public class Movie {
#Id
#Column(name="ID_")
private Long id;
...
//Column I want to fetch
private Date dueDate;
}
Table RE_VARIABLES manyToOne to table RE_STORAGE_INSTANCE, contains fields: re_key, re_value. I want to fetch re_value only if 're_key' equals dueDate. Even though it's manyToOne, only one row of RE_VARIABLES contains due date for each RE_STORAGE_INSTANCE row.

Spring Data / Hibernate save entity with Postgres using Insert on Conflict Update Some fields

I have a domain object in Spring which I am saving using JpaRepository.save method and using Sequence generator from Postgres to generate id automatically.
#SequenceGenerator(initialValue = 1, name = "device_metric_gen", sequenceName = "device_metric_seq")
public class DeviceMetric extends BaseTimeModel {
#Id
#GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "device_metric_gen")
#Column(nullable = false, updatable = false)
private Long id;
///// extra fields
My use-case requires to do an upsert instead of normal save operation (which I am aware will update if the id is present). I want to update an existing row if a combination of three columns (assume a composite unique) is present or else create a new row.
This is something similar to this:
INSERT INTO customers (name, email)
VALUES
(
'Microsoft',
'hotline#microsoft.com'
)
ON CONFLICT (name)
DO
UPDATE
SET email = EXCLUDED.email || ';' || customers.email;
One way of achieving the same in Spring-data that I can think of is:
Write a custom save operation in the service layer that
Does a get for the three-column and if a row is present
Set the same id in current object and do a repository.save
If no row present, do a normal repository.save
Problem with the above approach is that every insert now does a select and then save which makes two database calls whereas the same can be achieved by postgres insert on conflict feature with just one db call.
Any pointers on how to implement this in Spring Data?
One way is to write a native query insert into values (all fields here). The object in question has around 25 fields so I am looking for an another better way to achieve the same.
As #JBNizet mentioned, you answered your own question by suggesting reading for the data and then updating if found and inserting otherwise. Here's how you could do it using spring data and Optional.
Define a findByField1AndField2AndField3 method on your DeviceMetricRepository.
public interface DeviceMetricRepository extends JpaRepository<DeviceMetric, UUID> {
Optional<DeviceMetric> findByField1AndField2AndField3(String field1, String field2, String field3);
}
Use the repository in a service method.
#RequiredArgsConstructor
public class DeviceMetricService {
private final DeviceMetricRepository repo;
DeviceMetric save(String email, String phoneNumber) {
DeviceMetric deviceMetric = repo.findByField1AndField2AndField3("field1", "field", "field3")
.orElse(new DeviceMetric()); // create new object in a way that makes sense for you
deviceMetric.setEmail(email);
deviceMetric.setPhoneNumber(phoneNumber);
return repo.save(deviceMetric);
}
}
A word of advice on observability:
You mentioned that this is a high throughput use case in your system. Regardless of the approach taken, consider instrumenting timers around this save. This way you can measure the initial performance against any tunings you make in an objective way. Look at this an experiment and be prepared to pivot to other solutions as needed. If you are always reading these three columns together, ensure they are indexed. With these things in place, you may find that reading to determine update/insert is acceptable.
I would recommend using a named query to fetch a row based on your candidate keys. If a row is present, update it, otherwise create a new row. Both of these operations can be done using the save method.
#NamedQuery(name="getCustomerByNameAndEmail", query="select a from Customers a where a.name = :name and a.email = :email");
You can also use the #UniqueColumns() annotation on the entity to make sure that these columns always maintain uniqueness when grouped together.
Optional<Customers> customer = customerRepo.getCustomersByNameAndEmail(name, email);
Implement the above method in your repository. All it will do it call the query and pass the name and email as parameters. Make sure to return an Optional.empty() if there is no row present.
Customers c;
if (customer.isPresent()) {
c = customer.get();
c.setEmail("newemail#gmail.com");
c.setPhone("9420420420");
customerRepo.save(c);
} else {
c = new Customer(0, "name", "email", "5451515478");
customerRepo.save(c);
}
Pass the ID as 0 and JPA will insert a new row with the ID generated according to the sequence generator.
Although I never recommend using a number as an ID, if possible use a randomly generated UUID for the primary key, it will qurantee uniqueness and avoid any unexpected behaviour that may come with sequence generators.
With spring JPA it's pretty simple to implement this with clean java code.
Using Spring Data JPA's method T getOne(ID id), you're not querying the DB itself but you are using a reference to the DB object (proxy). Therefore when updating/saving the entity you are performing a one time operation.
To be able to modify the object Spring provides the #Transactional annotation which is a method level annotation that declares that the method starts a transaction and closes it only when the method itself ends its runtime.
You'd have to:
Start a jpa transaction
get the Db reference through getOne
modify the DB reference
save it on the database
close the transaction
Not having much visibility of your actual code I'm gonna abstract it as much as possible:
#Transactional
public void saveOrUpdate(DeviceMetric metric) {
DeviceMetric deviceMetric = metricRepository.getOne(metric.getId());
//modify it
deviceMetric.setName("Hello World!");
metricRepository.save(metric);
}
The tricky part is to not think the getOne as a SELECT from the DB. The database never gets called until the 'save' method.

How to get spring neo4j cypher custom query to populate an array of child relationships

Built-in queries to Spring Data Neo4j (SDN) return objects populated with depth 1 by default. This means that "children" (related nodes) of an object returned by a query are populated. That's good - there are actual objects on the end of references from objects returned by these queries.
Custom queries are depth 0 by default. This is a hassle.
In this answer, it is described how to get springboot neo4j to populate a related element to the target of a custom query - to achieve an extra one level of depth of results from the query.
I am having trouble with this method when the related elements are in a list:
#NodeEntity
public class BoardPosition {
#Relationship(type="PARENT", direction = Relationship.INCOMING)
public List<BoardPosition> children;
I have a query returning a target BoardPosition and I need it's children to be populated.
#Query("MATCH (target:BoardPosition) <-[c:PARENT]- (child:BoardPosition)
WHERE target.play={Play}
RETURN target, c, child")
BoardPosition findActiveByPlay(#Param("Play") String play);
The problem is that the query appears to return one separate result for each child, and those results aren't being used to populate the array of children in the target.
Instead of Spring Neo collating the children into the array on the target, I get "only 1 result expected" error - as if the query is returning multiple results each with one child, rather than one result with the children in it.
org.springframework.dao.IncorrectResultSizeDataAccessException:
Incorrect result size: expected at most 1
How can I have a custom query to populate that target's children list?
(Note that the built-in findByPlay(play) does what I want - the built-in queries have a depth of 1 rather than 0, and it returns a target with populated children - but of course I need to make the query a bit more sophisticated than just "by Play"... that's why I need to solve this)
Versions:
org.springframework.data:spring-data-neo4j:5.1.3.RELEASE
neo4j 3.5.0
=== Edit ======
Your problem arises because you have self-relationship (relationship between nodes of the same label)
This is how Spring treat your query for single node:
org.springframework.data.neo4j.repository.query.GraphQueryExecution
#Override
public Object execute(Query query, Class<?> type) {
Iterable<?> result;
....
Object ret = iterator.next();
if (iterator.hasNext()) {
throw new IncorrectResultSizeDataAccessException("Incorrect result size: expected at most 1", 1);
}
return ret;
}
Spring passes your node class type Class<?> type to neo4j-ogm and have your data read back.
You know, neo4j server will returns multiple rows for your query, one for each matching path:
A <- PARENT - B
A <- PARENT - C
A <- PARENT - D
If your nodes are of different labels, i.e. of different class type then the ogm only return single node correspond to your query return type, no problem.
But your nodes are of the same labels, i.e. same class type => Neo4j OGM cannot distinguish which is the returned node -> All nodes A, B, C, D returned -> Exception
Regard this issue, I think you should file a bug report now.
For workaround, you can can change the query to return only the distinct target.your_identity_property (identity_property is 'primary key' of the node, which uniquely identify your node)
Then have your application call load with the that identity property:
public interface BoardRepository extends CrudRepository<BoardPos, Long> {
#Query("MATCH (target:B) <-[c:PARENT]- (child:B) WHERE target.play={Play} RETURN DISTINCT target.your_identity_property")
Long findActiveByPlay(#Param("Play") String play);
BoardPos findByYourIdentityProperty(xxxx);
}
=== OLD ======
Spring docs says that (highlighted by me):
Custom queries do not support a custom depth. Additionally, #Query does not support mapping a path to domain entities, as such, a path should not be returned from a Cypher query. Instead, return nodes and relationships to have them mapped to domain entities.
So clearly your use-case (populate children nodes by custom query) is supported. Spring framework already maps the results into a single node. (Indeed, my setup on local turnouts that the operation is working properly)
So your exception may be caused by several issues:
You have more than one target:BoardPosition with target.play={play}. So the exception refers to more than one target:BoardPosition instead of one BoardPosition with multiple child result
You have incorrect entity mapping. Do you have your mapping field annotated with #Relationship with correct direction attribute? You might post your entity here.
Here is my local setup:
#NodeEntity(label = "C")
#Data
public class Child {
#Id
#GeneratedValue
private long id;
private String name;
#Relationship(type = "PARENT", direction = "INCOMING")
private List<Parent> parents;
}
public interface ChildRepository extends CrudRepository<Child, Long> {
#Query("MATCH (target:C) <-[p:PARENT]- (child:P) "
+ "WHERE target.name={name} "
+ "RETURN target, p, child")
Child findByName(#Param("name") String name);
}
(:C) <-[:PARENT] - (:P)
Consider the alternative query
MATCH (target:BoardPosition {play:{Play}})
RETURN target, [ (target)<-[c:PARENT]-(child:BoardPosition) | [c, child] ]
which is using list comprehension to return not only the target but also its relations and related nodes of label BoardPosition within one result row. This ensures that the result will be a single row (as long as your attribute play is unique).
I didn't try it with your example but in my application this approach is working fine. Neo4j OGM hydrates the objects as expected. It is important to include the related nodes as well as the relations pointing to the nodes.
If you enable neo4j OGM logs, you can see that the build-in queries with depth 1 use the same approach.

JDBC: select entities with Many to one relation

I have the two entity classes with bi-directional Many-to-one relation.
class A {
#Column(name="ID")
long Id;
}
class B {
#ManyToOne
#JoinColumn(name="A_ID")
A a;
}
The entities are well-coded with additional data fields and getters and setters. And now I want to construct a query string to fetch data from table B, where B's "A_ID" column is equal to A's "ID".
I tried something like this:
"select b.data1, b.data2 from B b, A a WHERE b.a.Id=a.Id"
But it does not work. What is the correct way to construct such a query? And if A and B are in a uni directional relation, would there be any difference?
Thanks in advance.
You don't need to join the tables, the whole idea behind #ManyToOne and #OneToMany is to do away with the need for most joins.
I refer you to a tutorial on JPA, like http://en.wikibooks.org/wiki/Java_Persistence/ManyToOne and http://en.wikibooks.org/wiki/Java_Persistence/OneToMany.
Now, without seeing your actual db definitions it's a bit difficult to guess the actual structure of your program and database, but it should be something like this:
class A {
#Id
#Column(name="ID")
long Id;
#OneToMany(mappedBy="a")
List<B> bees;
}
class B {
#ManyToOne
#JoinColumn(name="A_ID") // Note that A_ID is a column in the B table!!!
A a;
}
With the example above you could just select any list of B's you need, and JPA will automatically fetch the associated A for each found B. You don't need to do anything to be able to access it, b.a.Id will just work.
As we also have the OneToMany relationship, every A can have multiple B's associated with it. So, for any select that fetches a set of A's, each returned A's bees field will give access to the proper list of B objects, without the need to pull the B able into the query.

Sum with one to many Spring Data JPA

I am trying to get a sum with a one to many relationship, illustrated by the following relationship (only parent shown):
#Entity
#Table(name = "Parent")
public class Parent implements Serializable {
private static final long serialVersionUID = -7348332185233715983L;
#Id
#Basic(optional = false)
#Column(name = "PARENT_ID")
private Long parentId;
#OneToMany
#JoinColumn(name="CHILDREN", referencedColumnName="PARENT_ID")
private List<Child> children;
#Formula("(select sum(select height from children))")
private BicDecimal totalHeight
}
It is pretty straight forward with no restrictions and even with static restrictions. I am having trouble when the children list is restricted dynamically though.
In my case, I am using spring data and jpa. I am using specifications to restrict the children and am getting the appropriate list of children, but obviously the sum is still for unrestricted children because there is no where clause in the #Formula tag.
I do not want to iterate over the list in java for performance reasons and because the results are paginated. Also, the sum is not of the paginated results, but of all results.
I am new to Spring Data/JPA. Historically, I could build this query dynamically or use hibernate criteria. I am OK running a completley separate query to make this calculation. it is not required that I use the #Formula annotation as there is only 1 aggregation per call. In a hibernate framework, I could just state the select clause as "sum(field)" and build the criteria. In the Spring Data/JPA framework, I can build the specifications fine which covers the criteria, but I have no idea how to manipulate the select part of the query since it seems tied so tightly to the entity.
Using the #Query annotation on the repository works as its own query if I know which fields I need to restrict on, but often the fields are null and need to be ignored for the query. There are 8 possible fields, leaving me with 256 possible combinations (2^8). That is too many methods for this in the repository.
Any ideas outside of switching frameworks?
Posting an answer to this old question since I had a somewhat similar problem recently.
I decided to go with a custom repository with a method that does the aggregation based on any Specification passed into it. Specifications can be combined to compose dynamic criteria (see org.springframework.data.jpa.domain.Specifications)
So my repository to above Child height problem would look like below:
package something
import org.springframework.data.jpa.domain.Specification;
import org.springframework.stereotype.Repository;
import javax.persistence.EntityManager;
import javax.persistence.PersistenceContext;
import javax.persistence.criteria.CriteriaBuilder;
import javax.persistence.criteria.CriteriaQuery;
import javax.persistence.criteria.Root;
#Repository
public class ChildHeightRepository {
#PersistenceContext
EntityManager entityManager;
public Long getTotalHeight(Specification<Child> spec) {
CriteriaBuilder cb = entityManager.getCriteriaBuilder();
CriteriaQuery query = cb.createQuery(Long.class);
Root root = query.from(Child.class);
return ((Long) entityManager.createQuery(
query.where(spec.toPredicate(root, query, cb))
.select(cb.sum(root.get("height")))).getSingleResult());
}
}
Have you tried in JPQL
select sum(d.height) from Parent a join a.children d
If you dont want to ignore nulls
select sum(d.height) from Parent a left join a.children d
I think other question you have is how to filter depending on the properties . I mean if you need to have a where statement with several combinations.
Why you don't try to use a List and adding to the list all the predicates you want to apply depending on the combinations you want to have. Example
Create query
CriteriaBuilder cb = em.getCriteriaBuilder();
CriteriaQuery cq = cb.createQuery(Double.class);
Root<Parent> root = cq.from(Parent.class)
Join<Parent, Child> join = cq.join(("children"));
cq.select(cb.<Double>sum(join.get("height")));
Create the list of predicates
List<Predicates> listOfpredicates = new ArrayList<Predicates>();
if(property1 != null && !"".equals(property1){
PatameterExpression<String> p = cb.parameter(String.class, "valueOfproperty1")
listOfpredicates.add(cb.equal(join.get("property1"),p);
}
Then add to the CriteriaQuery
if(listOfPredicates.size() == 1)
cq.where(listOfPredicates.get(0))
else
cq.where(cb.and(listOfPredicates.toArray(new Predicate[0])));
Finally execute the query.
TypedQuery<Double> q = em.createQuery(cq);
q.getResultList();
This will create dynamically your query with any combination.
6 years late but it still took me a while to get it to work, here is how I would do it for your mapping:
#Formula("(select sum(children.height) from children_table children inner join Parent p on children.parent_id=p.id where children.parent_id=parent_id)")
private BicDecimal totalHeight
Stuff you need to take care of:
add () to the beginning and end of your formula otherwise the syntax of the sql wont be translated correctly.
the formula query is a native SQL query to my understanding and not a JPQL some one might want to correct me on this?.
properties are those of the tables and not what you name your properties in Java so the column names and table names have to actually be the table names.

Resources