I'm using SpringBoot 2.3.1 and Spring Data for accessing to PostgreSQL. I have the following simple controller:
#RestController
public class OrgsApiImpl implements OrgsApi {
#Autowired
Orgs repository;
#Override
public ResponseEntity<List<OrgEntity>> listOrgs(#Valid Optional<Integer> pageLimit,
#Valid Optional<String> pageCursor, #Valid Optional<List<String>> domainId,
#Valid Optional<List<String>> userId) {
List<OrgEntity> orgs;
if (domainId.isPresent() && userId.isPresent()) {
orgs = repository.findAllByDomainIdInAndUserIdIn(domainId.get(), userId.get());
} else if (domainId.isPresent) {
orgs = repository.findAllByDomainIdIn(domainId.get());
} else if (userId.isPresent()) {
orgs = repository.findAllByUserIdIn(userId.get());
} else {
orgs = findAll();
}
return ResponseEntity.ok(orgs);
}
}
And a simple JPA repository:
public interface Orgs extends JpaRepository<OrgEntity, String> {
List<OrgEntity> findAllByDomainIdIn(List<String> domainIds);
List<OrgEntity> findAllByUserIdIn(List<String> userIds);
List<OrgEntity> findAllByDomainIdInAndUserIdIn(List<String> domainIds, List<String> userIds);
}
The code above has several obvious issues:
If number of query parameters will grow, then this if is growing very fast and too hard to maintain it. Question: Is there any way to build query with dynamic number of parameters?
This code doesn't contain a mechanism to support cursor. Question: Is there any tool in Spring Data to support query based on cursor?
The second question can be easily get read if first question is answered.
Thank you in advance!
tl;dr
It's all in the reference documentation.
Details
Spring Data modules pretty broadly support Querydsl to build dynamic queries as documented in the reference documentation. For Spring Data JPA in particular, there's also support for Specifications on top of the JPA Criteria API. For simple permutations, query by example might be an option, too.
As for the second question, Spring Data repositories support streaming over results. That said, assuming you'd like to do this for performance reasons, JPA might not be the best fit in the first place, as it'll still keep processed items around due to its entity lifecycle model. If it's just about access subsets of the results page by page or slice by slice, that's supported, too.
For even more efficient streaming over large data sets, it's advisable to resort to plain SQL either via jOOQ (which can be used with any Spring Data module supporting relational databases), Spring Data JDBC or even Spring Data R2DBC if reactive programming is an option.
You can use spring-dynamic-jpa library to write a query template
The query template will be built into different query strings before execution depending on your parameters when you invoke the method.
Related
I'm wondering if I can use JPA specification predicates in custom queries?
I've tried but with no success.
Let's say I have an Entity Customer and a repository:
#Repository
public interface CustomerRepository
extends JpaRepository<Customer, Long>,
JpaSpecificationExecutor<Customer> {
}
Querying like this is OK
#Query("select c from Customer c")
Stream<Customer> streamAllCustomers();
This is Not OK
Stream<Customer> streamAllCustomersWithFilter(Specification<Customer> filter);
Is there a way to achieve this ?
NB I know I can put params in the #Query but I would like to stay in the design of the current app and use Specifications all the way.
There is the way to stream data from Spring Data JPA that I use.
This approach is useful to process huge amount of data while avoiding high memory consumption, because whole query result is not loaded to memory.
Create custom individual repository with following implementation
public class YourCustomRepositoryImpl implements YourCustomRepository {
#PersistenceContext(unitName = "yorEntityManagerFactory")
private EntityManager em;
#Override
public Stream<SomeEntity> streamAll(Specification<SomeEntity> spec) {
CriteriaBuilder cb = em.getCriteriaBuilder();
CriteriaQuery<SomeEntity> query = cb.createQuery(SomeEntity.class);
Root<SomeEntity> root = query.from(SomeEntity.class);
query.where(spec.toPredicate(root, query, cb));
return em.createQuery(query).getResultStream();
}
}
Sure, this requires JPA 2.2 supporting ORM framework (I used Hibernate 5.3).
Also, you should care to provide a connection to be alive while stream is being processed.
TL;DR;
No, and No, but manually Yes
I think issue DATAJPA-906 answers both of your questions
Question (from the title): How to use declare Stream as return type when dealing with JPA Specification and spring-data-jpa?
You don't, at least not in a directly supported way:
Support Java 8 Streams on JpaSpecificationExecutor
[..]
This unfortunately will have to wait for a 2.0 revamp as a Stream in the method signature would render the interface unloadable on versions of Java < 8.
Of course you can always add your custom methods including implementation.
Question Can I use JPA specification predicates in custom queries? (custom queries being queries defined using the #Query annotation
how would you even combine a CriteriaQuery defined through a Specification and a manually defined JPQL query?
In case the problem is not clear: If your custom query contains an inner select, wher should the Criteria from the specification go?
What you can do
Implement a custom method, returning a Stream and taking a specification as an argument, combine it with prepared specifications to call an existing method of the JpaSpecificationExecutor interface and convert the result to a Stream
Yesterday I've got access to the new project in my company and I have found this
public List<User> findNotActiveUsers() {
return this.userRepository.findAll().splititerator()
.filter(u -> u.isActive())
.collect(Collect.toList());
}
Is this a good way to find all the active users? Or should it be done in a repository like this?
public interface UserRepository extends JpaRepository<Long, User> {
#Query("SELECT user FROM User user WHERE user.active IS TRUE")
List<User> findActiveUsers();
}
And If first solution is correct what about performance?
Firstly, both options fulfill the requirement.
However, the option 2 makes more sense to filter the data at query level rather than at Java level. I believe the performance would be better on the second option though I don't have any data to backup this statement. I have commented about the performance based on my experience.
You can also consider whether Cache (#Cacheable) can be used. It purely depends on the use case i.e. how frequently the User entity is changed and how frequently you would like to refresh the cache.
One disadvantage of using native query is that currently Spring JPA doesn't support execution of dynamic sorting for native queries.
Please refer the similar question discussed in the below link though it is very much related to Hibernate. Clearly, the option 3 is preferred (i.e. #Query approach).
Spring Data Repository with ORM, EntityManager, #Query, what is the most elegant way to deal with custom SQL queries?
I am considering to use Spring-Data + QueryDSL + JDBC to replace (or enhance) the currently used MyBatis.
My reasons are:
Compile-time check of column names
Compile-time check of SQL tatements and auto-completion from IDE
Ability to write unit tests on Java Collections against the same code that will work against the actual DB, which is much simpler and faster than pre-populating a DB
It is more concise than MyBatis – no need for a separate XxxMapper.java and XxxMapper.xml under the DAO layer
Yet I see the following problems:
There is no infrastructure for mapping the query results to domain objects. QueryDSL's QBean and MappingProjection, Spring's BeanPropertyRowMapper and Spring-Data's OneToManyResultSetExtractor seem to be too low level, see below.
No out of the box session/transaction-level cache which comes for free in MyBatis
No out of the box SQL statement and result logging which comes for free in MyBatis
Since I am asking a single question let's concentrate on the mapping which I consider the most important problem.
So my question is:
Is there any convenient solution to map the QueryDSL's SQLQuery results to the domain object, similar to what MyBatis or JPA offer? That is, the mapping based on some simple configuration, be it XML, annotations, or simple Java?
In particular, I am interested in the following:
Mapping a column to a custom type, e.g. a String column to an EmailAddress Java object
Mapping a group of columns to an embedded object, such as e.g. grouping {first_name, last_name} into a FullName Java object
Supporting one-to-many relationship, such as being able to extract a Customer object(s) containing a list of Addresses.
To summarize, I need an easy way to obtain one or many objects of the following 'Customer' class from the following SQL query:
class Customer {
EmailAddress emailAddress;
FullName fullName;
Set<Address> addresses;
Set<Comment> selfDescription;
}
class EmailAddress {
private String address;
EmailAddress(String address) {this.address = address; }
}
class FullName {
String firstName, lastName;
}
class Address {
String street, town, country;
}
class Comment {
LocalDateTime timeStamp;
String content;
}
Query:
query.from(qCustomer).
leftJoin(qCustomer._addressCustomerRef, qAddress)).
leftJoin(qCustomer._commentCustomerRed).
getResults(
qCustomer.email_address, qCustomer.first_name, qCustomer.last_name,
qAddress.street, qAddress.town, qAddress.country,
qComment.creation_time_stamp, qComment.content);
The ideal solution for me would be to reuse the MyBatis mapping infrastructure.
Another mapping solution or a custom one is also acceptable.
Note:
I could also accept "negative" answers if you show an alternative that:
Possesses an ease of use and transparency comparable to that of MyBatis - you always know which SQL is executed by simply inspecting the code
Allows full control over the executed SQL code, in particular, allows to easily write three DAO methods for retrieving 'Customer': without 'addresses' and 'selfDescription' information, just with 'addresses', and with all the fields
Allows compile-time check of your SQL code
Does not require hand-coding of mapping of every single domain class from SQL.
The alternative should work well on the example above.
Solutions already considered:
MyBatis 'Builder' class (http://mybatis.github.io/mybatis-3/statement-builders.html): not enough, since the column and table names are still Strings, so it violates requirement (3)
Spring-data + JPA + QueryDSL: might be an option if you show how the requirements (1) and (2) can be satisfied and if no simpler solution will be provided
Lukas Eder gave an excellent answer to a similar question here: Is it possible to combine MyBatis and QueryDSL/jOOQ?
His answer to the mapping question is to use either Java 8 functional style capabilities or a dedicated solution such as modelmapper.
He also mentioned Spring JCache support as a caching solution and this solution to the logging.
I have a large table that I'd like to access via a Spring Data Repository.
Currently, I'm trying to extend the PagingAndSortingRepository interface but it seems I can only define methods that return lists, eg.:
public interface MyRepository extends
PagingAndSortingRepository<MyEntity, Integer>
{
#Query(value="SELECT * ...")
List<MyEntity> myQuery(Pageable p);
}
On the other hand, the findAll() method that comes with PagingAndSortingRepository returns an Iterable (and I suppose that the data is not loaded into memory).
Is it possible to define custom queries that also return Iterable and/or don't load all the data into memory at once?
Are there any alternatives for handling large tables?
We have the classical consulting answer here: it depends. As the implementation of the method is store specific, we depend on the underlying store API. In case of JPA there's no chance to provide streaming access as ….getResultList() returns a List. Hence we also expose the List to the client as especially JPA developers might be used to working with lists. So for JPA the only option is using the pagination API.
For a store like Neo4j we support the streaming access as the repositories return Iterable on CRUD methods as well as on the execution of finder methods.
The implementation of findAll() simply loads the entire list of all entities into memory. Its Iterable return type doesn't imply that it implements some sort of database level cursor handling.
On the other hand your custom myQuery(Pageable) method will only load one page worth of entities, because the generated implementation honours its Pageable parameter. You can declare its return type either as Page or List. In the latter case you still receive the same (restricted) number of entities, but not the metadata that a Page would additionally carry.
So you basically did the right thing to avoid loading all entities into memory in your custom query.
Please review the related documentation here.
I think what you are looking for is Spring Data JPA Stream. It brings a significant performance boost to data fetching particularly in databases with millions of record. In your case you have several options which you can consider
Pull all data once in memory
Use pagination and read pages each time
Use something like Apache Spark
Streaming data using Spring Data JPA
In order to make Spring Data JPA Stream to work, we need to modify our MyRepository to return Stream<MyEntity> like this:
public interface MyRepository extends PagingAndSortingRepository<MyEntity, Integer> {
#QueryHints(value = {
#QueryHint(name = HINT_CACHEABLE, value = "false"),
#QueryHint(name = READ_ONLY, value = "true")
})
#Query(value="SELECT * ...")
Stream<MyEntity> myQuery();
}
In this example, we disable second level caching and hint Hibernate that the entities will be read only. If your requirement is different, make sure to change those settings accordingly for your requirements.
Starting to work on a new project... RESTful layer providing services for social network platform.
Neo4j was my obvious choice for main data store, I had the chance to work with Neo before but without exploiting Spring Data abilities to map POJO to node which seems very convenient.
Goals:
The layer should provide support resemble to Facebook Graph API, which defines for each entity/object related properties & connections which can be refer from the URL. FB Graph API
If possible I want to avoid transfer objects which will be serialized to/from domain entities and use my domain pojo's as the JSON's transferred to/from the client.
Examples:
HTTP GET /profile/{id}/?fields=...&connections=... the response will be Profile object contains the requested in the URL.
HTTP GET /profile/{id}/stories/?fields=..&connections=...&page=..&sort=... the response will be list of Story objects according to the requested.
Relevant Versions:
Spring Framework 3.1.2
Spring Data Neo4j 2.1.0.RC3
Spring Data Mongodb 1.1.0.RC1
AspectJ 1.6.12
Jackson 1.8.5
To make it simple we have Profile,Story nodes and Role relationship between them.
public abstract class GraphEntity {
#GraphId
protected Long id;
}
Profile Node
#NodeEntity
#Configurable
public class Profile extends GraphEntity {
// Profile fields
private String firstName;
private String lastName;
// Profile connections
#RelatedTo(type = "FOLLOW", direction = Direction.OUTGOING)
private Set<Profile> followThem;
#RelatedTo(type = "BOOKMARK", direction = Direction.OUTGOING)
private Set<Story> bookmarks;
#Query("START profile=node({self}) match profile-[r:ROLE]->story where r.role = FOUNDER and story.status = PUBLIC")
private Iterable<Story> published;
}
Story Node
#NodeEntity
#Configurable
public class Story extends GraphEntity {
// Story fields
private String title;
private StoryStatusEnum status = StoryStatusEnum.PRIVATE;
// Story connections
#RelatedToVia(type = "ROLE", elementClass = Role.class, direction = Direction.INCOMING)
private Set<Role> roles;
}
Role Relationship
#RelationshipEntity(type = "ROLE")
public class Role extends GraphEntity {
#StartNode
private Profile profile;
#EndNode
private Story story;
private StoryRoleEnum role;
}
At first I didn't use AspectJ support, but I find it very useful for my use-case cause it is generating a divider between the POJO to the actual node therefore I can request easily properties/connections according to the requests and the Domain Driven Design Approach seems very nice.
Question 1 - AspectJ:
Let's say I want to define default fields for an object, these fields will be returned to the client whether if requested in the URL or not...so I have tried #FETCH annotation on these fields but it seems it is not working when using AspectJ.
At the moment I do it that way..
public Profile(Node n) {
setPersistentState(n);
this.id = getId();
this.firstName = getFirstName();
this.lastName = getLastName();
}
Is it the right approach to achieve that? does the #FETCH annotation should be supported even when using AspectJ? I will be happy to get examples/blogs talking about AspectJ + Neo4j didn't find almost anything....
Question 2 - Pagination:
I would like to support pagination when requesting for specific connection for example
/profile/{id}/stories/ , if stories related as below
// inside profile node
#RelatedTo(type = "BOOKMARK", direction = Direction.OUTGOING)
private Set<Story> bookmarks;
/profile/{id}/stories/ ,if stories related as below
// inside profile node
#Query("START profile=node({self}) match profile-[r:ROLE]->story where r.role = FOUNDER and story.status = PUBLIC")
private Iterable<Story> published;
Is pagination is supported out of the box with either #Query || #RelatedTo || #RelatedToVia using Pageable interface to retrieve Page instead of Set/List/Iterable? the limit and the sorting should be dynamic depending on the request from the client... I can achieve that using Cypher Query DSL but prefer to use the basic.. other approaches will be accepted happily.
Question 3 - #Query with {self}:
Kind of silly question but I can't help it :), it seems that when using #Query inside the node entity ( using {self} parameter } the return type must be Iterable which make sense..
lets take the example of...
// inside profile node
#Query("START profile=node({self}) match profile-[r:ROLE]->story where r.role = FOUNDER and story.status = PUBLIC")
private Iterable<Story> published;
When published connection is requested:
// retrieving the context profile
Profile profile = profileRepo.findOne(id);
// getting the publishe stories using AspectJ - will redirect to the backed node
Iterable<Story> published = profile.getPublished();
// set the result into the domain object - will throw exception of read only because the type is Iterable
profile.setPublished(published);
Is there a workaround for that? which is not creating another property which will be #Transiant inside Profile..
Question 4 - Recursive relations:
I am having some problems with transitive / recursive relations, when assigning new Profile Role in Story the relation entity role contain #EndNode story , which contain roles connection...and one of them is the context role above and it is never end :)...
Is there a way to configure the spring data engine not to create these never ending relations?
Question 5 - Transactions:
Maybe I should have mentioned it before but I am using the REST server for the Neo4j DB, from previous reading I understand that there is not support out-of-the-box in transactions? like when using the Embedded server
I have the following code...
Profile newProfile = new Profile();
newProfile.getFollowThem().add(otherProfile);
newProfile.getBookmarks().add(otherStory);
newProfile.persist(); // or profileRepo.save(newProfile)
will this run in transaction when using REST server? there are few operations here, if one fail all fail?
Question 6 - Mongo + Neo4j:
I need to store data which don't have relational nature.. like Feeds, Comments , Massages. I thought about an integration with MongoDB to store these.. can I split domain pojo fields/connections to both mongo/neo4j with cross-store support? will it support AspectJ?
That is it for now.... any comments regarding any approach I presented above will be welcome.. thank you.
Starting to answer, by no means complete:
Perhaps upgrade to the the .RELEASE versions?
Question 1
If you want to serialize AspectJ entities to JSON you have to exclude the internal fields generated by the advanced mapping (see this forum discussion).
When you use the Advanced Mapping #Fetch is not necessary as the data is read-through from the database anyway.
Question 2
For the pagination for fields, you can try to use a cypher-query with #Query and LIMIT 100 SKIP 10 as a fixed parameter. Otherwise you could employ a repository/template to actually fill a Collection in a field of your entity with the paged information.
Question 3
I don't think that the return-type of an #Query has to be an Iterable it should also work with other types (Collections or concrete types). What is the issue you run into?
For creating recursive relationships - try to store the relationship-objects themselves first and only then the node-entities. Or use template.createRelationshipBetween(start, end, type, allowDuplicates) for creating the relationships.
Question 5
As you are using SDN over REST it might not perform very well, as right now the underlying implementation uses the RestGraphDatabase for fine-grained operations and the advanced mapping uses very fine grained calls. Is there any reason why you don't want to use the embedded mode? Against a REST server I would most certainly use the simple-mapping and try to handle read operations mostly with cypher.
With the REST APi there is only one tx per http-call the only option of having larger transactions is to use the rest-batch-api.
There is a pseudo-transaction support in the underlying rest-graph-database which batches calls issued within a "transaction" to be executed in one batch-rest-request. But those calls must not rely on read-results during the tx, those will only be populated after the tx has finished. There were also some issues using this approach with SDN so I disabled it for that (it is a config-option/system-property for the rest-graphdb).
Question 6
Right now cross-store support for both MongoDB and Neo4j is just used against a JPA / relational store. We discussed having cross-store references between the spring-data projects once but didn't follow up on this.