Hibernate Batch processing - spring

So I am trying to configure Hibernate for batch processing purposes. I have built a sample application and configured according to Hibernates docs.
But after configuring Hibernate to log out the SQL, it looks like it is not performing a batch insert at all but merely individual inserts. Am I reading this log wrong?
So I have the following properties configured in my Spring Boot app.
spring.jpa.properties.hibernate.jdbc.batch_size=10
spring.jpa.properties.hibernate.show_sql=true
spring.jpa.properties.hibernate.format_sql=true
Here is my very basic Batch Writer..
#Transactional
public class BatchWriter {
private EntityManager entityManager;
private int batchSize;
public BatchWriter(EntityManager entityManager, int batchSize) {
this.entityManager = entityManager;
this.batchSize = batchSize;
}
public void writeBatchOfCustomers(int numOf) {
for(long i = 0; i <= numOf; i++) {
Customer customer = new Customer(i);
entityManager.persist(customer);
if ( i % batchSize == 0 ) { //20, same as the JDBC batch size
//flush a batch of inserts and release memory:
entityManager.flush();
entityManager.clear();
}
}
}
}
Now I am running this to insert 20 Customers for example and in the hibernate log I am seeing the following 20 times:
Hibernate:
insert
into
customer
(first_name, last_name, id)
values
(?, ?, ?)
What am I missing here?
It is currently using Spring Boot auto configuration with H2 database. I will however be looking to use it with Spring Batch and an Oracle db eventually, which will be inserting around 30k objects with about 35 attributes.
Any help appreciated.
Thanks,

So it appears that the hibernate SQL logging is rather misleading (in my opinion).
My configuration was in fact batch processing.
I added a logger with level DEBUG for this class:
org.hibernate.engine.jdbc.batch.internal.BatchingBatch
as currently named in Hibernate version 5.0.12. (Think it is named something else previously).
In this class you can see the it is in fact batch processing.

Related

Spring boot manually commit transaction

In my Spring boot app I'm deleting and inserting a large amount of data into my MySQL db in a single transaction. Ideally, I want to only commit the results to my database at the end, so all or nothing. I'm running into issues where my deletions will be committed before my insertions, so during that period any calls to the db will return no data (not good). Is there a way to manually commit transaction?
My main logic is:
#Transactional
public void saveParents(List<Parent> parents) {
parentRepo.deleteAllInBatch();
parentRepo.resetAutoIncrement();
//I'm setting the id manually before hand
String sql = "INSERT INTO parent " +
"(id, name, address, number) " +
"VALUES ( ?, ?, ?, ?)";
jdbcTemplate.batchUpdate(sql, new BatchPreparedStatementSetter() {
#Override
public void setValues(PreparedStatement ps, int i) throws SQLException {
Parent parent = parents.get(i);
ps.setInt(1, parent.getId());
ps.setString(2, parent.getName());
ps.setString(3, parent.getAddress());
ps.setString(4, parent.getNumber());
}
#Override
public int getBatchSize() {
return parents.size();
}
});
}
ParentRepo
#Repository
#Transactional
public interface ParentRepo extends JpaRepository<Parent, Integer> {
#Modifying
#Query(
value = "alter table parent auto_increment = 1",
nativeQuery = true
)
void resetAutoIncrement();
}
EDIT:
So I changed
parentRepo.deleteAllInBatch();
parentRepo.resetAutoIncrement();
to
jdbcTemplate.update("DELETE FROM output_stream");
jdbcTemplate.update("alter table output_stream auto_increment = 1");
in order to try avoiding jpa's transaction but each operation seems to be committing separately no matter what I try. I have tried TransactionTemplate and implementing PlatformTransactionManager (seen here) but I can't seem to get these operations to commit together.
EDIT: It seems the issue I was having was with the alter table as it will always commit.
I'm running into issues where my deletions will be committed before my insertions, so during that period any calls to the db will return no data
Did you configure JPA and JDBC to share transactions?
If not, then you're basically using two different mechanisms to access the data (EntityManager and JdbcTempate), each of them maintaining a separate connection to the database. What likely happens is that only EntityManager joins the transaction created by #Transactional; the JdbcTemplate operation executes either without a transaction context (read: in AUTOCOMMIT mode) or in a separate transaction altogether.
See this question. It is a little old, but then again, using JPA and Jdbc together is not exactly a common use case. Also, have a look at the Javadoc for JpaTransactionManager.

Hibernate - Table Locked after update

I'm performing an update via a method using Hibernate and the EntityManager.
This update method is called multiple times (within a loop).
It seems like when I execute it the first time, it locks the table and does not free it.
When trying to update the table via SQL Developer after having closed the application, I see the table is still locked because the update is hanging.
What do you see as a solution to this problem? If you need more information, let me know.
Class
#Repository
#Transactional(propagation = REQUIRES_NEW)
public class YirInfoRepository {
#Autowired
EntityManager entityManager;
#Transactional(propagation = REQUIRES_NEW)
public void setSent(String id) {
String query = "UPDATE_QUERY";
Query nativeQuery = entityManager.createNativeQuery(String.format(query, id));
nativeQuery.executeUpdate();
}
}
UPDATE
After having waited more than one hour, I launched the application again and it worked fine once but now again, it hangs.
UPDATE 2 -- I'll give a maximum bounty to whoever helps me solve this
On another place I use an application managed entity manager and it still gives me the same type of errors.
public void fillYirInfo() {
File inputFile = new File("path");
try (InputStream inputStream = new FileInputStream(inputFile);
BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(inputStream))) {
bufferedReader.lines().skip(1).limit(20).forEach(line -> {
String[] data = line.split(",");
String rnr = data[0];
String linked = data[1];
String email = data.length > 2 ? data[2] : "";
String insuredId = insuredPeopleRepository.getInsuredIdFromNationalId(rnr);
int modifiedCounter = 0;
if (!isNullOrEmpty(insuredId)) {
EntityManager entityManager = emf.createEntityManager();
EntityTransaction transaction = entityManager.getTransaction();
Query nativeQuery = entityManager.createNativeQuery(
"QUERY"
);
transaction.begin();
nativeQuery.executeUpdate();
entityManager.flush();
transaction.commit();
entityManager.close();
}
System.out.println(modifiedCounter + " rows modified");
});
} catch (IOException e) {
e.printStackTrace();
}
}
Try without an update-query:
#Repository
#Transactional(propagation = REQUIRES_NEW)
public class YirInfoRepository {
#Autowired
EntityManager entityManager;
#Transactional(propagation = REQUIRES_NEW)
public void setSent(String id) {
//guessing your class name and method..
final YirInfo yirInfo = entityManager.find(YirInfo.class, id);
yirInfo.setSent();
}
}
Might not be as fast as a single update query, but it's possible to get it reasonably fast, unless the amount of data is huge. This is the preferred way of using Hibernate/JPA, instead of thinking in terms of single values and SQL queries, you work with entities/objects and (sometimes) HQL/JPQL queries.
You are using #Transactional annotation. This means you are using Spring Transaction. Then in your UPDATE 2 you are using transaction by yourself and managed by spring (I guess it's another project or class not managed by Spring).
In any case what I would do is to try to update your records in single spring transaction and I'd not use #Transactional in DAO layer but in service layer. Something like this:
Service layer:
#Service
public class YirInfoService {
#Autowired
YirInfoRepository dao;
#Transactional(propagation = REQUIRES_NEW)
public void setSent(List < String > ids) {
dao.setSents(ids);
}
}
DAO layer:
#Repository
public class YirInfoRepository {
#Autowired
EntityManager entityManager;
//Here you can update by using and IN statement or by doing a cycle
//Let's suppose a bulk operation
public void setSents(List < String > ids) {
String query = "UPDATE_QUERY";
for (int i = 0; i < ids.size(); i++) {
String id = ids.get(i);
Query nativeQuery = entityManager.createNativeQuery(String.format(query, id));
nativeQuery.executeUpdate();
if (i % 20 == 0) {
entityManager.flush();
entityManager.clear();
}
}
}
}
The first thing you have to understand is that for the first example, you are using a native query to update rows in the DB. In this case you are completely skipping Hibernate to do anything for you.
In your second example, you have the same thing, you are updating via an update query. You don't need to flush the entity manager as it's only necessary for transferring the pending changes made to your entity objects within that entity manager.
Plus I don't know how your example works as you are autowiring the entity manager and not using the #PersistenceContext annotation. Make sure you use this one properly because you might have misconfigured the application. Also there is no need to manually create the entity manager when using Spring as it looks in the second example. Just use #PersistenceContext to get an entity manager in your app.
You are also mixing up transaction management. In the first example, it's enough if you put the #Transactional annotation to either of your method or to the class.
For the other example, you are doing manual transaction management which makes no sense in this case. If you are using Spring, you can simply rely on declarative transaction management.
The first thing I'd check here is to integrate datasource-proxy into your connection management and log out how your statements are executed. With this info, you can make sure that the query is sent to the DB side and the DB is executing it very slowly, or you are having a network issue between your app and db.
If you find out that the query is sent properly to the DB, you want to analyze your query, because most probably it's just executed very slowly and needs some optimizations. For this, you can use the Explain plan feature, to find out how your execution plan looks like and then make it faster.

Spring Boot - Change connection dynamically

I have a Spring Boot project with multiple databases of different years and these databases have same tables so the only difference is the year (..., DB2016, DB2017). In the controller of the application i need to return data that belong to "different" years. Moreover in future years other databases will be created (eg. in 2018 there's going to be a db named "DB2018"). So my problem is how to switch the connection among databases without creating a new datasource and a new repository every new year.
In an other question posted by me (Spring Boot - Same repository and same entity for different databases) the answer was to create different datasources and different repositories for every existing database, but in this case i want to return data from existing databases on the basis of the current year. More specifically:
SomeEntity.java
#Entity(name = "SOMETABLE")
public class SomeEntity implements Serializable {
#Id
#Column(name="ID", nullable=false)
private Integer id;
#Column(name="NAME")
private String name;
}
SomeRepository.java
public interface SomeRepository extends PagingAndSortingRepository<SomeEntity, Integer> {
#Query(nativeQuery= true, value = "SELECT * FROM SOMETABLE WHERE NAME = ?1")
List<SomeEntity> findByName(String name);
}
SomeController.java
#RequestMapping(value="/foo/{name}", method=RequestMethod.GET)
public ResponseEntity<List<SomeEntity>> findByName(#PathVariable("name") String name) {
List<SomeEntity> list = autowiredRepo.findByName(name);
return new ResponseEntity<List<SomeEntity>>(list,HttpStatus.OK);
}
application.properties
spring.datasource.url=jdbc:postgresql://localhost:5432/DB
spring.datasource.username=xxx
spring.datasource.password=xxx
So if the current year is 2017 i want something like this:
int currentyear = Calendar.getInstance().get(Calendar.YEAR);
int oldestDbYear = 2014;
List<SomeEntity> listToReturn = new LinkedList<SomeEntity>();
//the method getProperties is a custom method to get properties from a file
String url = getProperties("application.properties", "spring.datasource.url");
props.setProperty("user", getProperties("application.properties","spring.datasource.username"));
props.setProperty("password", getProperties("application.properties","spring.datasource.password"));
for (int i = currentYear, i>oldestDbYear, i--) {
//this is the connection that must be used by autowiredRepo Repository, but i don't know how to do this.
//So the repository uses different connection for every year.
Connection conn = getConnection(url+year,props);
List<SomeEntity> list_of_specific_year = autowiredRepo.findByName(name);
conn.close;
listToReturn.addAll(list_of_specific_year);
}
return listToReturn;
Hope everithing is clear
The thing that is probably most suitable to your needs here is Spring's AbstractRoutingDataSource. You do need to define multiple DataSources but you will only need a single repository. Multiple data sources is not an issue here as there is always a way to create the DataSource beans programatically at run time and register them with the application context.
How it works is you basically register a Map<Object, DataSource> inside your #Configuration class when creating your AbstractRoutingDataSource #Bean and in this case the lookup key would be the year.
Then you need create a class that implements AbstractRoutingDataSource and implement the determineCurrentLookupKey() method. Anytime a database call is made, this method is called in the current context to lookup which DataSource should be returned. In your case it sounds like you simply want to have the year as a #PathVariable in the URL and then as the implementation of determineCurrentLookupKey() grab that #PathVariable out of the URL (e.g in your controller you have mappings like #GetMapping("/{year}/foo/bar/baz")).
HttpServletRequest request = ((ServletRequestAttributes)RequestContextHolder
.getRequestAttributes()).getRequest();
HashMap templateVariables =
(HashMap)request.getAttribute(HandlerMapping.URI_TEMPLATE_VARIABLES_ATTRIBUTE);
return templateVariables.get("year");
I used this approach when writing a testing tool for a product where there were many instances running on multiple different servers and I wanted a unified programming model from my #Controllers but still wanted it to be hitting the right database for the server/deployment combination in the url. Worked like a charm.
The drawback if you are using Hibernate is that all connections will go through a single SessionFactory which will mean you can't take advantage of Hibernate's 2nd level caching as I understand it, but I guess that depends on your needs.

Too many connections spring boot jdbc

I know this is going to be the repetitive question , but I feel my question is bit different.
I have JdbcDAO classes like
#Component
public class JdbcUserDAO implements UserDAO {
#Autowired
MyJdbc myJdbc;
}
I have defined the MyJdbc class as follows :
#Component
public class MyJdbc {
#Autowired
protected JdbcTemplate jdbc;
}
In the MyJdbc class I am defining the insert and batchupdate and calling them through jdbc variable.
Will it create too many connections exceptions.
I have defined the jdbc parameters in application.properties file :
spring.datasource.url=#databaseurl
spring.datasource.username=#username
spring.datasource.password=#password
spring.datasource.driver-class-name=com.mysql.jdbc.Driver
spring.datasource.test-on-borrow=true
spring.datasource.max-active=100
spring.datasource.max-wait=10000
spring.datasource.min-idle=10
spring.datasource.validation-query=SELECT 1
spring.datasource.time-between-eviction-runs-millis= 5000
spring.datasource.min-evictable-idle-time-millis=30000
spring.datasource.test-while-idle=true
spring.datasource.test-on-borrow=true
spring.datasource.test-on-return=false
I am getting the exception :
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Too many connections
I have done many changes to the application.properties file for various constant but it didn't work. My db is hosted on AWS RDS.
But for updating the blob image values I do :
blob= myJdbc.jdbc.getDataSource().getConnection().createBlob();
blob.setBytes(1, str.getBytes());
pstmt = myJdbc.jdbc.getDataSource().getConnection().prepareStatement("update user_profile set profileImage=? where user_profile.id in ( select id from user_login where email=?)");
blob= myJdbc.jdbc.getDataSource().getConnection().createBlob();
blob.setBytes(1, str.getBytes());
pstmt = myJdbc.jdbc.getDataSource().getConnection().prepareStatement("update user_profile set profileImage=? where user_profile.id in ( select id from user_login where email=?)");
The problem is with your code. That code opens 2 additional connections to the database without closing them. You are opening connections yourself then you should also close them. However it is better to use a ConnectionCallback in those cases.
myJdbc.execute(new ConnectionCallback() {
public Object doInConnection(Connection con) throws SQLException, DataAccessException {
blob = con.createBlob();
blob.setBytes(1, str.getBytes());
pstmt = con.prepareStatement("update user_profile set profileImage=? where user_profile.id in ( select id from user_login where email=?)");
return null;
}
});
However it is even easier to use Spring JDBCs Blob support (see the reference guide). That way you don't need to mess around with connections and blobs yourself.
final String query = "update user_profile set profileImage=? where user_profile.id in ( select id from user_login where email=?)";
myJdbc.jdbc.execute(query, new AbstractLobCreatingPreparedStatementCallback(lobHandler) { 1
protected void setValues(PreparedStatement ps, LobCreator lobCreator) throws SQLException {
byte[] bytes = str.getBytes();
ps.setString(2, email);
lobCreator.setBlobAsBinaryStream(ps, 1, str.getBytes());
}
});

Native SQL from Spring / Hibernate without entity mapping?

I need to write some temporary code in my existing Spring Boot 1.2.5 application that will do some complex SQL queries. By complex, I mean a single queries about 4 different tables and I have a number of these. We all decided to use existing SQL to reduce potential risk of getting the new queries wrong, which in this case is a good way to go.
My application uses JPA / Hibernate and maps some entities to tables. From my research it seems like I would have to do a lot of entity mapping.
I tried writing a class that would just get the Hibernate session object and execute a native query but when it tried to configure the session factory it threw an exception complaining it could not find the config file.
Could I perhaps do this from one of my existing entities, or at least find a way to get the Hibernate session that already exists?
UPDATE:
Here is the exception, which makes perfect sense since there is no config file to find. Its app configured in the properties file.
org.hibernate.HibernateException: /hibernate.cfg.xml not found
at org.hibernate.internal.util.ConfigHelper.getResourceAsStream(ConfigHelper.java:173)
For what it's worth, the code:
#NamedNativeQuery(name = "verifyEa", query = "select account_nm from per_person where account_nm = :accountName")
public class VerifyEaResult
{
private SessionFactory sessionFact = null;
String accountName;
private void initSessionFactory()
{
Configuration config = new Configuration().configure();
ServiceRegistry serviceRegistry = new ServiceRegistryBuilder().applySettings(config.getProperties()).getBootstrapServiceRegistry();
sessionFact = config.buildSessionFactory(serviceRegistry);
}
public String getAccountName()
{
// Quick simple test query
String sql = "SELECT * FROM PER_ACCOUNT WHERE ACCOUNT_NM = 'lynnedelete'";
initSessionFactory();
Session session = sessionFact.getCurrentSession();
SQLQuery q = session.createSQLQuery(sql);
List<Object> result = q.list();
return accountName;
}
}
You can use Data access with JDBC, for example:
public class Client {
private final JdbcTemplate jdbcTemplate;
// Quick simple test query
final static String SQL = "SELECT * FROM PER_ACCOUNT WHERE ACCOUNT_NM = ?";
#Autowired
public Client(DataSource dataSource) {
jdbcTemplate = new JdbcTemplate(dataSource);
}
public List<Map<String, Object>> getData(String name) {
return jdbcTemplate.queryForList(SQL, name);
}
}
The short way is:
jdbcTemplate.queryForList("SELECT 1", Collections.emptyMap());

Resources