Why does Ehcache double the size of Strings it stores in memory? - ehcache

While trying to cache some objects of few MB I observed that Ehcache doubles their size while keeping them cached.
Why does this happen? Is it an optimization? Can it be cancelled?
The following code:
public class Main {
public static void main(String[] args) {
CacheManager manager = CacheManager.newInstance();
Cache oneCache = manager.getCache("OneCache");
String oneMbString = generateDummyString(1024 * 1024);
Element bigElement = new Element("key", oneMbString);
oneCache.put(bigElement);
System.out.println("size: "+ oneCache.getSize());
System.out.println("inMemorySize: " + oneCache.calculateInMemorySize());
System.out.println("size of string: " + oneMbString.getBytes().length);
}
/**
* Generate a dummy string
*
* #param size the size of the string in bytes.
* #return
*/
private static String generateDummyString(int size) {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < size; i++) {
sb.append("a");
}
return sb.toString();
}
}
Will output:
size: 1
inMemorySize: 2097384
size of string: 1048576
PS: The ehcache.xml file:
<?xml version="1.0" encoding="UTF-8"?>
<ehcache xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="ehcache.xsd"
updateCheck="false" monitoring="autodetect" maxBytesLocalHeap="512M">
<cache name="OneCache"
eternal="false"
overflowToDisk="false"
diskPersistent="false"
memoryStoreEvictionPolicy="LFU">
<sizeOfPolicy maxDepth="10000" maxDepthExceededBehavior="abort"/>
</cache>
</ehcache>

Strings in Java use 2-byte characters. Ehcache is not doubling the size. When you call toBytes(), you get the encoded bytes (in this case the default UTF-8 encoding is used). This is why you see the difference.

Related

How to increase file size upload limit in spring boot using embedded tomcat

I am try to upload the file using my spring boot API. The function is working fine when I am using small file (less than 1 MB), but when I upload large file it gives me an exception. I am using embedded Tomcat server.
Maximum upload size exceeded;
nested exception is java.lang.IllegalStateException: org.apache.tomcat.util.http.fileupload.impl.FileSizeLimitExceededException: The field file exceeds its maximum permitted size of 1048576 bytes.
I have tried the following code in my files but every time I am getting the error
1. application.property
server.tomcat.max-swallow-size=100MB
server.tomcat.max-http-post-size=100MB
spring.servlet.multipart.enabled=true
spring.servlet.multipart.fileSizeThreshold=100MB
spring.servlet.multipart.max-file-size=100MB
spring.servlet.multipart.max-request-size=100MB
I have also tried
spring.servlet.multipart.maxFileSize=100MB
spring.servlet.multipart.maxRequestSize=100MB
2. The belove is my file uploading code
public RestDTO uploadFile(MultipartFile file, String subPath) {
if (file.isEmpty()) {
return new RestFailure("Failed to store empty file");
}
try {
String fileName = new Date().getTime() + "_" + file.getOriginalFilename();
String filePath = uploadPath + subPath + fileName;
if (Objects.equals(file.getOriginalFilename(), "blob")) {
filePath += ".png";
fileName += ".png";
}
File uploadDir = new File(uploadPath + subPath);
if (!uploadDir.exists()) {
uploadDir.mkdirs();
}
FileOutputStream output = new FileOutputStream(filePath);
output.write(file.getBytes());
LOGGER.info("File path : " + filePath);
MediaInfoDTO mediaInfoDTO = getThumbnailFromVideo(subPath, fileName);
String convertedFileName = convertVideoToMP4(subPath, fileName);
System.out.println("---------------->" + convertedFileName);
return new RestData<>(new MediaDetailDTO(mediaInfoDTO.getMediaPath(), convertedFileName,
mediaInfoDTO.getMediaType(), mediaInfoDTO.getMediaCodec(), mediaInfoDTO.getWidth(),
mediaInfoDTO.getHeight(), mediaInfoDTO.getDuration()));
} catch (IOException e) {
LOGGER.info("Can't upload file: " + e.getMessage());
return new RestFailure("Failed to store empty file");
}
}
but every time I got the same exception.
Apart from comment might I suggest creating a #Bean for Factory MultipartConfigurationElement
This basically should override other restrictions if you have any from TomCat side.
#Bean
public MultipartConfigElement multipartConfigElement() {
MultipartConfigFactory factory = new MultipartConfigFactory();
factory.setMaxFileSize(DataSize.ofBytes(100000000L));
factory.setMaxRequestSize(DataSize.ofBytes(100000000L));
return factory.createMultipartConfig();
}
Here DataSize is of type org.springframework.util.unit.DataSize
Reference https://github.com/spring-projects/spring-boot/issues/11284
Another issue I suspect could be from TomCat maxSwallowSize see Baeldung's point #5 if above does not work.
https://www.baeldung.com/spring-maxuploadsizeexceeded
After reviewing many examples and after several tests with no results. I have managed to solve the problem with the following configuration:
Add to pom the follows dependencies:
<dependency>
<groupId>commons-fileupload</groupId>
<artifactId>commons-fileupload</artifactId>
<version>1.4</version>
</dependency>
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>2.6</version>
</dependency>
Remove from yml:
sprint:
servlet:
multipart:
enabled: true
file-size-threshold: 2KB
max-file-size: 10MB
max-request-size: 10MB
Add to yml:
server:
tomcat:
max-swallow-size: -1
max-http-form-post-size: -1
And last but not least:
#Bean
public MultipartResolver multipartResolver() {
CommonsMultipartResolver resolver
= new CommonsMultipartResolver();
resolver.setDefaultEncoding(StandardCharsets.UTF_8.displayName());
resolver.setMaxUploadSize(52428800L); //50MB
resolver.setMaxUploadSizePerFile(52428800L); //50MB
return resolver;
}
#ExceptionHandler(MaxUploadSizeExceededException.class)
public ResponseEntity<Object> handleFileUploadError(MaxUploadSizeExceededException ex) {
return ResponseEntity.status(EXPECTATION_FAILED).body(
CustomResponse.builder()
.status(Status.ERROR)
.message(ex.getMessage())
.build());
}
// Where CustomResponse class is in my case:
/**
* The UploadResponse class
* <p>
* Contain the response body
*/
#Getter
#Builder(toBuilder = true)
#AllArgsConstructor
#JsonInclude(JsonInclude.Include.NON_NULL)
public class CustomResponse {
/**
* The status
*/
private final Status status;
/**
* The message
*/
private final String message;
/**
* The errors
*/
private final Set<String> errors;
}

Ehcache jsr107:defaults not applying to programmatically created caches

Based on my findings in my previous SO question, I'm trying to setup JCaches in a mix of declarative and imperative configuration, to limit the max size of caches declaratively.
I keep a list of the caches and the duration (TTL) for their entries in my application.yaml, which I get with a property loader. I then create my caches with the code below:
#Bean
public List<javax.cache.Cache<Object, Object>> getCaches() {
javax.cache.CacheManager cacheManager = this.getCacheManager();
List<Cache<Object, Object>> caches = new ArrayList();
Map<String, String> cacheconfigs = //I populate this with a list of cache names and durations;
Set<String> keySet = cacheconfigs.keySet();
Iterator i$ = keySet.iterator();
while(i$.hasNext()) {
String key = (String)i$.next();
String durationMinutes = (String)cacheconfigs.get(key);
caches.add((new GenericDefaultCacheConfigurator.GenericDefaultCacheConfig(key, new Duration(TimeUnit.MINUTES, Long.valueOf(durationMinutes)))).getCache(cacheManager));
}
return caches;
}
#Bean
public CacheManager getCacheManager() {
return Caching.getCachingProvider().getCacheManager();
}
private class GenericDefaultCacheConfig {
public GenericDefaultCacheConfig(String cacheName, Duration duration) {
public GenericDefaultCacheConfig(String id, Duration duration, Factory expiryPolicyFactory) {
CACHE_ID = id;
DURATION = duration;
EXPIRY_POLICY = expiryPolicyFactory;
}
private MutableConfiguration<Object, Object> getCacheConfiguration() {
return new MutableConfiguration<Object, Object>()
.setTypes(Object.class, Object.class)
.setStoreByValue(true)
.setExpiryPolicyFactory(EXPIRY_POLICY);
}
public Cache<Object, Object> getCache(CacheManager cacheManager) {
CacheManager cm = cacheManager;
Cache<K, V> cache = cm.getCache(CACHE_ID, Object.class, Object.class);
if (cache == null)
cache = cm.createCache(CACHE_ID, getCacheConfiguration());
return cache;
}
}
I try limiting the cache size with the following ehcache.xml:
<config
xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'
xmlns='http://www.ehcache.org/v3'
xmlns:jsr107='http://www.ehcache.org/v3/jsr107'
xsi:schemaLocation="
http://www.ehcache.org/v3 http://www.ehcache.org/schema/ehcache-core-3.0.xsd
http://www.ehcache.org/v3/jsr107 http://www.ehcache.org/schema/ehcache-107-ext-3.0.xsd">
<service>
<jsr107:defaults default-template="heap-cache" enable-management="true" enable-statistics="true">
</jsr107:defaults>
</service>
<cache-template name="heap-cache">
<resources>
<heap unit="entries">20</heap>
</resources>
</cache-template> </config>
I set the following declaration in my application.yaml:
spring:
cache:
jcache:
config: classpath:ehcache.xml
However, my caches don't honor the imposed limit. I validate with the following test:
#Test
public void testGetCacheMaxSize() {
Cache<Object, Object> cache = getCache(MY_CACHE); //I get a cache of type Eh107Cache[myCache]
CacheRuntimeConfiguration<Object, Object> ehcacheConfig = (CacheRuntimeConfiguration<Object, Object>)cache.getConfiguration(
Eh107Configuration.class).unwrap(CacheRuntimeConfiguration.class);
long size = ehcacheConfig.getResourcePools().getPoolForResource(ResourceType.Core.HEAP).getSize(); //Returns 9223372036854775807 instead of the expected 20
for(int i=0; i<30; i++)
commonDataService.getAllStates("ENTRY_"+i);
Map<Object, Object> cachedElements = cacheManagerService.getCachedElements(MY_CACHE);
assertTrue(cachedElements.size().equals(20)); //size() returns 30
}
Can somebody point out what I am doing wrong? Thanks in advance.
The issue comes from getting the cache manager as:
Caching.getCachingProvider().getCacheManager();
By setting the config file URI on cache manager's initialization I got it to work:
cachingProvider = Caching.getCachingProvider();
configFileURI = resourceLoader.getResource(configFilePath).getURI();
cacheManager = cachingProvider.getCacheManager(configFileURI, cachingProvider.getDefaultClassLoader());
I was under the expectation that Spring Boot would automatically create the cache manager based on the configuration file included given in property spring.cache.jcache.config,
but that was not the case because I get the cache manager as described above instead of simply auto-wiring it and letting Spring create it.

ClassNotFound in HazelcastMembers for ReplicatedMaps, but ok for Maps

Our Problem might be similar to that one:
Hazelcast ClassNotFoundException for replicated maps
Since the description of the environment is not given in detail I describe our problematic enironment here:
We have a dedicated Hazelcast Server(Member), out of the box with some config. No additional classes added (The ones from our project).
Then we got two Hazelcast Clients using this Member with several of our own classes.
The Clients intend to use Replicated Maps, so at some point in our software they do "hazelcastInstance.getReplicatedMap("MyName")" and then do some put operations.
Doing this, the dedicated hazelcast server throws a ClassNotFound for our classes we want to put into the replicated map. I understand this. How should he know about the classes.
Then I change to a Map insteadof replicatedMap.
"hazelcastInstance.getMap("MyName")"
With no other change it works. And this is what makes me wonder how that can be?
Does this have to do with different InMemory Storage? Does replicatedMap here behave differently ?
Hazelcast Version is: 3.9.2
One info might be important: the Client configures a NearCache for all the maps used:
EvictionConfig evictionConfig = new EvictionConfig()
.setMaximumSizePolicy(EvictionConfig.MaxSizePolicy.ENTRY_COUNT)
.setSize(eapCacheId.getMaxAmountOfValues());
new NearCacheConfig()
.setName(eapCacheId.buildFullName())
.setInMemoryFormat(InMemoryFormat.OBJECT)
.setInvalidateOnChange(true)
.setEvictionConfig(evictionConfig);
}
I changed the InMemoryFormat to BINARY. Still the same ClassNotFound
The Start of the stacktrace is:
at com.hazelcast.internal.serialization.impl.JavaDefaultSerializers$JavaSerializer.read(JavaDefaultSerializers.java:224)
at com.hazelcast.internal.serialization.impl.StreamSerializerAdapter.read(StreamSerializerAdapter.java:48)
at com.hazelcast.internal.serialization.impl.AbstractSerializationService.toObject(AbstractSerializationService.java:185)
at com.hazelcast.spi.impl.NodeEngineImpl.toObject(NodeEngineImpl.java:339)
EDIT: wrote a little Test do demonstrate my problem:
package de.empic.hazelclient.client;
import com.hazelcast.client.HazelcastClient;
import com.hazelcast.client.config.ClientConfig;
import com.hazelcast.config.EvictionConfig;
import com.hazelcast.config.InMemoryFormat;
import com.hazelcast.config.NearCacheConfig;
import com.hazelcast.core.HazelcastInstance;
import java.util.Map;
import java.util.Random;
public class HazelClient {
private static final String[] MAP_KEYS = {"Mike", "Ben", "Luis", "Adria", "Lena"};
private static final String MAP_NAME = "Regular Map" ;
private static final String REPLICATED_MAP_NAME = "Replicated Map" ;
private static final String CACHE_MEMBERS = "192.168.56.101:5701" ;
private static final String MNGT_CENTER = "192.168.56.101:5701" ;
HazelcastInstance hazelClientInstance = null ;
private static Random rand = new Random(System.currentTimeMillis());
public static void main(String[] args) {
new HazelClient(true).loop();
}
private HazelClient(boolean useNearCache)
{
ClientConfig cfg = prepareClientConfig(useNearCache) ;
hazelClientInstance = HazelcastClient.newHazelcastClient(cfg);
}
private void loop()
{
Map<String, SampleSerializable> testMap = hazelClientInstance.getMap(MAP_NAME);
Map<String, SampleSerializable> testReplicatedMap = hazelClientInstance.getReplicatedMap(REPLICATED_MAP_NAME);
int count = 0 ;
while ( true )
{
// do a random write to map
testMap.put(MAP_KEYS[rand.nextInt(MAP_KEYS.length)], new SampleSerializable());
// do a random write to replicated map
testReplicatedMap.put(MAP_KEYS[rand.nextInt(MAP_KEYS.length)], new SampleSerializable());
if ( ++count == 10)
{
// after a while we print the map contents
System.out.println("MAP Content -------------------------");
printMapContent(testMap) ;
System.out.println("REPLIACTED MAP Content --------------");
printMapContent(testReplicatedMap) ;
count = 0 ;
}
// we do not want to drown in system outs....
try {
Thread.sleep(500);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
private void printMapContent(Map<String, SampleSerializable> map)
{
for ( String currKey : map.keySet())
{
System.out.println(String.format(" - %s -> %s", currKey, map.get(currKey)));
}
}
private ClientConfig prepareClientConfig(boolean useNearCache)
{
ClientConfig cfg = new ClientConfig();
cfg.setInstanceName("SampleInstance");
cfg.getProperties().put("hazelcast.client.statistics.enabled", "true");
cfg.getProperties().put("hazelcast.client.statistics.period.seconds", "5");
if ( useNearCache )
{
cfg.addNearCacheConfig(defineNearCache(MAP_NAME));
cfg.addNearCacheConfig(defineNearCache(REPLICATED_MAP_NAME));
}
// we use a single member for demo
String[] members = {CACHE_MEMBERS} ;
cfg.getNetworkConfig().addAddress(members);
return cfg ;
}
private NearCacheConfig defineNearCache(String name)
{
EvictionConfig evictionConfig = new EvictionConfig()
.setMaximumSizePolicy(EvictionConfig.MaxSizePolicy.ENTRY_COUNT)
.setSize(Integer.MAX_VALUE);
NearCacheConfig nearCacheConfig = new NearCacheConfig()
.setName(name)
.setInMemoryFormat(InMemoryFormat.OBJECT)
.setInvalidateOnChange(true)
.setEvictionConfig(evictionConfig) ;
return nearCacheConfig;
}
}
To have the full info, the Hazelcast member is started with the following xml:
<?xml version="1.0" encoding="UTF-8"?>
<hazelcast xsi:schemaLocation="http://www.hazelcast.com/schema/config https://hazelcast.com/schema/config/hazelcast-config-3.8.xsd"
xmlns="http://www.hazelcast.com/schema/config"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<instance-name>server-cache</instance-name>
<network>
<port>5701</port>
<join>
<multicast enabled="false"/>
<tcp-ip enabled="true">
<members>192.168.56.101:5701</members>
</tcp-ip>
</join>
<public-address>192.168.56.101:5701</public-address>
</network>
<management-center enabled="true">http://192.168.56.101:6679/mancenter</management-center>
</hazelcast>
The fact that the Hazelcast Member is running in docker while the clients are not is not important I think.
Can you post your configurations for both mapConfig and replicatedMapConfig? I will try to reproduce this.
I’m thinking this has to do with where the serialization happens. A couple things to keep in mind, there are two different configurations for map and replicatedmap. When you changed your getReplicatedMap("MyName") to .getMap("MyName"), if you don’t have a map config for “MyName”, then it will use the default config.
By default, Replicated Maps store in object memory format for performance.
I found my mistake. I configured the near cache to be of memory type "BINARY"
The server itself I did not configure. After having the replicated maps defined as "BINARY" in the server it works.

How to increase the performance of inserting data into the database?

I use PostgreSQL 9.5 (and a newest JDBC driver - 9.4.1209), JPA 2.1 (Hibernate), EJB 3.2, CDI, JSF 2.2 and Wildfly 10. I've to insert a lot of data into the database (about 1 mln - 170 mln entities). The number of entities depends on a file which user will add to the form on the page.
What is the problem?
The problem is the execution time of inserting data into the database which is very slow. The execution time is growing every calling of the flush() method. I've put the println(...) method to know how fast the execution of the flush method is. For the first ~4 times (400000 entities), I receive result of the println(...) method every ~20s. Later, the execution time of the flush method is incredibly slow and it's still growing.
Of course, if I delete the flush() and clear() methods, I receive result of the println(...) method every 1s BUT when I approach to the 3 mln entities, I also receive the exception:
java.lang.OutOfMemoryError: GC overhead limit exceeded
What have I done so far?
I've tried with the Container-Managed Transactions and the Bean-Managed Transactions as well (look at the code below).
I don't use the auto_increment feature for the PK ID. I add the IDs manually in the bean code.
I've also tried to change the number of the entities to flush (at the moment 100000).
I've tried to set the same number of the entities like in the hibernate.jdbc.batch_size property. It didn't help, the execution time was much slower.
I've tried to experiment with the properties in the persistence.xml file. For example, I added the reWriteBatchedInserts property but indeed I don't know if it could help.
The PostgreSQL is running on the SSD but the data are stored on the HDD because the data could be to big in the future. But I've tried to move my PostgreSQL data to the SSD and the result is the same, nothing has changed.
The question is: How can I increase the performance of inserting data into database?
Here's the structure of my table:
column_name | udt_name | length | is_nullable | key
---------------+-------------+--------+-------------+--------
id | int8 | | NO | PK
id_user_table | int4 | | NO | FK
starttime | timestamptz | | NO |
time | float8 | | NO |
sip | varchar | 100 | NO |
dip | varchar | 100 | NO |
sport | int4 | | YES |
dport | int4 | | YES |
proto | varchar | 50 | NO |
totbytes | int8 | | YES |
info | text | | YES |
label | varchar | 10 | NO |
Here's part of the EJB bean (first version) where I insert the data into the database:
#Stateless
public class DataDaoImpl extends GenericDaoImpl<Data> implements DataDao {
/**
* This's the first method which is executed.
* The CDI bean (controller) calls this method.
* #param list - data from the file.
* #param idFK - foreign key.
*/
public void send(List<String> list, int idFK) {
if(handleCSV(list,idFK)){
//...
}
else{
//...
}
}
/**
* The method inserts data into the database.
*/
#TransactionAttribute(TransactionAttributeType.REQUIRES_NEW)
private boolean handleCSV(List<String> list, int idFK){
try{
long start=0;
Pattern patternRow=Pattern.compile(",");
for (String s : list) {
if(start!=0){
String[] data=patternRow.split(s);
//Preparing data...
DataStoreAll dataStore=new DataStoreAll();
DataStoreAllId dataId=new DataStoreAllId(start++, idFK);
dataStore.setId(dataId);
//Setting the other object fields...
entityManager.persist(dataStore);
if(start%100000==0){
System.out.println("Number of entities: "+start);
entityManager.flush();
entityManager.clear();
}
}
else start++;
}
} catch(Throwable t){
CustomExceptionHandler exception=new CustomExceptionHandler(t);
return exception.persist("DDI", "handleCSV");
}
return true;
}
#Inject
private EntityManager entityManager;
}
Instead of using the Container-Managed Transactions, I've tried to use Bean-Managed Transactions either (second version):
#Stateless
#TransactionManagement(TransactionManagementType.BEAN)
public class DataDaoImpl extends GenericDaoImpl<Data> {
/**
* This's the first method which is executed.
* The CDI bean (controller) calls this method.
* #param list - data from the file.
* #param idFK - foreign key.
*/
public void send(List<String> list, int idFK) {
if(handleCSV(list,idFK)){
//...
}
else{
//...
}
}
/**
* The method inserts data into the linkedList collection.
*/
private boolean handleCSV(List<String> list, int idFK){
try{
long start=0;
Pattern patternRow=Pattern.compile(",");
List<DataStoreAll> entitiesAll=new LinkedList<>();
for (String s : list) {
if(start!=0){
String[] data=patternRow.split(s);
//Preparing data...
DataStoreAll dataStore=new DataStoreAll();
DataStoreAllId dataId=new DataStoreAllId(start++, idFK);
dataStore.setId(dataId);
//Setting the other object fields...
entitiesAll.add(dataStore);
if(start%100000==0){
System.out.println("Number of entities: "+start);
saveDataStoreAll(entitiesAll);
}
}
else start++;
}
} catch(Throwable t){
CustomExceptionHandler exception=new CustomExceptionHandler(t);
return exception.persist("DDI", "handleCSV");
}
return true;
}
/**
* The method commits the transaction.
*/
private void saveDataStoreAll(List<DataStoreAll> entities) throws EntityExistsException,IllegalArgumentException,TransactionRequiredException,PersistenceException,Throwable {
Iterator<DataStoreAll> iter=entities.iterator();
ut.begin();
while(iter.hasNext()){
entityManager.persist(iter.next());
iter.remove();
entityManager.flush();
entityManager.clear();
}
ut.commit();
}
#Inject
private EntityManager entityManager;
#Inject
private UserTransaction ut;
}
Here's my persistence.xml:
<?xml version="1.0" encoding="UTF-8"?>
<persistence version="2.1"
xmlns="http://xmlns.jcp.org/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="
http://xmlns.jcp.org/xml/ns/persistence
http://xmlns.jcp.org/xml/ns/persistence/persistence_2_1.xsd">
<persistence-unit name="primary">
<jta-data-source>java:/PostgresDS</jta-data-source>
<properties>
<property name="hibernate.show_sql" value="false" />
<property name="hibernate.jdbc.batch_size" value="50" />
<property name="hibernate.order_inserts" value="true" />
<property name="hibernate.order_updates" value="true" />
<property name="hibernate.jdbc.batch_versioned_data" value="true"/>
<property name="reWriteBatchedInserts" value="true"/>
</properties>
</persistence-unit>
</persistence>
If I forgot to add something, tell me about it and I'll update my post.
Update
Here's the controller which calls DataDaoImpl#send(...):
#Named
#ViewScoped
public class DataController implements Serializable {
#PostConstruct
private void init(){
//...
}
/**
* Handle of the uploaded file.
*/
public void handleFileUpload(FileUploadEvent event){
uploadFile=event.getFile();
try(InputStream input = uploadFile.getInputstream()){
Path folder=Paths.get(System.getProperty("jboss.server.data.dir"),"upload");
if(!folder.toFile().exists()){
if(!folder.toFile().mkdirs()){
folder=Paths.get(System.getProperty("jboss.server.data.dir"));
}
}
String filename = FilenameUtils.getBaseName(uploadFile.getFileName());
String extension = FilenameUtils.getExtension(uploadFile.getFileName());
filePath = Files.createTempFile(folder, filename + "-", "." + extension);
//Save the file on the server.
Files.copy(input, filePath, StandardCopyOption.REPLACE_EXISTING);
//Add reference to the unconfirmed uploaded files list.
userFileManager.addUnconfirmedUploadedFile(filePath.toFile());
FacesContext.getCurrentInstance().addMessage(null, new FacesMessage(FacesMessage.SEVERITY_INFO, "Success", uploadFile.getFileName() + " was uploaded."));
} catch (IOException e) {
//...
}
}
/**
* Sending data from file to the database.
*/
public void send(){
//int idFK=...
//The model includes the data from the file and other things which I transfer to the EJB bean.
AddDataModel addDataModel=new AddDataModel();
//Setting the addDataModel fields...
try{
if(uploadFile!=null){
//Each row of the file == 1 entity.
List<String> list=new ArrayList<String>();
Stream<String> stream=Files.lines(filePath);
list=stream.collect(Collectors.toList());
addDataModel.setList(list);
}
} catch (IOException e) {
//...
}
//Sending data to the DataDaoImpl EJB bean.
if(dataDao.send(addDataModel,idFK)){
userFileManager.confirmUploadedFile(filePath.toFile());
FacesContext.getCurrentInstance().addMessage(null, new FacesMessage(FacesMessage.SEVERITY_INFO, "The data was saved in the database.", ""));
}
}
private static final long serialVersionUID = -7202741739427929050L;
#Inject
private DataDao dataDao;
private UserFileManager userFileManager;
private UploadedFile uploadFile;
private Path filePath;
}
Update 2
Here's the updated EJB bean where I insert the data into the database:
#Stateless
#TransactionManagement(TransactionManagementType.BEAN)
public class DataDaoImpl extends GenericDaoImpl<Data> {
/**
* This's the first method which is executed.
* The CDI bean (controller) calls this method.
* #param addDataModel - object which includes path to the uploaded file and other things which are needed.
*/
public void send(AddDataModel addDataModel){
if(handleCSV(addDataModel)){
//...
}
else{
//...
}
}
/**
* The method inserts data into the database.
*/
private boolean handleCSV(AddDataModel addDataModel){
PreparedStatement ps=null;
Connection con=null;
FileInputStream fileInputStream=null;
Scanner scanner=null;
try{
con=ds.getConnection();
con.setAutoCommit(false);
ps=con.prepareStatement("insert into data_store_all "
+ "(id,id_user_table,startTime,time,sIP,dIP,sPort,dPort,proto,totBytes,info) "
+ "values(?,?,?,?,?,?,?,?,?,?,?)");
long start=0;
fileInputStream=new FileInputStream(addDataModel.getPath().toFile());
scanner=new Scanner(fileInputStream, "UTF-8");
Pattern patternRow=Pattern.compile(",");
Pattern patternPort=Pattern.compile("\\d+");
while(scanner.hasNextLine()) {
if(start!=0){
//Loading a row from the file into table.
String[] data=patternRow.split(scanner.nextLine().replaceAll("[\"]",""));
//Preparing datetime.
SimpleDateFormat simpleDateFormat=new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
GregorianCalendar calendar=new GregorianCalendar();
calendar.setTime(simpleDateFormat.parse(data[1]));
calendar.set(Calendar.MILLISECOND, Integer.parseInt(Pattern.compile("\\.").split(data[1])[1])/1000);
//Preparing an entity
ps.setLong(1, start++); //id PK
ps.setInt(2, addDataModel.getIdFk()); //id FK
ps.setTimestamp(3, new Timestamp(calendar.getTime().getTime())); //datetime
ps.setDouble(4, Double.parseDouble(data[2])); //time
ps.setString(5, data[3]); //sip
ps.setString(6, data[4]); //dip
if(!data[5].equals("") && patternPort.matcher(data[5]).matches()) ps.setInt(7, Integer.parseInt(data[5])); //sport
else ps.setNull(7, java.sql.Types.INTEGER);
if(!data[6].equals("") && patternPort.matcher(data[6]).matches()) ps.setInt(8, Integer.parseInt(data[6])); //dport
else ps.setNull(8, java.sql.Types.INTEGER);
ps.setString(9, data[7]); //proto
if(!data[8].trim().equals("")) ps.setLong(10, Long.parseLong(data[8])); //len
else ps.setObject(10, null);
if(data.length==10 && !data[9].trim().equals("")) ps.setString(11, data[9]); //info
else ps.setString(11, null);
ps.addBatch();
if(start%100000==0){
System.out.println("Number of entity: "+start);
ps.executeBatch();
ps.clearParameters();
ps.clearBatch();
con.commit();
}
}
else{
start++;
scanner.nextLine();
}
}
if (scanner.ioException() != null) throw scanner.ioException();
} catch(Throwable t){
CustomExceptionHandler exception=new CustomExceptionHandler(t);
return exception.persist("DDI", "handleCSV");
} finally{
if (fileInputStream!=null)
try {
fileInputStream.close();
} catch (Throwable t2) {
CustomExceptionHandler exception=new CustomExceptionHandler(t2);
return exception.persist("DDI", "handleCSV.Finally");
}
if (scanner != null) scanner.close();
}
return true;
}
#Inject
private EntityManager entityManager;
#Resource(mappedName="java:/PostgresDS")
private DataSource ds;
}
Your problem is not necessarily the database or even hibernate, but that you are loading way too much data into memory at once. That's why you get the out of memory message and why you see the jvm struggling on the way there.
You read the file from a stream, but then push it all into memory when you create the the list of strings. Then you map that list of strings into a linked list of some sort of entity!
Instead, use the the stream to process your files in small chunks and insert the chunks into your database. A scanner based approach would look something like this:
FileInputStream inputStream = null;
Scanner sc = null;
try {
inputStream = new FileInputStream(path);
sc = new Scanner(inputStream, "UTF-8");
while (sc.hasNextLine()) {
String line = sc.nextLine();
// Talk to your database here!
}
// note that Scanner suppresses exceptions
if (sc.ioException() != null) {
throw sc.ioException();
}
} finally {
if (inputStream != null) {
inputStream.close();
}
if (sc != null) {
sc.close();
}
}
You'll probably find the hibernate/ejb stuff works well enough after you make this change. But I think you'll find plain jdbc to be significantly faster. They say you can expect a 3x to 4x speed bump, depending. That would to make a big difference with a lot of data.
If you are talking about truly huge amounts of data then you should look into the CopyManager, that lets you load streams directly into the database. You can use the streaming apis to transform the data as it goes by.
As you are using WildFly 10, you are in a Java EE 7 environment.
Therefore you should consider using JSR-352 Batch Processing for performing your file import.
Have a look at An Overview of Batch Processing in Java EE 7.0.
This should resolve all your memory consumption and transaction issues.

Spring TypeConverter fails on Camel RouteBuilder subclass

In an attempt to install my component on karaf I get the following error:
Caused by: org.apache.camel.CamelException: Cannot find any routes with this RouteBuilder reference: RouteBuilderRef[logparserRouteBean]
I've narrowed it down to a conversion error in AbstractBeanFactory using the SimpleTypeConverter as returned by the getTypeConverter().
Given that PerformanceLogRoute extends org.apache.camel.builder.RouteBuilder, how can the convertion fail??
Suggestions, and ideas to any solution is greatly appreciated.
UPDATE
package no.osl.cdms.profile.routes;
import no.osl.cdms.profile.api.TimeMeasurement;
import no.osl.cdms.profile.factories.EntityFactory;
import no.osl.cdms.profile.log.TimeMeasurementEntity;
import no.osl.cdms.profile.parser.LogLineRegexParser;
import org.apache.camel.builder.RouteBuilder;
import java.util.Map;
public class PerformanceLogRoute extends RouteBuilder {
public static final String PERFORMANCE_LOG_ROUTE_ID = "PerformanceLogRoute";
private static final String LOG_DIRECTORY = "C:/data";
private static final String LOG_FILE = "performance.log";
private static final int DELAY = 0;
private LogLineRegexParser logLineRegexParser = new LogLineRegexParser();
private EntityFactory entityFactory = EntityFactory.getInstance();
private static final String LOG_FILE_ENDPOINT = "stream:file? fileName="+LOG_DIRECTORY +"/"+LOG_FILE+"&scanStream=true&scanStreamDelay=" + DELAY;
private static final String DATABASE_ENDPOINT = "jpa:";
#Override
public void configure() throws Exception{
fromF(LOG_FILE_ENDPOINT, LOG_DIRECTORY, LOG_FILE, DELAY)
.convertBodyTo(String.class) // Converts input to String
.choice().when(body().isGreaterThan("")) // Ignores empty lines
.bean(logLineRegexParser, "parse") // Parses log entry into String map
.bean(entityFactory, "createTimemeasurement") // Parses log entry into database format
.split(body())
.choice().when(body().isNotNull())
.toF(DATABASE_ENDPOINT, body().getClass().toString())
.routeId(PERFORMANCE_LOG_ROUTE_ID);
}
public String toString() {
return PERFORMANCE_LOG_ROUTE_ID;
}
}
The xml:
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:camel="http://camel.apache.org/schema/spring"
xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-2.0.xsd
http://camel.apache.org/schema/spring http://camel.apache.org/schema/spring/camel-spring.xsd">
<bean id="logparserRouteBean" class="no.osl.cdms.profile.routes.PerformanceLogRoute" />
<camelContext id="cdms-core-camel-context" xmlns="http://camel.apache.org/schema/spring">
<routeBuilder ref="logparserRouteBean" />
</camelContext>
</beans>
This is what I found at this moment. From what I remember it is identical to what caused the error, but I'll double check in the morning.

Resources