Jasypt - Poor performance on Solaris - performance

I'm using a PooledPBEStringEncryptor to decrypt some strings. I'm running my test on multiple threads.
I'm executing the method below on multiple threads
public static long decryptJasypt (String str)
{
long time = System.currentTimeMillis();
encryptor.decrypt(str);
return System.currentTimeMillis()-time;
}
where
encryptor = new PooledPBEStringEncryptor();
encryptor.setPoolSize(4);
I'am running the same test on Ubuntu and on Solaris. The same test takes 10 longer on Solaris to complete.
For Solaris, as security providers I used either
sun.security.pkcs11.SunPKCS11 ${java.home}/lib/security/sunpkcs11-solaris.cfg
or
com.oracle.security.ucrypto.UcryptoProvider ${java.home}/lib/security/ucrypto-solaris.cfg
For the first provider my threads are staying here:
"Thread-86" #104 prio=5 os_prio=64 tid=0x00000000014ab800 nid=0x78 runnable [0xffff80ffb3971000]
java.lang.Thread.State: RUNNABLE
at sun.security.pkcs11.wrapper.PKCS11.C_CloseSession(Native Method)
at sun.security.pkcs11.SessionRef.dispose(Session.java:171)
at sun.security.pkcs11.Session.close(Session.java:120)
at sun.security.pkcs11.SessionManager.closeSession(SessionManager.java:232)
at sun.security.pkcs11.SessionManager.killSession(SessionManager.java:174)
at sun.security.pkcs11.Token.killSession(Token.java:311)
at sun.security.pkcs11.P11Digest.engineReset(P11Digest.java:144)
at sun.security.pkcs11.P11Digest.engineDigest(P11Digest.java:194)
at sun.security.pkcs11.P11Digest.engineDigest(P11Digest.java:157)
at java.security.MessageDigest$Delegate.engineDigest(MessageDigest.java:592)
at java.security.MessageDigest.digest(MessageDigest.java:365)
at com.sun.crypto.provider.PBES1Core.deriveCipherKey(PBES1Core.java:279)
at com.sun.crypto.provider.PBES1Core.init(PBES1Core.java:250)
at com.sun.crypto.provider.PBEWithMD5AndDESCipher.engineInit(PBEWithMD5AndDESCipher.java:221)
While for the other one here:
Thread-73" #91 prio=5 os_prio=64 tid=0x0000000001733800 nid=0x6b runnable [0xffff80ffb467e000]
java.lang.Thread.State: RUNNABLE
at com.oracle.security.ucrypto.NativeDigest.nativeInit(Native Method)
at com.oracle.security.ucrypto.NativeDigest.engineUpdate(NativeDigest.java:167)
- locked <0x00000003402108d0> (a com.oracle.security.ucrypto.NativeDigest$MD5)
at java.security.MessageDigest$Delegate.engineUpdate(MessageDigest.java:584)
at java.security.MessageDigest.update(MessageDigest.java:335)
at com.sun.crypto.provider.PBES1Core.deriveCipherKey(PBES1Core.java:278)
at com.sun.crypto.provider.PBES1Core.init(PBES1Core.java:250)
at com.sun.crypto.provider.PBEWithMD5AndDESCipher.engineInit(PBEWithMD5AndDESCipher.java:221)
at javax.crypto.Cipher.init(Cipher.java:1393)
at javax.crypto.Cipher.init(Cipher.java:1326)
Could you please advice on how I should troubleshoot this. Are there any known issues for jasypt running on Solaris ?
My test consists in creating an array with some simple strings encrypted with jasypt (it has 30 items in it) and then starting 100 threads. Each thread decrypts the items in the array 10 times.
Results on Solaris: a thread takes from 39 to 79 seconds to complete this task
Results on Linux: a thread takes from 2 to 7 seconds to complete this task
On linux machine: 1 physical processor, 8 cores
On solaris i have a virtual machine that resides on machine with 2 physical processors each with 4 cores and 8 virtual processors.
Note: we experienced slowness when decrypting data even on a solaris sparc with 256 cores. I wasn't able though to perform my tests in that env.
public static ArrayList<String> ar = new ArrayList();
public static PooledPBEStringEncryptor encryptor;
static
{
Provider[] providers = Security.getProviders();
encryptor = new PooledPBEStringEncryptor();
encryptor.setPoolSize(4);
encryptor.setPassword("xxxxxx");
for (int i=0;i<10;i++){
ar.add(encryptJasyptValue("false"));
}
for (int i=0;i<10;i++){
ar.add(encryptJasyptValue(""+i));
}
for (int i=0;i<10;i++){
ar.add(encryptJasyptValue("true"));
}
}
public static long encryptJasypt (String str){
long time = System.currentTimeMillis();
return System.currentTimeMillis()-time;
}
public static long decryptJasypt (String str){
long time = System.currentTimeMillis();
encryptor.decrypt(str);
return System.currentTimeMillis()-time;
}
public static void main(String[] args) throws Exception
{
int noTreads =100;
if(args.length>0 && args[0]!=null) {
noTreads = Integer.parseInt(args[0]);
}
for (int i=0;i<noTreads;i++)
{
new MyThread().start();
}
}
}
class MyThread extends Thread {
public void run() {
long time=0;
for (int k=0;k<10;k++) {
for (int i = 0; i < CryptoUtil.ar.size(); i++) {
time += CryptoUtil.decryptJasypt(CryptoUtil.ar.get(i));
}
}
System.out.println("took d="+time);
}
}

Related

How does ForkJoinPool#awaitQuiescence actually work?

I have next implementation of RecursiveAction, single purpose of this class - is to print from 0 to 9, but from different threads, if possible:
public class MyRecursiveAction extends RecursiveAction {
private final int num;
public MyRecursiveAction(int num) {
this.num = num;
}
#Override
protected void compute() {
if (num < 10) {
System.out.println(num);
new MyRecursiveAction(num + 1).fork();
}
}
}
And I thought that invoking awaitQuiescence will make current thread to wait until all tasks (submitted and forked) will be completed:
public class Main {
public static void main(String[] args) {
ForkJoinPool forkJoinPool = new ForkJoinPool();
forkJoinPool.execute(new MyRecursiveAction(0));
System.out.println(forkJoinPool.awaitQuiescence(5, TimeUnit.SECONDS) ? "tasks" : "time");
}
}
But I don't always get correct result, instead of printing 10 times, prints from 0 to 10 times.
But if I add helpQuiesce to my implementation of RecursiveAction:
public class MyRecursiveAction extends RecursiveAction {
private final int num;
public MyRecursiveAction(int num) {
this.num = num;
}
#Override
protected void compute() {
if (num < 10) {
System.out.println(num);
new MyRecursiveAction(num + 1).fork();
}
RecursiveAction.helpQuiesce();//here
}
}
Everything works fine.
I want to know for what actually awaitQuiescence waiting?
You get an idea of what happens when you change the System.out.println(num); to System.out.println(num + " " + Thread.currentThread());
This may print something like:
0 Thread[ForkJoinPool-1-worker-3,5,main]
1 Thread[main,5,main]
tasks
2 Thread[ForkJoinPool.commonPool-worker-3,5,main]
When awaitQuiescence detects that there are pending tasks, it helps out by stealing one and executing it directly. Its documentation says:
If called by a ForkJoinTask operating in this pool, equivalent in effect to ForkJoinTask.helpQuiesce(). Otherwise, waits and/or attempts to assist performing tasks until this pool isQuiescent() or the indicated timeout elapses.
Emphasis added by me
This happens here, as we can see, a task prints “main” as its executing thread. Then, the behavior of fork() is specified as:
Arranges to asynchronously execute this task in the pool the current task is running in, if applicable, or using the ForkJoinPool.commonPool() if not inForkJoinPool().
Since the main thread is not a worker thread of a ForkJoinPool, the fork() will submit the new task to the commonPool(). From that point on, the fork() invoked from a common pool’s worker thread will submit the next task to the common pool too. But awaitQuiescence invoked on the custom pool doesn’t wait for the completion of the common pool’s tasks and the JVM terminates too early.
If you’re going to say that this is a flawed API design, I wouldn’t object.
The solution is not to use awaitQuiescence for anything but the common pool¹. Normally, a RecursiveAction that splits off sub tasks should wait for their completion. Then, you can wait for the root task’s completion to wait for the completion of all associated tasks.
The second half of this answer contains an example of such a RecursiveAction implementation.
¹ awaitQuiescence is useful when you don’t have hands on the actual futures, like with a parallel stream that submits to the common pool.
Everything works fine.
No it does not, you got lucky that it worked when you inserted:
RecursiveAction.helpQuiesce();
To explain this let's slightly change your example a bit:
static class MyRecursiveAction extends RecursiveAction {
private final int num;
public MyRecursiveAction(int num) {
this.num = num;
}
#Override
protected void compute() {
if (num < 10) {
System.out.println(num);
new MyRecursiveAction(num + 1).fork();
}
}
}
public static void main(String[] args) {
ForkJoinPool forkJoinPool = new ForkJoinPool();
forkJoinPool.execute(new MyRecursiveAction(0));
LockSupport.parkNanos(TimeUnit.SECONDS.toNanos(2));
}
If you run this, you will notice that you get the result you expect to get. And there are two main reasons for this. First, fork method will execute the task in the common pool as the other answer already explained. And second, is that threads in the common pool are daemon threads. JVM is not waiting for them to finish before exiting, it exists early. So if that is the case, you might ask why it works. It does because of this line:
LockSupport.parkNanos(TimeUnit.SECONDS.toNanos(2));
which makes the main thread (which is a non daemon thread) sleep for two seconds, giving enough time for the ForkJoinPool to execute your task.
Now let's change the code closer to your example:
public static void main(String[] args) {
ForkJoinPool forkJoinPool = new ForkJoinPool();
forkJoinPool.execute(new MyRecursiveAction(0));
System.out.println(forkJoinPool.awaitQuiescence(5, TimeUnit.SECONDS) ? "tasks" : "time");
}
specifically, you use: forkJoinPool.awaitQuiescence(...), which is documented as:
Otherwise, waits and/or attempts to assist performing tasks...
It does not say that it will necessarily wait, it says it will "wait and/or attempt ...", in this case it is more or, than and. As such, it will attempt to help, but still it will not wait for all the tasks to finish. Is this weird or even stupid?
When you insert RecursiveAction.helpQuiesce(); you are eventually calling the same awaitQuiescence (with different arguments) under the hood - so essentially nothing changes; the fundamental problem is still there:
static ForkJoinPool forkJoinPool = new ForkJoinPool();
static AtomicInteger res = new AtomicInteger(0);
public static void main(String[] args) {
forkJoinPool.execute(new MyRecursiveAction(0));
System.out.println(forkJoinPool.awaitQuiescence(5, TimeUnit.SECONDS) ? "tasks" : "time");
System.out.println(res.get());
}
static class MyRecursiveAction extends RecursiveAction {
private final int num;
public MyRecursiveAction(int num) {
this.num = num;
}
#Override
protected void compute() {
if (num < 10_000) {
res.incrementAndGet();
System.out.println(num + " thread : " + Thread.currentThread().getName());
new MyRecursiveAction(num + 1).fork();
}
RecursiveAction.helpQuiesce();
}
}
When I run this, it never printed 10000, showing that the insertions of that line changes nothing.
The usual default way to handle such things is to fork then join. And one more join in the caller, on the ForkJoinTask that you get back when calling submit. Something like:
public static void main(String[] args) {
ForkJoinPool forkJoinPool = new ForkJoinPool(2);
ForkJoinTask<Void> task = forkJoinPool.submit(new MyRecursiveAction(0));
task.join();
}
static class MyRecursiveAction extends RecursiveAction {
private final int num;
public MyRecursiveAction(int num) {
this.num = num;
}
#Override
protected void compute() {
if (num < 10) {
System.out.println(num);
MyRecursiveAction ac = new MyRecursiveAction(num + 1);
ac.fork();
ac.join();
}
}
}

Spring kafka idlebetweenpolls is always triggering partition rebalance

I'm trying to use the idle between polls mentioned here to slow down the consumption rate, i also use the max.poll.interval.ms to double the idle between polls, but its always triggering partition rebalance, any idea what is the problem?
[Edit]
I have 5 hosts and i'm setting concurrency level to 1
[Edit 2]
I was setting the idle between polls to 5 min and max.poll.interval.ms to 10 min i also noticed this log "About to close the idle connection from 105 due to being idle for 540012 millis".
I decreased the idle between polls to 10 sec and the issue disappeared, any idea why?
private ConsumerFactory<String, GenericRecord> dlqConsumerFactory() {
Map<String, Object> configurationProperties = commonConfigs();
DlqConfiguration dlqConfiguration = kafkaProperties.getConsumer().getDlq();
final Integer idleBetweenPollInterval = dlqConfiguration.getIdleBetweenPollInterval()
.orElse(DLQ_POLL_INTERVAL);
final Integer maxPollInterval = idleBetweenPollInterval * 2; // two times the idleBetweenPoll, to prevent re-balancing
logger.info("Setting max poll interval to {} for DLQ", maxPollInterval);
overrideIfRequired(DQL_CONSUMER_CONFIGURATION, configurationProperties, ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG, maxPollInterval);
dlqConfiguration.getMaxPollRecords().ifPresent(maxPollRecords ->
overrideIfRequired(DQL_CONSUMER_CONFIGURATION, configurationProperties, ConsumerConfig.MAX_POLL_RECORDS_CONFIG, maxPollRecords)
);
return new DefaultKafkaConsumerFactory<>(configurationProperties);
}
<time to process last polled records> + <idle between polls> must be less than max.poll.interval.ms.
EDIT
There is logic in the container to make sure we never exceed the max poll interval:
idleBetweenPolls = Math.min(idleBetweenPolls,
this.maxPollInterval - (System.currentTimeMillis() - this.lastPoll)
- 5000); // NOSONAR - less by five seconds to avoid race condition with rebalance
I can't reproduce the issue with this...
#SpringBootApplication
public class So63411124Application {
public static void main(String[] args) {
SpringApplication.run(So63411124Application.class, args);
}
#KafkaListener(id = "so63411124", topics = "so63411124")
public void listen(String in) {
System.out.println(in);
}
#Bean
public ApplicationRunner runner(ConcurrentKafkaListenerContainerFactory<?, ?> factory,
KafkaTemplate<String, String> template) {
factory.getContainerProperties().setIdleBetweenPolls(300000L);
return args -> {
while (true) {
template.send("so63411124", "foo");
Thread.sleep(295000);
}
};
}
#Bean
public NewTopic topic() {
return TopicBuilder.name("so63411124").partitions(1).replicas(1).build();
}
}
logging.level.org.springframework.kafka=debug
spring.kafka.consumer.auto-offset-reset=earliest
spring.kafka.consumer.properties.max.poll.interval.ms=600000
If you can provide a small example like this that exhibits the behavior you describe, I will take a look to see what's wrong.

Stop threads in spring batch while using taskExecutor(asyncTaskExecutor())

I have a issue here. When I am using this as my code (below) in spring batch configuration, my code ends successfully.
#Bean(name = "myStep")
public Step myStep() {
int cores = Runtime.getRuntime().availableProcessors();
int maxAndQueueSize = cores * 2;
return stepBuilderFactory.get("myStep").<A, B> chunk(CHUNKS)
.reader(myItemReader(entityManagerFactory)).processor(myProcessor())
.writer(myWriter()).listener(myListener()).throttleLimit(maxAndQueueSize).allowStartIfComplete(true).build();
}
But when I modify this code (for async writing) by adding taskExecutor(asyncTaskExecutor()), my aim is fulfilled but code is still running in the eclipse. Seems its a thread closing related issue. Please help how can I close my code gracefully?
#Bean(name = "myStep")
public Step myStep() {
int cores = Runtime.getRuntime().availableProcessors();
int maxAndQueueSize = cores * 2;
return stepBuilderFactory.get("myStep").<A, B> chunk(CHUNKS)
.reader(myItemReader(entityManagerFactory)).processor(myProcessor())
.writer(myWriter()).listener(myListener()).taskExecutor(asyncTaskExecutor()).throttleLimit(maxAndQueueSize).allowStartIfComplete(true).build();
}
This is the asyncTaskExecutor()
#Bean(name = "asyncTaskExecutor")
public AsyncTaskExecutor asyncTaskExecutor() {
int cores = Runtime.getRuntime().availableProcessors();
int maxAndQueueSize = cores * 2;
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(cores);
executor.setMaxPoolSize(maxAndQueueSize);
executor.setQueueCapacity(maxAndQueueSize);
executor.setThreadNamePrefix("asyncExecutor-");
executor.initialize();
return executor;
}

FTP server connection simulation

How can I write a script or otherwise simulate about 100
users connection to my own ftp server?
You can prepare a simple Java code.
First, you have to decide how these requests arrive to your server. I.e., completely random, one per minute, following a normal distribution or more likely an exponential distribution.
Then, you have to use a thread that has:
A method to make an ftp connection (e.g. ftpCall())
A method to get the x milliseconds to the next FTP call (e.g. getTimeToNext())
After an FTP call, the method has to stay in sleep for x milliseconds before to make the next call. Here is the outline of the code in Java
public class FTPTest{
class MyFTPThread{
private int numberOfCall=100;
private void ftpCall() {
//DO CONNECTION
}
private long void getTimeToNext() {
//RETURN A RANDOM TIME OR A FIXED VALUE
}
public void run(){
int counter = 0;
while(++counter <= numberOfCall){
ftpCall();
this.sleep(getTimeToNext());
}
}
}
public static void main(String [] args){
MyFTPThread t = new MyFTPThread();
t.start();
}
}

locking on a cache key

I've read several questions similar to this, but none of the answers provide ideas of how to clean up memory while still maintaining lock integrity. I'm estimating the number of key-value pairs at a given time to be in the tens of thousands, but the number of key-value pairs over the lifespan of the data structure is virtually infinite (realistically it probably wouldn't be more than a billion, but I'm coding to the worst case).
I have an interface:
public interface KeyLock<K extends Comparable<? super K>> {
public void lock(K key);
public void unock(K key);
}
with a default implementation:
public class DefaultKeyLock<K extends Comparable<? super K>> implements KeyLock<K> {
private final ConcurrentMap<K, Mutex> lockMap;
public DefaultKeyLock() {
lockMap = new ConcurrentSkipListMap<K, Mutex>();
}
#Override
public void lock(K key) {
Mutex mutex = new Mutex();
Mutex existingMutex = lockMap.putIfAbsent(key, mutex);
if (existingMutex != null) {
mutex = existingMutex;
}
mutex.lock();
}
#Override
public void unock(K key) {
Mutex mutex = lockMap.get(key);
mutex.unlock();
}
}
This works nicely, but the map never gets cleaned up. What I have so far for a clean implementation is:
public class CleanKeyLock<K extends Comparable<? super K>> implements KeyLock<K> {
private final ConcurrentMap<K, LockWrapper> lockMap;
public CleanKeyLock() {
lockMap = new ConcurrentSkipListMap<K, LockWrapper>();
}
#Override
public void lock(K key) {
LockWrapper wrapper = new LockWrapper(key);
wrapper.addReference();
LockWrapper existingWrapper = lockMap.putIfAbsent(key, wrapper);
if (existingWrapper != null) {
wrapper = existingWrapper;
wrapper.addReference();
}
wrapper.addReference();
wrapper.lock();
}
#Override
public void unock(K key) {
LockWrapper wrapper = lockMap.get(key);
if (wrapper != null) {
wrapper.unlock();
wrapper.removeReference();
}
}
private class LockWrapper {
private final K key;
private final ReentrantLock lock;
private int referenceCount;
public LockWrapper(K key) {
this.key = key;
lock = new ReentrantLock();
referenceCount = 0;
}
public synchronized void addReference() {
lockMap.put(key, this);
referenceCount++;
}
public synchronized void removeReference() {
referenceCount--;
if (referenceCount == 0) {
lockMap.remove(key);
}
}
public void lock() {
lock.lock();
}
public void unlock() {
lock.unlock();
}
}
}
This works for two threads accessing a single key lock, but once a third thread is introduced the lock integrity is no longer guaranteed. Any ideas?
I don't buy that this works for two threads. Consider this:
(Thread A) calls lock(x), now holds lock x
thread switch
(Thread B) calls lock(x), putIfAbsent() returns the current wrapper for x
thread switch
(Thread A) calls unlock(x), the wrapper reference count hits 0 and it gets removed from the map
(Thread A) calls lock(x), putIfAbsent() inserts a new wrapper for x
(Thread A) locks on the new wrapper
thread switch
(Thread B) locks on the old wrapper
How about:
LockWrapper starts with a reference count of 1
addReference() returns false if the reference count is 0
in lock(), if existingWrapper != null, we call addReference() on it. If this returns false, it has already been removed from the map, so we loop back and try again from the putIfAbsent()
I would use a fixed array by default for a striped lock, since you can size it to the concurrency level that you expect. While there may be hash collisions, a good spreader will resolve that. If the locks are used for short critical sections, then you may be creating contention in the ConcurrentHashMap that defeats the optimization.
You're welcome to adapt my implementation, though I only implemented the dynamic version for fun. It didn't seem useful in practice so only the fixed was used in production. You can use the hash() function from ConcurrentHashMap to provide a good spreading.
ReentrantStripedLock in,
http://code.google.com/p/concurrentlinkedhashmap/wiki/IndexableCache

Resources