how to add multiple queueName on rabbitListener - spring

here's my code how to declare queueName in here.
//RabbitmqConfig.java
#Getter
public List<String> queueNameList = new ArrayList<>();
#Bean
public DirectExchange exchange(RabbitAdmin rabbitAdmin) {
DirectExchange directExchange = new DirectExchange(exchange);
for (int num = 1; num <= 3; num++) {
String newQueueName = String.format(queueName + "-%s", num);
String newRoutingKey = String.format(routingKey + "-%s", num);
Queue queue = new Queue(newQueueName, false);
rabbitAdmin.declareQueue(queue);
rabbitAdmin.declareBinding(BindingBuilder.bind(queue).to(directExchange).with(newRoutingKey));
queueNameList.add(newQueueName);
}
return new DirectExchange(exchange);
}
Then, my question is how to get these queue name on rabbitListener?
I got some answer using split.
#RabbitListener(queues = {"#{'${spring.rabbitmq.test}'.split(',')}"},
But I want to use RabbitmqConfig.queueNameList. Because the number of queue can be increase, so I want to deal it by setting variable num.
Maybe SPeL? or anything else?

Related

Spring Batch: read csv file into Map

I have data in a csv file that I want to read into a Map using Spring batch. The format of the data is like this:
1, "data1", 2, "data2", 3, "data3"
This format lends itself easily to a Map, but I can't seem to do it. I am currently using a PassThroughLineMapper and then tokenizing the String in the processor. However, since I have a couple of processors, I am having to do this in all of them. This seems very inefficient to me. Here's my current FlatFileItemReader code below.
#Bean
public FlatFileItemReader<String> reader() {
return new FlatFileItemReaderBuilder<String>()
.name("fileLineReader").linesToSkip(1)
.resource(new FileSystemResource(inputCsv))
.lineMapper(new PassThroughLineMapper())
.build();
}
I would like it to return Map<Integer, String>
Turned out to be a simple task in the end. I wrote a custom lineMapper. Not deleting the question because it might help somebody else.
#Override
public Map<Integer, String> mapLine(String s, int i) throws Exception {
Map<Integer, String> map = new HashMap<>();
String[] tokens = s.split(",");
String key = "";
for (int j = 0; j < tokens.length; j++) {
if (tokens[j].equals("9999")) {
break;
} else {
if (j % 2 == 0)
key = tokens[j];
else
map.putIfAbsent(Integer.valueOf(key), tokens[j]);
}
}
return map;
}

AggregatingReplyingKafkaTemplate releaseStrategy Question

There seem to be an issue when I use AggregatingReplyingKafkaTemplate with template.setReturnPartialOnTimeout(true) in that, it returns timeout exception even if partial results are available from consumers.
In example below, I have 3 consumers to reply to the request topic and i've set the reply timeout at 10 seconds. I've explicitly delayed the response of Consumer 3 to 11 seconds, however, I expect the response back from Consumer 1 and 2, so, I can return partial results. However, I am getting KafkaReplyTimeoutException. Appreciate your inputs. Thanks.
I follow the code based on the Unit Test below.
[ReplyingKafkaTemplateTests][1]
I've provided the actual code below:
#RestController
public class SumController {
#Value("${kafka.bootstrap-servers}")
private String bootstrapServers;
public static final String D_REPLY = "dReply";
public static final String D_REQUEST = "dRequest";
#ResponseBody
#PostMapping(value="/sum")
public String sum(#RequestParam("message") String message) throws InterruptedException, ExecutionException {
AggregatingReplyingKafkaTemplate<Integer, String, String> template = aggregatingTemplate(
new TopicPartitionOffset(D_REPLY, 0), 3, new AtomicInteger());
String resultValue ="";
String currentValue ="";
try {
template.setDefaultReplyTimeout(Duration.ofSeconds(10));
template.setReturnPartialOnTimeout(true);
ProducerRecord<Integer, String> record = new ProducerRecord<>(D_REQUEST, null, null, null, message);
RequestReplyFuture<Integer, String, Collection<ConsumerRecord<Integer, String>>> future =
template.sendAndReceive(record);
future.getSendFuture().get(5, TimeUnit.SECONDS); // send ok
System.out.println("Send Completed Successfully");
ConsumerRecord<Integer, Collection<ConsumerRecord<Integer, String>>> consumerRecord = future.get(10, TimeUnit.SECONDS);
System.out.println("Consumer record size "+consumerRecord.value().size());
Iterator<ConsumerRecord<Integer, String>> iterator = consumerRecord.value().iterator();
while (iterator.hasNext()) {
currentValue = iterator.next().value();
System.out.println("response " + currentValue);
System.out.println("Record header " + consumerRecord.headers().toString());
resultValue = resultValue + currentValue + "\r\n";
}
} catch (Exception e) {
System.out.println("Error Message is "+e.getMessage());
}
return resultValue;
}
public AggregatingReplyingKafkaTemplate<Integer, String, String> aggregatingTemplate(
TopicPartitionOffset topic, int releaseSize, AtomicInteger releaseCount) {
//Create Container Properties
ContainerProperties containerProperties = new ContainerProperties(topic);
containerProperties.setAckMode(ContainerProperties.AckMode.MANUAL_IMMEDIATE);
//Set the consumer Config
//Create Consumer Factory with Consumer Config
DefaultKafkaConsumerFactory<Integer, Collection<ConsumerRecord<Integer, String>>> cf =
new DefaultKafkaConsumerFactory<>(consumerConfigs());
//Create Listener Container with Consumer Factory and Container Property
KafkaMessageListenerContainer<Integer, Collection<ConsumerRecord<Integer, String>>> container =
new KafkaMessageListenerContainer<>(cf, containerProperties);
// container.setBeanName(this.testName);
AggregatingReplyingKafkaTemplate<Integer, String, String> template =
new AggregatingReplyingKafkaTemplate<>(new DefaultKafkaProducerFactory<>(producerConfigs()), container,
(list, timeout) -> {
releaseCount.incrementAndGet();
return list.size() == releaseSize;
});
template.setSharedReplyTopic(true);
template.start();
return template;
}
public Map<String, Object> consumerConfigs() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,bootstrapServers);
props.put(ConsumerConfig.GROUP_ID_CONFIG, "test_id");
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.StringDeserializer.class);
return props;
}
public Map<String, Object> producerConfigs() {
Map<String, Object> props = new HashMap<>();
// list of host:port pairs used for establishing the initial connections to the Kakfa cluster
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,
bootstrapServers);
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
org.apache.kafka.common.serialization.StringSerializer.class);
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.StringSerializer.class);
return props;
}
public ProducerFactory<Integer,String> producerFactory() {
return new DefaultKafkaProducerFactory<>(producerConfigs());
}
#KafkaListener(id = "def1", topics = { D_REQUEST}, groupId = "D_REQUEST1")
#SendTo // default REPLY_TOPIC header
public String dListener1(String in) throws InterruptedException {
return "First Consumer : "+ in.toUpperCase();
}
#KafkaListener(id = "def2", topics = { D_REQUEST}, groupId = "D_REQUEST2")
#SendTo // default REPLY_TOPIC header
public String dListener2(String in) throws InterruptedException {
return "Second Consumer : "+ in.toLowerCase();
}
#KafkaListener(id = "def3", topics = { D_REQUEST}, groupId = "D_REQUEST3")
#SendTo // default REPLY_TOPIC header
public String dListener3(String in) throws InterruptedException {
Thread.sleep(11000);
return "Third Consumer : "+ in;
}
}
'''
[1]: https://github.com/spring-projects/spring-kafka/blob/master/spring-kafka/src/test/java/org/springframework/kafka/requestreply/ReplyingKafkaTemplateTests.java
template.setReturnPartialOnTimeout(true) simply means the template will consult the release strategy on timeout (with the timeout argument = true, to tell the strategy it's a timeout rather than a delivery call).
It must return true to release the partial result.
This is to allow you to look at (and possibly modify) the list to decide whether you want to release or discard.
Your strategy ignores the timeout parameter:
(list, timeout) -> {
releaseCount.incrementAndGet();
return list.size() == releaseSize;
});
You need return timeout ? true : { ... }.

why does my return statement return a null result?

I am trying to connect to a distant machine and display some files from a specific directory.
The problem is when I test my function it returns a null result and what I want to do is to display file names.
Here is my code:
#Service
public class SftpClientImpl implements SftpClient {
private LsEntry entry;
#Override
public LsEntry connectToServer() {
String SFTPHOST = "xxxxx";
int SFTPPORT = 22;
String SFTPUSER = "xxx";
String SFTPPASS = "xxxxx";
String SFTPWORKINGDIR = "/dir/dir2/dir3";
Session session = null;
Channel channel = null;
ChannelSftp channelSftp = null;
try{
JSch jsch = new JSch();
session = jsch.getSession(SFTPUSER,SFTPHOST,SFTPPORT);
session.setPassword(SFTPPASS);
java.util.Properties config = new java.util.Properties();
config.put("StrictHostKeyChecking", "no");
session.setConfig(config);
session.connect();
channel = session.openChannel("sftp");
channel.connect();
System.out.println("Starting the session ..");
channelSftp = (ChannelSftp)channel;
channelSftp.cd(SFTPWORKINGDIR);
Vector filelist = channelSftp.ls(SFTPWORKINGDIR);
for(int i=0; i<filelist.size();i++){
LsEntry entry = (LsEntry) filelist.get(i);
System.out.println(entry.getFilename());
}
while(session != null){
System.out.println("Killing the session");
session.disconnect();
System.exit(0);
}
}catch(Exception ex){
ex.printStackTrace();
}
return entry;
}
}
and:
#GetMapping(produces = MediaType.APPLICATION_JSON_VALUE)
public ResponseEntity<LsEntry> getDirectories() {
LsEntry entry = sftpClient.connectToServer();
return new ResponseEntity<>(entry, HttpStatus.OK);
}
Any idea why this is not working?
entry is null as it's value is only contained within the for loop, and is actually declared twice (once with private class scope, once with local scope within the for loop).
What I suggest is to correct your variable declaration and test the connection and filename printing. If it still doesn't work, try it within a known working spring endpoint. If it prints your directory as expected than move to its own endpoint and try again. In doing this it'll help narrow down the scope of your issue.
I've used the below code to connect to and print file names for the past few years and is heavily based on the example code provided by JSCH back then:
JSch jsch = new JSch();
Session session;
session = jsch.getSession(username, hostname, port);
session.setConfig("StrictHostKeyChecking", "no");
session.setPassword(password);
session.connect();
Channel channel = session.openChannel("sftp");
channel.connect();
ChannelSftp sftpChannel = (ChannelSftp) channel;
//List our files within the directory
Vector vv = sftpChannel.ls(srcDir);
if (vv != null) {
LOGGER.debug("We have a file listing!");
for (int ii = 0; ii < vv.size(); ii++) {
Object obj = vv.elementAt(ii);
if (obj instanceof ChannelSftp.LsEntry) {
LOGGER.debug("[" + ((ChannelSftp.LsEntry) obj).getFilename() + "]");
if (ii < 1) { // empty directory contains entries for . and ..
continue;
}
String filename = ((ChannelSftp.LsEntry) obj).getFilename();
filenames.add(filename);
LOGGER.debug("filename is: {}", filename);
....
this is how i solved my problem :)
I corrected my variable declaration and it worked nicely, like you told me to :) thanks
#Override
public LsEntry connectToServer() {
String HOST = "xxxxx";
int PORT = 22;
String USER = "xxx";
String PASS = "xxxxx";
String DIR = "/dir/dir2/dir3";
Session session = null;
Channel channel = null;
ChannelSftp channelSftp = null;
// LsEntry declaration
LsEntry entry = null;
try {
JSch jsch = new JSch();
session = jsch.getSession(USER, HOST, PORT);
session.setPassword(PASS);
// the rest of the code
//....
//...
for (int i = 0; i < filelist.size(); i++) {
//cast
entry = (LsEntry) filelist.get(i);
System.out.println(((LsEntry) entry).getFilename());
}
while (session != null) {
System.out.println("Killing the session");
session.disconnect();
System.exit(0);
}
} catch (Exception ex) {
ex.printStackTrace();
}
return (LsEntry) entry;
}

SftpInboundFileSynchronizer not synchronizing

I have the following SFTP file synchronizer:
#Bean
public SftpInboundFileSynchronizer sftpInboundFileSynchronizer() {
SftpInboundFileSynchronizer fileSynchronizer = new SftpInboundFileSynchronizer(sftpSessionFactory());
fileSynchronizer.setDeleteRemoteFiles(false);
fileSynchronizer.setRemoteDirectory(applicationProperties.getSftpDirectory());
CompositeFileListFilter<ChannelSftp.LsEntry> compositeFileListFilter = new CompositeFileListFilter<ChannelSftp.LsEntry>();
compositeFileListFilter.addFilter(new SftpPersistentAcceptOnceFileListFilter(store, "sftp"));
compositeFileListFilter.addFilter(new SftpSimplePatternFileListFilter(applicationProperties.getLoadFileNamePattern()));
fileSynchronizer.setFilter(compositeFileListFilter);
fileSynchronizer.setPreserveTimestamp(true);
return fileSynchronizer;
}
When the application first runs, it synchronizes to the local directory with the remote SFTP site directory. However, it fails to pick up any subsequent changes in the remote SFTP directory files.
It is scheduled to poll as follows:
#Bean
#InboundChannelAdapter(autoStartup="true", channel = "sftpChannel", poller = #Poller("pollerMetadata"))
public SftpInboundFileSynchronizingMessageSource sftpMessageSource() {
SftpInboundFileSynchronizingMessageSource source =
new SftpInboundFileSynchronizingMessageSource(sftpInboundFileSynchronizer());
source.setLocalDirectory(applicationProperties.getScheduledLoadDirectory());
source.setAutoCreateLocalDirectory(true);
ChainFileListFilter<File> chainFileFilter = new ChainFileListFilter<File>();
chainFileFilter.addFilter(new LastModifiedFileListFilter());
FileSystemPersistentAcceptOnceFileListFilter fs = new FileSystemPersistentAcceptOnceFileListFilter(store, "dailyfilesystem");
fs.setFlushOnUpdate(true);
chainFileFilter.addFilter(fs);
source.setLocalFilter(chainFileFilter);
source.setCountsEnabled(true);
return source;
}
#Bean
public PollerMetadata pollerMetadata(RetryCompoundTriggerAdvice retryCompoundTriggerAdvice) {
PollerMetadata pollerMetadata = new PollerMetadata();
List<Advice> adviceChain = new ArrayList<Advice>();
adviceChain.add(retryCompoundTriggerAdvice);
pollerMetadata.setAdviceChain(adviceChain);
pollerMetadata.setTrigger(compoundTrigger());
pollerMetadata.setMaxMessagesPerPoll(1);
return pollerMetadata;
}
#Bean
public CompoundTrigger compoundTrigger() {
CompoundTrigger compoundTrigger = new CompoundTrigger(primaryTrigger());
return compoundTrigger;
}
#Bean
public CronTrigger primaryTrigger() {
return new CronTrigger(applicationProperties.getSchedule());
}
#Bean
public PeriodicTrigger secondaryTrigger() {
return new PeriodicTrigger(applicationProperties.getRetryInterval());
}
In the afterReceive method of RetryCompoundTriggerAdvice which extends AbstractMessageSourceAdvice, I get a null result after the first run.
How can I configure the synchronizer such that it synchronizes periodically (rather than just once at app startup)?
Update
I have found that when the SFTP site has no file in its directory on my application startup, the SftpInboundFileSynchronizer syncs at every polling interval. So I can see com.jcraft.jsch log statements at every poll. But as soon as a file is found on the SFTP site, it syncs to get that file locally and then syncs no more.
Update 2
My apologies... here's the custom code:
#Component
public class RetryCompoundTriggerAdvice extends AbstractMessageSourceAdvice {
private final static Logger logger = LoggerFactory.getLogger(RetryCompoundTriggerAdvice.class);
private final CompoundTrigger compoundTrigger;
private final Trigger override;
private final ApplicationProperties applicationProperties;
private final Mail mail;
private int attempts = 0;
private boolean expectedMessage;
private boolean inProcess;
public RetryCompoundTriggerAdvice(CompoundTrigger compoundTrigger,
#Qualifier("secondaryTrigger") Trigger override,
ApplicationProperties applicationProperties,
Mail mail) {
this.compoundTrigger = compoundTrigger;
this.override = override;
this.applicationProperties = applicationProperties;
this.mail = mail;
}
#Override
public boolean beforeReceive(MessageSource<?> source) {
logger.debug("!inProcess is " + !inProcess);
return !inProcess;
}
#Override
public Message<?> afterReceive(Message<?> result, MessageSource<?> source) {
if (expectedMessage) {
logger.info("Received expected load file. Setting cron trigger.");
this.compoundTrigger.setOverride(null);
expectedMessage = false;
return result;
}
final int maxOverrideAttempts = applicationProperties.getMaxFileRetry();
attempts++;
if (result == null && attempts < maxOverrideAttempts) {
logger.info("Unable to find file after " + attempts + " attempt(s). Will reattempt");
this.compoundTrigger.setOverride(this.override);
} else if (result == null && attempts >= maxOverrideAttempts) {
String message = "Unable to find daily file" +
" after " + attempts +
" attempt(s). Will not reattempt since max number of attempts is set at " +
maxOverrideAttempts + ".";
logger.warn(message);
mail.sendAdminsEmail("Missing Load File", message);
attempts = 0;
this.compoundTrigger.setOverride(null);
} else {
attempts = 0;
// keep periodically checking until we are certain
// that this message is the expected message
this.compoundTrigger.setOverride(this.override);
inProcess = true;
logger.info("Found load file");
}
return result;
}
public void foundExpectedMessage(boolean found) {
logger.debug("Expected message was found? " + found);
this.expectedMessage = found;
inProcess = false;
}
}
You have the logic:
#Override
public boolean beforeReceive(MessageSource<?> source) {
logger.debug("!inProcess is " + !inProcess);
return !inProcess;
}
Let's study its JavaDoc:
/**
* Subclasses can decide whether to proceed with this poll.
* #param source the message source.
* #return true to proceed.
*/
public abstract boolean beforeReceive(MessageSource<?> source);
And the logic around this method:
Message<?> result = null;
if (beforeReceive((MessageSource<?>) target)) {
result = (Message<?>) invocation.proceed();
}
return afterReceive(result, (MessageSource<?>) target);
So, we call invocation.proceed() (SFTP synchronization) only if beforeReceive() returns true. In your case it is the case only if !inProcess.
In the afterReceive() implementation you have inProcess = true; in case you have a result - at the first attempt. And looks like you reset it back to the false only when someone calls that foundExpectedMessage().
So, what do you expect from us as an answer to your problem? It is really in your custom code and not related to the Framework. Sorry...

Java 8 Stream , convert List<File> to Map<Integer, List<FIle>>

I have below code in traditional java loop. Would like to use Java 8 Stream instead.
I have a sorted list of files(Sorted by file size). I group these files together in a way that the total size of all files does not exceed the given max size and put them in a Map with the key 1,2,3,... so on. Here is the code.
List<File> allFilesSortedBySize = getListOfFiles();
Map<Integer, List<File>> filesGroupedByMaxSizeMap = new HashMap<Integer, List<File>>();
double totalLength = 0L;
int count = 0;
List<File> filesWithSizeTotalMaxSize = Lists.newArrayList();
//group the files to be zipped together as per maximum allowable size in a map
for (File file : allFilesSortedBySize) {
long sizeInBytes = file.length();
double sizeInMb = (double)sizeInBytes / (1024 * 1024);
totalLength = totalLength + sizeInMb;
if(totalLength <= maxSize) {
filesWithSizeTotalMaxSize.add(file);
} else {
count = count + 1;
filesGroupedByMaxSizeMap.put(count, filesWithSizeTotalMaxSize);
filesWithSizeTotalMaxSize = Lists.newArrayList();
filesWithSizeTotalMaxSize.add(file);
totalLength = sizeInMb;
}
}
filesGroupedByMaxSizeMap.put(count+1, filesWithSizeTotalMaxSize);
return filesGroupedByMaxSizeMap;
after reading,I found the solution using Collectors.groupBy instead.
Code using java8 lambda expression
private final long MB = 1024 * 1024;
private Map<Integer, List<File>> grouping(List<File> files, long maxSize) {
AtomicInteger group = new AtomicInteger(0);
AtomicLong groupSize = new AtomicLong();
return files.stream().collect(groupingBy((file) -> {
if (groupSize.addAndGet(file.length()) <= maxSize * MB) {
return group.get() == 0 ? group.incrementAndGet() : group.get();
}
groupSize.set(file.length());
return group.incrementAndGet();
}));
}
Code provided by #Holger then you are free to checking group whether equals 0
private static final long MB = 1024 * 1024;
private Map<Integer, List<File>> grouping(List<File> files, long maxSize) {
AtomicInteger group = new AtomicInteger(0);
//force initializing group starts with 1 even if the first file is empty.
AtomicLong groupSize = new AtomicLong(maxSize * MB + 1);
return files.stream().collect(groupingBy((file) -> {
if (groupSize.addAndGet(file.length()) <= maxSize * MB) {
return group.get();
}
groupSize.set(file.length());
return group.incrementAndGet();
}));
}
Code using anonymous class
inspired by #Holger, All “solutions” using a grouping function that modifies external state are hacks abusing the API,so you can use anonymous class to manage the grouping logic state in class.
private static final long MB = 1024 * 1024;
private Map<Integer, List<File>> grouping(List<File> files, long maxSize) {
return files.stream().collect(groupingBy(groupSize(maxSize)));
}
private Function<File, Integer> groupSize(final long maxSize) {
long maxBytesSize = maxSize * MB;
return new Function<File, Integer>() {
private int group;
private long groupSize = maxBytesSize + 1;
#Override
public Integer apply(File file) {
return hasRemainingFor(file) ? current(file) : next(file);
}
private boolean hasRemainingFor(File file) {
return (groupSize += file.length()) <= maxBytesSize;
}
private int next(File file) {
groupSize = file.length();
return ++group;
}
private int current(File file) {
return group;
}
};
}
Test
import org.junit.jupiter.api.Test;
import java.io.File;
import java.util.Collections;
import java.util.List;
import java.util.Map;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.concurrent.atomic.AtomicLong;
import java.util.function.Function;
import static java.util.Arrays.asList;
import static java.util.Collections.singletonList;
import static java.util.stream.Collectors.groupingBy;
import static org.hamcrest.MatcherAssert.assertThat;
import static org.hamcrest.Matchers.equalTo;
/**
* Created by holi on 3/24/17.
*/
public class StreamGroupingTest {
private final File FILE_1MB = file(1);
private final File FILE_2MB = file(2);
private final File FILE_3MB = file(3);
#Test
void eachFileInIndividualGroupIfEachFileSizeGreaterThanMaxSize() {
Map<Integer, List<File>> groups = grouping(asList(FILE_2MB, FILE_3MB), 1);
assertThat(groups.size(), equalTo(2));
assertThat(groups.get(1), equalTo(singletonList(FILE_2MB)));
assertThat(groups.get(2), equalTo(singletonList(FILE_3MB)));
}
#Test
void allFilesInAGroupIfTotalSizeOfFilesLessThanOrEqualMaxSize() {
Map<Integer, List<File>> groups = grouping(asList(FILE_2MB, FILE_3MB), 5);
assertThat(groups.size(), equalTo(1));
assertThat(groups.get(1), equalTo(asList(FILE_2MB, FILE_3MB)));
}
#Test
void allNeighboringFilesInAGroupThatTotalOfTheirSizeLessThanOrEqualMaxSize() {
Map<Integer, List<File>> groups = grouping(asList(FILE_1MB, FILE_2MB, FILE_3MB), 3);
assertThat(groups.size(), equalTo(2));
assertThat(groups.get(1), equalTo(asList(FILE_1MB, FILE_2MB)));
assertThat(groups.get(2), equalTo(singletonList(FILE_3MB)));
}
#Test
void eachFileInIndividualGroupIfTheFirstFileAndTotalOfEachNeighboringFilesSizeGreaterThanMaxSize() {
Map<Integer, List<File>> groups = grouping(asList(FILE_2MB, FILE_1MB, FILE_3MB), 2);
assertThat(groups.size(), equalTo(3));
assertThat(groups.get(1), equalTo(singletonList(FILE_2MB)));
assertThat(groups.get(2), equalTo(singletonList(FILE_1MB)));
assertThat(groups.get(3), equalTo(singletonList(FILE_3MB)));
}
#Test
void theFirstEmptyFileInGroup1() throws Throwable {
File emptyFile = file(0);
Map<Integer, List<File>> groups = grouping(singletonList(emptyFile), 2);
assertThat(groups.get(1), equalTo(singletonList(emptyFile)));
}
private static final long MB = 1024 * 1024;
private Map<Integer, List<File>> grouping(List<File> files, long maxSize) {
AtomicInteger group = new AtomicInteger(0);
AtomicLong groupSize = new AtomicLong(maxSize * MB + 1);
return files.stream().collect(groupingBy((file) -> {
if (groupSize.addAndGet(file.length()) <= maxSize * MB) {
return group.get();
}
groupSize.set(file.length());
return group.incrementAndGet();
}));
}
private Function<File, Integer> groupSize(final long maxSize) {
long maxBytesSize = maxSize * MB;
return new Function<File, Integer>() {
private int group;
private long groupSize = maxBytesSize + 1;
#Override
public Integer apply(File file) {
return hasRemainingFor(file) ? current(file) : next(file);
}
private boolean hasRemainingFor(File file) {
return (groupSize += file.length()) <= maxBytesSize;
}
private int next(File file) {
groupSize = file.length();
return ++group;
}
private int current(File file) {
return group;
}
};
}
private File file(int sizeOfMB) {
return new File(String.format("%dMB file", sizeOfMB)) {
#Override
public long length() {
return sizeOfMB * MB;
}
#Override
public boolean equals(Object obj) {
File that = (File) obj;
return length() == that.length();
}
};
}
}
Since the processing of each element highly depends on the previous’ processing, this task is not suitable for streams. You still can achieve it using a custom collector, but the implementation would be much more complicated than the loop solution.
In other words, there is no improvement when you rewrite this as a stream operation. Stay with the loop.
However, there are still some things you can improve.
List<File> allFilesSortedBySize = getListOfFiles();
// get maxSize in bytes ONCE, instead of converting EACH size to MiB
long maxSizeBytes = (long)(maxSize * 1024 * 1024);
// use "diamond operator"
Map<Integer, List<File>> filesGroupedByMaxSizeMap = new HashMap<>();
// start with "create new list" condition to avoid code duplication
long totalLength = maxSizeBytes;
// count is obsolete, the map maintains a size
// the initial "totalLength = maxSizeBytes" forces creating a new list within the loop
List<File> filesWithSizeTotalMaxSize = null;
for(File file: allFilesSortedBySize) {
long length = file.length();
if(maxSizeBytes-totalLength <= length) {
filesWithSizeTotalMaxSize = new ArrayList<>(); // no utility method needed
// store each list immediately, so no action after the loop needed
filesGroupedByMaxSizeMap.put(filesGroupedByMaxSizeMap.size()+1,
filesWithSizeTotalMaxSize);
totalLength = 0;
}
totalLength += length;
filesWithSizeTotalMaxSize.add(file);
}
return filesGroupedByMaxSizeMap;
You may further replace
filesWithSizeTotalMaxSize = new ArrayList<>();
filesGroupedByMaxSizeMap.put(filesGroupedByMaxSizeMap.size()+1,
filesWithSizeTotalMaxSize);
with
filesWithSizeTotalMaxSize = filesGroupedByMaxSizeMap.computeIfAbsent(
filesGroupedByMaxSizeMap.size()+1, x -> new ArrayList<>());
but there might be different opinions whether this is an improvement.
The simplest solution to the problem I could think of is to use an AtomicLong wrapper for the size and a AtomicInteger wrapper for length. These have some useful methods for performing basic arithmetic operations on them which are very useful in this particular case.
List<File> files = getListOfFiles();
AtomicLong length = new AtomicLong();
AtomicInteger index = new AtomicInteger(1);
long maxLength = SOME_ARBITRARY_NUMBER;
Map<Integer, List<File>> collect = files.stream().collect(Collectors.groupingBy(
file -> {
if (length.addAndGet(file.length()) <= maxLength) {
return index.get();
}
length.set(file.length());
return index.incrementAndGet();
}
));
return collect;
Basically what Collectors.groupingBy does the work which you Intended.

Resources