CharacterStreamReadingMessageSource.stdin() and EOF

CharacterStreamReadingMessageSource.stdin() and EOF - spring-boot

I am using CharacterStreamReadingMessageSource in a spring integration flow:
IntegrationFlows.from(CharacterStreamReadingMessageSource.stdin())
It works. The problem is that if I pipe a file to the process:
cat file | java -jar app.jar
or
java -jar app.jar < file
once the file has been read, the EOF is not propagated, the stdin is still active, and the process does not end. Is there something that I can do to make it behave so? Manually entering ctrl-Z on the command line works as expected, closing the application (Spring boot app, no web).

Unfortunately, it won't work in that scenario; it's designed for console input.
The CharacterStreamReadingMessageSource wraps System.in in a BufferedReader and uses readLine(). Since readLine() blocks and we don't want to tie up a thread for long periods, we check reader.ready(), which returns false if there is no data or the stream is closed.
It should probably provide an option for blocking for this use case, but when used with a real console, it would block forever.
In the meantime, you could create a copy of the class and change receive() ...
#Override
public Message<String> receive() {
try {
synchronized (this.monitor) {
// if (!this.reader.ready()) { // remove this
// return null;
// }
String line = this.reader.readLine();
if (line == null) { // add this
((ConfigurableApplicationContext) getApplicationContext()).close();
}
return (line != null) ? new GenericMessage<String>(line) : null;
}
}
catch (IOException e) {
throw new MessagingException("IO failure occurred in adapter", e);
}
}
(removing the ready check, and shutting down the context at EOF).
I opened a JIRA Issue.

Related

Implementing Jump Hosts with SSHJ

Somebody asked for this and there is a pull-request which contains code that somehow was rewritten before it got merged and somebody managed to code a solution based on the pull-request. However, there is no example for the final version in that library.
Therefore, that doesn't really help me with my limited understanding of ssh and all. Basically there are two scenarios I want to solve:
common SSH-session via some jump-hosts:
user1#jump1.com
user2#jump2.com
user3#jump3.com
admin#server.com
ending in an ssh-session where the connecting user is free to work around in that ssh-shell at server.com, i.e. what a normal ssh admin#server.com-command would do in the shell on jump3.com.
like the above but ending in a port forwarding to server.com:80
That is possible with ssh's ProxyCommand, but I want to code this with SSHJ. And that's where I fail to figure out how to do this.
What I have now is
SSHClient hop1 = new SSHClient();
try {
Path knownHosts = rootConfig.getKnownHosts();
if (knownHosts != null) {
hop1.loadKnownHosts(knownHosts.toFile());
} else {
hop1.loadKnownHosts();
}
Path authenticationFile = hop1Config.getAuthenticationFile();
if (authenticationFile != null) {
KeyProvider keyProvider = hop1.loadKeys(authenticationFile.toString(), (String) null);
hop1.authPublickey(hop1Config.getUser(), keyProvider);
} else {
hop1.authPassword(hop1Config.getUser(), hop1Config.getPassword());
}
// I found these methods:
hop1.getConnection();
hop1.getSocket();
// and now what?
} catch (IOException e) {
logger.error("Failed to open ssh-connection to {}", hop1Config, e);
}
I noticed class LocalPortForwarder.DirectTCPIPChannel, but I don't know with what values I should instantiate it or how to use it with the rest afterwards.

How to create a permanent directory for my files with spring boot

I am working with spring boot and spring content. I want to store all my pictures and videos in one directory but my code continues to create different dir every time I rerun the application
I have such bean and when I run the app again it shows null pointer because the dir already exists but I want it to create it just once and every file is stored there
every time i run this tries to create the dir again
#Bean
File filesystemRoot() {
try {
return Files.createDirectory(Paths.get("/tmp/photo_video_myram")).toFile();
} catch (IOException io) {}
return null;
}
#Bean
FileSystemResourceLoader fileSystemResourceLoader() {
return new FileSystemResourceLoader(filesystemRoot().getAbsolutePath());
}

One solution, would be to check if the directory exists:
#Bean
File filesystemRoot() {
File tmpDir = new File("tmp/photo_video_myram");
if (!tmpDir.isDirectory()) {
try {
return Files.createDirectory(tmpDir.toPath()).toFile();
} catch (IOException e) {
e.printStackTrace();
}
}
return tmpDir;
}

You can use isDirectory() method first to check if the directory already exists. In case it does not exist, then create a new one.

Meanwhile there is another way to achieve this, when you use Spring Boot and accordingly spring-content-fs-boot-starter.
According to the documentation at https://paulcwarren.github.io/spring-content/refs/release/fs-index.html#_spring_boot_configuration it should be sufficient to add
spring.content.fs.filesystemRoot=/tmp/photo_video_myram
to your application.properties file.

How to use integrationFlows for new files?

I was following a tutorial on how to listen to a folder with spring integration and SseEmitter. I have this code now:
#Bean
IntegrationFlow inboundFlow ( #Value("${input-dir:file:C:\\Users\\kader\\Desktop\\Scaned\\}") File in){
return IntegrationFlows.from(Files.inboundAdapter(in).autoCreateDirectory(true),
poller -> poller.poller(spec -> spec.fixedRate(1000L)))
.transform(File.class, File::getAbsolutePath)
.handle(String.class, (path, map) -> {
sses.forEach((sse) -> {
try {
String p = path;
sse.send(SseEmitter.event().name("spring").data(p));
}
catch (IOException e) {
throw new RuntimeException(e);
}
});
return null ;
})
.get();
}
and it works but it sends all the files in the specified directory including the files that already exist, is there any way to make it ignore them and send the new files only???

Well, actually since you don't configure any filters on the Files.inboundAdapter(), there is a logic like this:
// no filters are provided
else if (Boolean.FALSE.equals(this.preventDuplicates)) {
filtersNeeded.add(new AcceptAllFileListFilter<File>());
}
else { // preventDuplicates is either TRUE or NULL
filtersNeeded.add(new AcceptOnceFileListFilter<File>());
}
Therefore an AcceptOnceFileListFilter is applied and no any already polled files are not going to be picked up on the subsequent poll tasks.
However you really may talk about something like "after application restart", so yes, in this case all the files are going to be pulled.
I believe you need to study what is the FileListFilter and use an appropriate for your use-case: https://docs.spring.io/spring-integration/docs/current/reference/html/files.html#file-reading

Downlolad and save file from ClientRequest using ExchangeFunction in Project Reactor

I have problem with correctly saving a file after its download is complete in Project Reactor.
class HttpImageClientDownloader implements ImageClientDownloader {
private final ExchangeFunction exchangeFunction;
HttpImageClientDownloader() {
this.exchangeFunction = ExchangeFunctions.create(new ReactorClientHttpConnector());
}
#Override
public Mono<File> downloadImage(String url, Path destination) {
ClientRequest clientRequest = ClientRequest.create(HttpMethod.GET, URI.create(url)).build();
return exchangeFunction.exchange(clientRequest)
.map(clientResponse -> clientResponse.body(BodyExtractors.toDataBuffers()))
//.flatMapMany(clientResponse -> clientResponse.body(BodyExtractors.toDataBuffers()))
.flatMap(dataBuffer -> {
AsynchronousFileChannel fileChannel = createFile(destination);
return DataBufferUtils
.write(dataBuffer, fileChannel, 0)
.publishOn(Schedulers.elastic())
.doOnNext(DataBufferUtils::release)
.then(Mono.just(destination.toFile()));
});
}
private AsynchronousFileChannel createFile(Path path) {
try {
return AsynchronousFileChannel.open(path, StandardOpenOption.CREATE);
} catch (Exception e) {
throw new ImageDownloadException("Error while creating file: " + path, e);
}
}
}
So my question is:
Is DataBufferUtils.write(dataBuffer, fileChannel, 0) blocking?
What about when the disk is slow?
And second question about what happens when ImageDownloadException occurs ,
In doOnNext I want to release the given data buffer, is that a good place for this kind operation?
I think also this line:
.map(clientResponse -> clientResponse.body(BodyExtractors.toDataBuffers()))
could be blocking...

Here's another (shorter) way to achieve that:
Flux<DataBuffer> data = this.webClient.get()
.uri("/greeting")
.retrieve()
.bodyToFlux(DataBuffer.class);
Path file = Files.createTempFile("spring", null);
WritableByteChannel channel = Files.newByteChannel(file, StandardOpenOption.WRITE);
Mono<File> result = DataBufferUtils.write(data, channel)
.map(DataBufferUtils::release)
.then(Mono.just(file));
Now DataBufferUtils::write operations are not blocking because they use non-blocking IO with channels. Writing to such channels means it'll write whatever it can to the output buffer (i.e. may write all the DataBuffer or just part of it).
Using Flux::map or Flux::doOnNext is the right place to do that. But you're right, if an error occurs, you're still responsible for releasing the current buffer (and all the remaining ones). There might be something we can improve here in Spring Framework, please keep an eye on SPR-16782.
I don't see how your last sample shows anything blocking: all methods return reactive types and none are doing blocking I/O.

how to prevent hadoop job to fail on corrupted input file

I'm running hadoop job on many input files.
But if one of the files is corrupted the whole job is fails.
How can I make the job to ignore the corrupted file?
maybe write for me some counter/error log but not fail the whole job

It depends on where your job is failing - if a line is corrupt, and somewhere in your map method an Exception is thrown then you should just be able to wrap the body of your map method with a try / catch and just log the error:
protected void map(LongWritable key, Text value, Context context) {
try {
// parse value to a long
int val = Integer.parseInt(value.toString());
// do something with key and val..
} catch (NumberFormatException nfe) {
// log error and continue
}
}
But if the error is thrown by your InputFormat's RecordReader then you'll need to amend the mappers run(..) method - who's default implementation is as follows:
public void run(Context context) {
setup(context);
while (context.nextKeyValue()) {
map(context.getCurrentKey(), context.getCurrentValue(), context);
}
cleanup(context);
}
So you could amend this to try and catch the exception on the context.nextKeyValue() call but you have to be careful on just ignoring any errors thrown by the reader - an IOExeption for example may not be 'skippable' by just ignoring the error.
If you have written your own InputFormat / RecordReader, and you have a specific exception which denotes record failure but will allow you to skip over and continue parsing, then something like this will probably work:
public void run(Context context) {
setup(context);
while (true) {
try {
if (!context.nextKeyValue()) {
break;
} else {
map(context.getCurrentKey(), context.getCurrentValue(), context);
}
} catch (SkippableRecordException sre) {
// log error
}
}
cleanup(context);
}
But just to re-itterate - your RecordReader must be able to recover on error otherwise the above code could send you into an infinite loop.
For your specific case - if you just want to ignore a file upon the first failure then you can update the run method to something much simpler:
public void run(Context context) {
setup(context);
try {
while (context.nextKeyValue()) {
map(context.getCurrentKey(), context.getCurrentValue(), context);
}
cleanup(context);
} catch (Exception e) {
// log error
}
}
Some final words of warning:
You need to make sure that it isn't your mapper code which is causing the exception to be thrown, otherwise you'll be ignoring files for the wrong reason
GZip compressed files which are not GZip compressed will actually fail in the initialization of the record reader - so the above will not catch this type or error (you'll need to write your own record reader implementation). This is true for any file error that is thrown during record reader creation

This is what Failure Traps are used for in cascading:
Whenever an operation fails and throws an exception, if there is an associated trap, the offending Tuple is saved to the resource specified by the trap Tap. This allows the job to continue processing without any data loss.
This will essentially let your job continue and let you check your corrupt files later
If you are somewhat familiar with cascading in your flow definition statement:
new FlowDef().addTrap( String branchName, Tap trap );
Failure Traps

There is also another possible way. You could use mapred.max.map.failures.percent configuration option. Of course this way of solving this problem could also hide some other problems occurring during map phase.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

CharacterStreamReadingMessageSource.stdin() and EOF - spring-boot

Related

Implementing Jump Hosts with SSHJ

How to create a permanent directory for my files with spring boot

How to use integrationFlows for new files?

Downlolad and save file from ClientRequest using ExchangeFunction in Project Reactor

how to prevent hadoop job to fail on corrupted input file

Categories

Resources