Will Spring Integration listener pick up file when file transferring half way? - spring

Good day,
I am using spring integration 2.2 to build the listener to a folder. May I know how can I check will my listener pick up file when file transferring half way? Or it will wait until the complete file has been transferred?
Let say there is a txt file with 10MB, but due to internet slowness, the file taking 10 seconds to transfer to the folder. Let say at the 5th second, the file only done transfer 5MB, still 5MB to go, possible the listener go to pick up the file and process?
Here is part of my listener code in xml:
<int-file:inbound-channel-adapter id="hostFilesIn" directory="${hostfile.dir.in}"
prevent-duplicates="true" filename-regex="${hostfile.in.filename-regex}">
<int:poller id="poller" fixed-rate="${poller.fixrate:15000}" max-messages-per-poll="1" />
</int-file:inbound-channel-adapter>
Let me know if need more information.
Here is the problem that happen:
From the log, we can saw a file is being added to queue,
2019-02-01 11:13:33.011 [task-scheduler-9] DEBUG org.springframework.integration.file.FileReadingMessageSource - Added to queue: [/apps/NAS/ftp/in/incompleteL041.TXT]
After that hitting the following error:
2019-02-01 11:13:33.096 [task-scheduler-9] DEBUG c.c.c.c.g.a.auditservice.SimpleErrorIdResolver - ERROR MESSAGE7 :
java.io.FileNotFoundException: /apps/NAS/ftp/in/incompleteL041.TXT (A file or directory in the path name does not exist.)
Its because, when the file finish transfer, it will rename it to L041.TXT.
Kindly advise.

Your understanding is fully correct. That's how file system behaves. If file is already there, even if it is still in writing mode, it is visible for all others including the mentioned file reading message source.
To prevent reading non-complete files you really need to implement a logic when non-finished file has a different name and filters on the file inbound channel adapter should be configured the way that they don't see such a file.
Something like this regexp ^(incomplete).*?\.TXT should work for you current explanation.

Related

Spring batch integration file lock access

I have a spring batch integration where multiple servers are polling a single file directory. This causes a problem where a file can be processed up by more than one. I have attempted to add a nio-lock onto the file once a server has got it but this locks the file for processing so it can't read the contents of the file.
Is there a spring batch/integration solution to this problem or is there a way to rename the file as soon as it is picked up by a node?
Consider to use FileSystemPersistentAcceptOnceFileListFilter with the shared MetadataStore: http://docs.spring.io/spring-integration/reference/html/system-management-chapter.html#metadata-store
So, only one instance of your application will be able to pick up a file.
Even if we find a solution for nio-lock, you should understand that lock means "do not touch until freed". Therefore when one instance has done its work, another one is ready to pick up the file. I guess that isn't your goal.

Monitoring files entering and leaving the folder(file queue)

I have a Folder which is assumed to be a file queue, every second many files enter and leave the file queue. I want to write a script which will maintain 2 log files. One will log the name of file and time at which the file entered the file queue, the other will log the name of the file and time at which the file will leave the queue. Could you please help?
I think with a batch file you can't reach that goal.
I would prefer a C# console application or a powershell script. In both cases the FileSystemWatcher would be the needed class.
C# documentation and example
Powershell example

int-file:outbound-gateway ignored duplicate file name in `outbound-gateway ` memory state?

Thanks for attention, i using Spring Integration in my project, i want to retrieve files from servers into tmp folder by int-ftp:inbound-channel-adapterand move files to orginal forder by int-file:outbound-gateway for future batch processing, but i feel when file name is duplicate int-file:outbound-gateway not working for me and does not transmit the file and seems ignore them, how to solving this my problem.
<int-file:outbound-gateway id="tmp-mover"
request-channel="ready-to-process-inbound-tmp-mover"
reply-channel="ready-to-process-inbound"
directory="${backupRootPath}/ali/in//"
mode="REPLACE" delete-source-files="true"/>
Set the local-filter in the ftp inbound channel adapter to an AcceptAllFileListFilter. By default, it's an AcceptOnceFileListFilter.

Logstash close file descriptors?

BACKGROUND:
We have rsyslog creating log files directories like: /var/log/rsyslog/SERVER-NAME/LOG-DATE/LOG-FILE-NAME
So multiple servers are spilling out their logs of different dates to a central location.
Now to read these logs and store them in elasticsearch for analysing I have my logstash config file something like this:
file{
path => /var/log/rsyslog/**/*.log
}
ISSUE :
Now as number of log files in the directory increase, logstash opens file descriptors (FD) for new files and will not release FDs for already read log files.
Since log files are generated per date, once it is read, it is of no use after that since it will not be updated after that date.
I have increased the file openings limit to 65K in /etc/security/limits.conf
Can we make logstash close the handle after some time so that number of file handles opened do not increase too much ??
I think you may have hit this bug: http://github.com/elastic/logstash/issues/1604. Do you have the same symptoms? Exceptions in logs after some time? If you run sudo lsof | grep java | wc -l do you see the descriptors steadily increasing over time? (some of them might close, but some will stay open and their number will increase)
I've been tracking this issue for some time, and I don't know that it's properly solved.
We were in a similar boat, perhaps bigger: Logstash couldn't open handles for hundreds of thousands of log files on a box, even though very few of them written to actively. LOGSTASH-271 captured this issue, and there were some attempts to patch Logstash, including PR #1260.
It seems a fix may have made it's way into Logstash 1.5 with PR #1545, but I've never tested this personally. We ended up forking the underlying library Logstash uses to implement the file input, called FileWatch, into FFileWatch, which adds an "eviction mechanism".
The basic idea behind this approach is to only keep files open while they're being written. Normally, Logstash will open a handle on the file and keep it open forever, but FFileWatch adds an option to close the handle if the file has not changed recently (eviction_interval). I then created a custom build of Logstash using the forked gem.
Obviously this is less than ideal, but it worked for us. Eventually we dropped Logstash entirely for picking up log files, although we still use it further down the log processing pipeline. We implemented our own lightweight log shipper (Franz), which does not suffer from this issue.

logstash forwareder doesn't release file handle

I am running logstash forwareder to ship logs.
Forwarder,logstash,elasticsearch all are on localhost.
I have one UI application whose log files is read by shipper. When forwarder is running, archiving of log file doesn't work. logs are appended in same file. I have configured log file to archive every minute, so I can see the change. As soon as I stop forwarder, log file archiving starts working.
I guess forwarder keep holding file handle that's why file does not get archived.
Please help me on this.
regards,
Sunil
Running on windows? There exists known unfixed bugs.
See https://github.com/elasticsearch/logstash/issues/1944
for some kind of work around.

Resources