Broken pipe, closing control connection. while piping small file through funzip using wget - ftp

I'm trying to download a small zip file (1159 bytes) and pipe it through funzip. This works great with larger files fro that server. However three small files give me an error:
Broken pipe, closing control connection.
I use the following code:
wget -O - --ftp-user=username --ftp-password=secret ftp://server/small-file.zip | funzip
Also downloading the file directly works good, only the piping to funzip doesn't work. I suspect the file is too small.
Anyone knows how to fix this?
Edit: Size doesn't seem to matter (don't let the girls tell you otherwise :)), even files of 400 bytes are not giving errors

Ok, if nobody can answer it, I'll answer it myself
I found there are two solutions, one is limiting the download rate for wget
--limit-rate=1000
This works for the files of around 1kb but now sometimes larger files seem to suffer from the same error. It also slows down the whole process.
Now I just pipe the download through a script that sleeps 1 second at the end. This seems to solve it.

Related

How can i CSTORE pixel data of dicom from CMOVE properly?

I have 1 running server for handle C-Move, 2 running server for handle C-Store and remote pacs server(GEPACS)
When i tried to C-Move command from remote pacs to C-Store handler, 1 server(py-netdicom) is build and save the file properly and 1 server(go-netdicom) is not.
So there was couple of problems in go-netdicom.
I fixed the code can handle hexadecimals. It originally not supported on go-netdicom.
This was fix almost every problems in my case but still cannot store pixel data properly.
For example, I got 9117252 bytes from original signal from remote pacs and I saved the data itself, but actually it needs to be 18000000 bytes(got an error). even CT images are short for 3 times(got approximately 180000, but need 524288)
I think the problem caused by might be the encapsulation of pixel-data but not sure.
Is there any tip or some help?
Thank you.
EDIT 4: I've got a clue.link here
Somehow C-STORE command have a kind of transfer syntax.
This offer to scp type(compressed or not) of data scu get.
But still I don't have a idea which part of go-netdicom has to be changed.
I'll delete "python" tag because this is not related with python anymore.
I found the solution.
Somehow, GEPACS send the certain transfer syntax for JPEG compression.
if go-netdicom doesn't have the TransferSyntaxUID then pick the GEPACS's first transfer syntax and that was for JPEG compression.
i just put bigendian and explicitvr (GEPACS default) when transfersyntax is empty.
which placed in contextmanager.go line 101 on AssociateRequest, line 127
Hope this result help someone.
Thank you
The problem here is that go-netdicom uses the first PresentationContext sent in the A_ASSOCIATE_RQ (As you can see in the last image). So it accepts "2.16.840.1.113709.1.2.2" which is a privative Transfer Syntax and it is not in the DICOM standart, so no one can manage the C-STORE at the end.
If you are reading this.. maybe you do not use go-netdicom but the problem could be the same if the error involves the transfer Syntax "2.16.840.1.113709.1.2.2", in the Centricity PACs documentation says: "It is expected that other vendors' applications will ignore all Presentation Context proposed with the GE Private Compress Express Transfer Syntax"
And that is what we are suppose to do. I see a list of PRs in go-netdicom so I suppose it is not mantained, so I will post the change for go-netdicom here. I made this changes in contextmanager.go and works like a charm:

Monitor A File For Additions And Get Last Added Line

I'm having trouble monitoring a file for changes. I need to be able to know when a file changes, and when it does, I need the new line that was added. I intend to parse each line and find ones that match certain criteria, and act on information in those lines. I know the expected number of matching lines ahead of time, but I do not know how many lines in total will be added to the file, or where the matching lines will be.
I've tried 2 packages so far, with no avail.
fsnotify/fsnotify
As fas as I can tell, fsnotify can only tell me when a file is modified, not what the details of the modification was. Since I need to know what exactly was added to the file, this is no good for me.
(As a side-question, can this be run in a loop? The example that I tried exited after just one modification. I need to monitor for multiple modifications.)
hpcloud/tail
This package tries to mimic the Unix tail command, but it seems to have its own issues. The output that I get includes timestamps and other data - I just want the added line, nothing else. Also, it seems to think a file has been modified multiple times, even when it's just one edit. Further, the deal breaker here is that it does not output the last line if the line was not followed by a newline character.
Delegating to tail
I came across this answer, which suggests to delegate this work to the tail command itself, but I need this to work cross-platform (specifically, macOS, Linux and Windows). I don't believe that an equivalent command exists on Windows.
How do I go about tackling this?
#user2515526,
Usually changed diff is out of scope of file watchers' functionality, because, you know, you could change an image, and a watcher would need to keep a track several Mb of a diff in memory, and what if we have thousands of files?
However, as bad as it sounds, this may be exactly the way you want to implement this (sure, depends on your app, etc. - could be fine for text files), i.e. - keeping a map of diffs (1 diff per file) since last modification. Cannot say I like it, but sounds like fsnotify has no support for changes/diffs that you need.
Also, regarding your question about running in a loop, maybe you can get some hints here: https://github.com/kataras/iris/blob/8370d76910cdd8de043753ed81ae080eae8dc798/utils/file.go
Its a framework that allows to build a server that watches for TypeScript file changes. So sounds similar to your case/question.
Cheers,
-D

zmodem upload ends up with strange error

I'm currently trying to upload some files via zmodem to a small system with an embedded linux with busybox. While most files takes a long time through the 9600 BAUD connection, there is one file that always fails (cramfs_cmc-pu2_v2.45.img). With about 4MB it is also the largest one. For the upload I use Le Putty, a Putty fork that supports zmodem. Unfortunately there is no other method to upload files as the ftp server on that machine does not work properly.
The problem is that the upload always ends up with this strange stuff (after some hours of no feedback at all):
# /usr/bin/rz
Sending: cramfs_cmc-pu2_v2.45.img23be50
Bytes Sent: 0/4132864 BPS:0 ETA 00:00
®B#id##íÁ##htCJÁ®B#killíÁ##htCJ®B#killall#íÁ##htCJÁ®B#ln##íÁ##htCJ®B
#logger##íÁ##<H#Jº!#login###íÁ##htCJÁ®B#ls##íÁ##htCJ®B#md5sum##íÁ##¿
##JCø##mgfestart###íÁ##htCJ®B#mkdir###íÁ##htCJ®B#mknod###íÁ##htCJkH>
F¾#
I guessed that it runs out of flash memory but df gives me just
df: /proc/mounts: No such file or directory
Calculation of free space is difficult in that case anyway as the filesystem is jffs2.
Maybe there is anyone with an idea how to solve this problem with that ancient protocol. Thanks in advance.
Edit: Meanwhile I've splitted the file in many smaller ones and tried to upload them. It always fails after two files. This supports the suspicion that there is not enough free space.
Quite simple approach to check how much space there is left, even if you have no "df":
I just copied an existing file several times and the result was: "No space left on the device". So I'm pretty sure that the strange behaviour described above happened because of this.

How do you fix wget from comingling download data when running multiple concurrent instances?

I am running a script that in turn calls another script multiple times in the background with different sets of parameters.
The secondary script first does a wget on an ftp url to get a listing of files at that url. It outputs that to a unique filename.
Simplified example:
Each of these is being called by a separate instance of the secondary script running in the background.
wget --no-verbose 'ftp://foo.com/' -O '/downloads/foo/foo_listing.html' >foo.log
wget --no-verbose 'ftp://bar.com/' -O '/downloads/bar/bar_listing.html' >bar.log
When I run the secondary script once at a time, everything behaves as expected. I get an html file with a list of files, links to them, and information about the files the same way I would when viewing an ftp url through a browser.
Continued simplified one at a time (and expected) example results:
foo_listing.html:
...
foo1.xml ...
foo2.xml ...
...
bar_listing.html:
...
bar3.xml ...
bar4.xml ...
...
When I run the secondary script many times in the background, some of the resulting files, although they have the base urls correct (the one that was passed in) the files listed are from a different run of wget.
Continued simplified multiprocessing (and actual) example results:
foo_listing.html:
...
bar3.xml ...
bar4.xml ...
...
bar_listing.html
correct, as above
Oddly enough, all other files I download seem to work just fine. It's only these listing files that get jumbled up.
The current workaround is to put in a 5 second delay between backgrounded processes. With only that one change everything works perfectly.
Does anybody know how to fix this?
Please don't recommend using some other method of getting the listing files or not running concurrently. I'd like to actually know how to fix this when using wget in many backgrounded processes if possible.
EDIT:
Note:
I am not referring to the status output that wget spews to the screen. I don't care at all about that (that is actually also being stored in separate log files and is working correctly). I'm referring to the data wget is downloading from the web.
Also, I cannot show the exact code that I am using as it is proprietary for my company. There is nothing "wrong" with my code as it works perfectly when putting in a 5 second delay between backgrounded instances.
Log bug with Gnu, use something else for now whenever possible, put in time delays between concurrent runs. Possibly create a wrapper for getting ftp directory listings that only allows one at a time to be retrieved.
:-/

Verify whether ftp is complete or not?

I got an application which is polling on a folder continuously. Once any file is ftp to the folder, the application has to move this file to some other folder for processing.
Here, we don't have any option to verify whether ftp is complete or not.
One command "lsof" is suggested in the technical forums. It got a file description column which gives the file status.
Since, this is a free bsd command and not present in old versions of linux, I want to clarify the usage of this command.
Can you guys tell us your experience in file verification and is there any other alternative solution available?
Also, is there any risk in using this utility?
Appreciate your help in advance.
Thanks,
Mathew Liju
We've done this before in a number of different ways.
Method one:
If you can control the process sending the files, have it send the file itself followed by a sentinel file. For example, send the real file "contracts.doc" followed by a one-byte "contracts.doc.sentinel".
Then have your listener process watch out for the sentinel files. When one of them is created, you should process the equivalent data file, then delete both.
Any data file that's more than a day old and doesn't have a corresponding sentinel file, get rid of it - it was a failed transmission.
Method two:
Keep an eye on the files themselves (specifically the last modification date/time). Only process files whose modification time is more than N minutes in the past. That increases the latency of processing the files but you can usually be certain that, if a file hasn't been written to in five minutes (for example), it's done.
Conclusion:
Both those methods have been used by us successfully in the past. I prefer the first but we had to use the second one once when we were not allowed to change the process sending the files.
The advantage of the first one is that you know the file is ready when the sentinel file appears. With both lsof (I'm assuming you're treating files that aren't open by any process as ready for processing) and the timestamps, it's possible that the FTP crashed in the middle and you may be processing half a file.
There are normally three approaches to this sort of problem.
providing a signal file so that when your file is transferred, an additional file is sent to mark that transfer is complete
add an entry to a log file within that directory to indicate a transfer is complete (this really only works if you have a single peer updating the directory, to avoid concurrency issues)
parsing the file to determine completeness. e.g. does the file start with a length field, or is it obviously incomplete ? e.g. parsing an incomplete XML file will result in a parse error due to the lack of an end element. Depending on your file's size and format, this can be trivial, or can be very time-consuming.
lsof would possibly be an option, although you've identified your Linux portability issue. If you use this, note the -F option, which formats the output suitable for processing by other programs, rather than being human-readable.
EDIT: Pax identified a fourth (!) method I'd forgotten - using the fact that the timestamp of the file hasn't updated in some time.
There is a fifth method. You can also check if the FTP Session is still active. This will work if every peer has it's own ftp user account. As long as the user is not logged off from FTP, assume the files are not complete.

Resources