How to only read a few lines from a remote file? - ruby

Before downloading file, I need to set up a way it (the .csv typically, but not always) will be parsed.
I don't want to download the whole file especially if the "headers" do not match what is expected.
Is there a way to only download up until a certain number of byes and then gracefully kill the connection?

There's no explicit support for this in an FTP protocol.
There's an expired draft for RANG command that would allow this:
https://datatracker.ietf.org/doc/html/draft-bryan-ftp-range-08
But that's obviously supported by only new FTP servers.
Though there's nothing that prevents you from initiating a normal (full) download and forcefully break it as soon you get the amount of data you need.
All you need to do is to close the data transfer connection. This is basically what all FTP clients do, when an end user decides to abort the transfer.
This approach might result in few error messages in an FTP server log.
If you can use an SFTP protocol, then it's easy. The SFTP supports this natively.

Related

SMB2 opens for Query Directory requests

Is there a way to definitely identify whether an open request sent by SMB2 client is for reading a file or enumerating the directory(SMB2 query directory request)? Sometimes, the requests are compounded and in some cases they are not. Does any flag in the open request indicate that the open which the client is attempting to do is to read the contents of the directory and not the file?
I know that the create option flag cannot be relied upon for this as few clients may ignore to set that flag.

How to refresh Panic Transmit file list?

I like the Panic Transmit client for FTP and SFTP, but have lost work a couple of times because the file list is cached and can't be completly refreshed easily.
The Refresh option in the View menu only refreshes the current directory, and doesn't do the subdirectories.
I've contacted Panic about this and got a response that it's the way it works, and they would like to change it but not in this release. I've tried a couple of other FTP clients and find them lacking, eg. Fetch only shows the remote side and uses the Finder for the local side, this gets confusing quite quickly.
Does anyone know where Transmit keeps the cache of the file list so I can delete it and get a full refresh?
If not, it's back to the future with SCP, RSYNC and command line FTP.
I found a crude workaround for this. Transmit keeps the cache in memory, so if you quit the application it is cleared. I just make a habit of always using quit from the dock before any usage that requires up to date timestamps.

Transfer a big file in golang

Client send file, the size may be more than 5G, to slave server, and than slave send to master server.
Will the slave save temp file to itself? I do not want it happen because it will slow the upload speed and waste the slave's memory.
Any way to avoid this? And what is the best way to transfer a big file in golang?
Yes, there's a standard way to avoid store-and-forward approach: as soon as a client connects the slave server the latter should open a connection to the master server and then just stream the data from the client there. Typically this is done using the io.Copy() function. Thanks to Go's excellent duck typing using interfaces, this works for TCP connections and HTTP requests/responses.
(To get better explanation(s) you have to narrow your question down.)
A part of the solution does even appear in the similar questions suggested by stackoverflow—here it is.

Client-server synchronization

I found a nice answer of S.Lott about what I've been searching for:
Client-server synchronization pattern / algorithm?
But my question is now, what if the client has a wrong time?
Here's my problem:
Let's say the time of the client is 1 hour behind the servers, then the client changes a file, so the last write time is now 1 hour behind the servers. When the user starts his program which synchronizes the file, the server says to the changed file: "Oh, that file you have there is 1 hour older than mine, so let's replace it", but that's wrong, because the users file is actually newer, so it should be uploaded to the server.
I need a system that checks if the file is newer on server or on the client, and that doesn't work if the time is wrong or different.
Any ideas?
By the way, I am trying to write a cloud program.
If you're resolving conflicts manually (which would make sense for most applications), this can probably be done better with versioning rather than timestamps. When a client modifies a file, set a flag. When synchronizing, check the flag and versions.
If the client flag is set and the client and server versions are the same, send the client file to the server.
If the client flag is not set and the server version is newer, send the server file to the client.
If the client flag is set and the server version is newer, a conflict occurred and should be resolved.
The versions are per-file and should be sent along with the files.
Reset all client flags after synchronization.
This 'flag' can just be a check whether the last modified time on the file is different from the time that file was received from the server (we can store this time separately right after getting the file from the server).
Alternatively, you could sync the time.
Here's one possible solution:
When receiving files from the server, first get the current time from the server, then offset the timestamp of each file received on the client side by the difference between the server and client time. When sending files to the server, you can do something similar by offsetting by the client time.
But this seems more complex than necessary.

Scripting a major multi-file multi-server FTP upload: is smart interrupted transfer resuming possible?

I'm trying to upload several hundred files to 10+ different servers. I previously accomplished this using FileZilla, but I'm trying to make it go using just common command-line tools and shell scripts so that it isn't dependent on working from a particular host.
Right now I have a shell script that takes a list of servers (in ftp://user:pass#host.com format) and spawns a new background instance of 'ftp ftp://user:pass#host.com < batch.file' for each server.
This works in principle, but as soon as the connection to a given server times out/resets/gets interrupted, it breaks. While all the other transfers keep going, I have no way of resuming whichever transfer(s) have been interrupted. The only way to know if this has happened is to check each receiving server by hand. This sucks!
Right now I'm looking at wput and lftp, but these would require installation on whichever host I want to run the upload from. Any suggestions on how to accomplish this in a simpler way?
I would recommend using rsync. It's really good at only transferring just the data that's been changed during a transfer. Much more efficient than FTP! More info on how to resume interrupted connections with an example can be found here. Hope that helps!

Resources