I have a bash script that downloads some files from an ftp server. the problem is that sometimes curl returns errors 6 (can't resolve host) randomly! I can open the ftp via web browser without any problem. I also noticed that the most errors occurs on the first downloads. any idea?
Also I wanted to know that how can I make curl to retry download when these errors occur
Code I used:
curl -m 60 --retry 10 --retry-delay 10 --ftp-method multicwd -C - ftp://some_address/some_file --output ./some_file
note: I also tried the code without --ftp-method multicwd
OS: CentOS 6.5 64bit
while [ "$ret" != "0" ]; do curl [your options]; ret=$?; sleep 5; done
Assuming those are transitional problems with the server and/or DNS, looping might be of some help. This is a particularly good case for the rarely used (?) until loop:
until curl [your options]; do sleep 5; done
In addition, if using curl is not mandatory, maybe wget might be better suited for "unreliable" network connections. From the man:
GNU Wget is a free utility for non-interactive download of files from
the Web. It supports HTTP, HTTPS, and FTP protocols, as well as
retrieval through HTTP proxies.
[...]
Wget has been designed for robustness over slow or unstable network connections; if a download fails due to
a network problem, it will keep retrying until the whole file has been retrieved. If the server supports
regetting, it will instruct the server to continue the download from where it left off.
Related
As I work remotely I do often have to run scripts that make sense only when I am on intranet.
But I a not always connected to intranet and I would prefer to define a more generic way of testing the connectivity so I bypass when I am not in "work-mode" ;)
I to implement this as a simple bash command or script so I can do something like:
#!/bin/bash
is-intranet-on || echo "Yeah, time to do something!"
If I do this I will be include this even on crontab so I can have scheduled tasks that run only when connected to intranet.
I need to make this work on both MacOS and Linux. Currently I use OpenVPN but I think that testing for network interfaces would be the wrong approach because: I could configure the VPN on my router or I could be in the office.
My impression is that the final solution would have to involve some kind of DNS check, but I need to make it kinda safe as I don't want surprised from captive portals that may return me fake IP for a DNS entry.
If for example you know that you have a server named example.intra and the name is only resolvable within the intranet or on VPN, and lets say the name resolves to 10.1.1.3 and the machine is pingable, the code would simply be something like:
is_intranet_on() {
[[ $(dig example.intra +short) == "10.1.1.3" ]] && ping -c 1 example.intra &> /dev/null
}
Which checks that the dns name resolves to a specific ip and then pings the ip to make sure there is at least some kind of network connectivity.
The output will be return code 0 when there is connectivity and return code 1 when there is none. You can put this function in your script, or source it.
You could modify this to use curl instead to check for a https-site by first obtaining the certificate-chain from your server with something like:
openssl s_client -showcerts -servername example.intra -connect example.intra:443 > cacert.pem
This command saves the certs into a file named cacert.pem
And then using curl to check that the server is ok using the certificates:
[[ $(curl -s -I -L -m 4 --cacert cacert.pem https://example.intra | head -n 1) == "HTTP/1.1 200 OK" ]]
Change the string HTTP/1.1 200 OK to whatever your server responds if needed (for example a 204 status or whatever)
A site I'm working on requires a large amount of files to be downloaded from an external system via FTP, daily. This is not of my design, it is the only solution offered up by the external provider (I cannot use SSH/SFTP/SCP).
I've solved this by using wget, run inside a cron task:
wget -m -v -o log.txt --no-parent -nH -P /exampledirectory/ --user username --password password ftp://www.example.com/"
Unfortunately, wget does not seem to see the timestamp differences, so when a file is modified, it still returns:
Remote file no newer than local file
`/xxx/data/data.file'
-- not retrieving.
When I manually connect via FTP, I can see differences in the timestamps, so it should be getting the updated file. I'm not able to access or control the target server via any other means.
Is there anything I can do to get around this? Can I force wget to mirror while ignoring timestamps? (I understand this defeats the point of mirroring)...
I'm developing an installer for my YAMon script for *WRT routers (see http://www.dd-wrt.com/phpBB2/viewtopic.php?t=289324).
I'm currently testing on a TP-Link TL-WR1043ND with DD-WRT v3.0-r28647 std (01/02/16). Like many others, this firmware variant does not include curl so I (gracefully) fall back to a wget call. But, it appears that DD-WRT includes a cut-down version of wget so the -C and --no-cache options are not recognized.
Long & short, my wget calls insist on downloading cached versions of the requested files.
BTW - I'm using: wget "$src" -qO "$dst"
where src is the source file on my remote server and dst is the destination on the local router
So far I've unsuccessfully tried to:
1. append a timestamp to the request URL
2. reboot the router
3. run stopservice dnsmasq & startservice dnsmasq
None have changed the fact that I'm still getting a cached version of the file.
I'm beating my head against the wall... any suggestions? Thx!
Al
Not really an answer but a seemingly viable workaround...
After a lot of experimentation, I found that wget seems to always return the latest version of the file from the remote server if the extension on the requested file is '.html'; but if it is something else (e.g., '.txt' or '.sh'), it does not.
I have no clue why this happens or where they are cached.
But now that I do, all of the files required by my installer have an html extension on the remove server and the script saves them with the proper extension locally. (Sigh...several days of my life that I won't get back)
Al
I had the same prob. While getting images from a camera the HTTP server on the camera always send the same image.
wget --no-http-keep-alive ..
solved my problem
and my full line is
wget --no-check-certificate --no-cache --no-cookies --no-http-keep-alive $URL -O img.jpg -o wget_last.log
I've been trying to use socat to respond on each connection to a socket it's listening to with a fake HTTP reply. I cannot get it working. It might be because I'm using the cygwin version of socat? I don't know.
Part of the problem is I want the second argument <some_file_response> not to be written to. In other words because it's bidirectional it will read what's in response.txt and then write to that same file, and I don't want that. Even if I do open:response.txt,rdonly it doesn't work repeatedly. system: doesn't seem to do anything. exec seems like it works, for example I can do exec:'cat response.txt' but that never gets sent to the client connecting to port 1234.
socat -vv tcp-listen:1234,reuseaddr,fork <some_file_response>
I want it to read a file to the client that's connected and then close the connection, and do that over and over again (that's why I used fork).
I am putting a bounty on this question. Please only give me solutions that work with the cygwin version from the windows command prompt.
Tested with cygwin:
socat -vv TCP-LISTEN:1234,crlf,reuseaddr,fork SYSTEM:"echo HTTP/1.0 200; echo Content-Type\: text/plain; echo; cat <some_file_response>"
If you do not want a complete HTTP response, leave out the echos:
socat -vv TCP-LISTEN:1234,crlf,reuseaddr,fork SYSTEM:"cat <some_file_response>"
Taken from socat examples
socat -vv TCP-LISTEN:8000,crlf,reuseaddr,fork SYSTEM:"echo HTTP/1.0 200; echo Content-Type\: text/plain; echo; cat"
This one works:
socat -v -v -d -d TCP-LISTEN:8080,reuseaddr,fork exec:"cat http.response",pipes
Two things need to be aware,
should you add crlf, as in other answers. I recommend not.
crlf caused problem sending image
just use \r\n explicitly in http response headers.
without pipes, seems no data sent to client. browser complains:
127.0.0.1 didn’t send any data.
ERR_EMPTY_RESPONSE
tested in cygwin.
== EDIT ==
If you want use inside cmd.exe, make sure PATH is correctly set, so that socat and cat can be found.
Say both socat.exe and cat.exe located under E:\cygwin64\bin
set PATH=%PATH%;E:\cygwin64\bin
Works in cmd.exe, with socat & cat from cygwin.
I want to setup a simple ssh tunnel from a local machine to a machine on the internet.
I'm using
ssh -D 8080 -f -C -q -N -p 12122 <username>#<hostname>
Setup works fine (I think) cause ssh returs asking for the credentials, which I provide.
Then i do
export http_proxy=http://localhost:8080
and
wget http://www.google.com
Wget returns that the request has been sent to the proxy, but no data is received back.
What i need is a way to look at how ssh is processing the request....
To get more information out of your SSH connection for debugging, leave out the -q and -f options, and include -vvv:
ssh -D 8080 -vvv -N -p 12122 <username>#<hostname>
To address your actual problem, by using ssh -D you're essentially setting up a SOCKS proxy which I believe is not supported by default in wget.
You might have better luck with curl which provides SOCKS suport via the --socks option.
If you really really need to use wget, you'll have to recompile your own version to include socks support. There should be an option for ./configure somewhere along the lines of --with-socks.
Alternatively, look into tsock which can intercept outgoing network connections and redirecting them through a SOCKS server.