Bash curl call second command only when status is not 200 - bash

I looking for solution how combine two curl request in bash, and call second curl only when first doesnt return status 200.
I tried:
curl -s "https://example.com/first" || curl -s "https://example.com/second"
but it still call both because first curl is successful if return for example status 404.
How it is possible call second only when first doesnt return status 200?
Thanks for help.

curl -s -o /dev/null -w "%{http_code}" https://example.com | grep -q "^200$" || curl -s https://example.com/2.html
Edit: added improvement by #tripleee to not pollute output with grep output.

Related

Check if url returns 200 using bash

I need to check if the remote file exists based on the url response by doing:
curl -u myself:XXXXXX -Is https://mylink/path/to/file | head -1
What can give something like these:
'HTTP/1.1 200 OK
'
or
'HTTP/1.1 404 Not Found
'
Now, I want to extract the http status code like 200 from the resulting string above and assign the number to a variable. How can I do that?
Use the -o option to send the headers to /dev/null, and use the -w option to output only the status.
$ curl -o /dev/null -u myself:XXXXXX -Isw '%{http_code}\n' https://mylink/path/to/file
200
$
If you intended to capture the status to a variable, you can omit the newline from the format.
$ status=$(curl ... -o /dev/null -Isw '%{http_code}' ...)
Use grep:
curl -u myself:XXXXXX -Is https://mylink/path/to/file | head -1 | grep -o '[0-9][0-9][0-9]'
Nice and simple:
curl --output /dev/null --silent --head --fail http://google.com

If proxy is down, get a new one

I'm writing my first bash script
LANG="en_US.UTF8" ; export LANG
PROXY=$(shuf -n 1 proxy.txt)
export https_proxy=$PROXY
RUID=$(php -f randuid.php)
curl --data "mydata${RUID}" --user-agent "myuseragent" https://myurl.com/url -o "ticket.txt"
This script also use curl, but if proxy is down it gives me this error:
failed to connect PROXY:PORT
How can I make bash script run again, so it can get another proxy address from proxy.txt
Thanks in advance
Run it in a loop until the curl succeeds, for example:
export LANG="en_US.UTF8"
while true; do
PROXY=$(shuf -n 1 proxy.txt)
export https_proxy=$PROXY
RUID=$(php -f randuid.php)
curl --data "mydata${RUID}" --user-agent "myuseragent" https://myurl.com/url -o "ticket.txt" && break
done
Notice the && break at the end of the curl command.
That is, if the curl succeeds, break out of the infinite loop.
If you have multiple curl commands and you need all of them to succeed,
then chain them all together with &&, and add the break after the last one:
curl url1 && \
curl url2 && \
break
Lastly, as #Inian pointed out,
you could use the --proxy flag to pass a proxy URL to curl without the extra step of setting https_proxy, for example:
curl --proxy "$(shuf -n 1 proxy.txt)" --data "mydata${RUID}" --user-agent "myuseragent"
Lastly, note that due to the randomness, a randomly selected proxy may come up more than once until you find one that works.
Avoid that, you could read iterate over the shuffled proxies instead of an infinite loop:
export LANG="en_US.UTF8"
shuf proxy.txt | while read -r proxy; do
ruid=$(php -f randuid.php)
curl --proxy "$proxy" --data "mydata${ruid}" --user-agent "myuseragent" https://myurl.com/url -o "ticket.txt" && break
done
I also lowercased your user-defined variables,
as capitalization is not recommended for those.
I know i accepted #janos answer but since I can't edit his I'm going to add this
response=$(curl --proxy "$proxy" --silent --write-out "\n%{http_code}\n" https://myurl.com/url)
status_code=$(echo "$response" | sed -n '$p')
html=$(echo "$response" | sed '$d')
case "$status_code" in
200) echo 'Working!'
;;
*)
echo 'Not working, trying again!';
exec "$0" "$#"
esac
This will run my script again if it gives 503 status code which i wanted :)
And with #janos code it will run again if proxy is not working.
Thank you everyone i achieved what i wanted.

How to get response time from curl request (via command-line)

How can I capture the response time from a CURL request?
curl https://www.google.com
I found this approach in an article that returns a value in seconds
curl -o /dev/null -s -w %{time_total}\\n https://www.google.com
It outputs something like:
0.059

I need to execute a Curl script on Jenkins so as to check the status URL

The script should check for Http status code for the URL and should show error when status code doesn't match for eg. 200.
In Jenkins if this script fails then Build should get failed and Mail is triggered through post build Procedure.
Another interesting feature of curl is its -f/--fail option. If set, it will tell curl to fail on any HTTP error, i.e. curl will have an exit code different from 0, if the server response status code was not 1xx/2xx/3xx, i.e. if it was 4xx or above, so
curl --silent --fail "http://www.example.org/" >/dev/null
or (equivalently):
curl -sf "http://www.example.org/" >/dev/null
would have an exit code of 22 rather than 0, if the URL could not be found or if some other HTTP error occurred. See man curl for a description of curl's various exit codes.
You can use simple shell command as referred in this answer
curl -s -o /dev/null -w "%{http_code}" http://www.example.org/
This will happen if the following shell script is added:
response=$(curl -s -o /dev/null -w "%{http_code}\n" http://www.example.org/)
if [ "$response" != "200" ]
then
exit 1
fi
exit 1 will mark build as failed
Jenkins also has HTTP Request Plugin that can trigger HTTP requests.
For example this is how you can check response status and content:
def response = httpRequest "http://httpbin.org/response-headers?param1=${param1}"
println('Status: '+response.status)
println('Response: '+response.content)
You could try:
response=`curl -k -s -X GET --url "<url_of_the_request>"`
echo "${response}"
How about passing the URL at run time using curl in bashscript
URL=www.google.com
"curl --location --request GET URL"
How we can pass url at runtime ?

Script to get the HTTP status code of a list of urls?

I have a list of URLS that I need to check, to see if they still work or not. I would like to write a bash script that does that for me.
I only need the returned HTTP status code, i.e. 200, 404, 500 and so forth. Nothing more.
EDIT Note that there is an issue if the page says "404 not found" but returns a 200 OK message. It's a misconfigured web server, but you may have to consider this case.
For more on this, see Check if a URL goes to a page containing the text "404"
Curl has a specific option, --write-out, for this:
$ curl -o /dev/null --silent --head --write-out '%{http_code}\n' <url>
200
-o /dev/null throws away the usual output
--silent throws away the progress meter
--head makes a HEAD HTTP request, instead of GET
--write-out '%{http_code}\n' prints the required status code
To wrap this up in a complete Bash script:
#!/bin/bash
while read LINE; do
curl -o /dev/null --silent --head --write-out "%{http_code} $LINE\n" "$LINE"
done < url-list.txt
(Eagle-eyed readers will notice that this uses one curl process per URL, which imposes fork and TCP connection penalties. It would be faster if multiple URLs were combined in a single curl, but there isn't space to write out the monsterous repetition of options that curl requires to do this.)
wget --spider -S "http://url/to/be/checked" 2>&1 | grep "HTTP/" | awk '{print $2}'
prints only the status code for you
Extending the answer already provided by Phil. Adding parallelism to it is a no brainer in bash if you use xargs for the call.
Here the code:
xargs -n1 -P 10 curl -o /dev/null --silent --head --write-out '%{url_effective}: %{http_code}\n' < url.lst
-n1: use just one value (from the list) as argument to the curl call
-P10: Keep 10 curl processes alive at any time (i.e. 10 parallel connections)
Check the write_out parameter in the manual of curl for more data you can extract using it (times, etc).
In case it helps someone this is the call I'm currently using:
xargs -n1 -P 10 curl -o /dev/null --silent --head --write-out '%{url_effective};%{http_code};%{time_total};%{time_namelookup};%{time_connect};%{size_download};%{speed_download}\n' < url.lst | tee results.csv
It just outputs a bunch of data into a csv file that can be imported into any office tool.
This relies on widely available wget, present almost everywhere, even on Alpine Linux.
wget --server-response --spider --quiet "${url}" 2>&1 | awk 'NR==1{print $2}'
The explanations are as follow :
--quiet
Turn off Wget's output.
Source - wget man pages
--spider
[ ... ] it will not download the pages, just check that they are there. [ ... ]
Source - wget man pages
--server-response
Print the headers sent by HTTP servers and responses sent by FTP servers.
Source - wget man pages
What they don't say about --server-response is that those headers output are printed to standard error (sterr), thus the need to redirect to stdin.
The output sent to standard input, we can pipe it to awk to extract the HTTP status code. That code is :
the second ($2) non-blank group of characters: {$2}
on the very first line of the header: NR==1
And because we want to print it... {print $2}.
wget --server-response --spider --quiet "${url}" 2>&1 | awk 'NR==1{print $2}'
Use curl to fetch the HTTP-header only (not the whole file) and parse it:
$ curl -I --stderr /dev/null http://www.google.co.uk/index.html | head -1 | cut -d' ' -f2
200
wget -S -i *file* will get you the headers from each url in a file.
Filter though grep for the status code specifically.
I found a tool "webchk” written in Python. Returns a status code for a list of urls.
https://pypi.org/project/webchk/
Output looks like this:
▶ webchk -i ./dxieu.txt | grep '200'
http://salesforce-case-status.dxi.eu/login ... 200 OK (0.108)
https://support.dxi.eu/hc/en-gb ... 200 OK (0.389)
https://support.dxi.eu/hc/en-gb ... 200 OK (0.401)
Hope that helps!
Keeping in mind that curl is not always available (particularly in containers), there are issues with this solution:
wget --server-response --spider --quiet "${url}" 2>&1 | awk 'NR==1{print $2}'
which will return exit status of 0 even if the URL doesn't exist.
Alternatively, here is a reasonable container health-check for using wget:
wget -S --spider -q -t 1 "${url}" 2>&1 | grep "200 OK" > /dev/null
While it may not give you exact status out, it will at least give you a valid exit code based health responses (even with redirects on the endpoint).
Due to https://mywiki.wooledge.org/BashPitfalls#Non-atomic_writes_with_xargs_-P (output from parallel jobs in xargs risks being mixed), I would use GNU Parallel instead of xargs to parallelize:
cat url.lst |
parallel -P0 -q curl -o /dev/null --silent --head --write-out '%{url_effective}: %{http_code}\n' > outfile
In this particular case it may be safe to use xargs because the output is so short, so the problem with using xargs is rather that if someone later changes the code to do something bigger, it will no longer be safe. Or if someone reads this question and thinks he can replace curl with something else, then that may also not be safe.

Resources