curl in bash script vs curl one liner - bash

This code ouputs a http status of 000 - which seems to indicate something didn't connect properly but when I do this curl outside of the bash script it works fine and produces a 200 so something with this code is off... any guidance?
#!/bin/bash
URLs=$(< test.txt | grep Url | awk -F\ ' { print $2 } ')
# printf "Preparing to check $URLs \n"
for line in $URLs
do curl -L -s -w "%{http_code} %{url_effective}\\n" $line
done
http://beerpla.net/2010/06/10/how-to-display-just-the-http-response-code-in-cli-curl/

your script works on my vt.
I added in a couple of debugging lines, this may help you to see where any metacharacters are getting in, as I would have to agree with the posted coments.
I've output lines in the for to a file which is then printed out with od.
I have amended the curl line to grab the last line, just to get the response code.
#!/bin/bash
echo -n > $HOME/Desktop/urltstfile # truncate urltstfile
URLs=$(cat testurl.txt | grep Url | awk -F\ ' { print $2 } ')
# printf "Preparing to check $URLs \n"
for line in $URLs
do echo $line >> $HOME/Desktop/urltstfile;
echo line:$line:
curl -IL -s -w "%{http_code}\n" $line | tail -1
done
od -c $HOME/Desktop/urltstfile
#do curl -L -s -w "%{http_code} %{url_effective}\\n" "$line\n"

Related

How do I check the HTTP status code and also parse the payload

Imagine I have the following code in a bash script:
curl -s https://cat-fact.herokuapp.com/facts/random?animal=cat | jq .
Notice that I wish to display the payload of the response by passing it to jq.
Now suppose sometimes those curls sometimes return a 404, in such cases my script currently still succeeds so what I need to do is check the return code and exit 1 as appropriate (e.g. for a 404 or 503). I've googled around and found https://superuser.com/a/442395/722402 which suggests --write-out "%{http_code}" might be useful however that simply prints the http_code after printing the payload:
curl -s --write-out "%{http_code}" https://cat-fact.herokuapp.com/facts/random?animal=cat | jq .
$ curl -s --write-out "%{http_code}" https://cat-fact.herokuapp.com/facts/random?animal=cat | jq .
{
"_id": "591f98783b90f7150a19c1ab",
"__v": 0,
"text": "Cats and kittens should be acquired in pairs whenever possible as cat families interact best in pairs.",
"updatedAt": "2018-12-05T05:56:30.384Z",
"createdAt": "2018-01-04T01:10:54.673Z",
"deleted": false,
"type": "cat",
"source": "api",
"used": false
}
200
What I actually want to is still output the payload, but still be able to check the http status code and fail accordingly. I'm a bash noob so am having trouble figuring this out. Help please?
I'm using a Mac by the way, not sure if that matters or not (I'm vaguely aware that some commands work differently on Mac)
Update, I've pieced this together which sorta works. I think. Its not very elegant though, I'm looking for something better.
func() {
echo "${#:1:$#-1}";
}
response=$(curl -s --write-out "%{http_code}" https://cat-fact.herokuapp.com/facts/random?animal=cat | jq .)
http_code=$(echo $response | awk '{print $NF}')
func $response | jq .
if [ $http_code == "503" ]; then
echo "Exiting with error due to 503"
exit 1
elif [ $http_code == "404" ]; then
echo "Exiting with error due to 404"
exit 1
fi
What about this. It uses a temporary file. Seems me a bit complicated but it separates your content.
# copy/paste doesn't work with the following
curl -s --write-out \
"%{http_code}" https://cat-fact.herokuapp.com/facts/random?animal=cat | \
tee test.txt | \ # split output to file and stdout
sed -e 's-.*\}--' | \ # remove everything before last '}'
grep 200 && \ # try to find string 200, only in success next step is done
echo && \ # a new-line to juice-up the output
cat test.txt | \ #
sed 's-}.*$-}-' | \ # removes the last line with status
jq # formmat json
Here a copy/paste version
curl -s --write-out "%{http_code}" https://cat-fact.herokuapp.com/facts/random?animal=cat | tee test.txt | sed -e 's-.*\}--' | grep 200 && echo && cat test.txt | sed 's-}.*$-}-' | jq
This is my attempt. Hope it works for you too.
#!/bin/bash
result=$( curl -i -s 'https://cat-fact.herokuapp.com/facts/random?animal=cat' )
status=$( echo "$result" | grep -E '^HTTPS?/[1-9][.][1-9] [1-9][0-9][0-9]' | grep -o ' [1-9][0-9][0-9] ')
payload=$( echo "$result" | sed -n '/^\s*$/,//{/^\s*$/ !p}' )
echo "STATUS : $status"
echo "PAYLOAD : $payload"
Output
STATUS : 200
PAYLOAD : {"_id":"591f98803b90f7150a19c23f","__v":0,"text":"Cats can't taste sweets.","updatedAt":"2018-12-05T05:56:30.384Z","createdAt":"2018-01-04T01:10:54.673Z","deleted":false,"type":"cat","source":"api","used":false}
AWK version
payload=$( echo "$result" | awk '{ if( $0 ~ /^\s*$/ ){ c_p = 1 ; next; } if (c_p) { print $0} }' )
Regards!
EDIT : I have simplified this even more by using the -i flag
EDIT II : Removed empty line from payload
EDIT III : Included an awk method to extract the payload in case sed is problematic
Borrowing from here you can do:
#!/bin/bash
result=$(curl -s --write-out "%{http_code}" https://cat-fact.herokuapp.com/facts/random?animal=cat)
http_code="${result: -3}"
response="${result:0:${#result}-3}"
echo "Response code: " $http_code
echo "Response: "
echo $response | jq
Where
${result: -3} is the 3rd index starting from the right of the string till the end. This ${result: -3:3} also would work: Index -3 with length 3
${#result} gives us the length of the string
${result:0:${#result}-3} from the beginning of result to the end minus 3 from the http_status code
The site cat-fact.herokuapp.com isn't working now so I had to test it with another site

Bash, loop unexpected stop

I'm having problems with this last part of my bash script. It receives input from 500 web addresses and is supposed to fetch the server information from each. It works for a bit but then just stops at like the 45 element. Any thoughts with my loop at the end?
#initializing variables
timeout=5
headerFile="lab06.output"
dataFile="fortune500.tsv"
dataURL="http://www.tech.mtu.edu/~toarney/sat3310/lab09/"
dataPath="/home/pjvaglic/Documents/labs/lab06/data/"
curlOptions="--fail --connect-timeout $timeout"
#creating the array
declare -a myWebsitearray
#obtaining the data file
wget $dataURL$dataFile -O $dataPath$dataFile
#getting rid of the crap from dos
sed -n "s/^m//" $dataPath$dataFile
readarray -t myWebsitesarray < <(cut -f3 -d$'\t' $dataPath$dataFile)
myWebsitesarray=("${myWebsitesarray[#]:1}")
websitesCount=${#myWebsitesarray[*]}
echo "There are $websitesCount websites in $dataPath$dataFile"
#echo -e ${myWebsitesarray[200]}
#printing each line in the array
for line in ${myWebsitesarray[*]}
do
echo "$line"
done
#run each website URL and gather header information
for line in "${myWebsitearray[#]}"
do
((count++))
echo -e "\\rPlease wait... $count of $websitesCount"
curl --head "$curlOptions" "$line" | awk '/Server: / {print $2 }' >> $dataPath$headerFile
done
#display results
echo "Results: "
sort $dataPath$headerFile | uniq -c | sort -n
It would certainly help if you actually passed the --connect-timeout option to curl. As written, you are currently passing the single argument --fail --connect-timeout $timeout rather than 3 distinct arguments --fail, --connect-timeout, and $timeout. This is one instance where you should not quote the variable. IOW, use:
curl --head $curlOptions "$line"

Check return code in bash while capturing text

When running an ldapsearch we get a return code indicating success or failure. This way we can use an if statement to check success.
On failure when using debug it prints if the cert validation failed. How can I capture the output of the command while checking the sucess or failure of ldapsearch?
ldapIP=`nslookup corpadssl.glb.intel.com | awk '/^Address: / { print $2 }' | cut -d' ' -f2`
server=`nslookup $ldapIP | awk -F"= " '/name/{print $2}'`
ldap='ldapsearch -x -d8 -H "ldaps://$ldapIP" -b "dc=corp,dc=xxxxx,dc=com" -D "name#am.corp.com" -w "366676" (mailNickname=sdent)"'
while true; do
if [[ $ldap ]] <-- capture text output here ??
then
:
else
echo $server $ldapIP `date` >> fail.txt
fi
sleep 5
done
As #codeforester suggested, you can use $? to check the return code of the last command.
ldapIP=`nslookup corpadssl.glb.intel.com | awk '/^Address: / { print $2 }' | cut -d' ' -f2`
server=`nslookup $ldapIP | awk -F"= " '/name/{print $2}'`
while true; do
captured=$(ldapsearch -x -d8 -H "ldaps://$ldapIP" -b "dc=corp,dc=xxxxx,dc=com" -D "name#am.corp.com" -w "366676" "(mailNickname=sdent)")
if [ $? -eq 0 ]
then
echo "${captured}"
else
echo "$server $ldapIP `date`" >> fail.txt
fi
sleep 5
done
EDIT: at #rici suggestion (and because I forgot to do it)... ldap needs to be run before the if.
EDIT2: at #Charles Duffy suggestion (we will get there), we don't need to store the command in a variable.

BASH shell script echo to output on same line

I have a simple BASH shell script which checks the HTTP response code of a curl command.
The logic is fine, but I am stuck on "simply" printing out the "output".
I am using GNU bash, version 3.2.25(1)-release (x86_64-redhat-linux-gnu)
I would like to output the URL with a tab - then the 404|200|501|502 response. For example:
http://www.google.co.uk<tab>200
I am also getting a strange error where the "http" part of a URL is being overwritten with the 200|404|501|502. Is there a basic BASH shell scripting (feature) which I am not using?
thanks
Miles.
#!/bin/bash
NAMES=`cat $1`
for i in $NAMES
do
URL=$i
statuscode=`curl -s -I -L $i |grep 'HTTP' | awk '{print $2}'`
case $statuscode in
200)
echo -ne $URL\t$statuscode;;
301)
echo -ne "\t $statuscode";;
302)
echo -ne "\t $statuscode";;
404)
echo -ne "\t $statuscode";;
esac
done
From this answer you can use the code
response=$(curl --write-out %{http_code} --silent --output /dev/null servername)
Substituted into your loop this would be
#!/bin/bash
NAMES=`cat $1`
for i in $NAMES
do
URL=$i
statuscode=$(curl --write-out %{http_code} --silent --output /dev/null $i)
case $statuscode in
200)
echo -e "$URL\t$statuscode" ;;
301)
echo -e "$URL\t$statuscode" ;;
302)
echo -e "$URL\t$statuscode" ;;
404)
echo -e "$URL\t$statuscode" ;;
* )
;;
esac
done
I've cleaned up the echo statements too so for each URL there is a new line.
try
200)
echo -ne "$URL\t$statuscode" ;;
I'm taking a stab here, but I think what's confusing you is the fact that curl is sometimes returning more than one header info (hence more than one status code) when the initial request gets redirected.
For example:
[me#hoe]$ curl -sIL www.google.com | awk '/HTTP/{print $2}'
302
200
When you're printing that in a loop, it would appear that the second status code has become part of the next URL.
If this is indeed your problem, then there are several ways to solve this depending on what you're trying to achieve.
If you don't want to follow redirections, simple leave out the -L option in curl
statuscode=$(curl -sI $i | awk '/HTTP/{print $2}')
To take only the last status code, simply pipe the whole command to tail -n1 to take only the last one.
statuscode=$(curl -sI $i | awk '/HTTP/{print $2}' | tail -n1)
To show all codes in the order, replace all linebreaks with spaces
statuscode=$(curl -sI $i | awk '/HTTP/{print $2}' | tr "\n" " ")
For example, using the 3rd scenario:
[me#home]$ cat script.sh
#!/bin/bash
for URL in www.stackoverflow.com stackoverflow.com stackoverflow.com/xxx
do
statuscode=$(curl -siL $i | awk '/^HTTP/{print $2}' | tr '\n' ' ')
echo -e "${URL}\t${statuscode}"
done
[me#home]$ ./script.sh
www.stackoverflow.com 301 200
stackoverflow.com 200
stackoverflow.com/xxx 404

AWK: execute CURL on each line and parse result

given an input stream with following lines:
123
456
789
098
...
I would like to call
curl -s http://foo.bar/some.php?id=xxx
with xxx being the number for each line, and everytime let an awk script fetch some information from the curl output which is written to the output stream. I am wondering if this is possible without using the awk "system()" call in following way:
cat lines | grep "^[0-9]*$" | awk '
{
system("curl -s " $0 \
" | awk \'{ #parsing; print }\'")
}'
You can use bash and avoid awk system call:
grep "^[0-9]*$" lines | while read line; do
curl -s "http://foo.bar/some.php?id=$line" | awk 'do your parsing ...'
done
A shell loop would achieve a similar result, as follows:
#!/bin/bash
for f in $(cat lines|grep "^[0-9]*$"); do
curl -s "http://foo.bar/some.php?id=$f" | awk '{....}'
done
Alternative methods for doing similar tasks include using Perl or Python with an HTTP client.
If your file gets dynamically appended the id's, you can daemonize a small while loop to keep checking for more data in the file, like this:
while IFS= read -d $'\n' -r a || sleep 1; do [[ -n "$a" ]] && curl -s "http://foo.bar/some.php?id=${a}"; done < lines.txt
Otherwise if it's static, you can change the sleep 1 to break and it will read the file and quit when there is no data left, pretty useful to know how to do.

Resources