How to retrieve the real redirect location header with Curl? without using {redirect_url} - bash

I realized that Curl {redirect_url} does not always show the same redirect URL. For example if the URL header isLocation: https:/\example.com this will redirect to https:/\example.com but curl {redirect_url} shows redirect_url: https://host-domain.com/https:/\example.com and it won't display the response real location header. (I like to see the real location: result.)
This is the BASH I'm working with:
#!/bin/bash
# Usage: urls-checker.sh domains.txt
FILE="$1"
while read -r LINE; do
# read the response to a variable
response=$(curl -H 'Cache-Control: no-cache' -s -k --max-time 2 --write-out '%{http_code} %{size_header} %{redirect_url} ' "$LINE")
# get the title
title=$(sed -n 's/.*<title>\(.*\)<\/title>.*/\1/ip;T;q'<<<"$response")
# read the write-out from the last line
read -r http_code size_header redirect_url < <(tail -n 1 <<<"$response")
printf "***Url: %s\n\n" "$LINE"
printf "Status: %s\n\n" "$http_code"
printf "Size: %s\n\n" "$size_header"
printf "Redirect-url: %s\n\n" "$redirect_url"
printf "Title: %s\n\n" "$title"
# -c 20 only shows the 20 first chars from response
printf "Body: %s\n\n" "$(head -c 100 <<<"$response")"
done < "${FILE}"
How can I printf "Redirect-url: the original requested location: header without having to use redirect_url?

To read the exact Location header field value, as returned by the server, you can use the -i/--include option, in combination with grep.
For example:
$ curl 'http://httpbin.org/redirect-to?url=http:/\example.com' -si | grep -oP 'Location: \K.*'
http:/\example.com
Or, if you want to read all headers, content and the --write-out variables line (according to your script):
response=$(curl -H 'Cache-Control: no-cache' -s -i -k --max-time 2 --write-out '%{http_code} %{size_header} %{redirect_url} ' "$url")
# break the response in parts
headers=$(sed -n '1,/^\r$/p' <<<"$response")
content=$(sed -e '1,/^\r$/d' -e '$d' <<<"$response")
read -r http_code size_header redirect_url < <(tail -n1 <<<"$response")
# get the real Location
location=$(grep -oP 'Location: \K.*' <<<"$headers")
Fully integrated in your script, this looks like:
#!/bin/bash
# Usage: urls-checker.sh domains.txt
file="$1"
while read -r url; do
# read the response to a variable
response=$(curl -H 'Cache-Control: no-cache' -s -i -k --max-time 2 --write-out '%{http_code} %{size_header} %{redirect_url} ' "$url")
# break the response in parts
headers=$(sed -n '1,/^\r$/p' <<<"$response")
content=$(sed -e '1,/^\r$/d' -e '$d' <<<"$response")
read -r http_code size_header redirect_url < <(tail -n1 <<<"$response")
# get the real Location
location=$(grep -oP 'Location: \K.*' <<<"$headers")
# get the title
title=$(sed -n 's/.*<title>\(.*\)<\/title>.*/\1/ip;T;q'<<<"$content")
printf "***Url: %s\n\n" "$url"
printf "Status: %s\n\n" "$http_code"
printf "Size: %s\n\n" "$size_header"
printf "Redirect-url: %s\n\n" "$location"
printf "Title: %s\n\n" "$title"
printf "Body: %s\n\n" "$(head -c 100 <<<"$content")"
done < "$file"

According to #randomir answer and since I was only need raw redirect URL I use this command on my batch
curl -w "%{redirect_url}" -o /dev/null -s "https://stackoverflow.com/q/46507336/3019002"

https:/\example.com is not a legal URL(*). The fact that this works in browsers in an abomination (that I've fought against) and curl doesn't. %{redirect_url} shows exactly the URL curl would redirect to...
A URL should use to forward slashes, so the above should look like http://example.com.
(*) = I refuse to accept the WHATWG "definition".

Related

How to insert name field in mailx

My sample data File is
$ cat /fullpath/myfile.csv
a#gmail.com, A Singh
k#gmail.com, K Singh
I am using script.sh
#!/bin/bash
while IFS= read -r line
do
email=$(echo $line | awk -F, '{print $1 }')
name=$(echo $line | awk -F, '{print $2 }')
echo | mailx -v -s "Helo $name" -S smtp-use-starttls -S ssl-verify=ignore -S smtp-auth=login -S smtp=smtp://smtp.gmail.com:587 -S from="xxxx#gmail.com(John Smith)" -S smtp-auth-user=xxxx#gmail.com -S smtp-auth-password=xxxxpassword -S ssl-verify=ignore -S nss-config-dir=~/.certs "$name<$email>"
done < /fullpath/myfile.csv
what is the correct syntax of adding receiver name
I am looking for syntax which I am not able to find
I tried below
"$name<$email>"
$name<$email>
-S to:"$name<$email>"
-S To:"$name<$email>"
-S To: "$name <$email>"
-S To: $name <$email>
its picking names (A Singh) as email and say invalid email. if i use To, it pick TO as email. i.e. whatever come 1st after certs code pic that as email.
According to the standard documentation, mailx does not seem to support the -S option, but some systems may add this option.
I recommend you use GNU Mailutils.
To specify a "FROM" name and address, you can use the "-a" option.
-a header:value
--append=header:value
Append the given header to the composed message.
To specify the receiver name and address, just like you did, add name and email to the end of the command will do the work.
#!/bin/bash
while IFS= read -r line
do
email=$(echo $line | awk -F, '{print $1 }')
name=$(echo $line | awk -F, '{print $2 }')
mail -s "Hello $name" -a "From: John Smith<xxxx#gmail.com>" "$name<$email>"
#echo | mailx -v -s "Helo $name" -S smtp-use-starttls -S ssl-verify=ignore -S smtp-auth=login -S smtp=smtp://smtp.gmail.com:587 -S from="xxxx#gmail.com(John Smith)" -S smtp-auth-user=xxxx#gmail.com -S smtp-auth-password=xxxxpassword## -S ssl-verify=ignore -S nss-config-dir=~/.certs "$name<$email>"
done < ./myfile.csv

Exiting while loop bash script from tail

I have a script that tails a log file, and then uploads the line. I would like to have it exit as soon as the first line is read:
#!/bin/bash
tail -n0 -F "$1" | while read LINE; do
(echo "$LINE" | grep -e "$3") && curl -X POST --silent --data-urlencode \
"payload={\"text\": \"$(echo $LINE | sed "s/\"/'/g")\"}" "$2";
done
If you want to exit as soon as the first line is uploaded you can just add a break:
#!/bin/bash
tail -n0 -F "$1" | while read LINE; do
(echo "$LINE" | grep -e "$3") && curl -X POST --silent --data-urlencode \
"payload={\"text\": \"$(echo $LINE | sed "s/\"/'/g")\"}" "$2" && break;
done
The issue was the tail command wasn't getting killed. A slightly modified version of my script (I didn't end up needing the echo to stdout)
#!/bin/bash
tail -n0 -F "$1" | while read LINE; do
curl -X POST --data-urlencode "payload={\"text\": \"$(echo $LINE | sed "s/\"/'/g")\"}" "$2" && pkill -P $$ tail
done
This answer helped as well: https://superuser.com/questions/270529/monitoring-a-file-until-a-string-is-found

How to use Curl to also output first 20 characters from the response?

I'm using curl to retrieve the http_code size_header redirect_url and Website Title with:
#!/bin/bash
FILE="$1"
while read LINE; do
curl -H 'Cache-Control: no-cache' -i -s -k -o >(perl -l -0777 -ne 'print $1 if /<title.*?>\s*(.*?)\s*<\/title/si') --silent --max-time 2 --write-out '%{http_code} %{size_header} %{redirect_url} ' "$LINE"
echo " $LINE"
done < ${FILE}
but I like to also retrieve the first 20 characters from the response to have more information.
The idea is to get this output
%{http_code} %{size_header} %{redirect_url} $website_title $website_first_20_bytes
I only need to add the $website_first_20_bytes to the output. How can I achieve this?
PS: No the first 20 characters from header response. Only the source.
So you probably mean something like this then (I added a bunch of newlines etc. to the output which you can trim as you please):
#!/bin/bash
FILE="$1"
while read -r LINE; do
# read the response to a variable
response=$(curl -H 'Cache-Control: no-cache' -s -k --max-time 2 --write-out '%{http_code} %{size_header} %{redirect_url} ' "$LINE")
# get the title
title=$(sed -n 's/.*<title>\(.*\)<\/title>.*/\1/ip;T;q'<<<"$response")
# read the write-out from the last line
read -r http_code size_header redirect_url < <(tail -n 1 <<<"$response")
printf "Status: %s\n" "$http_code"
printf "Size: %s\n" "$size_header"
printf "Redirect-url: %s\n" "$redirect_url"
printf "Url: %s\n" "$LINE"
printf "Title: %s" "$title"
# -c 20 only shows the 20 first chars from response
printf "Body: %s" "$(head -c 20 <<<"$response")"
done < ${FILE}

Curl echo Server response Bash

I'm trying to create a bash script that check url from list status code and echo server name from header. I'm actually new.
#!/bin/bash
while read LINE; do
curl -o /dev/null --silent --head --write-out '%{http_code}' "$LINE"
echo " $LINE" &
curl -I /dev/null --silent --head | grep -Fi Server "$SERVER"
echo " $SERVER"
done < dominios-https
I get the following output
301 http://example.com
grep: : No such file or directory
1) while read LINE can not use last line if text file not ended with new line.
2) You don't set "$SERVER" anywhere, and grep say it
3) Not all servers return "Server:" in headers
try it:
scriptDir=$( dirname -- "$0" )
for siteUrl in $( < "$scriptDir/myUrl.txt" )
do
if [[ -z "$siteUrl" ]]; then break; fi # break line if him empty
httpCode=$( curl -I -o /dev/null --silent --head --write-out '%{http_code}' "$siteUrl" )
echo "HTTP_CODE = $httpCode"
headServer=$( curl -I --silent --head "$siteUrl" | grep "Server" | awk '{print $2}' )
echo "Server header = $headServer"
done

BASH output column formatting

First time posting. HELLO WORLD. Working on my first script that just simply checks if a list of my websites are online and then returns the HTTP code and the amount of time it took to return that to another file on my desktop.
-- THIS SCRIPT WILL BE RUNNING ON MAC OSX --
I would like to amend my script so that it formats its output into 3 neat columns.
currently
#!/bin/bash
file="/Users/USER12/Desktop/url-list.txt"
printf "" > /Users/USER12/Desktop/url-results.txt
while read line
do
printf "$line" >> /Users/USER12/Desktop/url-results.txt
printf "\t\t\t\t" >> /Users/USER12/Desktop/url-results.txt
curl -o /dev/null --silent --head --write-out '%{http_code} %{time_total}' "$line" >> /Users/USER12/Desktop/url-results.txt
printf "\n" >> /Users/USER12/Desktop/url-results.txt
done <"$file"
which outputs in the following format
google.com 200 0.389
facebook.com 200 0.511
abnormallyLongDomain.com 200 0.786
but i would like to format into neat aligned columns for easy reading
DOMAIN_NAME HTTP_CODE RESPONSE_TIME
google.com 200 0.389
facebook.com 200 0.511
abnormallyLongDomain.com 200 0.486
Thanks for the help everyone!!
column is very nice. You are, however, already using printf which gives you fine control over the output format. Using printf's features also allows the code to be somewhat simplified:
#!/bin/bash
file="/Users/USER12/Desktop/url-list.txt"
log="/Users/USER12/Desktop/url-results.txt"
fmt="%-25s%-12s%-12s\n"
printf "$fmt" DOMAIN_NAME HTTP_CODE RESPONSE_TIME > "$log"
while read line
do
read code time < <(curl -o /dev/null --silent --head --write-out '%{http_code} %{time_total}' "$line")
printf "$fmt" "$line" "$code" "$time" >> "$log"
done <"$file"
With the above defined format, the output looks like:
DOMAIN_NAME HTTP_CODE RESPONSE_TIME
google.com 301 0.305
facebook.com 301 0.415
abnormallyLongDomain.com 000 0.000
You can fine-tune the output format, such as spacing or alignment, by changing the fmt variable in the script.
Further Refinements
The above code opens and closes the log file with each loop. This can be avoided as Charles Duffy suggests, simply by using exec to redirect stdout to the log file before the first printf statement:
#!/bin/bash
file="/Users/USER12/Desktop/url-list.txt"
exec >"/Users/USER12/Desktop/url-results.txt"
fmt="%-25s%-12s%-12s\n"
printf "$fmt" DOMAIN_NAME HTTP_CODE RESPONSE_TIME
while read line
do
read code time < <(curl -o /dev/null --silent --head --write-out '%{http_code} %{time_total}' "$line")
printf "$fmt" "$line" "$code" "$time"
done <"$file"
Alternatively, as Chepner suggests, the print statements can be grouped:
#!/bin/bash
file="/Users/USER12/Desktop/url-list.txt"
fmt="%-25s%-12s%-12s\n"
{
printf "$fmt" DOMAIN_NAME HTTP_CODE RESPONSE_TIME
while read line
do
read code time < <(curl -o /dev/null --silent --head --write-out '%{http_code} %{time_total}' "$line")
printf "$fmt" "$line" "$code" "$time"
done <"$file"
} >"/Users/USER12/Desktop/url-results.txt"
An advantage of grouping is that, after the group, stdout is automatically restored to its normal value.
Shortened a bit
#!/bin/bash
file="./url.txt"
fmt="%s\t%s\t%s\n"
( printf "$fmt" "DOMAIN_NAME" "HTTP_CODE" "RESPONSE_TIME"
while read -r line
do
printf "$fmt" "$line" $(curl -o /dev/null --silent --head --write-out '%{http_code} %{time_total}' "$line")
done <"$file" ) | column -t > ./out.txt
Don't need redirect every printf but you can enclose the part of your script into (...) and run it in an subshell a redirect it's output. Print every field separated with one tab and use the column command to format it nicely.
Anyway, usually is better don't put filenames (nor headers) into the script and reduce it to
#!/bin/bash
while read -r line
do
printf "%s\t%s\t%s\n" "$line" $(curl -o /dev/null --silent --head --write-out '%{http_code} %{time_total}' "$line")
done | column -t
and use it like:
myscript.sh < url-list.txt >result.txt
this allows you use your script in pipes, like:
something_produces_urls | myscript.sh | grep 200 > somewhere.txt

Resources