BASH output column formatting - bash

First time posting. HELLO WORLD. Working on my first script that just simply checks if a list of my websites are online and then returns the HTTP code and the amount of time it took to return that to another file on my desktop.
-- THIS SCRIPT WILL BE RUNNING ON MAC OSX --
I would like to amend my script so that it formats its output into 3 neat columns.
currently
#!/bin/bash
file="/Users/USER12/Desktop/url-list.txt"
printf "" > /Users/USER12/Desktop/url-results.txt
while read line
do
printf "$line" >> /Users/USER12/Desktop/url-results.txt
printf "\t\t\t\t" >> /Users/USER12/Desktop/url-results.txt
curl -o /dev/null --silent --head --write-out '%{http_code} %{time_total}' "$line" >> /Users/USER12/Desktop/url-results.txt
printf "\n" >> /Users/USER12/Desktop/url-results.txt
done <"$file"
which outputs in the following format
google.com 200 0.389
facebook.com 200 0.511
abnormallyLongDomain.com 200 0.786
but i would like to format into neat aligned columns for easy reading
DOMAIN_NAME HTTP_CODE RESPONSE_TIME
google.com 200 0.389
facebook.com 200 0.511
abnormallyLongDomain.com 200 0.486
Thanks for the help everyone!!

column is very nice. You are, however, already using printf which gives you fine control over the output format. Using printf's features also allows the code to be somewhat simplified:
#!/bin/bash
file="/Users/USER12/Desktop/url-list.txt"
log="/Users/USER12/Desktop/url-results.txt"
fmt="%-25s%-12s%-12s\n"
printf "$fmt" DOMAIN_NAME HTTP_CODE RESPONSE_TIME > "$log"
while read line
do
read code time < <(curl -o /dev/null --silent --head --write-out '%{http_code} %{time_total}' "$line")
printf "$fmt" "$line" "$code" "$time" >> "$log"
done <"$file"
With the above defined format, the output looks like:
DOMAIN_NAME HTTP_CODE RESPONSE_TIME
google.com 301 0.305
facebook.com 301 0.415
abnormallyLongDomain.com 000 0.000
You can fine-tune the output format, such as spacing or alignment, by changing the fmt variable in the script.
Further Refinements
The above code opens and closes the log file with each loop. This can be avoided as Charles Duffy suggests, simply by using exec to redirect stdout to the log file before the first printf statement:
#!/bin/bash
file="/Users/USER12/Desktop/url-list.txt"
exec >"/Users/USER12/Desktop/url-results.txt"
fmt="%-25s%-12s%-12s\n"
printf "$fmt" DOMAIN_NAME HTTP_CODE RESPONSE_TIME
while read line
do
read code time < <(curl -o /dev/null --silent --head --write-out '%{http_code} %{time_total}' "$line")
printf "$fmt" "$line" "$code" "$time"
done <"$file"
Alternatively, as Chepner suggests, the print statements can be grouped:
#!/bin/bash
file="/Users/USER12/Desktop/url-list.txt"
fmt="%-25s%-12s%-12s\n"
{
printf "$fmt" DOMAIN_NAME HTTP_CODE RESPONSE_TIME
while read line
do
read code time < <(curl -o /dev/null --silent --head --write-out '%{http_code} %{time_total}' "$line")
printf "$fmt" "$line" "$code" "$time"
done <"$file"
} >"/Users/USER12/Desktop/url-results.txt"
An advantage of grouping is that, after the group, stdout is automatically restored to its normal value.

Shortened a bit
#!/bin/bash
file="./url.txt"
fmt="%s\t%s\t%s\n"
( printf "$fmt" "DOMAIN_NAME" "HTTP_CODE" "RESPONSE_TIME"
while read -r line
do
printf "$fmt" "$line" $(curl -o /dev/null --silent --head --write-out '%{http_code} %{time_total}' "$line")
done <"$file" ) | column -t > ./out.txt
Don't need redirect every printf but you can enclose the part of your script into (...) and run it in an subshell a redirect it's output. Print every field separated with one tab and use the column command to format it nicely.
Anyway, usually is better don't put filenames (nor headers) into the script and reduce it to
#!/bin/bash
while read -r line
do
printf "%s\t%s\t%s\n" "$line" $(curl -o /dev/null --silent --head --write-out '%{http_code} %{time_total}' "$line")
done | column -t
and use it like:
myscript.sh < url-list.txt >result.txt
this allows you use your script in pipes, like:
something_produces_urls | myscript.sh | grep 200 > somewhere.txt

Related

Get curl response and parameter in the same line

I have a file containing values:
file.txt
value1
value2
value3
And I have this script:
#!/bin/bash
input="test.txt"
while IFS= read -r line ; do
curl -o /dev/null -s -w "%{http_code}\n" -u username:token "https://website.com/api/user"$line
done < "$input" > result.txt
At the moment this is my output:
result.txt
400
200
200
What I really want to achieve is to get this kind of output:
output.txt
value1,400
value2,200
value3,200
What my code is missing to achieve this?
Run and capture the output of the curl command using the command substitution syntax $(...).
Then, use echo or printf to show the captured output, along with the original input, in the form you want.
input="test.txt"
while IFS= read -r line ; do
res=$(curl -o /dev/null -s -w "%{http_code}\n" -u username:token "https://website.com/api/user$line")
printf "%s\n" "$line,$res"
done < "$input" > result.txt

How to retrieve the real redirect location header with Curl? without using {redirect_url}

I realized that Curl {redirect_url} does not always show the same redirect URL. For example if the URL header isLocation: https:/\example.com this will redirect to https:/\example.com but curl {redirect_url} shows redirect_url: https://host-domain.com/https:/\example.com and it won't display the response real location header. (I like to see the real location: result.)
This is the BASH I'm working with:
#!/bin/bash
# Usage: urls-checker.sh domains.txt
FILE="$1"
while read -r LINE; do
# read the response to a variable
response=$(curl -H 'Cache-Control: no-cache' -s -k --max-time 2 --write-out '%{http_code} %{size_header} %{redirect_url} ' "$LINE")
# get the title
title=$(sed -n 's/.*<title>\(.*\)<\/title>.*/\1/ip;T;q'<<<"$response")
# read the write-out from the last line
read -r http_code size_header redirect_url < <(tail -n 1 <<<"$response")
printf "***Url: %s\n\n" "$LINE"
printf "Status: %s\n\n" "$http_code"
printf "Size: %s\n\n" "$size_header"
printf "Redirect-url: %s\n\n" "$redirect_url"
printf "Title: %s\n\n" "$title"
# -c 20 only shows the 20 first chars from response
printf "Body: %s\n\n" "$(head -c 100 <<<"$response")"
done < "${FILE}"
How can I printf "Redirect-url: the original requested location: header without having to use redirect_url?
To read the exact Location header field value, as returned by the server, you can use the -i/--include option, in combination with grep.
For example:
$ curl 'http://httpbin.org/redirect-to?url=http:/\example.com' -si | grep -oP 'Location: \K.*'
http:/\example.com
Or, if you want to read all headers, content and the --write-out variables line (according to your script):
response=$(curl -H 'Cache-Control: no-cache' -s -i -k --max-time 2 --write-out '%{http_code} %{size_header} %{redirect_url} ' "$url")
# break the response in parts
headers=$(sed -n '1,/^\r$/p' <<<"$response")
content=$(sed -e '1,/^\r$/d' -e '$d' <<<"$response")
read -r http_code size_header redirect_url < <(tail -n1 <<<"$response")
# get the real Location
location=$(grep -oP 'Location: \K.*' <<<"$headers")
Fully integrated in your script, this looks like:
#!/bin/bash
# Usage: urls-checker.sh domains.txt
file="$1"
while read -r url; do
# read the response to a variable
response=$(curl -H 'Cache-Control: no-cache' -s -i -k --max-time 2 --write-out '%{http_code} %{size_header} %{redirect_url} ' "$url")
# break the response in parts
headers=$(sed -n '1,/^\r$/p' <<<"$response")
content=$(sed -e '1,/^\r$/d' -e '$d' <<<"$response")
read -r http_code size_header redirect_url < <(tail -n1 <<<"$response")
# get the real Location
location=$(grep -oP 'Location: \K.*' <<<"$headers")
# get the title
title=$(sed -n 's/.*<title>\(.*\)<\/title>.*/\1/ip;T;q'<<<"$content")
printf "***Url: %s\n\n" "$url"
printf "Status: %s\n\n" "$http_code"
printf "Size: %s\n\n" "$size_header"
printf "Redirect-url: %s\n\n" "$location"
printf "Title: %s\n\n" "$title"
printf "Body: %s\n\n" "$(head -c 100 <<<"$content")"
done < "$file"
According to #randomir answer and since I was only need raw redirect URL I use this command on my batch
curl -w "%{redirect_url}" -o /dev/null -s "https://stackoverflow.com/q/46507336/3019002"
https:/\example.com is not a legal URL(*). The fact that this works in browsers in an abomination (that I've fought against) and curl doesn't. %{redirect_url} shows exactly the URL curl would redirect to...
A URL should use to forward slashes, so the above should look like http://example.com.
(*) = I refuse to accept the WHATWG "definition".

How to use Curl to also output first 20 characters from the response?

I'm using curl to retrieve the http_code size_header redirect_url and Website Title with:
#!/bin/bash
FILE="$1"
while read LINE; do
curl -H 'Cache-Control: no-cache' -i -s -k -o >(perl -l -0777 -ne 'print $1 if /<title.*?>\s*(.*?)\s*<\/title/si') --silent --max-time 2 --write-out '%{http_code} %{size_header} %{redirect_url} ' "$LINE"
echo " $LINE"
done < ${FILE}
but I like to also retrieve the first 20 characters from the response to have more information.
The idea is to get this output
%{http_code} %{size_header} %{redirect_url} $website_title $website_first_20_bytes
I only need to add the $website_first_20_bytes to the output. How can I achieve this?
PS: No the first 20 characters from header response. Only the source.
So you probably mean something like this then (I added a bunch of newlines etc. to the output which you can trim as you please):
#!/bin/bash
FILE="$1"
while read -r LINE; do
# read the response to a variable
response=$(curl -H 'Cache-Control: no-cache' -s -k --max-time 2 --write-out '%{http_code} %{size_header} %{redirect_url} ' "$LINE")
# get the title
title=$(sed -n 's/.*<title>\(.*\)<\/title>.*/\1/ip;T;q'<<<"$response")
# read the write-out from the last line
read -r http_code size_header redirect_url < <(tail -n 1 <<<"$response")
printf "Status: %s\n" "$http_code"
printf "Size: %s\n" "$size_header"
printf "Redirect-url: %s\n" "$redirect_url"
printf "Url: %s\n" "$LINE"
printf "Title: %s" "$title"
# -c 20 only shows the 20 first chars from response
printf "Body: %s" "$(head -c 20 <<<"$response")"
done < ${FILE}

Curl echo Server response Bash

I'm trying to create a bash script that check url from list status code and echo server name from header. I'm actually new.
#!/bin/bash
while read LINE; do
curl -o /dev/null --silent --head --write-out '%{http_code}' "$LINE"
echo " $LINE" &
curl -I /dev/null --silent --head | grep -Fi Server "$SERVER"
echo " $SERVER"
done < dominios-https
I get the following output
301 http://example.com
grep: : No such file or directory
1) while read LINE can not use last line if text file not ended with new line.
2) You don't set "$SERVER" anywhere, and grep say it
3) Not all servers return "Server:" in headers
try it:
scriptDir=$( dirname -- "$0" )
for siteUrl in $( < "$scriptDir/myUrl.txt" )
do
if [[ -z "$siteUrl" ]]; then break; fi # break line if him empty
httpCode=$( curl -I -o /dev/null --silent --head --write-out '%{http_code}' "$siteUrl" )
echo "HTTP_CODE = $httpCode"
headServer=$( curl -I --silent --head "$siteUrl" | grep "Server" | awk '{print $2}' )
echo "Server header = $headServer"
done

Bash Script - Not returning column/variable

I am having an issue that i cant seem to figure out after a couple hours of tinkering with this. I cant seem this script to return anything to the final column.
#!/bin/bash
file="/Users/USER12/Desktop/url-list.txt"
log="/Users/USER12/Desktop/url-results.txt"
fmt="%-25s%-12s%-16s%-20s\n"
printf "$fmt" DOMAIN_NAME HTTP_CODE RESPONSE_TIME CONTENT_CHECK > "$log"
while read line
do
read code time < <(curl -o /dev/null --silent --head --write-out '%{http_code} %{time_total}' "$line")
curl "$line" 2>/dev/null > /Users/USER12/Desktop/domainQueryString_output.txt
ifStatementConditional=`grep "THE CONTENT I'M LOOKING TO VERIFY" /Users/USER12/Desktop/domainQueryString_output.txt | wc -l`
if [ $ifStatementConditional -eq 1 ] ; then
second_check="online"
else
second_check="DOMAIN IS OFFLINE"
fi
printf "$fmt" "$line" "$code" "$time" "$second_chance" >> "$log"
done <"$file"
It returns the following but nothing to the final column....
DOMAIN_NAME HTTP_CODE RESPONSE_TIME CONTENT_CHECK
google.com 301 1.177
Thanks for the help guys. The Help is Much Appreciated.
You have "$second_chance" where you should have had "$second_check".
Other then that, the following is a better way to do your if check:
if grep "THE CONTENT I'M LOOKING TO VERIFY" $yourfile -q
then
...
else
...
fi

Resources