Using both stdout and stderr of curl in a subsequent jq command - shell

I want to do the following,
command2(stdout)
/ \
command1 command4
\ /
command3(stderr)
As it is covered in How can I split and re-join STDOUT from multiple processes?
Except, command1 outputs different text to both, stdout and stderr. So, it is a combination of the above question and Pipe stderr to another command
To have a context, what I am trying to achieve:
Execute curl
Capture raw output (stdout), base64, and embed it into json: curl https://someaddress.tld | base64 | jq --raw-input '{"curl_ret" : .}'
Output curl json (return code etc) and pass it to stderr: curl --write-out '%{stderr}%{json}' https://someaddress.tld
Given #2 and #3 is the same curl call, I want to merge the output of #2 and #3 and pass the merged result to jq: jq --slurp ...
All these in one piped command.
stdout and stderr separation is done to avoid parsing merged text, to avoid pitfalls, given curl output can be anything. Curl has --silent switch so no unexpected text in either output stream

In practice, for the use case at hand
You don't need to do this at all. --write-out '%{json}' will always be written after the body content, so it's always the last line of stdout. It's safe to have it in the same stream.
getExampleAndMetadata() {
curl --write-out '%{json}' https://example.com |
jq -RSs '
split("\n")
| {"body": .[:-1] | join("\n"),
"logs": .[-1] | fromjson}'
}
As an exercise
It's ugly, and I don't recommend it -- it'd be better to just parse the two sets of data out of stdout and leave stderr free so you have a way to log actual errors -- but, doing what you asked for:
getExampleAndMetadata() {
local tmpdir retval=0
# mkfifo is different between operating systems, adjust to fit
tmpdir=$(mktemp -d sa_query.XXXXXX) || return
mkfifo "$tmpdir/stdout" "$tmpdir/stderr" || { retval=$?; rm -rf "$tmpdir"; return "$retval"; }
curl --silent --write-out '%{stderr}%{json}' https://example.com \
>"$tmpdir/stdout" 2>"$tmpdir/stderr" & curl_pid=$!
# must read stdout _before_ stderr to avoid a deadlock here
# to stop caring what order of operations jq uses, we use process substitutions
# to buffer both stdout and stderr in-memory
jq -Rn \
--rawfile content <(out=$(cat "$tmpdir/stdout"); printf '%s\n' "$out") \
--slurpfile logs <(err=$(cat "$tmpdir/stderr"); printf '%s\n' "$err") \
'{"body": $content, "logs": $logs}'; (( retval |= $? ))
rm -rf -- "$tmpdir"; (( retval |= $? ))
wait "$curl_pid"; (( retval |= $? ))
return "$retval"
}
...gives you a simple command, getExampleAndMetadata. And of course, if you eliminate the comments and line continuations you can collapse the whole thing to one line by adding ;s appropriately.

Related

Calling bash script from bash script

I have made two programms and I'm trying to call the one from the other but this is appearing on my screen:
cp: cannot stat ‘PerShip/.csv’: No such file or directory
cp: target ‘tmpship.csv’ is not a directory
I don't know what to do. Here are the programms. Could somebody help me please?
#!/bin/bash
shipname=$1
imo=$(grep "$shipname" shipsNAME-IMO.txt | cut -d "," -f 2)
cp PerShip/$imo'.csv' tmpship.csv
dist=$(octave -q ShipDistance.m 2>/dev/null)
grep "$shipname" shipsNAME-IMO.txt | cut -d "," -f 2 > IMO.txt
idnumber=$(cut -b 4-10 IMO.txt)
echo $idnumber,$dist
#!/bin/bash
rm -f shipsdist.csv
for ship in $(cat shipsNAME-IMO.txt | cut -d "," -f 1)
do
./FindShipDistance "$ship" >> shipsdist.csv
done
cat shipsdist.csv | sort | head -n 1
The code and error messages presented suggest that the second script is calling the first with an empty command-line argument. That would certainly happen if input file shipsNAME-IMO.txt contained any empty lines or otherwise any lines with an empty first field. An empty line at the beginning or end would do it.
I suggest
using the read command to read the data, and manipulating IFS to parse out comma-delimited fields
validating your inputs and other data early and often
making your scripts behave more pleasantly in the event of predictable failures
More generally, using internal Bash features instead of external programs where the former are reasonably natural.
For example:
#!/bin/bash
# Validate one command-line argument
[[ -n "$1" ]] || { echo empty ship name 1>&2; exit 1; }
# Read and validate an IMO corresponding to the argument
IFS=, read -r dummy imo tail < <(grep -F -- "$1" shipsNAME-IMO.txt)
[[ -f PerShip/"${imo}.csv" ]] || { echo no data for "'$imo'" 1>&2; exit 1; }
# Perform the distance calculation and output the result
cp PerShip/"${imo}.csv" tmpship.csv
dist=$(octave -q ShipDistance.m 2>/dev/null) ||
{ echo "failed to compute ship distance for '${imo}'" 2>&1; exit 1; }
echo "${imo:3:7},${dist}"
and
#!/bin/bash
# Note: the original shipsdist.csv will be clobbered
while IFS=, read -r ship tail; do
# Ignore any empty ship name, however it might arise
[[ -n "$ship" ]] && ./FindShipDistance "$ship"
done < shipsNAME-IMO.txt |
tee shipsdist.csv |
sort |
head -n 1
Note that making the while loop in the second script part of a pipeline will cause it to run in a subshell. That is sometimes a gotcha, but it won't cause any problem in this case.

How do I check the HTTP status code and also parse the payload

Imagine I have the following code in a bash script:
curl -s https://cat-fact.herokuapp.com/facts/random?animal=cat | jq .
Notice that I wish to display the payload of the response by passing it to jq.
Now suppose sometimes those curls sometimes return a 404, in such cases my script currently still succeeds so what I need to do is check the return code and exit 1 as appropriate (e.g. for a 404 or 503). I've googled around and found https://superuser.com/a/442395/722402 which suggests --write-out "%{http_code}" might be useful however that simply prints the http_code after printing the payload:
curl -s --write-out "%{http_code}" https://cat-fact.herokuapp.com/facts/random?animal=cat | jq .
$ curl -s --write-out "%{http_code}" https://cat-fact.herokuapp.com/facts/random?animal=cat | jq .
{
"_id": "591f98783b90f7150a19c1ab",
"__v": 0,
"text": "Cats and kittens should be acquired in pairs whenever possible as cat families interact best in pairs.",
"updatedAt": "2018-12-05T05:56:30.384Z",
"createdAt": "2018-01-04T01:10:54.673Z",
"deleted": false,
"type": "cat",
"source": "api",
"used": false
}
200
What I actually want to is still output the payload, but still be able to check the http status code and fail accordingly. I'm a bash noob so am having trouble figuring this out. Help please?
I'm using a Mac by the way, not sure if that matters or not (I'm vaguely aware that some commands work differently on Mac)
Update, I've pieced this together which sorta works. I think. Its not very elegant though, I'm looking for something better.
func() {
echo "${#:1:$#-1}";
}
response=$(curl -s --write-out "%{http_code}" https://cat-fact.herokuapp.com/facts/random?animal=cat | jq .)
http_code=$(echo $response | awk '{print $NF}')
func $response | jq .
if [ $http_code == "503" ]; then
echo "Exiting with error due to 503"
exit 1
elif [ $http_code == "404" ]; then
echo "Exiting with error due to 404"
exit 1
fi
What about this. It uses a temporary file. Seems me a bit complicated but it separates your content.
# copy/paste doesn't work with the following
curl -s --write-out \
"%{http_code}" https://cat-fact.herokuapp.com/facts/random?animal=cat | \
tee test.txt | \ # split output to file and stdout
sed -e 's-.*\}--' | \ # remove everything before last '}'
grep 200 && \ # try to find string 200, only in success next step is done
echo && \ # a new-line to juice-up the output
cat test.txt | \ #
sed 's-}.*$-}-' | \ # removes the last line with status
jq # formmat json
Here a copy/paste version
curl -s --write-out "%{http_code}" https://cat-fact.herokuapp.com/facts/random?animal=cat | tee test.txt | sed -e 's-.*\}--' | grep 200 && echo && cat test.txt | sed 's-}.*$-}-' | jq
This is my attempt. Hope it works for you too.
#!/bin/bash
result=$( curl -i -s 'https://cat-fact.herokuapp.com/facts/random?animal=cat' )
status=$( echo "$result" | grep -E '^HTTPS?/[1-9][.][1-9] [1-9][0-9][0-9]' | grep -o ' [1-9][0-9][0-9] ')
payload=$( echo "$result" | sed -n '/^\s*$/,//{/^\s*$/ !p}' )
echo "STATUS : $status"
echo "PAYLOAD : $payload"
Output
STATUS : 200
PAYLOAD : {"_id":"591f98803b90f7150a19c23f","__v":0,"text":"Cats can't taste sweets.","updatedAt":"2018-12-05T05:56:30.384Z","createdAt":"2018-01-04T01:10:54.673Z","deleted":false,"type":"cat","source":"api","used":false}
AWK version
payload=$( echo "$result" | awk '{ if( $0 ~ /^\s*$/ ){ c_p = 1 ; next; } if (c_p) { print $0} }' )
Regards!
EDIT : I have simplified this even more by using the -i flag
EDIT II : Removed empty line from payload
EDIT III : Included an awk method to extract the payload in case sed is problematic
Borrowing from here you can do:
#!/bin/bash
result=$(curl -s --write-out "%{http_code}" https://cat-fact.herokuapp.com/facts/random?animal=cat)
http_code="${result: -3}"
response="${result:0:${#result}-3}"
echo "Response code: " $http_code
echo "Response: "
echo $response | jq
Where
${result: -3} is the 3rd index starting from the right of the string till the end. This ${result: -3:3} also would work: Index -3 with length 3
${#result} gives us the length of the string
${result:0:${#result}-3} from the beginning of result to the end minus 3 from the http_status code
The site cat-fact.herokuapp.com isn't working now so I had to test it with another site

Print a word after a piped command output to the same line

I have a command pipeline using pv (pipe viewer). I would like print 'done' after the pipeline output in the same line. I use pv, and it has a parameter -c/--cursor which perhaps could provide this, but I cannot find out how to use it. The documentation is very concise and I don't find examples. -c, --cursor do not accept parameter.
Example
for a in {1,2,3,4}; do echo $a; sleep 2; done | tee -a log.txt | pv -name "Test" -w 80 -cls 4 > /dev/null; echo done
Either using -c or not 'done' will be printed in the next row, but during the process there is a difference: using -c, cursor is in the end of the progress bar.

How to loop through large file and output lines to curl through stdin?

Say somefile contains the content
a
b
c
I want each line to be turned into 3 http POST curl commands. So line 3 would post "c" to some url.
I can loop through the file with bash and dump to curl like this
cat somefile | while read line; \
do curl -XPOST 'www.example.com' -d "$line"; \
done
However, line is a giant json file and sometimes passing it through the command line does weird things. I'd rather have something like this
cat somefile | parallel curl -XPOST example.com -d #-
where '#-' means each line of file is passed to curl through stdin. gnu parallel can accept {} as an argument which is similar to "$line" above, but I'd like something that turns a file in to a stream of lines before passing it to the next command.
cat somefile | parallel --pipe -N1 curl -XPOST example.com -d #-
ShellCheck says:
Line 1:
cat somefile | while read line; \
^-- SC2162: read without -r will mangle backslashes.
This would explain it doing weird things to JSON, which frequently uses backslashes:
$ echo '{ "key": "some value with \"nested quotes\" here" }' | \
while read line; do echo "$line"; done
{ "key": "some value with "nested quotes" here" }
Adding the -r will instead leave them alone:
$ echo '{ "key": "some value with \"nested quotes\" here" }' | \
while read -r line; do echo "$line"; done
{ "key": "some value with \"nested quotes\" here" }
To be entirely correct, it should be while IFS= read -r line to also preserve leading spaces.
Relevant POSIX docs for read:
By default, unless the -r option is specified, < backslash> shall act as an escape character. An unescaped < backslash> shall preserve the literal value of the following character, with the exception of a < newline>. If a < newline> follows the < backslash>, the read utility shall interpret this as line continuation. The < backslash> and < newline> shall be removed before splitting the input into fields. All other unescaped < backslash> characters shall be removed after splitting the input into fields.
cat somefile | parallel 'echo {} | curl -XPOST example.com -d #-'

Can you set multiple cURL --write-out variables to bash variables in a single call

I need to set or access multiple cURL variables so I can access them later in a script. For example:
curl -s --write-out "%{http_code} | %{local_ip} | %{time_total}" "http://endpoint.com/payload"
Now how can I access http_code or local_ip to do things like add them to an bash array, etc? Is the only option to grep them out of the response?
You can pipe your curl command to a read command :
curl -s --write-out "write-out: %{http_code} | %{local_ip} | %{time_total}\n" "http://yahoo.com" | \
sed -n '/^write-out:/ s///p' | \
while IFS='|' read http_code local_ip time_total;
do
printf "http_code: %s\nlocal_ip: %s\ntotal_time: %s\n" $http_code $local_ip $time_total;
# or in an array
curlvars=($http_code $local_ip $time_total)
for data in "${curlvars[#]}"
do
printf "%s | " $data
done
done
I added a \n to the write-out string to allow process it as a line.
The sed command extract the write-out line from the curl output.
In the read command you can define a separator and assign all parsed strings to vars.

Resources