get file as text from github v3 api using bash - bash

I'm trying to get a config file from our GitHub using the get contents api.
This returns a JSON containing the file content encoded as a base64 string.
I'd like to get it as text
Steps I've taken
get initial api response:
curl -H 'Authorization: token MY_TOKEN' \
https://github.com/api/v3/repos/MY_OWNER/MY_REPO/contents/MY_FILE
this returns a JSON response with a field "content": "encoded content ..."
get the encoded string:
add <prev command> | grep -F "content\":"
this gets the content, but there's still the "content": string, the " chars and a comma at the end
cut the extras:
<prev command> | cut -d ":" -f 2 | cut -d "\"" -f 2
decode:
<prev command | base64 --decode>
final command:
curl -H 'Authorization: token MY_TOKEN' \
https://github.com/api/v3/repos/MY_OWNER/MY_REPO/contents/MY_FILE | \
grep -F "content\":" | cut -d ":" -f 2 | cut -d "\"" -f 2 | base64 --decode
Issues:
the resulting string (before the base64 --decode) decodes in an online decoder (not well -> see next item), but fails to do so in bash. The response being
"Invalid character in input stream."
When decoding the string in an online decoder, some (not all) of the file is in gibberish, and not the original text. I've tried all the available charsets.
Notes:
I've tried removing the last 2 (newline) chars with sed 's/..$//', but this has no effect.
If I select the output with the mouse and copy paste it to a echo MY_ECODED_STRING_PASTED_HERE | base64 --decode command, it has the same effect as the online tool, that is, it decodes as gibberish.

Add header Accept: application/vnd.github.VERSION.raw to the GET.

Following tripleee's advice, i've switched the extracting method to jq
file=randomFileName74894031264.txt
curl -H 'Authorization: token MY_TOKEN' https://github.com/api/v3/repos/MY_OWNER/MY_REPO/contents/MY_FILE > "$file"
encoded_str=($(jq -r '.content' "$file"))
echo "$encoded_str" | base64 -D
rm -f "$file"
This works when running from the command line, but when running as a script the stdout doesn't flush, and we only get the first few lines of the file.
I will update this answer when I've formalized a generic script.

Related

illegal character found in url

I'm trying to extract data from an api with different ids stored in a text file but i keep getting the message "curl(3): illegal character found in url".
the text file contains:
362ae-235sa-3h26g-136gr
652ae-290sa-3h26g-132gr
394ae-275sa-k726g-106gr
362ae-257sa-3le0g-136gr
My script:
for j in $(cat ids.json)
do
curl -u "$workspace_username":"$workspace_password" \
"https://gateway.watsonplatform.net/assistant/api/v1/workspaces/$j/logsversion=2018-07-10" \
| jq '.' | jq -r '.logs[]' >> test.json
sleep 3
done
I'm new to this. Can anyone please help me with the script?
I could reproduce your problem with a CR attached to a line in the file ids.json. I just can assume that this is also your problem. I propose to fix your file.
You can do that automatically by removing all characters which are not part of your ids which are supposed to be in this file:
sed -i 's/[^0-9a-z-]//g' ids.json

Parse JQ output through external bash function?

I want to parse out data out of a log file which consist of JSON sting and I wonder if there's a way for me to use a bash function to perform any custom parsing instead of overloading jq command.
Command:
tail errors.log --follow | jq --raw-output '. | [.server_name, .server_port, .request_file] | #tsv'
Outputs:
8.8.8.8 80 /var/www/domain.com/www/public
I want to parse 3rd column to cut the string to exclude /var/www/domain.com part where /var/www/domain.com is the document root, and /var/www/domain.com/subdomain/public is the public html section of the site. Therefore I would like to leave my output as /subdomain/public (or from the example /www/public).
I wonder if I can somehow inject a bash function to parse .request_file column? Or how would I do that using jq?
I'm having issues piping out the output of any part of this command that would allow me to do any sort of string manipulation.
Use a BashFAQ #1 while read loop to iterate over the lines, and a BashFAQ #100 parameter expansion to perform the desired modifications:
tail -f -- errors.log \
| jq --raw-output --unbuffered \
'[.server_name, .server_port, .request_file] | #tsv' \
| while IFS=$'\t' read -r server_name server_port request_file; do
printf '%s\t%s\t%s\n' "$server_name" "$server_port" "/${request_file#/var/www/*/}"
done
Note the use of --unbuffered, to force jq to flush its output lines immediately rather than buffering them. This has a performance penalty (so it's not default), but it ensures that you get output immediately when reading from a potentially-slow input source.
That said, it's also easy to remove a prefix in jq, so there's no particular reason to do the above:
tail -f -- errors.log | jq -r '
def withoutPrefix: sub("^([/][^/]+){3}"; "");
[.server_name, .server_port, (.request_file | withoutPrefix)] | #tsv'

Curl Command response output has new line and Extra characters

I am using below curl command to download file from url. But the output file has new line and extra characters due to which Tiff file getting corrupt.
curl -k -u Username:Password URL >/Test.Tiff
Sample Test.Tiff has below data
1.
2.
3.IDCFILE87918
4.II*ÿûÞ©¥zKÿJÛï_]ÿÿÿ÷ÿÞï¹×ëÿ¤ÿO]
5¿ûÕÿÿ¯zê¿ß£0•¿þÛ¯kÚÿ¹5Éöûé_u_éwÕzkJï·_¯¯ßþýuw]í~þžmúºßÿzÈfçúîC7½õëÿÛ¯ô¿Z[6.ý®Úö·4ýý ~«v×ÿº^Ÿ¿í¾Ýÿzuýëÿ÷×]}ûÿõé‰ÿ¿m/KûÿµÛ_ý¾×Oín½}+wýzíýö¿õÿî—7.ékñN¿û­Sߦ=ºì%±N—í¯i_Û¶¬:×·m{
8.ÿ­¶ÿím¿í/ívÒ®ÒP­¯Õ¥¶¿}SÛúì%Ú_kûim­ú«i·V½»
9..Âýt•¿ßoÛ]¦Òý´»KßØaPaa…å87M…VÂúý?ÿa„˜ei
First three lines where line no 1 and 2 is newlines which is coming as ^M through VI editor are extra which should not be there.When i delete first 3 lines and save the file then i am able to open the file.
Let me know how first three lines are getting appended.
Update: Try greping the Curl output to remove blank lines, like this:
curl -k -u <username>:<password> <url> | grep -v '^$' > /Test.Tiff
Curl also has the --output <name> option to redirect output to a file. You may first output the response to a file and then use it as grep input:
curl -k -u <username>:<password> <url> > curl_out.txt
grep -v '^$' curl_out.txt > Test.Tiff

Argument List Too Long - Curl - GeoJSON

I am trying to feed a GeoJSON to this data service using the following bash code.
curl -X POST -F "shape=$(cat myfile.geojson)" \
-F 'age=69' -o reconstructed_myfile.geojson \
https://dev.macrostrat.org/reconstruct
However, I am getting an "Argument list too long" error. I see a lot of questions open on stack related to this issue, but I do not understand how to convert the answers given in those threads to this specific case.
You should use <filename or #filename:
curl -X POST \
-F 'shape=<myfile.geojson' \
-F 'age=69' \
-o 'reconstructed_myfile.geojson' \
-- 'https://dev.macrostrat.org/reconstruct'
See man curl for details:
$ man curl | awk '$1 ~ /-F/' RS=
-F, --form <name=content>
(HTTP) This lets curl emulate a filled-in form in which a user has
pressed the submit button. This causes curl to POST data using the
Content-Type multi‐ part/form-data according to RFC 2388. This
enables uploading of binary files etc. To force the 'content' part to
be a file, prefix the file name with an # sign. To just get the
content part from a file, prefix the file name with the symbol <. The
difference between # and < is then that # makes a file get
attached in the post as a file upload, while the < makes a text field
and just get the contents for that text field from a file.

How to use `jq` in a shell pipeline?

I can't seem to get jq to behave "normally" in a shell pipeline. For example:
$ curl -s https://api.github.com/users/octocat/repos | jq | cat
results in jq simply printing out its help text*. The same thing happens if I try to redirect jq's output to a file:
$ curl -s https://api.github.com/users/octocat/repos | jq > /tmp/stuff.json
Is jq deliberately bailing out if it determines that it's not being run from a tty? How can I prevent this behavior so that I can use jq in a pipeline?
Edit: it looks like this is no longer an issue in recent versions of jq. I have jq-1.6 now and the examples above work as expected.
* (I realize this example contains a useless use of cat; it's for illustration purposes only)
You need to supply a filter as an argument. To pass the JSON through unmodified other than the pretty printing jq provides by default, use the identity filter .:
curl -s https://api.github.com/users/octocat/repos | jq '.' | cat
One use case I have found myself doing frequently as well is "How do I construct JSON data to supply into other shell commands, for example curl?" The way I do this is by using the --null-input/-n option:
Don’t read any input at all! Instead, the filter is run once using null as the input. This is useful when using jq as a simple calculator or to construct JSON data from scratch.
And an example passing it into curl:
jq -n '{key: "value"}' | curl -d #- \
--url 'https://some.url.com' \
-H 'Content-Type: application/json' \
-H 'Accept: application/json'

Resources