Getting the last 100 commits in repositories of a GitHub user/organisation in Bash? - bash

Context
I wrote the following code to get the last n commits of a repository of a GitHub user/organisation:
# Get commits
commits_json=$(curl -H "Accept: application/vnd.github.v3+json" https://api.github.com/repos/$github_username/$github_repo_name/commits?per_page=1&page=1)
echo "commits_json=$commits_json"
echo ""
# Get the first commit.
readarray -t branch_commits_arr < <(echo "$commits_json" | jq ".[].sha")
echo "branch_commits_arr=$branch_commits_arr"
Issue
I noticed I get into the reported rate limits of 60 API calls per hour when I try to do this for all repositories in a GitHub user/organisation.
Attempt I
I tried the more general format to get the commit lists in a single API call:
curl -H "Accept: application/vnd.github.v3+json" https://api.github.com/repos/$some_user/commits?per_page=10&page=1
Which returned:
{ "message": "Not Found",
"documentation_url": "https://docs.github.com/rest/reference/repos#get-a-repository"
}
Attempt II
Another approach to get the data without triggering the API rate limit would be to parse the atom format of each repository, however, it seems like this is an undesirable hack/more boilerplate code than needed.
Question
Hence, I was wondering, how can one get a list/json of/containing all/the most recent 100/n commits across all the repositories of a GitHub user/organisation, using the GitHub API in Bash?

Related

Github API - Get private repositories of user

I have created a script that automatically backs up my GitHub repositories on a hard drive.
I use my Github username in combination with a personal access token to get authorized to Github. Now I've been reading a bit in their documentation about how to get ALL my repositories from the API (public & private), but I can only seem to get the public...
My script: https://github.com/TomTruyen/GitHub-Backup-Script/blob/main/github_backup_script.sh
From what I can understand on line 78, the url should return all my 'owned' repositories (which should include my privates ones)
repositories=$(curl -XGET -s https://"${GITHUB_USERNAME}":"${GITHUB_TOKEN}"#api.github.com/users/"${GITHUB_USERNAME}"/repos?per_page="${repository_count}" | jq -c --raw-output ".[] | {name, ssh_url}")
I have already enabled ALL repository scopes which should give me 'Full control of private repositories (and public)'
I'm out of ideas right now... am I doing something wrong?
NOTE: I'm trying to get my private repositories as a USER, not as an organization
NOTE: ${GITHUB_USERNAME} & ${GITHUB_TOKEN} are variables that I have of course filled in, in my script
You're calling the /users endpoint, but looking at List repositories for the authenticated user it looks like you should be calling /user/repos.
By default this will return all repositories, both public and private, for the currently authenticated user. You'll also need to correctly handle pagination (unless you know for sure you have fewer than 100 repositories).
I was able to fetch a list of all my repositories using the following script:
#!/bin/sh
#
# you must set GH_API_USER and GH_API_TOKEN in your environment
tmpfile=$(mktemp curlXXXXXX)
trap "rm -f $tmpfile" EXIT
page=0
while :; do
let page++
curl -sf -o $tmpfile \
-u "$GH_API_USER:$GH_API_TOKEN" \
"https://api.github.com/user/repos?per_page=100&page=$page&visibility=all"
count=$(jq length $tmpfile)
if [[ $count = 0 ]]; then
break
fi
jq '.[]|.full_name' $tmpfile
done
visibility=all is the key thing here, guys. I spent hours but end up not getting what I wanted. It's been mentioned in the doc as well -
https://docs.github.com/en/rest/repos/repos#list-repositories-for-the-authenticated-user

How to download a big file from google drive via curl in Bash?

I wanna make a very simple bash script for downloading files from google drive via Drive API, so in this case there is a big file on google drive and I installed OAuth 2.0 Playground on my google drive account, then in the Select the Scope box, I choose Drive API v3, and https://www.googleapis.com/auth/drive.readonly to make a token and link.
After clicking Authorize APIs and then Exchange authorization code for tokens. I copied the Access tokenlike below.
#! /bin/bash
read -p 'Enter your id : ' id
read -p 'Enter your new token : ' token
read -p 'Enter your file name : ' file
curl -H "Authorization: Bearer $token" "https://www.googleapis.com/drive/v3/files/$id?alt=media" -o "$file"
but it won't work, any idea ?
for example the size of my file is 12G, when I run the code I will get this as output and after a second it back to prompt again ! I checked it in two computers with two different ip addresses.(I also add alt=media to URL)
-bash-3.2# bash mycode.sh
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 166 100 166 0 0 80 0 0:00:02 0:00:02 --:--:-- 80
-bash-3.2#
the content of file that it created is like this
{
"error": {
"errors": [
{
"domain": "global",
"reason": "downloadQuotaExceeded",
"message": "The download quota for this file has been exceeded."
}
],
"code": 403,
"message": "The download quota for this file has been exceeded."
}
}
You want to download a file from Google Drive using the curl command with the access token.
If my understanding is correct, how about this modification?
Modified curl command:
Please add the query parameter of alt=media.
curl -H "Authorization: Bearer $token" "https://www.googleapis.com/drive/v3/files/$id?alt=media" -o "$file"
Note:
This modified curl command supposes that your access token can be used for downloading the file.
In this modification, the files except for Google Docs can be downloaded. If you want to download the Google Docs, please use the Files: export method of Drive API. Ref
Reference:
Download files
If I misunderstood your question and this was not the direction you want, I apologize.
UPDATE AS FOR MARCH 2021
Simply follow this guide here. It worked for me.
In summary:
For small files to download run
wget --no-check-certificate 'https://docs.google.com/uc?export=download&id=FILEID' -O FILENAME
While if you are trying to download a quite large file you should try to run
wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=FILEID' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=FILEID" -O FILENAME && rm -rf /tmp/cookies.txt
Simply substitute FILEID and FILENAME with your custom values.
FILEID can be found in your file share link (after the /d/ as illustrated in the article mantioned above).
FILENAME is simply the name you want to save the download as. Remember to include the right extension. For Example FILENAME = my_file.pdf if the file is a pdf.
This is a known bug
It has been reported in this Issue Tracker post. This is caused because as you can read in the documentation:
(about download url)
Short lived download URL for the file. This field is only populated
for files with content stored in Google Drive; it is not populated for
Google Docs or shortcut files.
So you should use another field.
You can follow the report by clicking on the star next to the issue
number to give more priority to the bug and to receive updates.
As you can read in the comments of the report, the current workaround is:
Use webContentlink instead
or
Change www.googleapis.com to content.googleapis.com

export github commits/names to CSV with bash & jq

For a project I need to extract data from a lot of different blockchain GitHub profiles to a csv.
After browsing through the GitHub API I was able to achieve some of the necessary data being shown as txt/csv files using bash commands and jq.
Now doing all of this manually would probably take 7 days. I have a list of profiles i need to loop through saved as CSV.
The list looks like this --> https://docs.google.com/spreadsheets/d/1lFsewAYI7F8zSw7WPhI9E9WwR8f4G1clw1yjxY3wz_4/edit#gid=0
My approach so far to get all the repo names looks like this:
sample='[{"name":"0chain"},{"name":"0stateapp"},{"name":"0xcert"}]'
the csv belongs in here, I didn't know how to redirect it to that variable yet, but for testing purposes this was enough. If somebody knows how to, feel free to give a hint.
for row in $(echo "${sample}" | jq -r '.[] | #base64'); do
_jq()
{
echo ${row} | base64 --decode | jq -r ${1}
}
for GHUSER in $( echo $(_jq '.name')); do
curl -s https://api.github.com/users/$GHUSER/repos?per_page=100 | jq -r '.[]|.full_name'
done
done
The output looks like this:
0chain/0chain-token
0chain/client-sdk
0chain/docs
0chain/gorocksdb
0chain/hostadmin
0chain/rocksdb
0stateapp/ZSCoin
0xcert/0xcert
0xcert/conventions
0xcert/docs
0xcert/erc721-validator
0xcert/erc721-validator-api
0xcert/erc721-validator-ui
0xcert/erc721-website
0xcert/ethereum
0xcert/ethereum-crowdsale
0xcert/ethereum-dex
0xcert/ethereum-erc20
0xcert/ethereum-erc721
0xcert/ethereum-minter
0xcert/ethereum-utils
0xcert/ethereum-xcert
0xcert/ethereum-xcert-builder
0xcert/ethereum-zxc
0xcert/framework
0xcert/framework-cert-test
0xcert/nonfungiblealliance-www
0xcert/solidity-style-guide
0xcert/techpaper
0xcert/truffle
0xcert/web3.js
What I need to do is use all of the above values and generate a file that contains:
Github Profile (already stored in the attached sheet)
The Date when accessing this information
All the repositories belonging to that profile (code above but
filtered)
Now the Interesting part:
The commit history
number of commit (ID)
number of commit (ID)
Date of commit
Description of commit
person who commited
checks passed
checks failed
Almost the same needs to be done for closed and open pull requests although I think when solving the "problem" above solving the pull requests is the same strategy.
For the commits I'd do something like this:
for commits in $( $repoarray) do curl -i https://api.github.com/repos/$commits/commits | jq -r '.[]|.author.lgoin (and whatever els is needed)' done
basically this chart here needs to be filled
https://docs.google.com/spreadsheets/d/1mFXiohiWNXNP8CVztFA1PFF41jn3J9sRUhYALZShsPY/edit?usp=sharing
what I need help with:
storing my output from the first loop in a an array
loop through that array to get the number of commits
loop through that array to get the data to closed pull requests
loop through that array to get the data to open pull requests
Excuse my "noobish" question.
I'm using bash/jq and the GitHub API for the time.
I'd appreciate any kind of help.

How to use WebAPI in bash for sonarqube?

I want to write a shell script to login and get bugs for a project. I want the dashboard values like bugs, Vulnerabilities, code smells and coverage.
The url of dashboard is: http://www.example.com/dashboard?id=example_project_name.
Here is what I tried:
curl GET -u username:password http://www.example.com/api/issues/search?project=example_project_name&types=BUG.
So, this prints all the data. I just need the value show in the below image:
Basically What I want to achieve is that I’m using a Sonarqube plugin in Jenkins, so I use extended email plugin to send email for job execution and in that email I want to give details like number of bugs in the repository after the build.
Is there any other way?
Finally after reading the documentation carefully, I got the values. Here is the script that I created.
#!/bin/bash
vul=$(curl -sX GET -u username:password 'http://www.example.com/api/issues/search?projectKeys=example_project_name&types=VULNERABILITY');
bug=$(curl -sX GET -u username:password 'http://www.example.com/api/issues/search?projectKeys=example_project_name&types=BUG');
no_vul=$(echo $vul | jq -r .total);
no_bug=$(echo $bug | jq -r .total);
echo "Total number of VULNERABILITIES are $no_vul"
echo "Total number of BUGS are $no_bug"
Here is the API documentation URL.

curl error 18 transfer closed with outstanding read data remaining

Setup
I'm Using curl in the following bash script to push a JSON file to a REST API running in tomcat sitting behind nginx.
while IFS= read -d '' -r file; do
base=$(basename "$file")
datetime=$(find $file -maxdepth 0 -printf "%TY/%Tm/%Td %TH:%TM:%.2TS")
curl -vX POST -H "Content-Type: application/json" -H "Cache-Control: no-cache" \
-d #"$file" -u vangeeij:eian12 \
"http://192.168.105.10/homeaccess/services/aCStats/uploadData?username=vangeeij&filename=$base&datetime=$datetime"
#sudo mv "$file" /home/vangeeij/acserver/resultsOld
done < <(sudo find . -type f -print0)
Problem
When running this script I get a http 400 response with curl error:
curl: (18) transfer closed with outstanding read data remaining
What I have tried
I have found 2 things. First running the same URL and body through Postman yields a successful POST.
I found that this error goes away when the last parameter is removed from the URL &datetime=$datetime
I have also found a few connections between this error and setting a curl option something like
curl_setopt($curl, CURLOPT_HTTPHEADER, array('Expect:'));
But I'm not sure where/how to set this exactly when using curl in a simple bash script
Question
What do I need to change in my curl command to get rid of the error and still be able to use all parameters?
UPDATE
Starting a new question, as further investigation has lead me to a better understanding of the problem.
New Question Link
The error has to do with the fact that the parameter datetime= ends up with text in it that needs to be URL encoded.
This was confirmed by replacing the variable with 2017%2F03%2F01%2008%3A50%3A56
and it worked.
So now the problem is, that I can't get --data-urlencode datetime=$datetime to work. It seems this just gets appended to the JSON data or something.
This error is being generated by the fact that the datetime= paramater is being passed in with non encoded non URL friendly characters... (eg. space).
The fix to this would be to find a way to convert the $datetime to a URLEncoded String.
eg. convert:
2017/03/01 08:50:56
TO
2017%2F03%2F01%2008%3A50%3A56
See the following discussion for one method to accomplish this.
Post JSON data to Rest with URLEncoded query paramaters

Resources