I have created a script that automatically backs up my GitHub repositories on a hard drive.
I use my Github username in combination with a personal access token to get authorized to Github. Now I've been reading a bit in their documentation about how to get ALL my repositories from the API (public & private), but I can only seem to get the public...
My script: https://github.com/TomTruyen/GitHub-Backup-Script/blob/main/github_backup_script.sh
From what I can understand on line 78, the url should return all my 'owned' repositories (which should include my privates ones)
repositories=$(curl -XGET -s https://"${GITHUB_USERNAME}":"${GITHUB_TOKEN}"#api.github.com/users/"${GITHUB_USERNAME}"/repos?per_page="${repository_count}" | jq -c --raw-output ".[] | {name, ssh_url}")
I have already enabled ALL repository scopes which should give me 'Full control of private repositories (and public)'
I'm out of ideas right now... am I doing something wrong?
NOTE: I'm trying to get my private repositories as a USER, not as an organization
NOTE: ${GITHUB_USERNAME} & ${GITHUB_TOKEN} are variables that I have of course filled in, in my script
You're calling the /users endpoint, but looking at List repositories for the authenticated user it looks like you should be calling /user/repos.
By default this will return all repositories, both public and private, for the currently authenticated user. You'll also need to correctly handle pagination (unless you know for sure you have fewer than 100 repositories).
I was able to fetch a list of all my repositories using the following script:
#!/bin/sh
#
# you must set GH_API_USER and GH_API_TOKEN in your environment
tmpfile=$(mktemp curlXXXXXX)
trap "rm -f $tmpfile" EXIT
page=0
while :; do
let page++
curl -sf -o $tmpfile \
-u "$GH_API_USER:$GH_API_TOKEN" \
"https://api.github.com/user/repos?per_page=100&page=$page&visibility=all"
count=$(jq length $tmpfile)
if [[ $count = 0 ]]; then
break
fi
jq '.[]|.full_name' $tmpfile
done
visibility=all is the key thing here, guys. I spent hours but end up not getting what I wanted. It's been mentioned in the doc as well -
https://docs.github.com/en/rest/repos/repos#list-repositories-for-the-authenticated-user
Related
Context
I wrote the following code to get the last n commits of a repository of a GitHub user/organisation:
# Get commits
commits_json=$(curl -H "Accept: application/vnd.github.v3+json" https://api.github.com/repos/$github_username/$github_repo_name/commits?per_page=1&page=1)
echo "commits_json=$commits_json"
echo ""
# Get the first commit.
readarray -t branch_commits_arr < <(echo "$commits_json" | jq ".[].sha")
echo "branch_commits_arr=$branch_commits_arr"
Issue
I noticed I get into the reported rate limits of 60 API calls per hour when I try to do this for all repositories in a GitHub user/organisation.
Attempt I
I tried the more general format to get the commit lists in a single API call:
curl -H "Accept: application/vnd.github.v3+json" https://api.github.com/repos/$some_user/commits?per_page=10&page=1
Which returned:
{ "message": "Not Found",
"documentation_url": "https://docs.github.com/rest/reference/repos#get-a-repository"
}
Attempt II
Another approach to get the data without triggering the API rate limit would be to parse the atom format of each repository, however, it seems like this is an undesirable hack/more boilerplate code than needed.
Question
Hence, I was wondering, how can one get a list/json of/containing all/the most recent 100/n commits across all the repositories of a GitHub user/organisation, using the GitHub API in Bash?
I am using the below script to generate a list of repositories in one of my GitHub Enterprise orgs and it works fine; however, by default it only fetches 100 repos at a time.
How can I modify it to generate the entire list? I have some 2000 repos in my GitHub org.
curl --silent --user "myusername:mypassword" "https://github.***.com/api/v3/orgs/myorg/repos?page=1&per_page=2000" | npx jq '.[].clone_url' | while read repo
do
repo="${repo%\"}"
repo="${repo#\"}"
echo "$repo"
done > repolist.txt
I am unable to tweak the page=*&per_page=* here and no matter what number combinations I use, when I execute the above shell script, a file called repolist.txt is generated with the list of first 100 repos in the GitHub org.
From docs
You can specify how many items to receive (up to a maximum of 100);
You can go to /orgs/{org} and read public_repos
With this value, you can make total_pages=$(($public_repos / 100 + 1)) and iterate in total_pages incrementing your page prop.
Below is a small code, just add your credentials and Org Name:
#!/bin/bash
user=""
password=""
org=""
public_repos=$(curl -s -u "${user}:${password}" "https://api.github.com/orgs/${org}" | jq .public_repos)
per_page=100
total_pages=$(($public_repos / $per_page + 1))
for page in $(seq 1 $total_pages); do
curl -s -u "${user}:${password}"\
"https://api.github.com/orgs/${org}/repos?page=${page}&per_page=${per_page}" | \
jq -r '.[].clone_url'
done
For a project I need to extract data from a lot of different blockchain GitHub profiles to a csv.
After browsing through the GitHub API I was able to achieve some of the necessary data being shown as txt/csv files using bash commands and jq.
Now doing all of this manually would probably take 7 days. I have a list of profiles i need to loop through saved as CSV.
The list looks like this --> https://docs.google.com/spreadsheets/d/1lFsewAYI7F8zSw7WPhI9E9WwR8f4G1clw1yjxY3wz_4/edit#gid=0
My approach so far to get all the repo names looks like this:
sample='[{"name":"0chain"},{"name":"0stateapp"},{"name":"0xcert"}]'
the csv belongs in here, I didn't know how to redirect it to that variable yet, but for testing purposes this was enough. If somebody knows how to, feel free to give a hint.
for row in $(echo "${sample}" | jq -r '.[] | #base64'); do
_jq()
{
echo ${row} | base64 --decode | jq -r ${1}
}
for GHUSER in $( echo $(_jq '.name')); do
curl -s https://api.github.com/users/$GHUSER/repos?per_page=100 | jq -r '.[]|.full_name'
done
done
The output looks like this:
0chain/0chain-token
0chain/client-sdk
0chain/docs
0chain/gorocksdb
0chain/hostadmin
0chain/rocksdb
0stateapp/ZSCoin
0xcert/0xcert
0xcert/conventions
0xcert/docs
0xcert/erc721-validator
0xcert/erc721-validator-api
0xcert/erc721-validator-ui
0xcert/erc721-website
0xcert/ethereum
0xcert/ethereum-crowdsale
0xcert/ethereum-dex
0xcert/ethereum-erc20
0xcert/ethereum-erc721
0xcert/ethereum-minter
0xcert/ethereum-utils
0xcert/ethereum-xcert
0xcert/ethereum-xcert-builder
0xcert/ethereum-zxc
0xcert/framework
0xcert/framework-cert-test
0xcert/nonfungiblealliance-www
0xcert/solidity-style-guide
0xcert/techpaper
0xcert/truffle
0xcert/web3.js
What I need to do is use all of the above values and generate a file that contains:
Github Profile (already stored in the attached sheet)
The Date when accessing this information
All the repositories belonging to that profile (code above but
filtered)
Now the Interesting part:
The commit history
number of commit (ID)
number of commit (ID)
Date of commit
Description of commit
person who commited
checks passed
checks failed
Almost the same needs to be done for closed and open pull requests although I think when solving the "problem" above solving the pull requests is the same strategy.
For the commits I'd do something like this:
for commits in $( $repoarray) do curl -i https://api.github.com/repos/$commits/commits | jq -r '.[]|.author.lgoin (and whatever els is needed)' done
basically this chart here needs to be filled
https://docs.google.com/spreadsheets/d/1mFXiohiWNXNP8CVztFA1PFF41jn3J9sRUhYALZShsPY/edit?usp=sharing
what I need help with:
storing my output from the first loop in a an array
loop through that array to get the number of commits
loop through that array to get the data to closed pull requests
loop through that array to get the data to open pull requests
Excuse my "noobish" question.
I'm using bash/jq and the GitHub API for the time.
I'd appreciate any kind of help.
I want to write a shell script to login and get bugs for a project. I want the dashboard values like bugs, Vulnerabilities, code smells and coverage.
The url of dashboard is: http://www.example.com/dashboard?id=example_project_name.
Here is what I tried:
curl GET -u username:password http://www.example.com/api/issues/search?project=example_project_name&types=BUG.
So, this prints all the data. I just need the value show in the below image:
Basically What I want to achieve is that I’m using a Sonarqube plugin in Jenkins, so I use extended email plugin to send email for job execution and in that email I want to give details like number of bugs in the repository after the build.
Is there any other way?
Finally after reading the documentation carefully, I got the values. Here is the script that I created.
#!/bin/bash
vul=$(curl -sX GET -u username:password 'http://www.example.com/api/issues/search?projectKeys=example_project_name&types=VULNERABILITY');
bug=$(curl -sX GET -u username:password 'http://www.example.com/api/issues/search?projectKeys=example_project_name&types=BUG');
no_vul=$(echo $vul | jq -r .total);
no_bug=$(echo $bug | jq -r .total);
echo "Total number of VULNERABILITIES are $no_vul"
echo "Total number of BUGS are $no_bug"
Here is the API documentation URL.
I'm attempting to use the new incremental authorization for an installed app in order to add scopes to an existing authorization while keeping the existing scopes. This is done using the new include_granted_scopes=true parameter. However, no matter what I've tried, the re-authorization always overwrites the scopes completely. Here's a minimal Bash PoC script I've written to demo my issue:
client_id='716905662885.apps.googleusercontent.com' # throw away client_id (non-prod)
client_secret='CMVqIy_iQqBEMlzjYffdYM8A' # not really a secret
redirect_uri='urn:ietf:wg:oauth:2.0:oob'
while :
do
echo "Please enter a list of scopes (space separated) or CTRL+C to quit:"
read scope
# Form the request URL
# http://goo.gl/U0uKEb
auth_url="https://accounts.google.com/o/oauth2/auth?scope=$scope&redirect_uri=$redirect_uri&response_type=code&client_id=$client_id&approval_prompt=force&include_granted_scopes=true"
echo "Please go to:"
echo
echo "$auth_url"
echo
echo "after accepting, enter the code you are given:"
read auth_code
# swap authorization code for access token
# http://goo.gl/Mu9E5J
auth_result=$(curl -s https://accounts.google.com/o/oauth2/token \
-H "Content-Type: application/x-www-form-urlencoded" \
-d code=$auth_code \
-d client_id=$client_id \
-d client_secret=$client_secret \
-d redirect_uri=$redirect_uri \
-d grant_type=authorization_code)
access_token=$(echo -e "$auth_result" | \
grep -Po '"access_token" *: *.*?[^\\]",' | \
awk -F'"' '{ print $4 }')
echo
echo "Got an access token of:"
echo $access_token
echo
# Show information about our access token
info_result=$(curl -s --get https://www.googleapis.com/oauth2/v2/tokeninfo \
-H "Content-Type: application/json" \
-d access_token=$access_token)
current_scopes=$(echo -e "$info_result" | \
grep -Po '"scope" *: *.*?[^\\]",' | \
awk -F'"' '{ print $4 }')
echo "Our access token now allows the following scopes:"
echo $current_scopes | tr " " "\n"
echo
echo "Let's add some more!"
echo
done
The script simply performs OAuth authorization and then prints out the scopes the token is currently authorized to use. In theory it should continue to add scopes each time through but in practice, the list of scopes is getting overwritten each time. So the idea would be on the first run, you'd use a minimal scope of something like email and then the next run, tack on something more like read-only calendar https://www.googleapis.com/auth/calendar.readonly. Each time, the user should only be prompted to authorize the currently requested scopes but the resulting token should be good for all scopes including those authorized on previous runs.
I've tried with a fresh client_id/secret and the results are the same. I know I could just include the already authorized scopes again but that prompts the user for all of the scopes, even those already granted and we all know the longer the list of scopes, the less likely the user is to accept.
UPDATE: during further testing, I noticed that the permissions for my app do show the combined scopes of each incremental authorization. I tried waiting 30 seconds or so after the incremental auth, then grabbing a new access token with the refresh token but that access token is still limited to the scopes of the last authorization, not the combined scope list.
UPDATE 2: I've also toyed around with keeping the original refresh token. The refresh token is only getting new access tokens that allow the original scopes, the incrementally added scopes are not included. So it seems effectively that include_granted_scopes=true is having no effect on the tokens, the old and new refresh tokens continue to work but only for their specified scopes. I cannot get a "combined scope" refresh or access token.
Google's OAuth 2.0 service does not support incremental auth for installed/native apps; it only works for the web server case. Their documentation is broken.
Try adding a complete list of scopes to the second request, where you exchange authorization code for an access token. Strangely enough, scope parameter doesn't seem to be documented, but it is present in requests generated by google-api-java-client. For example:
code=foo&grant_type=authorization_code
&redirect_uri=http%3A%2F%2Flocalhost%3A8080%2Fmyapp%2FoauthCallback
&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fuserinfo.email+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fuserinfo.profile+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fplus.me+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fplus.stream.write
In the web server scenario, a complete list of granted scopes is returned together with authorization code when include_granted_scopes is set to true. This is another bit of information that seems to be missing from linked documentation.
Edit 1 Including a complete list of scopes in the code exchange request works for us in our Java app, but I have just tried your original script with no modification (except for client id/secret) and it works just fine (edited just the ids and tokens):
$ bash tokens.sh
Please enter a list of scopes (space separated) or CTRL+C to quit:
https://www.googleapis.com/auth/userinfo.profile
Please go to:
https://accounts.google.com/o/oauth2/auth?scope=https://www.googleapis.com/auth/userinfo.profile&redirect_uri=urn:ietf:wg:oauth:2.0:oob&response_type=code&client_id=189044568151-4bs2mcotfi2i3k6qp7vq8c6kbmkp2rf8.apps.googleusercontent.com&approval_prompt=force&include_granted_scopes=true
after accepting, enter the code you are given:
4/4qXGQ6Pt5QNYqdEuOudzY5G0ogru.kv_pt5Hlwq8UYKs_1NgQtlUFsAJ_iQI
Got an access token of:
ya29.1.AADtN_XIt8uUZ_zGZEZk7l9KuNQl9omr2FRXYAqf67QF92KqfvXliYQ54ffg_3E
Our access token now allows the following scopes:
https://www.googleapis.com/auth/userinfo.profile
https://www.googleapis.com/auth/userinfo.email
https://www.googleapis.com/auth/plus.me
https://www.googleapis.com/auth/plus.circles.read
You can see that the previously granted scopes are included...