What is the correct placement of caching in a GitHub Action? Specifically is in correct to place it before or after running setup of tools using another Action?
For example if I'm using something like haskell/actions/setup should my use of actions/cache precede or follow that? Put another way: if setup subsequently installs updated components on a future run of my Action, will the corresponding parts of the cache be invalidated?
The cache action should be placed before any step that consumes or creates that cache. This step is responsible for:
defining cache parameters.
restoring the cache, if it was cached in the past.
GitHub Actions will then run a "Post *" step after all the steps, which will store the cache for future calls.
See the example workflow from the documentation.
For example, consider this sample workflow:
name: Caching Test
on: push
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout#v2
- name: Enable Cache
id: cache-action
uses: actions/cache#v2
with:
path: cache-folder
key: ${{ runner.os }}-cache-key
- name: Use or generate the cache
if: steps.cache-action.outputs.cache-hit != 'true'
run: mkdir cache-folder && touch cache-folder/hello
- name: Verify we have our cached file
run: ls cache-folder
This is how it looks on the first run:
And this is how it looks on the second run:
GitHub will not invalidate the cache. Instead, it is the responsibility of the developer to ensure that the cache key is unique to the content it represents.
On common way to do this, is to set the cache key so that it contains a hash of a file that lives in the repository, so that changes to this file, will yield a different cache key. A good example for such a behavior, is when you have lock files that list all your repository's dependencies (requirements.txt for pyrhon, Gemfile.lock for ruby, etc).
This is achieved by a syntax similar to this:
key: ${{ runner.os }}-${{ hashFiles('**/lockfiles') }}
as described in the Creating a cache key section of the documentation.
Related
This is a question for the github workflow action dawidd6/action-download-artifact.
There is no discussion board in https://github.com/dawidd6/action-download-artifact, so asking this question in this forum.
This is how I wish to use this workflow in my GitHub repo:
A pull request is created.
This triggers an workflow – lets call it the “build workflow” - to build the entire repo and uploads the build artifacts.
Then another workflow – lets call it the “test workflow” - should start, that should download the build artifact using action-download-artifact and run some other actions.
Now if I put the trigger for the “test workflow” as pull_request, then how can I make it wait for the corresponding “build workflow” to complete? Do I specify the run_id ?
For now I am using “workflow_run” as the trigger for the run WF. But then when a PR is created, it does not show the “test workflow” as one of the checks for the PR. Can you help me figure out the correct way of using the download-artifact action that would help for my purpose?
You could write two workflows where the first builds when the pull request is opened or edited, and the second executes the test when the pull request is closed and merged. The HEAD commit SHA could be used to identify the artifact name between the two workflows.
I'm going to reword your requirements slightly.
Build everything and upload the artifacts when a pull request is opened or edited (e.g. new commits added).
Download the artifact and test it when a pull request is closed and merged.
Here are two sample workflows that would accomplish that. You will need to create a token to share the artifacts between workflows (see secrets.GITHUB_TOKEN below).
Build.yml
name: Build
on:
pull_request:
jobs:
Build:
steps:
- name: Environment Variables
shell: bash
run: |
ARTIFACTS_SHA=$(git rev-parse HEAD)
BUILD_ARTIFACTS=BuildArtifacts_${ARTIFACTS_SHA}
echo "ARTIFACTS_DIR=$ARTIFACTS_DIR" >> $GITHUB_ENV
- name: Build
run: make
- name: Capture Artifacts
uses: actions/upload-artifact#2
with:
name: Artifacts_${{ env.ARTIFACTS_SHA }}
path: path/to/artifact/
Test.yml
name: Test
on:
pull_request:
types: [closed]
jobs:
Test:
steps:
- name: Environment Variables
shell: bash
run: |
ARTIFACTS_SHA=$(git rev-parse HEAD)
BUILD_ARTIFACTS=BuildArtifacts_${ARTIFACTS_SHA}
echo "ARTIFACTS_DIR=$ARTIFACTS_DIR" >> $GITHUB_ENV
- name: Download Artifacts
uses: dawidd6/action-download-artifact#v2
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
workflow: Build.yml
name: ${{ env.BUILD_ARTIFACTS }}
I have the following Github Action workflow that is intended to read our lines of code coverage from a coverage.txt file and print the coverage as a comment.
name: Report Code Coverage
on:
pull_request:
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout#v2
- name: Use Node.js 12.x
uses: actions/setup-node#v1
with:
node-version: 12.x
- name: Install dependencies
run: npm ci
- name: Run tests
run: npm run test:coverage
## I need help around here, as test:coverage generates a file and I need to get a value from a file
- uses: mshick/add-pr-comment#v1
with:
message: |
**{{Where I want the code coverage value to go}}**
🌏
!
repo-token: ${{ secrets.GITHUB_TOKEN }}
repo-token-user-login: 'github-actions[bot]' # The user.login for temporary GitHub tokens
allow-repeats: false # This is the default
Where I am struggling is on taking the output of the file, which I can obtain with bash using awk '/^Lines/ { print $2 }' < coverage.txt (not sure if there is a better alternative way) and inserting it into the message, which currently just says hello.
I found this article on yaml variables, but when I added some of that code to my own file, it just was not recognized any longer by GitHub actions (which is the only way I know to test yaml). Normally it would fail somewhere and give me an error, but after multiple attempts, the only thing that worked was to remove it.
It is quite possible that I am missing something obvious as I do not know yaml very well nor even what certain key words might be.
Alternatively, is it easier to just dump the contents of the file into the message? That could be acceptable as well.
You can create a step that gets the coverage to an output variable that then can be accessed by the next step.
Notice that for utilizing this method you will need to give the step and id and the set output variable a variable name so that you can access it in follow up steps within the same job.
Sample with your workflow below, but if you want to see a running demo here is the repo https://github.com/meroware/demo-coverage-set-output
- name: Run tests
run: npm run test:coverage
- name: Get coverage output
id: get_coverage
run: echo "::set-output name=coverage::$(awk '/^Lines/ { print $2 }' < test.txt)"
- uses: mshick/add-pr-comment#v1
with:
message: |
Coverage found ${{steps.get_coverage.outputs.coverage}}
🌏
!
repo-token: ${{ secrets.GITHUB_TOKEN }}
repo-token-user-login: 'github-actions[bot]' # The user.login for temporary GitHub tokens
allow-repeats: false # This is the default
I make a lot of github actions tutorials here if you're looking to grasp more on this topic. Cheers!!
I would like to avoid repeating a strategy matrix across jobs:
jobs:
build-sdk:
runs-on: macOS-latest
strategy:
fail-fast: false
matrix:
qt-version: ['5.15.1']
ios-deployment-architecture: ['arm64', 'x86_64']
ios-deployment-target: '12.0'
steps:
…
create-release:
needs: build-sdk
runs-on: macOS-latest
steps:
…
publish-sdk:
needs: [build-sdk, create-release]
runs-on: macOS-latest
strategy:
fail-fast: false
matrix: ?????
steps:
…
Is this possible (without creating a job to create the matrix as JSON itself)?
There's an action that allows uploading multiple assets to the same release from a matrix build that's triggered on push to a tag. Someone filed an issue about this specific use-case, and the action's author responded with
Assets are uploaded for the GitHub release associated with the same tag so as long as the this action is run in a workflow run for the same tag all assets should get added to the same GitHub release.
This suggests that a workflow like this would probably meet your needs:
on:
push:
tags:
- 'v*' # Push events to matching v*, i.e. v1.0, v20.15.10
jobs:
release:
runs-on: macOS-latest
strategy:
fail-fast: false
matrix:
qt-version: ['5.15.1']
ios-deployment-architecture: ['arm64', 'x86_64']
ios-deployment-target: '12.0'
steps:
- name: build SDK
run: ...
- name: Create Release
uses: softprops/action-gh-release#v1
with:
files: |
- "SDK_file1" # created in previous build step
- "SDK_file2"
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GITHUB_REPOSITORY: username/reponame
- name: publish SDK
run: ...
I've simplified what you would need to do, but I'm guessing you might be wanting to upload assets with names reflecting their applicable matrix options. For that detail, I recommend adding an explicit step in your job to create the asset's filename and add it to the job environment, somewhat similar to what I've done here:
- name: Name asset
run: |
BINARY_NAME=sdk-qt${{matrix.qt-version}}-iOS${{matrix.ios-deployment-target}}-${{matrix.ios-deployment-architecture}}
echo "BINARY_NAME=$BINARY_NAME" >> $GITHUB_ENV
Then, when your build step generates your assets, you can name them with the filename in ${{env.BINARY_NAME}}, and pass that same name to the release creation step like I've done in my asset release step here.
I have a gitlab job that downloads a bunch of dependencies and stuffs them in a cache (if necessary), then I have a bunch of jobs that use that cache. I notice at the end of the consuming jobs, they spend a bunch of time creating a new cache, even though they made no changes to it.
Is it possible to have them act only as consumers? Read-only?
cache:
paths:
- assets/
configure:
stage: .pre
script:
- conda env update --prefix ./assets/env/base -f ./environment.yml;
- source activate ./assets/env/base
- bash ./download.sh
parse1:
stage: build
script:
- source activate ./assets/env/base;
- ./build.sh -b test -s 2
artifacts:
paths:
- build
parse2:
stage: build
script:
- source activate ./assets/env/base;
- ./build.sh -b test -s 2
artifacts:
paths:
- build
In the very detailed .gitlab-ci.yml documentation is a reference to a cache setting called policy. GitLab caches have the concept of push (aka write) and pull (aka read). By default it is set to pull-push (read at the beginning and write at the end).
If you know the job does not alter the cached files, you can skip the upload step by setting policy: pull in the job specification. Typically, this would be twinned with an ordinary cache job at an earlier stage to ensure the cache is updated from time to time:
.gitlab-ci.yml > cache:policy
Which pretty much describes this situation: the job configure updates the cache, and the parse jobs do not alter the cache.
In the consuming jobs, add:
cache:
paths:
- assets/
policy: pull
For clarity, it probably wouldn't hurt to make that explicit in the global setting:
cache:
paths:
- assets/
policy: pull-push
TLDR. Overwrite cache with no path element.
You probably have to add a key element to your global cache configuration too. I actually have never used without a key element.
See the cache documentation here
What I considered:
github offers github pages to host documentation in either a folder on my master branch or a dedicated gh-pages branch, but that would mean to commit build artifacts
I can also let readthedocs build and host docs for me through webhooks, but that means learning how to configure Yet Another Tool at a point in time where I try to consolidate everything related to my project in github-actions
I already have a docu-building process that works for me (using sphinx as the builder) and that I can also test locally, so I'd rather just leverage that instead. It has all the rules set up and drops some static html in an artifact - it just doesn't get served anywhere. Handling it in the workflow, where all the other deployment configuration of my project is living, feels better than scattering it over different tools or github specific options.
Is there already an action in the marketplace that allows me to do something like this?
name: CI
on: [push]
jobs:
... # do stuff like building my-project-v1.2.3.whl, testing, etc.
release_docs:
steps:
- uses: actions/sphinx-to-pages#v1 # I wish this existed
with:
dependencies:
- some-sphinx-extension
- dist/my-project*.whl
apidoc_args:
- "--no-toc"
- "--module-first"
- "-o docs/autodoc"
- "src/my-project"
build-args:
- "docs"
- "public" # the content of this folder will then be served at
# https://my_gh_name.github.io/my_project/
In other words, I'd like to still have control over how the build happens and where artifacts are dropped, but do not want to need to handle the interaction with readthedocs or github-pages.
###Actions that I tried
❌ deploy-to-github-pages: runs the docs build in an npm container - will be inconvenient to make it work with python and sphinx
❌ gh-pages-for-github-action: no documentation
❌ gh-pages-deploy: seems to target host envs like jekyll instead of static content, and correct usage with yml syntax not yet documented - I tried a little and couldn't get it to work
❌ github-pages-deploy: looks good, but correct usage with yml syntax not yet documented
✅ github-pages: needs a custom PAT in order to trigger rebuilds (which is inconvenient) and uploads broken html (which is bad, but might be my fault)
✅ deploy-action-for-github-pages: also works, and looks a little cleaner in the logs. Same limitations as the upper solution though, it needs a PAT and the served html is still broken.
The eleven other results when searching for github+pages on the action marketplace all look like they want to use their own builder, which sadly never happens to be sphinx.
In the case of managing sphinx using pip (requirements.txt), pipenv, or poetry, we can deploy our documentation to GitHub Pages as follows. For also other Python-based Static Site Generators like pelican and MkDocs, the workflow works as same. Here is a simple example of MkDocs. We just add the workflow as .github/workflows/gh-pages.yml
For more options, see the latest README: peaceiris/actions-gh-pages: GitHub Actions for GitHub Pages 🚀 Deploy static files and publish your site easily. Static-Site-Generators-friendly.
name: github pages
on:
push:
branches:
- main
jobs:
deploy:
runs-on: ubuntu-18.04
steps:
- uses: actions/checkout#v2
- name: Setup Python
uses: actions/setup-python#v2
with:
python-version: '3.8'
- name: Upgrade pip
run: |
# install pip=>20.1 to use "pip cache dir"
python3 -m pip install --upgrade pip
- name: Get pip cache dir
id: pip-cache
run: echo "::set-output name=dir::$(pip cache dir)"
- name: Cache dependencies
uses: actions/cache#v2
with:
path: ${{ steps.pip-cache.outputs.dir }}
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt') }}
restore-keys: |
${{ runner.os }}-pip-
- name: Install dependencies
run: python3 -m pip install -r ./requirements.txt
- run: mkdocs build
- name: Deploy
uses: peaceiris/actions-gh-pages#v3
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./site
I got it to work, but there is no dedicated action to build and host sphinx docs on either github pages or readthedocs as of yet, so as far as I am concerned there is quite a bit left to be desired here.
This is my current release_sphinx job that uses the deploy-action-for-github-pages action and uploads to github-pages:
release_sphinx:
needs: [build]
runs-on: ubuntu-latest
container:
image: python:3.6
volumes:
- dist:dist
- public:public
steps:
# check out sources that will be used for autodocs, plus readme
- uses: actions/checkout#v1
# download wheel that was build and uploaded in the build step
- uses: actions/download-artifact#v1
with:
name: distributions
path: dist
# didn't need to change anything here, but had to add sphinx.ext.githubpages
# to my conf.py extensions list. that fixes the broken uploads
- name: Building documentation
run: |
pip install dist/*.whl
pip install sphinx Pallets-Sphinx-Themes
sphinx-apidoc --no-toc --module-first -o docs/autodoc src/stenotype
sphinx-build docs public -b dirhtml
# still need to build and set the PAT to get a rebuild on the pages job,
# apart from that quite clean and nice
- name: github pages deploy
uses: peaceiris/actions-gh-pages#v2.3.1
env:
PERSONAL_TOKEN: ${{ secrets.PAT }}
PUBLISH_BRANCH: gh-pages
PUBLISH_DIR: public
# since gh-pages has a history, this step might no longer be necessary.
- uses: actions/upload-artifact#v1
with:
name: documentation
path: public
Shoutout to the deploy action's maintainer, who resolved the upload problem within 8 minutes of me posting it as an issue.