How to debug GitHub Action failure - ffmpeg

Just yesterday a stable GitHub Action (CI) started failing rather cryptically and I've run out of tools to debug it.
All I can think of is our BUNDLE_ACCESS_TOKEN went bad somehow but I didn't set that up. It's an Action secret under Repository Secrets that are not visible in GitHub UI. How can I test to see if it's valid?
Or maybe it's something else?!? "Bad credentials" is vague...
Here's the meat of the action we're trying to run:
#my_tests.yml
jobs:
my-test:
runs-on: ubuntu-latest
services:
postgres:
image: postgres:13.4
env:
POSTGRES_USERNAME: postgres
POSTGRES_PASSWORD: postgres
POSTGRES_DB: myapp_test
ports:
- 5432:5432
options: --health-cmd pg_isready --health-interval 10s --health-timeout 5s --health-retries 5
env:
RAILS_ENV: test
POSTGRES_HOST: localhost
POSTGRES_USERNAME: pg
POSTGRES_PASSWORD: pg
GITHUB_TOKEN: ${{ secrets.BUNDLE_ACCESS_TOKEN }}
BUNDLE_GITHUB__COM: x-access-token:${{ secrets.BUNDLE_ACCESS_TOKEN }}
CUCUMBER_FORMAT: progress
steps:
- uses: actions/checkout#v2
- uses: FedericoCarboni/setup-ffmpeg#v1
...
And with debug turned on here's the Failure (line 20) from GitHub Actions:
Run FedericoCarboni/setup-ffmpeg#v1
1 ------- ##[debug]Evaluating condition for step: 'Run FedericoCarboni/setup-ffmpeg#v1'
2 ##[debug]Evaluating: success()
3 ##[debug]Evaluating success:
4 ##[debug]=> true
5 ##[debug]Result: true
6 ##[debug]Starting: Run FedericoCarboni/setup-ffmpeg#v1
7 ##[debug]Loading inputs
8 ##[debug]Loading env
9 Run FedericoCarboni/setup-ffmpeg#v1
10 with:
11 env:
12 RAILS_ENV: test
13 POSTGRES_HOST: localhost
14 POSTGRES_USERNAME: pg
15 POSTGRES_PASSWORD: pg
16 GITHUB_TOKEN: ***
17 BUNDLE_GITHUB__COM: x-access-token:***
19 CUCUMBER_FORMAT: progress
20 Error: Bad credentials
21 ##[debug]Node Action run completed with exit code 1
22 ##[debug]Finishing: Run FedericoCarboni/setup-ffmpeg#v1
Thanks for any help.

For your particular case try scoping GITHUB_TOKEN and BUNDLE_GITHUB__COM only to steps that actually use it instead the whole job.
Also consider switching to FedericoCarboni/setup-ffmpeg#v2 it has built in support for github.token.
Generic GH Action Debugging
https://github.com/nektos/act
Run actions locally. Mostly gives you faster feedback for experiments.
https://github.com/mxschmitt/action-tmate
Allows you to create interactive remote session where you can poke around.

Related

sh: 1: nest: Permission denied in GitHub Action

For some reason the build step for my NestJS project in my GitHub Action fails for a few days now. I use Turborepo with pnpm in a monorepo and try to run the build with turbo run build. This works flawlessly on my local machine, but somehow in GitHub it fails with sh: 1: nest: Permission denied. ELIFECYCLE  Command failed with exit code 126. I'm not sure how this is possible, since I couldn't find any meaningful change I made to the code in the meantime. It just stopped working unexpectedly. I actually think it is an issue with GH Actions, since it actually works in my local Docker build as well.
Has anyone else encountered this issue with NestJS in GH Actions?
This is my action yml:
name: Test, lint and build
on:
push:
jobs:
test-lint-build:
runs-on: ubuntu-latest
services:
postgres:
# Docker Hub image
image: postgres
# Provide the password for postgres
env:
POSTGRES_HOST: localhost
POSTGRES_USER: test
POSTGRES_PASSWORD: docker
POSTGRES_DB: financing-database
ports:
# Maps tcp port 5432 on service container to the host
- 2345:5432
# Set health checks to wait until postgres has started
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
steps:
- name: Checkout
uses: actions/checkout#v3
- name: Install pnpm
uses: pnpm/action-setup#v2.2.2
with:
version: latest
- name: Install
run: pnpm i
- name: Lint
run: pnpm run lint
- name: Test
run: pnpm run test
- name: Build
run: pnpm run build
env:
VITE_SERVER_ENDPOINT: http://localhost:8000/api
- name: Test financing-server (e2e)
run: pnpm --filter #project/financing-server run test:e2e
I found out what was causing the problem. I was using node-linker = hoisted to mitigate some issues the pnpm way of linking modules was causing with my jest tests. Removing this from my project suddenly made the action work again.
I still don't know why this only broke the build recently, since I've had this option activated for some time now.

How can I reuse "services" code in multiple GitHub CI jobs

I am trying to DRY up my GitHub ci.yml file somewhat. I have two jobs—one runs RSpec tests, the other runs Cucumber tests. There were a number of steps they shared, which I’ve extracted to an external action.
They both depend on a postgres and chrome Docker image however, and some environment variables, so currently both jobs include the below code. Is there any way I can put this code in one place for them both to use? Note I’m not attempting to share the image itself, I just don’t want to have the repeated code.
services:
postgres:
image: postgres:13
env:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
POSTGRES_DB: postgres
ports:
- 5432:5432
# Set health checks to wait until postgres has started
# tmpfs for faster DB in RAM
options: >-
--mount type=tmpfs,destination=/var/lib/postgresql/data
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
chrome:
image: seleniarm/standalone-chromium:4.1.2-20220227
ports:
- 4444:4444
env:
DB_HOST: localhost
CHROMEDRIVER_HOST: localhost
RAILS_ENV: test

Github Actions | Port conflict on parallel runs

I have a React application that is built with Vite. If I don't specify a specific port for the preview application, the port is taken randomly. And then Cypress doesn't know which host/port to go to. If I specify a specific port I get an error that the port is already in use. I don't really know what to do and how to get around this.
My action config:
name: Node.js CI
# Controls when the workflow will run
on:
# Triggers the workflow on push or pull request events but only for the master branch
pull_request:
branches: [ master ]
# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
# This workflow contains a single job called "build"
build:
# The type of runner that the job will run on
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
node-version: [16.x, 17.x]
containers: [1, 2, 3, 4]
# Steps represent a sequence of tasks that will be executed as part of the job
steps:
- uses: actions/checkout#v3
- name: Use Node.js ${{ matrix.node-version }}
uses: actions/setup-node#v3
with:
node-version: ${{ matrix.node-version }}
- run: npm ci
- name: Cypress run
uses: cypress-io/github-action#v2
with:
record: true
parallel: true
group: 'Actions example'
build: npm run cypress:build
start: npm run serve:ci
env:
CYPRESS_host: http://localhost
CYPRESS_port: 41732
CYPRESS_RECORD_KEY: ${{ secrets.CYPRESS_RECORD_KEY }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
package.json
...
"scripts": {
"serve:ci": "vite preview --port=41732",
}
...
Error message
start server "npm run serve:ci command "npm run serve:ci"
current working directory "/home/runner/work/web-client-vite/web-client-vite"
/opt/hostedtoolcache/node/17.8.0/x64/bin/npm run serve:ci
> web-client-vite#0.1.2 serve:ci
> vite preview --port=41732
> Local: http://localhost:41732/
> Network: use `--host` to expose
[1850:0402/162628.909537:ERROR:bus.cc(392)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix")
[1850:0402/162628.909613:ERROR:bus.cc(392)] Failed to connect to the bus: Could not parse server address: Unknown address type (examples of valid types are "tcp" and on UNIX "unix")
[2031:0402/162628.932637:ERROR:gpu_init.cc(453)] Passthrough is not supported, GL is swiftshader, ANGLE is
[2031:0402/162628.943271:ERROR:sandbox_linux.cc(374)] InitializeSandbox() called with multiple threads in process gpu-process.
Port 41732 is already in use.
Test run failed, code 1
More information might be available above
Cypress module has returned the following error message:
Could not find Cypress test run results
Error: Could not find Cypress test run results

Build times out, can't increase time out

I'm deploying to Kubernettes via Cloud Build. Every now and then the build times out because it exceeds the build-in time out of ten minutes. I can't figure out how to increase this time out. I'm using in-line build config in my trigger. It looks like this
steps:
- name: gcr.io/cloud-builders/docker
args:
- build
- '-t'
- '$_IMAGE_NAME:$COMMIT_SHA'
- .
- '-f'
- $_DOCKERFILE_NAME
dir: $_DOCKERFILE_DIR
id: Build
- name: gcr.io/cloud-builders/docker
args:
- push
- '$_IMAGE_NAME:$COMMIT_SHA'
id: Push
- name: gcr.io/cloud-builders/gke-deploy
args:
- prepare
- '--filename=$_K8S_YAML_PATH'
- '--image=$_IMAGE_NAME:$COMMIT_SHA'
- '--app=$_K8S_APP_NAME'
- '--version=$COMMIT_SHA'
- '--namespace=$_K8S_NAMESPACE'
- '--label=$_K8S_LABELS'
- '--annotation=$_K8S_ANNOTATIONS,gcb-build-id=$BUILD_ID'
- '--create-application-cr'
- >-
--links="Build
details=https://console.cloud.google.com/cloud-build/builds/$BUILD_ID?project=$PROJECT_ID"
- '--output=output'
id: Prepare deploy
- name: gcr.io/cloud-builders/gsutil
args:
- '-c'
- |-
if [ "$_OUTPUT_BUCKET_PATH" != "" ]
then
gsutil cp -r output/suggested gs://$_OUTPUT_BUCKET_PATH/config/$_K8S_APP_NAME/$BUILD_ID/suggested
gsutil cp -r output/expanded gs://$_OUTPUT_BUCKET_PATH/config/$_K8S_APP_NAME/$BUILD_ID/expanded
fi
id: Save configs
entrypoint: sh
- name: gcr.io/cloud-builders/gke-deploy
args:
- apply
- '--filename=output/expanded'
- '--cluster=$_GKE_CLUSTER'
- '--location=$_GKE_LOCATION'
- '--namespace=$_K8S_NAMESPACE'
id: Apply deploy
timeout: 900s
images:
- '$_IMAGE_NAME:$COMMIT_SHA'
options:
substitutionOption: ALLOW_LOOSE
substitutions:
_K8S_NAMESPACE: default
_OUTPUT_BUCKET_PATH: xxxxx-xxxxx-xxxxx_cloudbuild/deploy
_K8S_YAML_PATH: kubernetes/
_DOCKERFILE_DIR: ''
_IMAGE_NAME: xxxxxxxxxxx
_K8S_ANNOTATIONS: gcb-trigger-id=xxxxxxxx-xxxxxxx
_GKE_CLUSTER: xxxxx
_K8S_APP_NAME: xxxxx
_DOCKERFILE_NAME: Dockerfile
_K8S_LABELS: ''
_GKE_LOCATION: xxxxxxxx
tags:
- gcp-cloud-build-deploy
- $_K8S_APP_NAME
I've tried sticking the timeout: 900 arg in in various places with no luck.
The timeout of 10 minutes is the default for the whole build, therefore if you add the timeout: 900s option in any of the steps, it will only apply to the step that it has been added to. You can make a step have a larger timeout than the overall build timeout, but the whole build process will fail if the sum of all the steps exceeds the build timeout. This example shows this behavior:
steps:
- name: 'ubuntu'
args: ['sleep', '600']
timeout: 800s # Step timeout -> Allows the step to run up to 800s, but as the overall timeout is 600s, it will fail after that time has been passed, so the effective timeout value is 600s.
timeout: 600s # Overall build timeout
That said, the solution is to expand the overall build timeout by adding it outside of any step, and then you can have a build with up to 24h to finish before it fails with a timeout error.
Something like the following example should work out for you:
steps:
- name: 'ubuntu'
args: ['sleep', '600']
timeout: 3600s
Another way to solve this problem is to use a high-end machine so that overall it takes less time in the build process.
You can specify it like
options:
machineType: N1_HIGHCPU_8
Note: This performance benefits come at a cost. Please look into the pricing section to use optimal machine as per your requirement and budget.

Service elasticsearch is not visible when run tests

name: Rspec
on: [push]
jobs:
build:
runs-on: [self-hosted, linux]
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:7.9.2
env:
discovery.type: single-node
options: >-
--health-cmd "curl http://localhost:9200/_cluster/health"
--health-interval 10s
--health-timeout 5s
--health-retries 10
redis:
image: redis
options: --entrypoint redis-server
steps:
- uses: actions/checkout#v2
- name: running tests
run: |
sleep 60
curl -X GET http://elasticsearch:9200/
I am running tests self hosted, I see on host with docker ps the containers (redis and elasticsearch) when they up to test.
I access a container of redis, install a curl and run curl -X GET http://elasticsearch:9200/ and i see a response ok before 60 sec (wait time to service up)
On step running tests I got error message "Could not resolve host: elasticsearch"
So, inside service/container redis I see a host elasticsearch but on step running tests no. What I can do?
You have to map the ports of your service containers and use localhost:host-port as address in your steps running on the GitHub Actions runner.
If you configure the job to run directly on the runner machine and your step doesn't use a container action, you must map any required Docker service container ports to the Docker host (the runner machine). You can access the service container using localhost and the mapped port.
https://docs.github.com/en/free-pro-team#latest/actions/reference/workflow-syntax-for-github-actions#jobsjob_idservices
name: Rspec
on: [push]
jobs:
build:
runs-on: [self-hosted, linux]
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:7.9.2
env:
discovery.type: single-node
options: >-
--health-cmd "curl http://localhost:9200/_cluster/health"
--health-interval 10s
--health-timeout 5s
--health-retries 10
ports:
# <port on host>:<port on container>
- 9200:9200
redis:
image: redis
options: --entrypoint redis-server
steps:
- uses: actions/checkout#v2
- name: running tests
run: |
sleep 60
curl -X GET http://localhost:9200/
Alternative:
Also run your job in a container. Then the job has to access the service containers by hostname.
name: Rspec
on: [push]
jobs:
build:
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:7.9.2
env:
discovery.type: single-node
options: >-
--health-cmd "curl http://localhost:9200/_cluster/health"
--health-interval 10s
--health-timeout 5s
--health-retries 10
redis:
image: redis
options: --entrypoint redis-server
# Containers must run in Linux based operating systems
runs-on: [self-hosted, linux]
# Docker Hub image that this job executes in, pick any image that works for you
container: node:10.18-jessie
steps:
- uses: actions/checkout#v2
- name: running tests
run: |
sleep 60
curl -X GET http://elasticsearch:9200/

Resources