GitHub Actions Interview — Custom Actions: JavaScript and Docker

You want to create a GitHub Action that: downloads a file from AWS S3, processes it with a custom Python script, and uploads results back. This can't be done with shell commands (too complex). You're deciding between a JavaScript action and a Docker action. Which should you use?

Consider the tradeoffs: (1) JavaScript action: runs Node.js directly on the runner. Pros: fast (no container startup), access to the runner's environment, small file size. Cons: must write in Node.js or TypeScript, harder to manage external dependencies. (2) Docker action: runs your code in a containerized environment. Pros: can use any language (Python, Go, Rust), bundled dependencies (no installation needed), isolated from the runner. Cons: slower (container startup ~10s), larger file size. (3) For your scenario (Python script, S3 download, processing): use Docker. Python is your comfort zone, and you can bundle all dependencies (boto3, pandas, etc.) in the container image. (4) If the action is lightweight (simple shell command), use JavaScript. (5) If the action needs to run very frequently (milliseconds matter), use JavaScript. (6) For deployment: Docker actions are easier to maintain and debug. You can test the container locally. JavaScript actions require Node.js dependency management. (7) Performance: JavaScript is ~2-3x faster for simple tasks. For your S3 + Python scenario, the processing time dwarfs container startup, so Docker's overhead is negligible.

Follow-up: When would you use a composite action vs. JavaScript vs. Docker?

You created a JavaScript GitHub Action. You used `const os = require('os')` to get the OS name. The action works on macOS and Linux runners, but fails on Windows runners with "cannot find module 'os'". Why?

This is unusual—the built-in `os` module should be available on all platforms. Likely causes: (1) Node.js version mismatch: on Windows, the installed Node.js might be outdated or missing core modules. Check: `node --version` and `node -e "console.log(require('os').platform())"`. (2) Your `action.yml` specifies an unsupported Node.js version: `runs: using: 'node16'` means Node 16. Windows runners might have a different version. Use `node20` (latest supported). (3) Caching issue: if you use `npm install` and the Windows runner's npm cache is corrupt, modules might not install. Clear cache: `npm cache clean --force`. (4) More likely: your code has a typo. Double-check: `require('os')` is correct. (5) For cross-platform actions: test on all OS (Windows, macOS, Linux) to catch platform-specific issues. (6) Alternative: use a Docker action. Docker containers are Linux-based and behave consistently across platforms. If you use a JavaScript action on Windows, you're running Node.js natively—watch for platform differences.

Follow-up: How would you test a JavaScript action on Windows, macOS, and Linux runners?

You created a Docker GitHub Action. The action's image is 2 GB (includes Python, numpy, pandas, ML libraries). When users call the action, the first run takes 5 minutes (pulling and extracting the image). Subsequent runs are faster (image is cached). But the 5-minute initial delay bothers users. How do you optimize?

Optimize Docker image size and distribution: (1) Reduce image size: use a smaller base image (alpine instead of ubuntu). Example: `FROM python:3.11-alpine` is 50 MB vs. `python:3.11` is 1 GB. (2) Use multi-stage builds: build dependencies in an intermediate stage, then copy only the final artifacts to a minimal runtime image. This cuts size by 50-70%. (3) Remove unnecessary dependencies: some libraries include docs, tests, examples. Remove them. (4) Use lightweight package managers: if using apt, use `apt-get --no-install-recommends` to skip optional dependencies. (5) Pre-push image to a registry: instead of relying on GitHub's Docker build, pre-build and push the image to Docker Hub or GitHub Container Registry. When users pull, they get a cached image (faster). (6) Distribute via GitHub Container Registry: GHCR has global CDN and caches images per user. Faster download than rebuilding. (7) Split into smaller images: if the action has multiple use cases, provide small images for common cases (e.g., basic Python) and large images for advanced cases (with ML libraries). Users choose. (8) Document: "First run takes 5 minutes; subsequent runs are instant due to caching." This manages expectations. (9) For critical paths: pre-warm runners with the image. Keep a runner with the image cached permanently (or run a nightly cron to pull and cache it).

Follow-up: Design a Docker action distribution strategy that minimizes first-run overhead.

You created a JavaScript action that reads from `process.env.GITHUB_TOKEN` to authenticate to GitHub API. The action works locally (you manually set GITHUB_TOKEN). But in GitHub Actions workflows, the action fails: "Invalid credentials." Why?

GitHub provides GITHUB_TOKEN automatically in workflows, but there are caveats: (1) GITHUB_TOKEN is only available in the workflow's environment for the job. Your action must access it via the workflow's environment. In your action's step, pass it explicitly: `env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}` (or let it be inherited if the workflow already has it). (2) The action receives `GITHUB_TOKEN` through the job's environment variables. In Node.js, access it: `process.env.GITHUB_TOKEN`. (3) If your action is a Docker action, it needs to inherit the environment. In your action.yml: `runs: using: 'docker' image: 'docker://myaction'`. Docker containers inherit the job's environment by default. (4) Permissions: GITHUB_TOKEN has limited permissions by default. If your action needs more (e.g., `contents: write, actions: read`), the workflow must explicitly grant them: `permissions: contents: write, actions: read`. (5) If the action is forked or runs from a PR, GITHUB_TOKEN permissions are restricted for security. (6) For testing: set GITHUB_TOKEN locally when testing: `export GITHUB_TOKEN=ghp_xxxx` before running. (7) Fallback: if GITHUB_TOKEN isn't available, the action should fail gracefully: `if (!process.env.GITHUB_TOKEN) { throw new Error('GITHUB_TOKEN not set'); }`.

Follow-up: How would you securely handle credentials in a custom GitHub Action?

You created a Docker action. The action's Dockerfile runs: `RUN npm install` and `RUN pip install`. These take 5 minutes. Users call your action 10 times in a workflow. Does Docker cache the layers, or does each layer rebuild every time the action is called?

Docker caching within a workflow: (1) Within a single workflow run, if the action is called twice with identical inputs, Docker might reuse the image. But layers are built at image build time, not at action runtime. (2) When you call an action with `uses: myorg/myaction@v1`, GitHub pulls the pre-built image from the registry (Docker Hub, GHCR, etc.). The image is built once during release and reused for all workflows. (3) If you call the same action 10 times in a workflow, GitHub caches the image on the runner after the first pull. Subsequent pulls are instant. (4) To optimize action performance: (a) Pre-build the Docker image and push to a registry. Don't build at action runtime—build during CI and tag releases. (b) Use multi-stage builds in your Dockerfile so layers are built once and reused. (c) For large `RUN npm install` steps: use a base image that already has npm pre-installed. (d) Cache npm and pip downloads: use `docker/build-push-action@v5` with `cache-from: type=gha` to cache layers across workflow runs. (5) If the action runs 10 times in one workflow: the first run pulls the image (5s), then the 9 subsequent calls are instant (cached). Total overhead: 5 seconds for the first run, not 50 seconds. (6) For distributed workflows (multiple runners), each runner caches the image independently. Pulling on 10 parallel runners means 10 independent caches.

Follow-up: How would you optimize Docker layer caching for frequently-called actions?

You created a custom GitHub Action that's useful for your organization. You want to publish it so other teams can use it. Currently, it's in a private repo. What's the process to publish and maintain it?

Publishing a GitHub Action: (1) Move to a public repo (or create a new one). (2) Create an action.yml at the root with metadata: `name: 'My Action' description: 'Does X' inputs: ... outputs: ... runs: using: 'docker' image: 'docker://myorg/myaction:latest'`. (3) For Docker actions: publish the image to Docker Hub or GitHub Container Registry. Update action.yml to reference it. (4) Create a README documenting the action: inputs, outputs, examples, troubleshooting. (5) Release on GitHub Marketplace: once the action is public, go to GitHub Marketplace and submit it. GitHub reviews and publishes. (6) Version and tag: use semantic versioning (v1.0.0, v1.1.0, v2.0.0). Users can pin to specific versions. (7) For maintenance: (a) update dependencies regularly (npm packages, base Docker image). (b) respond to issues/PRs. (c) test on all OS (Windows, macOS, Linux) to ensure cross-platform compatibility. (d) maintain backward compatibility in v1—break changes warrant v2. (8) Documentation: keep README and examples up-to-date. Link to your source repository. (9) Security: scan dependencies for CVEs. Users trust your action; don't let vulnerabilities slip through. (10) Usage examples: provide real-world examples in the README. Show how to use it in a workflow.

Follow-up: Design a maintenance and versioning strategy for a published GitHub Action.

Your JavaScript action uses external npm dependencies (lodash, axios, etc.). You ran `npm install` during development, and `node_modules/` grew to 500 MB. When you pushed the action to GitHub, it broke—GitHub's UI warns "The Action repository is too large (>100 MB)." How do you fix this?

GitHub Actions have size limits and best practices for dependencies: (1) Don't commit node_modules to git. Instead: (a) Commit package.json and package-lock.json. (b) During action execution, run `npm install` to fetch dependencies. (c) For Docker actions, node_modules is installed inside the container (not committed to git). (2) For JavaScript actions: use a tool like `ncc` (Node Compiler Collection) to bundle dependencies. `ncc` compiles your action + all dependencies into a single file (dist/index.js). This is ~1-5 MB instead of 500 MB. (3) To use ncc: `npm install -g @vercel/ncc`, then `ncc build index.js -o dist`. Commit only the dist/ folder. (4) Alternative: use esbuild or webpack to bundle dependencies. (5) Minimize dependencies: do you really need lodash? Maybe a smaller alternative exists. Each dependency adds size. (6) For Docker actions: commit only the Dockerfile, not the image layers. The image is built in CI and pushed to a registry. Keeps the repo small. (7) For version control: if your repo exceeds 100 MB, GitHub will warn but still allow it (with performance degradation). For production actions, keep it <50 MB for reliability. (8) Use .gitignore: exclude node_modules, dist, build artifacts from git: `node_modules/ dist/ *.o *.so`.

Follow-up: How would you build and bundle a JavaScript action to minimize repository size?

You have a Docker action that calls GitHub API. Inside the container, you run: `curl -H "Authorization: Bearer $GITHUB_TOKEN" https://api.github.com/repos/...`. The action is public, and you publish it to GitHub Marketplace. A user runs your action: it works for them locally but fails on GitHub Actions with "Unauthorized." Why?

Environment variable inheritance in Docker actions: (1) Your action's Dockerfile references $GITHUB_TOKEN, which is set in the workflow. But the variable must be explicitly passed to the container. (2) In action.yml, ensure you're inheriting the environment: `runs: using: 'docker' image: 'docker://myimage' env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}`. (3) Or, pass via the workflow step: `uses: myorg/myaction@v1 env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}`. (4) GitHub provides GITHUB_TOKEN automatically for workflows, but it's scoped. If the workflow doesn't explicitly pass it, the container doesn't receive it. (5) For debugging: add a step that echoes (masked) the token: `- run: echo "GITHUB_TOKEN is set: ${{ env.GITHUB_TOKEN != '' }}"` (output: true/false, doesn't leak the value). (6) For users of your action: document: "This action requires GITHUB_TOKEN. Add to your workflow: `env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}`" (7) Alternatively, if the action doesn't need the token for users, don't reference it. Build it in a way that the token is optional. (8) For public actions: consider if you really need GITHUB_TOKEN. If the action is read-only (no modifications), GITHUB_TOKEN might not be needed.

Follow-up: How would you design a public action that properly handles secrets and environment variables?