Your CI/CD pipeline pulls dependencies from npm, PyPI, and Docker Hub every build. A malicious actor compromises the npm package `lodash@4.17.20` (popular, widely used). Your build pulls the compromised version. Weeks later, you discover the malicious code was running in production, stealing customer data. How do you prevent this?
Implement supply chain security: (1) Lock dependency versions: use lock files (package-lock.json, requirements.txt). Never use loose version ranges like `lodash: ^4.17.0`; lock to exact versions. (2) Verify package integrity: npm, PyPI, and Docker use checksums. Verify packages during install: `npm install --audit` checks for known vulnerabilities. (3) Use Software Composition Analysis (SCA): tools like Snyk, Dependabot, or Trivy scan your dependencies for known CVEs and malicious packages. (4) Implement SBOM (Software Bill of Materials): generate a list of all dependencies with versions. Audit it for suspicious packages. (5) For critical dependencies, review the source: check if the package maintainer is trusted, when the package was last updated, and if recent updates introduced suspicious code. (6) Use a package mirror/proxy: instead of pulling directly from npm, use a private mirror (Artifactory, Sonatype Nexus) that you control. Manually review packages before allowing them through. This adds delay but increases security. (7) Containerize builds: use a Docker image with pre-built dependencies instead of downloading during CI. Image is scanned once; reused for all builds. (8) Monitor supply chain: GitHub's Dependency Graph shows all dependencies; set up alerts for new vulnerabilities or malicious packages discovered post-deployment.
Follow-up: How would you implement a staged supply chain vetting process where dependencies are reviewed before use?
Your GitHub Actions workflow runs a third-party action: `third-party/deploy@v1.0`. The action's repository is compromised. An attacker updates the action to include a backdoor. Your workflow runs the compromised action, and the attacker now has access to your deployment secrets.
Always pin third-party actions to specific commit SHAs, not tags or branches: (1) Instead of `uses: third-party/deploy@v1.0` (mutable tag), use `uses: third-party/deploy@a1b2c3d4e5f6g7h8` (specific commit SHA). This prevents tag rewrites from affecting you. (2) Periodically update the SHA by reviewing changes: `git log --oneline third-party/deploy` to see what changed, then manually update your workflow. (3) Use Dependabot to automate SHA updates: it creates PRs to update actions, you review them, and merge. This gives visibility into changes. (4) Audit critical actions: before using a third-party action, review its source code. Check: Is it from a trusted vendor? Are there known security issues? How frequently is it updated? (5) Use GitHub's official actions when possible: `actions/checkout@v4`, `actions/setup-node@v4` are maintained by GitHub and vetted. (6) For actions with write access to secrets/repo, implement environment protection rules: require approval before running. (7) Scope action permissions: don't give actions blanket `contents: write` or `secrets: read`. Grant only needed permissions per action. Use GitHub's new fine-grained permissions (available in advanced settings). (8) Monitor action source: set up alerts if the action's repository is updated. Review changes before your workflows run them.
Follow-up: Design a system that automatically reviews and approves action updates before deploying.
Your organization uses GitHub Actions self-hosted runners on-premises. An engineer with access to the runner machine modifies the runner agent code to steal secrets from workflows. They commit this back to the runner repo, and now all workflows are compromised. How do you prevent insider threats?
Implement defense-in-depth for runners: (1) Never store long-lived credentials on runners. Use OIDC/federated auth; credentials are fetched fresh per job, valid for only 15 minutes. (2) Run workflows in isolated containers: use `container:` in the workflow to isolate each job. Even if a runner is compromised, the job's environment is isolated. (3) Audit runner access: only senior DevOps engineers should have SSH access to runner machines. Use bastion/jump hosts. Log all access via CloudTrail. (4) Version control everything: keep runner agent code in git. Require code review before updates. If someone modifies the agent, the diff is visible. (5) Use ephemeral runners: spawn a new runner per job (via Kubernetes, VMs), then destroy it. This prevents persistent compromise. (6) Monitor runner health: if a runner's CPU/network activity spikes unexpectedly, it might be compromised. Alert on anomalies. (7) For on-premises runners, implement kernel-level security: AppArmor or SELinux to restrict runner processes. Even if the agent is compromised, the OS restricts what it can do. (8) Rotate runner agent versions frequently. Require all runners to upgrade within 7 days. Old agents might have known vulnerabilities.
Follow-up: Design a system that detects compromised self-hosted runners automatically.
Your CI workflow builds a Docker image and pushes it to your container registry. The build output claims the image is based on `node:18-alpine`. However, you have no way to verify the image actually came from that base. An attacker could have replaced the base image. How do you ensure build transparency?
Implement build attestation and verification: (1) Use SLSA (Supply-chain Levels for Software Artifacts): GitHub Actions natively supports SLSA provenance. When pushing artifacts, generate a provenance file that attests to the build: who built it, which workflow, input, timestamp. `actions/upload-artifact@v3` automatically generates this. (2) Sign build artifacts: use a tool like Cosign to sign Docker images with your key. Image consumers verify the signature before using. (3) Use Sigstore: it's an open-source project that provides free code signing for open-source projects. GitHub Actions integrates with Sigstore. (4) Implement policy enforcement: in your Kubernetes cluster, require signed images via Admission Controllers. Unsigned images are rejected. (5) Use image scanning: Trivy or Grype scans images for known vulnerabilities and tampered layers. Include this in your CI. (6) For Docker images, use digest-based references: instead of `node:18-alpine` (mutable tag), use `node:18-alpine@sha256:abcd1234` (immutable digest). This pins the exact image. (7) Document supply chain: generate an SBOM for each image. It lists all dependencies, versions, and known CVEs. Consumers review this before deploying. (8) For high-security environments, require multiple independent builds and attestations: build the image twice on different runners, both sign it, and deployment requires both signatures.
Follow-up: How would you implement Sigstore integration for automatic code signing in GitHub Actions?
Your team has 5 production environments. Each requires different security policies: prod requires MFA approval, staging requires basic review, dev has no restrictions. You want to enforce these policies consistently across all 50 services' CI workflows. How do you scale security policies across a monorepo?
Centralize security policies: (1) Create a reusable workflow that enforces security gates: `_deploy-gate.yml` checks if deployment meets policy. Each service's deployment workflow calls it first: `jobs: gate: uses: ./.github/workflows/_deploy-gate.yml with: environment: prod`. (2) The gate workflow enforces: (a) branch restrictions (prod: only main), (b) approval requirements (prod: MFA, staging: basic review), (c) secret rotation checks (rotated within 30 days), (d) vulnerability scanning (no critical CVEs). (3) Use GitHub Environments with settings for each environment. Environment settings inherit across all repos in the org (Enterprise orgs only). For non-Enterprise, use a centralized config file that all services read. (4) Implement policy-as-code: store policies in a Git repo as YAML. Services read the policy at build time and enforce it. Example: `policy: prod: { require-approval: true, require-mfa: true, min-review-days: 1 }`. (5) Use a tool like Rego (from Open Policy Agent) to define complex policies: "deployments to prod require 2 approvals, one from the on-call engineer, and must pass security scan." The policy is code-reviewed and versioned. (6) Audit: log all deployment decisions (approved, denied, policy version used). Review monthly to ensure policies are working.
Follow-up: Design a policy-as-code system using OPA/Rego for deployment gates.
Your organization develops a popular open-source library. A contributor's pull request adds a GitHub Actions workflow that displays secrets in logs (accidentally). The PR is merged. The secret is now visible in the public repo's Actions logs forever. How do you mitigate?
Immediate: (1) Rotate the exposed secret (API key, token, password). GitHub Actions masks known secrets, but community patterns might not be caught. (2) Rewrite git history to remove the secret: `git filter-repo --path .github/workflows/bad-file.yml`. Force-push to remove it from history. (3) Contact users if the secret grants access to sensitive systems. (4) Document the incident in a security advisory. (5) For the future: (a) require code review for all workflow changes. (b) Use GitHub's secret scanning to detect secrets in PRs before merge. (c) Implement a pre-commit hook that scans for secrets locally before pushing. (d) Educate contributors: document "Never commit secrets. Use GitHub Secrets." (e) For open-source projects, implement branch protection: workflows can't run on PRs from untrusted forks; they run in a limited mode with no access to secrets. (f) Use ephemeral secrets: instead of storing long-lived API keys, use temporary credentials (15-min tokens) that auto-expire. (g) Create a secrets scanning GitHub Action that open-source projects can use: scan for common secret patterns (AWS keys, GitHub PATs, etc.) and fail if found. (h) For sensitive secrets, don't store them in GitHub at all—use an external provider and OIDC to fetch them at runtime.
Follow-up: How would you implement a GitHub Action that detects and prevents secrets from being committed to open-source repos?
Your company deploys software to critical infrastructure (hospitals, financial systems). Regulators require you to prove that code in production hasn't been tampered with, and it came from an authorized build process. Your current CI/CD has no audit trail or verification mechanism. How do you implement compliance?
Implement end-to-end verifiable build process: (1) Enable GitHub Actions audit logs: every workflow run, approval, and deployment is logged and retained. Export these logs to a secure SIEM (Security Information Event Management system). (2) Use SLSA (Supply-chain Levels for Software Artifacts): implement SLSA Level 3 or 4. This requires: (a) version-controlled workflows (all changes reviewed), (b) provenance attestation (who built, when, from which commit), (c) hermetic builds (reproducible, no random inputs), (d) signature verification (artifacts signed, verified before deployment). (3) For each build artifact (Docker image, executable, library), generate cryptographic proof: commit SHA, build timestamp, builder identity, inputs. Store this in a tamper-proof log (blockchain optional, but Git history works). (4) Implement deployment verification: before deploying, re-verify the artifact's provenance. Check: (a) artifact is signed by your key, (b) commit is from main branch, (c) commit has required approvals, (d) no security vulnerabilities detected. (5) Use a Hardware Security Module (HSM) to store signing keys. Keys are protected; even admins can't export them. (6) For extreme compliance, require multi-party approval: 3 authorized signers must approve each deployment. Any one can veto. (7) Audit retention: keep all logs/attestations for 7+ years for compliance. Implement immutable log storage (append-only ledger). (8) Regular audits: have an independent auditor periodically verify the supply chain is intact and hasn't been bypassed.
Follow-up: How would you implement SLSA Level 4 for a critical-infrastructure deployment pipeline?