Ansible Interview — Molecule Testing Framework

Your Ansible role is used by 200 teams. A small change to a task breaks 30 dependent playbooks in production because the role's interface changed. You need testing that prevents this. How do you implement comprehensive role testing using Molecule?

Implement Molecule with scenario-based testing. Create test scenarios covering: basic functionality, edge cases, and backwards compatibility. Define test steps in molecule.yml: lint, prepare (setup test environment), create (Docker/VM instances), converge (run role), verify (validate state), destroy. Create verify.py playbook that tests role idempotency: run role twice and verify second run produces no changes. Test role interface by verifying all expected variables are available and have correct defaults. Create scenarios for role dependencies: test role works with optional dependencies disabled. Implement inventory tests: verify role works with different host groups and vars. Test role behavior with minimal vars (only required vars) and full vars (all options). Create role changelog documenting variables/handlers that are public interface. Use `molecule lint` to catch YAML issues before testing. Implement version constraints in role metadata to prevent incompatible versions. Test role on multiple OS versions (molecule matrix). Publish test results in CI/CD showing coverage of role functionality.

Follow-up: How would you test role idempotency to ensure running it twice produces same result?

Your Ansible roles are tested with Molecule using Docker, but production runs on VMs with different kernel versions and network stack. Docker tests pass, but production failures occur. How do you close the testing-production gap?

Implement Molecule with multiple drivers: Docker for fast feedback, Vagrant for VMs, cloud provider driver for production-like infrastructure. Use `molecule.yml` matrix to test role across drivers: `platforms: [{name: docker-alpine}, {name: vagrant-ubuntu}]`. For network-specific testing, use Vagrant driver which provides network stack closer to production. Implement provider-specific test scenarios: test networking role with Vagrant only since Docker has limited network features. Create staging test environment that mirrors production: same OS versions, kernels, network configuration. Use Ansible provisioner in Molecule to set up realistic test environment. Implement `prepare` step that configures test instance to match production baseline (security groups, network settings, etc.). Use cloud provider Molecule driver (AWS, Azure) to test in production-like environment. Create integration test playbook that tests role interaction with other infrastructure components. Implement `converge` step that handles both idempotent and non-idempotent scenarios. Monitor production issues and backport fixes to Molecule test scenarios.

Follow-up: How would you implement cross-platform role testing for Windows and Linux?

Your Molecule tests run for 30 minutes per role, and with 50 roles to test in CI/CD, tests take 25 hours total. This creates deployment bottleneck. How do you optimize Molecule test performance?

Implement test parallelization: run multiple Molecule instances on different agents simultaneously. Use CI/CD matrix strategy to run 5 role tests in parallel, reducing from 25 hours to 5 hours. Optimize Docker images: use lightweight base images (alpine) instead of full distributions. Cache Docker layers to avoid redundant downloads. Reduce test scenarios: remove non-critical scenarios and keep only core functionality tests. Implement per-scenario optimization: some scenarios use Docker (fast), others use Vagrant (slower) only for critical tests. Optimize Molecule steps: combine multiple verifications into single verify playbook instead of multiple plays. Use `--parallel` flag in Molecule (Molecule 3.0+) to run instances in parallel. Implement instance reuse: `idempotent: true` to avoid destroying/recreating instances. Cache role dependencies to avoid re-downloading. Use GitHub Actions or cloud CI/CD with better hardware (SSD, multiple cores) for Molecule. Implement fast-path testing: run basic syntax/lint tests first, only run full Molecule on relevant changes. Profile Molecule execution to identify bottlenecks.

Follow-up: How would you implement incremental testing where only changed roles are tested?

Your Molecule test passes in CI/CD but fails when role runs in Tower with different variables, network connectivity, and external service integrations. Local Molecule test doesn't cover real-world complexity. How do you implement realistic integration testing?

Extend Molecule testing to include integration scenarios. Create `molecule/integration/` scenario that tests role with realistic data: production-like variables, external service mocking. Use Molecule `prepare` step to set up mocked external services: mocked API servers, message queues, databases. Implement `converge` step with realistic playbook structure: roles interacting with each other, not isolated. Create `verify` step that tests integration outcomes: service availability, data in mocked systems, inter-service communication. Use TestInfra plugin in verify.py to assert infrastructure state: `assert service('nginx').is_running`. Implement external service mocking with `responses` library for HTTP APIs, `mongomock` for MongoDB, etc. Test role with dynamic inventory similar to production. For network-dependent roles, implement network simulation in Docker. Create Tower-specific test playbook that runs role through Tower API, replicating Tower's execution model. Implement `molecule converge` in staging environment before production. Monitor production issues and backport to Molecule scenarios to catch issues early.

Follow-up: How would you test role behavior under failure conditions (service down, network latency)?

Your team extends community roles (e.g., nginx) with custom extensions, but tests don't verify extended functionality works with role updates. When community role updates, extended functionality breaks. How do you test role extensions with upstream changes?

Implement Molecule tests for extended roles that verify both base functionality and custom extensions. Create test matrix that tests against multiple base role versions: `roles: [{name: nginx, version: 1.0.0}, {name: nginx, version: 1.1.0}]`. In Molecule `prepare` step, install base role from community Galaxy at specified version. Implement `converge` that applies your extension on top of base role. Verify both base role functionality and extension: test nginx runs, verify custom configuration applied, test custom variables work. Use CI/CD to run tests against latest base role version automatically. Subscribe to community role updates and run extension tests against new versions. Create compatibility matrix: document which extension versions work with which base role versions. Implement backwards compatibility tests: verify extension works with N-1 base role version. For breaking changes, create separate extension branches. Implement upstream monitoring: track community role releases and automatically run tests. If tests fail, update documentation or fork community role with necessary patches.

Follow-up: How would you implement automated community role updates with Molecule verification?

Your Molecule tests check that tasks run successfully, but don't verify the actual application behavior: is nginx actually serving traffic? Do the generated config files have correct permissions? How do you implement application-level testing?

Implement comprehensive verification in Molecule's `verify` step using TestInfra and custom assertions. Use TestInfra to verify infrastructure state: files exist with correct permissions, services are running, ports are listening. Create verify playbook that tests application functionality: curl requests to nginx, database connectivity checks, authentication validation. Implement config file verification: parse generated configs and validate structure/values match expectations. Use custom verify plugins that test application-specific behavior: query application API, validate response times, check error logs. Create verify tasks that simulate real-world usage: create files via application, verify they persist, test concurrent connections. Implement health checks in verify: run application smoke tests similar to production health checks. Use Testinfra + serverspec for infrastructure verification. For complex applications, implement integration verify scripts in Python that test full application workflow. Create monitoring verification: check application metrics are being collected correctly. Test role with production data samples to catch real-world issues.

Follow-up: How would you implement performance testing in Molecule to ensure role doesn't degrade performance?

Your Molecule tests depend on external resources: pulling container images from Docker Hub, querying GitHub for role versions, downloading galaxy collections. Network failures break CI/CD. How do you implement resilient testing with external dependencies?

Implement offline testing by caching external dependencies. Pre-download container images to CI/CD cache or internal registry. Use Molecule image caching in GitHub Actions or CI/CD platform. Pre-download Galaxy collections and store locally: implement private Galaxy mirror or artifact repository. Create Molecule provisioner that uses cached resources: override `molecule.yml` for CI/CD to point to local resources. Implement fallback mechanism: if external resource unavailable, use cached version. For tests that need external services, mock them: don't actually query GitHub, mock the response. Implement circuit breaker: if resource download fails, skip optional tests rather than failing entire pipeline. Use `--offline` flag in Galaxy commands to use cached collections. Cache Docker layers to avoid re-downloading. Implement retry logic with exponential backoff for transient failures. Monitor external resource availability and alert if services become unavailable. Create test matrix where critical tests run with cached resources, optional tests run with external services.

Follow-up: How would you implement Molecule test debugging where you can shell into failed test environment?

Your Ansible role uses callbacks and callback plugins for custom output. Standard Molecule verification doesn't test callback functionality. Callbacks are essential for observability but untested in CI/CD. How do you test callbacks in Molecule?

Implement callback testing by capturing and verifying callback output. In Molecule verify step, check callback plugin output was produced: verify log files were written, check metrics were sent, validate webhook was called. Create test playbook that uses callbacks: run role and capture callback output. Verify callback format is correct: JSON callbacks output valid JSON, webhook callbacks send correct data. Implement callback mocking: create mock API server that receives callbacks and logs them. In verify step, query mock server to confirm callbacks were received. Test callback error handling: verify callbacks don't break playbook execution if callback target is unavailable. For logging callbacks, verify logs are written to expected location with correct format. For webhook callbacks, mock HTTP server and verify callback data. Create callback verification playbook that runs after role convergence. Implement callback testing with multiple callback plugins active simultaneously. Test callback performance: verify callbacks don't add significant overhead to playbook execution.

Follow-up: How would you implement Molecule test debugging where you can examine internal playbook state at failure point?