Ansible Interview — Migration from Puppet or Chef

Your organization currently uses Puppet for infrastructure management. You're evaluating migration to Ansible. 500 production servers managed by Puppet, 200 roles, complex recipes. Puppet and Ansible have fundamentally different architectures (Puppet: master-agent pull model, Ansible: agentless push model). How do you plan the migration?

Implement phased migration strategy. Phase 1 (Planning): inventory all Puppet roles, catalog dependencies. Phase 2 (Parallel): translate Puppet roles to Ansible equivalently. Start with simple roles (basic packages), progress to complex roles (database clusters). Phase 3 (Pilot): select subset of servers (50 non-critical), migrate to Ansible, run in parallel with Puppet for 2 weeks, verify no divergence. Phase 4 (Validation): compare Ansible-managed servers with Puppet-managed servers, ensure state identical. Phase 5 (Staged Rollout): migrate 5% of fleet to Ansible, monitor, increase gradually. Phase 6 (Cutover): migrate remaining servers, decommission Puppet. Key decisions: agent decommissioning strategy (pull Puppet agents off servers gradually), state validation (ensure Ansible state == Puppet state), rollback plan (if Ansible issues, fallback to Puppet). Implement state comparison: snapshot Puppet state before migration, compare with Ansible state post-migration. Implement dual-run: servers run both Puppet and Ansible for grace period, must produce identical results. Implement monitoring: heavily instrument migration to catch issues early. Implement runbooks: document migration procedures, troubleshooting steps. Test migration on non-production infrastructure first.

Follow-up: How would you handle Puppet data/hiera that needs conversion to Ansible variables?

Your Puppet environment has 200 roles with complex interdependencies. Not all Puppet features map directly to Ansible: Puppet's resource models differ from Ansible module model. Some Puppet logic is deeply embedded in custom types/providers. How do you translate Puppet roles to Ansible?

Implement systematic role translation. For each Puppet role: 1) Understand what it manages (packages, services, configs), 2) Create equivalent Ansible role achieving same outcomes. Puppet classes → Ansible role variables. Puppet resources → Ansible modules/tasks. Puppet relationships → Ansible task ordering/handlers. Puppet custom types → custom Ansible modules (if needed). For complex Puppet resources, identify equivalent Ansible module or shell command. Example: Puppet `augeas` resource editing config files → Ansible `lineinfile` or custom module. Create translation matrix: Puppet concept → Ansible equivalent. For Puppet hiera hierarchy, create Ansible var_files with similar hierarchy. For Puppet classes with complex logic, implement Ansible role handlers and helpers. Implement wrapper roles: create Ansible role that replicates Puppet role behavior exactly. Test by comparing outcomes: run Puppet role on test server, compare state with Ansible role result. Implement gradual migration: migrate high-value roles first (reusable), complex roles later. For custom Puppet types: develop custom Ansible modules or use existing modules as proxy. Document translation decisions: why each Puppet concept mapped to particular Ansible approach. Create reference guide: Puppet → Ansible migration patterns. For very complex Puppet roles, consider keeping Puppet in parallel during transition.

Follow-up: How would you migrate Puppet data types and custom functions to Ansible?

Your Puppet deployment has 200 servers with agent installed. Migrating to Ansible means removing Puppet agents. Removing agents is risky: if Ansible fails, servers have no remediation mechanism (Puppet agent would detect drift and remediate). How do you safely decommission Puppet agents?

Implement safe Puppet decommissioning. Strategy: run Puppet and Ansible in parallel during grace period (4 weeks), both managing same infrastructure. Ansible deploys via Tower, Puppet agent continues monitoring. This provides fallback: if Ansible deployment causes issues, Puppet agent can remediate. Gradual decommissioning: phase 1 disable Puppet agent service but keep installed (can quickly revert). Phase 2 after 4 weeks stable: remove Puppet agent software. Implement monitoring: alert if server state drifts during Ansible-only phase. Implement Ansible-only remediation: create Ansible playbook that detects and fixes drift (replaces Puppet agent auto-remediation). Use pull-based Ansible with cron for auto-remediation: `ansible-pull` runs every 30 minutes, detects and fixes drift. Implement runbook: if something breaks on migrated server, escalation procedure documents how to respond. Implement state backups: before Puppet decommissioning, snapshot infrastructure state. If issues arise, can restore from snapshot. Implement testing: fully remove Puppet from test server first, verify all systems work without it. Implement service dependencies: audit which services depend on Puppet agent (certificates, configuration), ensure Ansible handles those dependencies. Test failure scenarios: what happens if Ansible controller becomes unavailable? Systems should remain functional (stateless as much as possible). Document Puppet decommissioning procedure comprehensively.

Follow-up: How would you migrate Puppet facts to Ansible facts?

Your organization has Puppet manifests that use extensive templating and metaprogramming. Complex Puppet logic generates different configs per server type dynamically. Ansible doesn't have Puppet's DSL power. How do you handle complex Puppet logic in Ansible?

Implement pattern translation for Puppet's advanced features. Puppet's parameterized classes → Ansible role variables with defaults. Puppet metaprogramming (dynamic resource generation) → Ansible loops and `include_tasks` with dynamic parameters. Puppet defined types → Ansible macros or custom modules. Puppet conditionals → Ansible `when:` conditionals and `set_fact:`. For complex Puppet logic, analyze the intent: what is logic trying to achieve? Create simpler Ansible equivalent achieving same outcome. Example: Puppet's complex resource generation → Ansible loop over data structure. Create Ansible filters (Python code) for complex transformations replacing Puppet's DSL. Implement Jinja2 templates where Puppet used templates. For very complex logic, consider: 1) Implement custom Ansible modules in Python (more powerful than Puppet logic), 2) Pre-process data in Ansible tasks before templating, 3) Simplify logic (refactor if logic is too complex). Implement gradual translation: keep complex Puppet manifests initially, gradually refactor to simpler Ansible patterns. Implement validation: complex Ansible must produce identical output to complex Puppet. Create comparison tests validating equivalence. Document complex patterns: create reference showing Puppet pattern and Ansible translation. For unsupported Puppet features, use `raw`/`shell` tasks as fallback (not ideal but works).

Follow-up: How would you implement Ansible equivalents of Puppet report processors?

During Puppet → Ansible migration, you discover that some servers managed by Puppet have manually applied configurations (not in Puppet manifests). These manual changes weren't tracked. When migrating to Ansible, which version is source of truth: Puppet manifest or manual server state?

Implement state reconciliation process. Pre-migration: audit all servers to capture actual state (including manual changes). Compare actual state with Puppet manifests. Create configuration baseline. Strategy for discrepancies: 1) Manual changes that improve performance/security: incorporate into Ansible roles (source of truth becomes actual state), 2) Manual changes that violate policy: Ansible enforces policy, manual changes removed, 3) Manual changes tracked separately: document in runbook, preserve during migration if needed. Implement Ansible state capture: use Ansible to query and compare actual state vs. desired state. Identify servers with drift from Puppet manifests. For each drift: decide if it's intentional or accidental. If intentional (performance improvement), incorporate into Ansible. If accidental (config mistake), remediate via Ansible. Implement state hashing: create SHA256 hash of configuration before Ansible migration. After Ansible deployment, verify hash unchanged (confirms Ansible achieved same state). Implement rollback capability: if Ansible produces different state than Puppet, can rollback. Implement server-by-server review: for servers with drift, human review before Ansible migration. Document findings: capture all discrepancies and decisions in change control. Implement continuous compliance: after migration, Ansible enforces configuration consistently, prevents future drift. Create reconciliation playbook: periodically compares actual state with desired state, reports differences.

Follow-up: How would you implement compliance validation across old and new configuration management systems?

Your Puppet infrastructure includes custom Puppet types, providers, and plugins developed in-house. These provide functionality unavailable in standard Puppet. Ansible doesn't have equivalent modules. These custom extensions are critical to infrastructure. How do you handle custom Puppet extensions in Ansible?

Implement custom Ansible modules replacing custom Puppet types. First: understand what each custom Puppet extension does. Create functional requirement specification per extension. Analyze if existing Ansible modules provide equivalent functionality. For gaps: develop custom Ansible modules in Python. Puppet type → Ansible module following Ansible conventions. Puppet provider → module implementation logic. Key advantages of custom modules: Ansible modules simpler than Puppet providers (less boilerplate), can leverage Python libraries, easier to test. Implement modules in Ansible collection: `collections/myorg/custom/plugins/modules/`. Create unit tests for each module. Create integration tests using Molecule. Document modules with examples. For complex Puppet extensions: break into simpler modules (Unix philosophy: single responsibility). For Puppet plugins (facts, functions): implement as Ansible lookup plugins, filter plugins, or custom fact gathering. Test custom modules thoroughly before production migration. Implement backwards compatibility: ensure custom modules work with all Ansible versions targeted. Create migration guide: teams using custom Puppet types/providers, show Ansible module equivalents. For critical extensions, consider keeping Puppet in parallel during transition for safety. Implement module versioning: track compatibility across Puppet → Ansible transition.

Follow-up: How would you implement automated testing for custom Ansible modules?

Your Puppet management runs 24/7: Puppet agent runs every 30 minutes, continuously remediating drift. Ansible is push-based: deployments are on-demand. During transition from Puppet to Ansible, how do you maintain continuous remediation without Puppet agents?

Implement pull-based remediation with Ansible. Use `ansible-pull` to replicate Puppet agent behavior: servers run `ansible-pull` from cron every 30 minutes. `ansible-pull` fetches playbook from Git, executes locally, remediates drift. This achieves Puppet-like continuous remediation. Configuration: create cron job: `*/30 * * * * ansible-pull -d /var/lib/ansible-pull -U git://repo/playbooks.git`. Git repo is source of truth (like Puppet manifests). Implement drift detection: ansible-pull compares desired vs. actual state, reports changes via callback. Implement playbook remediation: if service is down, ansible-pull restarts it (equivalent to Puppet auto-remediation). Implement alerting: ansible-pull logs remediation events, monitor for excessive changes (indicates problem). Implement pull interval tuning: 30 minutes might not be optimal for Ansible, tune based on environment. For critical systems needing <5 minute remediation, use Tower with event-driven triggers. Implement state consistency: ansible-pull must be idempotent (running twice produces same result). Test extensively. Implement fallback: if Git repo unavailable, ansible-pull caches previous playbook, continues remediation. Implement transition monitoring: during Puppet → ansible-pull transition, compare remediation patterns. Should be similar. Document ansible-pull setup: teams understand how pull-based remediation works.

Follow-up: How would you implement cluster awareness for distributed systems during migration?

Your organization must maintain backward compatibility: Puppet infrastructure stays operational during transition. Some teams aren't ready to migrate to Ansible. You need Puppet and Ansible coexisting for 1+ years. How do you manage hybrid Puppet/Ansible infrastructure?

Implement hybrid infrastructure management strategy. Inventory organization: create separate groups for Puppet-managed and Ansible-managed servers. Update as servers migrate. Ansible coexists with Puppet: servers can have both Puppet agent and Ansible pulling configuration. Both manage same servers during transition. Conflict prevention: partition infrastructure: Puppet manages group A, Ansible manages group B, overlap is minimal. For overlapping servers: ensure Puppet and Ansible don't manage same resources (coordinate). Implement monitoring: verify Puppet and Ansible produce consistent results on dual-managed servers. Implement coordination: teams upgrading Puppet roles should simultaneously update Ansible roles to match. Create runbook: when to migrate from Puppet to Ansible, how to verify state consistent. Implement testing: create test environment with both Puppet and Ansible, verify no conflicts. Implement communication: document which teams should migrate when, priority order. Use Tower and Puppet master as dual frontends: Tower for Ansible automation, Puppet master for Puppet. Both update inventory independently. Implement state tracking: track which servers use Puppet vs. Ansible. Periodic audits ensure state consistent. Implement phased cutoff: eventually decommission Puppet infrastructure, migrate remaining servers to Ansible. Set deadline: no Puppet servers after 2026-12-31. Create incentives for migration: teams not migrating lose support for Puppet.

Follow-up: How would you implement gradual feature parity between Puppet and Ansible roles?