VMware Level 3 Interview Q&A

sathyahraj
Oct 21, 2025
6 min read

1. vSphere / ESXi / vCenter

Q1. What happens internally during a vMotion?

Answer:- vCenter coordinates source and destination hosts.- Memory pre-copy begins while the VM continues running.- Dirty pages are tracked and copied in iterative passes.- VM is stunned briefly for final sync, then resumed on the destination host.- Inventory metadata and networking details are updated.

Q2. What is EVC and how does it work?

EVC (Enhanced vMotion Compatibility) masks CPU features to expose a consistent baseline instruction set across cluster hosts by intercepting CPUID calls.

Command:

vim-cmd hostsvc/evc/info

Q3. Why does an ESXi host go into ‘Not Responding’ state?

Root Causes:- Hostd or vpxa daemon crash- Network failure on management vmk- vCenter trust certificate expired

Fix:

/etc/init.d/hostd restart/etc/init.d/vpxa restart/etc/init.d/netlogond restart

Q4. Explain vCenter HA architecture.- Active, Passive, and Witness nodes.- Passive node syncs database and config.- Failover <30 seconds.

Q5. How do you troubleshoot vCenter login failure after upgrade?- Verify STS token validity: /var/log/vmware/sso/- Restart identity management:

service-control --restart vmware-stsd vmware-vmdird

Q6. What happens when a vCenter database is full?- Services fail to start (vpxd, inventory, or performance metrics).Resolution: Extend /storage/db partition and vacuum Postgres DB.

Q7. DRS vs vMotion difference?

- vMotion = one-time migration.- DRS = automated decision engine that uses vMotion for balancing.

2. NSX-T Data Center

Q8. NSX Manager vs Control Plane?- Manager:

Configuration, REST API, and UI layer.- Control Plane: Routing, BGP, topology calculation.

Q9. Troubleshoot NSX-T Edge tunnel failure.

get tunnel-interfacesget logical-routersping <TEP-IP> size 8972 df-bit enable

Root cause often MTU mismatch or VLAN trunking error.

Q10. NSX-T upgrade stuck at 47%

.- Collect logs from /var/log/nsxapi.log and /var/log/upgrade-coordinator.log.- Restart service:

systemctl restart upgrade-coordinator

Q11. Explain T0-T1 communication.- Tier-1 DR performs local routing within hypervisor kernel.- Traffic requiring NAT or North-South flows to Tier-0 SR.

Q12. Logical port missing from one host.

nsxcli -c get logical-portsnsxcli -c get interface

Root cause: Transport node profile not applied.

Q13. Edge node sync degraded.

get edge-cluster status

Reboot affected Edge node.

Q14. DFW (Distributed Firewall) rules not applying.

Force-publish via API:

POST /policy/api/v1/infra/

Q15. NSX Manager cluster degraded.

get cluster status

Resolution: Restart management service and ensure Cassandra partition healed.

3. vSAN

Q16. How is data replicated in vSAN?

Each object is split into components (based on FTT). Example: FTT=1 => 2 data + 1 witness.

Q17. Object inaccessible errors?

Check via:

cmmds-tool find -t DOM_OBJECTesxcli vsan debug object list

Cause: disk group failure or host isolation.

Q18. vSAN resync slow.

Monitor: vsan.resync_dashboardThrottle rebuilds:

vsan.resync_throttle --iops-limit 1000

Q19. Cluster partitioned?

esxcli vsan cluster getesxcli vsan cluster unicastagent list

MTU mismatch is common.

Q20. Disk group degraded?

vdq -q

Replace failed SSD or disk and recreate disk group.

4. VMware Cloud Foundation (VCF)

Q21. LCM (Lifecycle Manager) role?

Responsible for orchestrating upgrades of NSX, ESXi, vCenter, and vSAN.API: /v1/lcm/upgrade-bundles

Q22. VCF upgrade failure during ESXi patch?-

Host in MM failure (VIF attached).- Detach via NSX CLI and retry.

Q23. Password rotation failure.

Expired credentials in locker.Update manually using:

vcfcli password update --component nsx

Q24. Inventory out of sync.

/opt/vmware/vcf/lcm/lcm-cli refresh inventory

Q25. Domain deletion stuck.

Detach workloads first:

DELETE /v1/domains/<id>

5. Automation (PowerCLI / Terraform / Ansible)

Q26. PowerCLI connection fails (SSL).

Set-PowerCLIConfiguration -InvalidCertificateAction Ignore

Q27. Terraform state drift issue.

terraform refreshterraform plan

Q28. Ansible NSX authentication fails.

Refresh token or update credentials in ansible.cfg.

Q29. PowerCLI to find VMs with old snapshots:

Get-VM | Get-Snapshot | Where-Object {$_.Created -lt (Get-Date).AddDays(-7)}

Q30. NSX-T API task stuck.Cancel via:

DELETE /api/v1/task/<id>

6. Scenario-Based Questions

Q31. VM not reachable after migration.-

Check VLAN trunking and dvSwitch config.- Validate vmkping.

Q32. vSAN stretched cluster resync not completing.

Use Observer:

vsan.observer --run-webserver --force

Q33. NSX Edge rebooted and lost routing.

Reapply T0 configuration using policy API.

Q34. vCenter services failing intermittently.

Review /var/log/vmware/vmon/vmon-svc.log.

Q35. VM performance issue on vSAN.Check congestion via:

vsan.vm_perf_stats

7. Log Locations and Commands

Component	Log Path
vCenter	/var/log/vmware/vpxd/
ESXi	/var/log/hostd.log, /var/log/vmkernel.log
NSX-T	/var/log/nsxapi.log, /var/log/controller.log
vSAN	/var/log/vsanhealth.log
VCF	/var/log/vmware/vcf/lcm/lcm.log

Key Commands

· esxcli network ip interface list

· nsxcli -c get managers

· vsan.health.cluster.get

· Get-VMHost | Get-VM | Measure-Object

Section 1: vSphere / ESXi / vCenter

1. What happens internally when you vMotion a VM?Answer:

· vCenter coordinates source and destination hosts.

· Memory pages are copied using iterative pre-copy.

· Dirty pages are tracked via shadow page tables.

· VM is briefly stunned, final pages are copied, then resumed on destination.

· vCenter updates inventory and DRS/HA metadata.

2. Explain how vCenter HA works.Answer:

· Consists of three nodes: Active, Passive, and Witness.

· Passive node maintains a synchronous DB replication.

· Witness ensures quorum (prevents split-brain).

· vCenter HA heartbeat traffic uses a dedicated network.

· Failover usually occurs within ~10–30 seconds.

3. Why would an ESXi host go into “Not Responding” state?Root Causes:

· Hostd or vpxa daemon crash

· Management network failure

· vCenter trust or certificate issue

· CPU/Memory exhaustion causing watchdog triggerFix:SSH → restart management agents:

/etc/init.d/hostd restart

/etc/init.d/vpxa restart

/etc/init.d/netlogond restart

4. What’s the difference between DRS and vMotion?Answer:

· vMotion = Live migration of a single VM.

· DRS = Intelligent cluster-level placement (uses vMotion).

· DRS evaluates CPU/memory utilization every 5 minutes.

· Uses cost-benefit analysis before triggering migration.

5. How does EVC mode work internally?

Answer:EVC masks CPU feature sets at hypervisor level by intercepting CPUID instructions.It ensures all hosts in a cluster expose a common baseline CPU instruction set.

Section 2: NSX-T Data Center

6. Difference between NSX Manager and NSX Controller in NSX-T?

Answer:

· Manager Plane: Policy/config management, UI, API.

· Control Plane: Routing, MAC learning, topology computation.

· NSX-T integrates both in a cluster (unlike NSX-V which had separate controllers).

7. What happens when an NSX-T Edge node loses connectivity to TEP network?

Answer:

· Overlay tunnels (GENEVE) fail.

· East-West traffic drops; North-South may continue if VLAN-based uplink.

· Verify via:

·        get tunnel-interfaces

·        get logical-routers

· Root cause: MTU mismatch, VLAN config, or TEP pool conflict.

8. How do you troubleshoot NSX-T upgrade failure at 47%?

Answer:

· Usually stuck during Edge service reconfiguration.

· Collect logs: /var/log/nsxapi.log, /var/log/upgrade-coordinator.log

· Restart the upgrade service:

·        systemctl restart upgrade-coordinator

· Retry via API: POST /api/v1/upgrade/retry.

9. Explain how NSX-T handles routing between Tier-0 and Tier-1 gateways.

Answer:

· Tier-1 → Tier-0 communication uses SR (Service Router) and DR (Distributed Router).

· DR performs local routing in kernel module (hypervisor).

· SR handles centralized services like NAT, DHCP, LB, etc.

· Traffic uses an internal Geneve tunnel between DR and SR.

10. How to check logical switch connectivity at ESXi level?

nsxcli -c get logical-ports

nsxcli -c get interface

esxcli network vswitch dvs vmware vxlan list

Section 3: vSAN

11. Explain vSAN object components.

Answer:Each vSAN object = multiple components (based on failures to tolerate).

· Example: FTT=1 → 2 replicas + witness = 3 components.

· Stored across disk groups for redundancy.

12. Why does “vSAN object inaccessible” occur?

Root Causes:

· Host partition

· Disk group failure

· Stale CMMDS entryFix:

cmmds-tool find -t DOM_OBJECT

esxcli vsan debug object list

13. How to handle vSAN resync taking long time?

Answer:

· Check resync queue: vsan.resync_dashboard

· Verify congestion levels

· Optionally throttle resync IOPS via:

·        vsan.resync_throttle --iops-limit 1000

14. How to troubleshoot vSAN cluster partition?

Answer:

· Check cluster UUID and member view:

·        esxcli vsan cluster get

·        esxcli vsan cluster unicastagent list

· MTU mismatch or physical network issue often root cause.

Section 4: VMware Cloud Foundation (SDDC Manager)

15. What is the role of the LCM (Lifecycle Manager) service in VCF?

Answer:LCM handles version orchestration of vCenter, NSX, ESXi, and vSAN.

· It validates bundles, dependencies, and applies updates domain by domain.

· API path: /v1/lcm/upgrade-bundles.

16. Why does VCF upgrade fail during “Apply Solution” step?

Answer:ESXi host cannot enter MM → VIF or VM pinned.Check:

nsxcli -c get logical-ports | grep <hostname>

Manually place host into maintenance mode and resume upgrade.

Section 5: Automation (PowerCLI, Terraform, Ansible)

17. PowerCLI script to find orphaned VMs:

Get-VM | Where-Object {$_.Folder -eq $null} | Select Name

18. How to force Terraform to recreate an NSX object?

terraform taint nsxt_logical_switch.edge_switch

terraform apply

19. Ansible NSX playbook fails with 401 error — cause?

Answer:Expired API token or role missing in NSX Manager.Update ansible.cfg and re-authenticate via ansible-galaxy collection install vmware.nsxt.

20. PowerCLI to list all VM snapshots older than 7 days:

Get-VM | Get-Snapshot | Where-Object {$_.Created -lt (Get-Date).AddDays(-7)} | Select VM, Name, Created

Section 6: Scenario-based Questions

21. During VCF upgrade, one NSX Edge fails to come up post-reboot. Steps?

1. Check console via DCUI or vSphere Client

2. Verify /var/log/nsxapi.log

3. If boot corruption → redeploy Edge using API /api/v1/edge-nodes.

22. VM is not pingable after migrating to another host.Answer:

· Likely missing VLAN trunk on uplink.

· Check dvPortgroup VLAN tag.

· Validate physical switch port config.

23. How to restore vCenter when the appliance is corrupted?

Answer:

· Use VCSA file-based backup.

· Restore via installer → “Restore from Backup.”

· Alternatively, restore DB from /storage/db/vpostgres.

24. NSX-T logical segment missing on one host only.

Answer:Transport node profile not applied.Reapply via API:

POST /api/v1/transport-node-collections/<id>/apply-profile

25. vSAN stretched cluster resync not completing.

Answer:Check site latency & witness connectivity:vsan.observer --run-webserver for analysis.

VMware Level 3 Interview Q&A

1. vSphere / ESXi / vCenter

2. NSX-T Data Center

3. vSAN

4. VMware Cloud Foundation (VCF)

5. Automation (PowerCLI / Terraform / Ansible)

6. Scenario-Based Questions

7. Log Locations and Commands

Key Commands

Section 2: NSX-T Data Center

Section 3: vSAN

Section 4: VMware Cloud Foundation (SDDC Manager)

Section 5: Automation (PowerCLI, Terraform, Ansible)

Section 6: Scenario-based Questions

Recent Posts

Comments

About Us

Blogs

Author

Privacy Policy

Subscribe to get exclusive updates