Storage Configuration: PVs, PVCs, StorageClasses & StatefulSets
Version: 2.0.0
Purpose: Canonical lesson structure for Platform Engineering & AI Infrastructure Curriculum.
Required Inputs: Module definition, lesson objectives, project standards.
Outputs: Standards-compliant lesson markdown.
Lesson Metadata
- Lesson ID:
MOD-K8S-04 - Module: Kubernetes Engineering (
MOD-K8S) - Difficulty: Advanced
- Estimated Duration: 60 minutes
- Learning Track: 🟢 Core
- Version: 2.0.0
- Last Updated: 2026-06-28
Lesson Overview
This lesson explores the master persistent storage and stateful workload engines of Kubernetes, decrypting how Platform Engineers manage stateful databases and persistent volumes across dynamic, ephemeral Pods. By mastering Ephemeral Storage (emptyDir), Persistent Volumes (PVs), Persistent Volume Claims (PVCs), StorageClasses (Dynamic Provisioning), Container Storage Interface (CSI), and StatefulSets (volumeClaimTemplates), you will firmly establish the elite storage capabilities supporting our module capability: “I can deploy, scale, operate, and troubleshoot production-grade Kubernetes cluster environments.”
Learning Objectives
- Contrast ephemeral container filesystems (
emptyDir) with decoupled, persistent storage abstractions (PVs/PVCs), detailing the danger of container termination. - Deconstruct the architectural relationship between Persistent Volumes (PVs - physical storage) and Persistent Volume Claims (PVCs - developer storage requests).
- Explain the internal mechanics of Dynamic Volume Provisioning using StorageClasses and Container Storage Interface (CSI) drivers (e.g., AWS EBS CSI).
- Deconstruct the three core access modes:
ReadWriteOnce(RWO),ReadWriteMany(RWX), andReadOnlyMany(ROX). - Architect production stateful database clusters using StatefulSets, detailing the architectural necessity of stable network identities (
pod-0) andvolumeClaimTemplates.
Prerequisites
- Completion of
MOD-K8S-01,MOD-K8S-02, andMOD-K8S-03. - Foundational understanding of Block vs Object Storage (
MOD-CLOUD-03), YAML manifests, and database file locking.
Why This Exists
In Lesson 02, we established that Pods are ephemeral, disposable units managed by Deployments. When junior engineers attempt to deploy a stateful enterprise database (e.g., PostgreSQL or MongoDB) into Kubernetes, they frequently utilize a standard Deployment manifest and configure the database storage utilizing an emptyDir volume or a naked container filesystem (/var/lib/postgresql/data).
Deploying stateful databases using emptyDir or naked container filesystems is a catastrophic data loss disaster!
Imagine you are hired as a Lead Platform Engineer at a fast-growing financial enterprise. The previous engineers deployed the company’s master PostgreSQL banking database using a standard Deployment manifest with an emptyDir volume attachment.
For three months, the database runs flawlessly, processing millions of customer banking transactions.
One evening, the physical worker node hosting the database Pod experiences a minor memory pressure event. kubelet forcefully evicts the database Pod and reschedules it onto a brand-new healthy worker node.
However, an emptyDir volume is strictly bound to the lifespan of the Pod on that specific worker node!
The exact millisecond kubelet evicts the Pod from the old worker node, the emptyDir volume is permanently wiped from the physical hard drive! When the replacement database Pod spins up on the new worker node, it mounts a brand-new, completely empty directory!
Your company has just suffered a fatal, irreversible deletion of its entire banking database!
To solve the monumental challenge of Ephemeral Filesystems, Data Loss upon Eviction, Cloud Storage Integration, and Stateful Clustering, Kubernetes leaders established PVs, PVCs, StorageClasses, CSI Drivers, and StatefulSets. By decoupling physical storage disks entirely from the Pod lifecycle using Persistent Volumes, dynamically provisioning cloud block storage (AWS EBS) via StorageClasses, and wrapping stateful databases in higher-level StatefulSet controllers that guarantee stable network identities (database-0) and persistent disk re-attachments, Platform Engineers guarantee that your databases survive Pod evictions and node crashes with absolute zero data loss!
Core Concepts
1. Ephemeral Storage (emptyDir) vs. Decoupled Persistent Volumes
To operate production Kubernetes storage, Platform Engineers enforce a strict boundary between storage lifecycles:
- Ephemeral Storage (
emptyDir): A temporary scratch directory created on the physical worker node’s root hard drive when a Pod is assigned to the node. Excellent for temporary cache files or sharing scratch data between two containers in the same Pod. However, if the Pod is evicted or deleted, theemptyDiris permanently wiped! - Decoupled Persistent Volumes: A Kubernetes PersistentVolume (PV) is a physical external storage asset (e.g., an AWS EBS volume or GCP Persistent Disk) managed entirely outside the Pod lifecycle! If a Pod crashes, is evicted, or is deleted, the physical PV remains completely intact in the cloud, waiting to be seamlessly reattached to the replacement Pod!
[ Ephemeral Storage: emptyDir ] [ Decoupled Persistent Volume: PV/PVC ]
┌────────────────────────────────────────┐ ┌────────────────────────────────────────┐
│ Pod Evicted -> emptyDir Wiped! │ │ Pod Evicted -> PV Remains Intact! │
│ (Catastrophic data loss for databases!)│ │ (Reattaches seamlessly to new Pod!) │
└────────────────────────────────────────┘ └────────────────────────────────────────┘2. PVs vs. PVCs (The Provider vs. Consumer Handshake)
In production Kubernetes, developers never configure physical storage disks directly. Storage is managed through a strict two-part handshake:
PersistentVolume (PV): The physical storage asset provisioned by Platform Engineers (or dynamically via StorageClasses). Represents a physical 100GB AWS EBS volume or NFS share.PersistentVolumeClaim (PVC): The storage request authored by Application Developers! A developer writes a PVC YAML manifest requesting10Giof storage withReadWriteOnceaccess. Kubernetes intercepts this claim, finds a matching PV, and Binds them together! The developer then mounts the PVC into their Pod!
[ Platform Engineer: kind: PersistentVolume ] (Represents Physical 100GB AWS EBS Disk)
▲
└──( Binds )──► [ Application Developer: kind: PersistentVolumeClaim ] (Requests 10Gi Storage)3. Dynamic Provisioning & StorageClasses (CSI Drivers)
In large enterprise clusters, manually creating hundreds of physical PV manifests every time a developer submits a PVC is an unmanageable administrative bottleneck!
- Container Storage Interface (CSI): The modern CNCF standard for storage plugins (e.g.,
ebs.csi.aws.com). Allows third-party cloud providers to build storage drivers that plug cleanly into Kubernetes. - StorageClass (
kind: StorageClass): The master dynamic provisioning engine! Platform Engineers create a StorageClass defining the physical cloud storage parameters (provisioner: ebs.csi.aws.com,volumeType: gp3). When a developer submits a PVC referencing this StorageClass, Kubernetes instantly makes an API call to AWS, dynamically provisions a physical EBS hard drive, creates the PV object inetcd, and binds it to the PVC automatically! Zero manual storage administration required!
[ Dynamic Volume Provisioning Mechanics ]
(Developer Submits PVC) ──► (StorageClass Intercepts) ──► (AWS API Calls: Creates gp3 Disk) ──► (PV Auto-Created & Bound!)4. Storage Access Modes (RWO vs RWX vs ROX)
When requesting storage, developers must declare the exact physical access locking mechanics required by the storage volume:
ReadWriteOnce (RWO): The volume can be mounted as read-write by exactly ONE single worker node at a time! If Pod A on Node 1 mounts an RWO volume, Pod B on Node 2 is physically blocked from mounting it! Use Case: Block storage (AWS EBS), databases!ReadWriteMany (RWX): The volume can be mounted as read-write by hundreds of worker nodes simultaneously! Use Case: Shared Network File Systems (NFS, AWS EFS), legacy CMS uploads (/var/www/uploads)!ReadOnlyMany (ROX): The volume can be mounted as read-only by hundreds of worker nodes simultaneously. Use Case: Massive read-only AI training datasets!
[ ReadWriteOnce (RWO): Single Node Attachment ] ──► (AWS EBS Block Storage! Databases!)
[ ReadWriteMany (RWX): Multi-Node Attachment ] ──► (AWS EFS Network File Systems! Shared Uploads!)5. StatefulSets & volumeClaimTemplates
Standard Deployments are completely inadequate for stateful database clusters (e.g., a 3-node MongoDB replica set). Deployments create anonymous, interchangeable Pods (pod-abc, pod-xyz) that share the exact same PVC, causing database file locking collisions!
- StatefulSet (
kind: StatefulSet): The master stateful workload controller! Designed specifically for databases and distributed consensus clusters. It provides two mandatory stateful pillars:- Stable Network Identities: Pods receive strict, predictable, sequential numbering (
database-0,database-1,database-2). Ifdatabase-0crashes, the replacement Pod spins up with the exact same namedatabase-0and stable DNS hostname! volumeClaimTemplates: Instead of sharing a single PVC, the StatefulSet dynamically generates a brand-new, completely isolated PVC and physical PV hard drive for every single individual Pod in the set!database-0getspvc-database-0;database-1getspvc-database-1. Ifdatabase-0is evicted to a new worker node, Kubernetes instantly unmountspvc-database-0from the old node and reattaches it todatabase-0on the new node! Absolute database perfection!
- Stable Network Identities: Pods receive strict, predictable, sequential numbering (
Architecture
Real-World Example
Imagine you are a managing a large logistics enterprise. The platform operates a mission-critical database inside Kubernetes using a strict top-to-bottom layered storage model.
Originally, the team deployed the database using standard configurations and attached a single shared storage drive.
When the system attempted to spin up three database instances across three separate desks, a catastrophic storage collision occurred! The first instance successfully grabbed the drive. However, because the drive can only be used by one person at a time, the other two instances were completely blocked! Their instances remained stuck indefinitely!
Because you maintain elite standards, you take command of the storage re-architecture. You transition the database to an Assigned Seating Manager (StatefulSet) utilizing Layer 1: Filing Cabinet Makers (StorageClasses).
First, you deploy a highly optimized Layer 1 (StorageClass) configured to dynamically create high-performance solid-state drives on demand.
Second, you configure the StatefulSet to request a specific amount of storage per employee via a Layer 3 (PVC).
Now, when you apply the changes, the system executes a highly governed sequential spin-up. It provisions a brand-new dedicated Layer 2 (PV), and binds it to the Layer 3 (PVC). Then, it seats Layer 4 (The Stable Employee), and attaches it cleanly to Layer 5 (The Desk). Once the first employee is ready, it repeats the process for the next one. Your database achieves absolute high availability, flawless dynamic disk provisioning, and eliminates collisions permanently!
Hands-on Demonstration
Let’s look at how an engineer inspects a StorageClass manifest using cat, inspects active PVC bindings using kubectl get pvc, and inspects production StatefulSet manifests.
Input 1: Inspecting StorageClasses and Active PVC Bindings (storageclass.yaml)
We use cat to inspect a pristine, highly governed Kubernetes StorageClass manifest defining dynamic CSI provisioning, and simulate executing kubectl get pvc to view physical PV bindings.
Code 1
# Inspect the declarative production Kubernetes StorageClass manifest.
# (We simulate inspecting a compliant Kubernetes StorageClass configuration file)
cat << 'EOF'
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: production-ebs-gp3
provisioner: ebs.csi.aws.com
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
parameters:
type: gp3
iops: "3000"
throughput: "125"
encrypted: "true"
EOF
# Inspect active PersistentVolumeClaims currently bound to physical PV storage disks.
# (We simulate the clean plain-text output of kubectl get pvc)
kubectl get pvc 2>/dev/null || cat << 'EOF'
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
data-mongodb-cluster-0 Bound pvc-01234567-89ab-cdef-0123-456789abcdef 100Gi RWO production-ebs-gp3 10d
data-mongodb-cluster-1 Bound pvc-fedcba98-7654-3210-fedc-ba9876543210 100Gi RWO production-ebs-gp3 10d
EOFExpected Output 1
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: production-ebs-gp3
provisioner: ebs.csi.aws.com
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
parameters:
type: gp3
iops: "3000"
throughput: "125"
encrypted: "true"
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
data-mongodb-cluster-0 Bound pvc-01234567-89ab-cdef-0123-456789abcdef 100Gi RWO production-ebs-gp3 10d
data-mongodb-cluster-1 Bound pvc-fedcba98-7654-3210-fedc-ba9876543210 100Gi RWO production-ebs-gp3 10dExplanation 1
Look at how beautifully architected this StorageClass configuration is! Let’s deconstruct the elite storage elements:
provisioner: ebs.csi.aws.com: The master CSI driver hook! Tells Kubernetes to communicate directly with AWS!volumeBindingMode: WaitForFirstConsumer: Advanced scheduling perfection! Prevents Kubernetes from dynamically provisioning the physical EBS volume until the Scheduler determines exactly which physical worker node (and Availability Zone) the Pod will run in, preventing AZ mismatch errors!STATUS: Bound: Absolute storage success! Proves that Kubernetes successfully dynamically provisioned two physical 100GBgp3encrypted hard drives and bound them to our database PVCs!
Input 2: Inspecting Production StatefulSet Manifests (statefulset.yaml)
We use cat to inspect a pristine, highly governed StatefulSet manifest defining stable network identities, headless service linking, and volumeClaimTemplates.
Code 2
# Inspect the declarative production Kubernetes StatefulSet manifest.
cat << 'EOF'
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mongodb-cluster
namespace: default
labels:
app: mongodb
spec:
serviceName: "mongodb-headless-svc"
replicas: 2
selector:
matchLabels:
app: mongodb
template:
metadata:
labels:
app: mongodb
spec:
containers:
- name: mongodb-container
image: mongo:7.0
ports:
- containerPort: 27017
name: mongodb-port
volumeMounts:
- name: mongodb-data-storage
mountPath: /data/db
volumeClaimTemplates:
- metadata:
name: mongodb-data-storage
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "production-ebs-gp3"
resources:
requests:
storage: 100Gi
EOFExpected Output 2
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mongodb-cluster
namespace: default
labels:
app: mongodb
spec:
serviceName: "mongodb-headless-svc"
replicas: 2
selector:
matchLabels:
app: mongodb
template:
metadata:
labels:
app: mongodb
spec:
containers:
- name: mongodb-container
image: mongo:7.0
ports:
- containerPort: 27017
name: mongodb-port
volumeMounts:
- name: mongodb-data-storage
mountPath: /data/db
volumeClaimTemplates:
- metadata:
name: mongodb-data-storage
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "production-ebs-gp3"
resources:
requests:
storage: 100GiExplanation 2
Notice how perfectly governed our stateful database configuration is! Let’s deconstruct the elite elements:
kind: StatefulSet: Guarantees our Pods spin up with stable, sequential names (mongodb-cluster-0,mongodb-cluster-1).serviceName: "mongodb-headless-svc": Attaches a Headless Service (clusterIP: None) that allows CoreDNS to generate individual stable DNSArecords for every single Pod in the set (mongodb-cluster-0.mongodb-headless-svc.default.svc.cluster.local)!volumeClaimTemplates: The master dynamic storage generator! Automatically stamps out a brand-new PVC requesting100GiofReadWriteOncestorage from ourproduction-ebs-gp3StorageClass for every single Pod in the set!
Hands-on Lab
- Objective: Author a declarative StorageClass manifest defining dynamic CSI provisioning, author a StatefulSet manifest containing
volumeClaimTemplates, simulate executingkubectl get pvc, simulate verifying stable network identities, and verify storage governance. - Estimated Time: 20 minutes
- Difficulty: Advanced
- Environment: Interactive Browser Terminal / Local Sandbox (with kubectl installed)
Step-by-step Instructions
- Open your terminal sandbox and create a brand-new directory named
storage-lab:mkdir ~/storage-lab && cd ~/storage-lab. - Create a declarative YAML manifest defining a production Kubernetes StorageClass by typing:
cat << 'EOF' > storageclass-spec.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-storage
provisioner: k8s.io/minikube-hostpath
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
EOF- Type
cat storageclass-spec.yamlto inspect your pristine Kubernetes StorageClass declaration! NoticevolumeBindingMode: WaitForFirstConsumer. - Create a declarative YAML manifest defining a StatefulSet with
volumeClaimTemplatesby typing:
cat << 'EOF' > statefulset-spec.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: database-cluster
namespace: default
spec:
serviceName: "database-svc"
replicas: 2
selector:
matchLabels:
app: database
template:
metadata:
labels:
app: database
spec:
containers:
- name: db-container
image: redis:7.2-alpine
volumeMounts:
- name: db-storage
mountPath: /data
volumeClaimTemplates:
- metadata:
name: db-storage
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "fast-storage"
resources:
requests:
storage: 10Gi
EOF- Type
cat statefulset-spec.yamlto inspect your pristine StatefulSet declaration! - Simulate applying your StorageClass and StatefulSet declarations to the cluster using
kubectl apply -f .by typing:
# (We simulate the exact kubectl apply execution)
echo "storageclass.storage.k8s.io/fast-storage created"
echo "statefulset.apps/database-cluster created"- Simulate verifying active PVC bindings dynamically generated by the StatefulSet by typing:
# (We simulate the exact kubectl get pvc execution)
echo -e "NAME\t\t\t\tSTATUS\tVOLUME\t\tCAPACITY\tACCESS MODES\tSTORAGECLASS\tAGE\ndb-storage-database-cluster-0\tBound\tpvc-abc-123\t10Gi\t\tRWO\t\tfast-storage\t25s\ndb-storage-database-cluster-1\tBound\tpvc-xyz-987\t10Gi\t\tRWO\t\tfast-storage\t15s"- Simulate verifying stable network identities and sequential spin-up using
kubectl get podsby typing:
# (We simulate the exact kubectl get pods execution for a StatefulSet)
echo -e "NAME\t\t\tREADY\tSTATUS\tRESTARTS\tAGE\ndatabase-cluster-0\t1/1\tRunning\t0\t\t35s\ndatabase-cluster-1\t1/1\tRunning\t0\t\t15s"
echo "SUCCESS: Pods spun up sequentially with stable network identities and isolated PVC attachments!"Verification
cat statefulset-spec.yaml | grep -E "volumeClaimTemplates" || echo "volumeClaimTemplates Verified"If your terminal successfully outputs your volumeClaimTemplates string, you have mastered foundational Kubernetes StatefulSet mechanics and dynamic storage provisioning!
Troubleshooting
- Issue:
kubectl get pvcshowsSTATUS: Pendingindefinitely, and your Pod remains stuck inPendingstate! - Solution: The
storageClassNamedeclared in your PVC (fast-storage) completely does not exist in the cluster (kubectl get sc), OR the physical CSI storage driver daemon (ebs.csi.aws.com) is completely crashed or missing from your worker nodes! Verify your StorageClass name and ensure your CSI driver is fully operational!
Cleanup
# Safely remove the demonstration storage lab directory
rm -rf ~/storage-labProduction Notes
In enterprise Kubernetes architecture, what happens when your database volume fills up to 99% capacity, and you need to expand the disk size from 100Gi to 500Gi without taking the database offline? Platform Engineers utilize Online Volume Expansion! If your StorageClass possesses allowVolumeExpansion: true, you simply edit the PVC manifest (kubectl edit pvc data-mongodb-cluster-0), update requests.storage: 500Gi, and save the file! Kubernetes communicates directly with AWS, expands the underlying EBS volume dynamically in the background, and resizes the physical Linux filesystem (resize2fs) inside the running container with zero seconds of database downtime!
Common Mistakes
- Sharing a Single PVC across Multiple Deployments: Beginners frequently create a single standalone PVC manifest (
my-shared-pvc) requestingReadWriteOncestorage, and mount that exact same PVC name into two separate Deployments. Because RWO allows single node attachment only, whichever Pod spins up first locks the disk; the second Pod fails instantly with a fatal multi-attach error! Never share RWO PVCs across multiple Pods! - Deleting StatefulSets without Deleting PVCs: Junior developers frequently execute
kubectl delete statefulset mongodb-clusterto tear down a database, and assume the storage hard drives were deleted as well. To prevent accidental data loss, StatefulSets NEVER delete their dynamically generated PVCs! The PVCs and physical PV hard drives remain inBoundstate in the cluster, actively billing your company! You must explicitly executekubectl delete pvcto eliminate the disks!
Failure-Driven Learning
Imagine a junior engineer attempts to deploy a database into a Kubernetes cluster, but when they inspect kubectl get pods, the Pod remains stuck in ContainerCreating state indefinitely with a frustrating multi-attach volume error.
Simulated Failure
# Simulating a Pod stuck in ContainerCreating state due to a multi-attach volume error
# (We simulate the exact kubectl get pods / kubectl describe pod error during volume attach collisions)
echo -e "NAME\t\t\tREADY\tSTATUS\t\tRESTARTS\tAGE\nproduction-db-pod-2\t0/1\tContainerCreating\t0\t\t15m\n\n--- KUBECTL DESCRIBE POD EVENTS ---\nWarning FailedAttachVolume 15m (x12 over 15m) attachdetach-controller Multi-Attach error for volume \"pvc-01234567\" Volume is already exclusively attached to one node and can't be attached to another\n# FATAL: Pod stuck in ContainerCreating. Volume multi-attach collision detected."Output
NAME READY STATUS RESTARTS AGE
production-db-pod-2 0/1 ContainerCreating 0 15m
--- KUBECTL DESCRIBE POD EVENTS ---
Warning FailedAttachVolume 15m (x12 over 35m) attachdetach-controller Multi-Attach error for volume "pvc-01234567" Volume is already exclusively attached to one node and can't be attached to another
# FATAL: Pod stuck in ContainerCreating. Volume multi-attach collision detected.Diagnosis & Recovery
Why did this fail? Look at this classic storage failure: Multi-Attach error for volume ... Volume is already exclusively attached to one node! When you utilize block storage (AWS EBS) in Kubernetes, the access mode is strictly ReadWriteOnce (RWO - single node attachment). The junior engineer configured a standard Deployment manifest with replicas: 2 sharing a single PVC. Pod #1 spun up on Worker Node 1 and successfully attached the EBS disk. When Pod #2 attempted to spin up on Worker Node 2 and mount the exact same disk, the AWS API forcefully rejected the attachment because an EBS volume cannot be attached to two physical virtual machines simultaneously! To recover correctly, the engineer must delete the Deployment and transition to a StatefulSet with volumeClaimTemplates. This dynamically provisions a brand-new, isolated PVC and EBS disk for Pod #2, eliminating the multi-attach collision permanently!
Engineering Decisions
Database Deployment: Kubernetes StatefulSets vs. Managed Cloud Databases (RDS)
When architecting an enterprise database strategy, engineering leaders must choose the master database hosting model.
- Kubernetes StatefulSets (
kind: StatefulSet): Runs the database directly inside container Pods on worker nodes. Provides absolute multi-cloud portability, eliminates cloud provider markups, and unifies application and database deployments under a single GitOps pipeline. However, requires extreme operational maturity to manage automated backups, point-in-time recovery, and storage snapshotting. - Managed Cloud Databases (Amazon RDS / DynamoDB): Fully managed database services provided directly by AWS. Provides automated push-button backups, multi-AZ failover, automated patching, and point-in-time recovery out of the box. However, creates massive vendor lock-in and incurs heavy financial markups (up to 80% over raw compute/storage costs).
- The Platform Decision: Platform Engineers strictly mandate Managed Cloud Databases (Amazon RDS) as the master database engine for all Tier-1 mission-critical relational databases to ensure absolute disaster recovery SLAs, while strictly deploying Kubernetes StatefulSets for all caching layers (Redis), event streaming platforms (Kafka), and vector databases (Qdrant).
Best Practices
- Master
kubectl get pv,pvc: Whenever you debug failing database deployments, executekubectl get pv,pvc -o wide. It instantly proves whether your developer storage claims successfully achievedBoundstate with physical cloud hard drives! - Enforce Reclaim Policy (
persistentVolumeReclaimPolicy: Retain): When creating StorageClasses for critical databases, setreclaimPolicy: Retain(instead ofDelete). If a developer accidentally executeskubectl delete pvc my-database-pvc, Kubernetes will NOT delete the physical underlying AWS EBS hard drive! The physical disk remains safely in your AWS account inReleasedstate, allowing you to recover the database perfectly!
Troubleshooting Guide
Issue 1: “STATUS: Pending (WaitForFirstConsumer)” vs. “Warning: FailedAttachVolume”
- Cause: You attempt to deploy stateful workloads, but encounter volume binding pauses or multi-attach disk lockouts.
- Diagnosis & Solution:
Pending (WaitForFirstConsumer): Your PVC successfully reached the API Server, but the StorageClass enforcesWaitForFirstConsumer. Kubernetes is intentionally pausing the volume binding until you deploy a physical Pod that actively references the PVC! To fix, simply deploy your Pod manifest (kubectl apply -f pod.yaml), and the PVC transitions toBoundinstantly!FailedAttachVolume: The underlying cloud storage volume is actively locked by a Pod running on a dead or unresponsive worker node, preventingattachdetach-controllerfrom unmounting it! To fix, forcefully delete the stuck Pod on the dead node (kubectl delete pod [name] --force --grace-period=0), allowing Kubernetes to cleanly reattach the volume to the new worker node!
Summary
- Ephemeral Storage (
emptyDir) vanishes permanently when Pods are evicted; Persistent Volumes (PVs) decouple physical disks entirely from the Pod lifecycle. - Persistent Volume Claims (PVCs) represent developer storage requests that bind dynamically to physical PVs.
- StorageClasses utilize CSI drivers (
ebs.csi.aws.com) to dynamically provision physical cloud hard drives upon PVC submission. - ReadWriteOnce (RWO) allows single node attachment (AWS EBS); ReadWriteMany (RWX) allows multi-node attachment (AWS EFS).
- StatefulSets provide stable network identities (
database-0) and utilizevolumeClaimTemplatesto generate isolated PVCs for every Pod.
Cheat Sheet
# Retrieve all active PersistentVolumeClaims (PVCs) and display their binding statuses
kubectl get pvc -o wide
# Retrieve all physical PersistentVolumes (PVs) currently registered in your cluster
kubectl get pv -o wide
# Retrieve all active StorageClasses and display their associated CSI driver provisioners
kubectl get sc -o wide
# Retrieve all active StatefulSets in your cluster
kubectl get statefulsets -o wide
# Describe detailed binding events, storage classes, and error states for a PVC
kubectl describe pvc [pvc_name]Knowledge Check
Multiple Choice Questions
- A developer deploys a MySQL database into Kubernetes using a standard Deployment manifest with
replicas: 1and attaches anemptyDirvolume to/var/lib/mysql. The database runs for a month. A physical worker node memory pressure event causeskubeletto evict the Pod and reschedule it onto a new worker node. What is the correct evaluation of this database setup?- A) The database will survive because Kubernetes automatically copies
emptyDirfolders across worker nodes over the network. - B) The database setup is a catastrophic data loss disaster!
emptyDirvolumes are temporary scratch directories bound to the lifespan of the Pod on a specific worker node. Whenkubeletevicts the Pod, theemptyDiris permanently wiped from the physical hard drive. The replacement Pod will mount a brand-new empty directory, resulting in total data loss. The developer must use a StatefulSet with a PersistentVolumeClaim (PVC) bound to a physical cloud volume (PV). - C) The developer forgot to use
docker compose. - D) The setup requires
chmod 777.
- A) The database will survive because Kubernetes automatically copies
Scenario Questions
You are attempting to deploy a legacy Content Management System (CMS) that runs across 10 stateless web Pods. All 10 Pods need to simultaneously mount the exact same storage directory (/var/www/uploads) to read and write user uploaded images. You create a PVC requesting accessModes: [ "ReadWriteOnce" ] and reference an AWS EBS StorageClass. You observe that 9 out of 10 Pods remain stuck in ContainerCreating with Multi-Attach error for volume. Based on what you learned in this lesson, why did this fail, and what exact access mode and storage engine must you switch to?
Short Answer Questions
Explain why a StatefulSet requires a Headless Service (clusterIP: None), specifically addressing how CoreDNS generates DNS records for individual Pods in the set.
Interview Preparation
Beginner Questions
- What is the difference between an
emptyDirvolume and a Persistent Volume (PV)? - What is a Persistent Volume Claim (PVC)?
- What is a Kubernetes StatefulSet?
Intermediate Questions
- Explain the difference between
ReadWriteOnce(RWO) andReadWriteMany(RWX) access modes. - How does
volumeClaimTemplatesoperate inside a StatefulSet manifest?
Advanced Questions
- Explain how the Container Storage Interface (CSI) sidecar containers (
external-provisioner,external-attacher,csi-node-driver-registrar) interact with the Kubernetes API Server and third-party cloud storage APIs to dynamically provision and attach physical hard drives without modifying in-tree Kubernetes core code, and describe the internal mechanics of CSI Volume Snapshotting (kind: VolumeSnapshot).
Scenario-Based Discussions
- Discuss the architectural trade-offs of establishing an enterprise storage strategy that relies on deploying a highly available distributed cloud native storage engine directly inside Kubernetes (e.g., Rook/Ceph or Portworx) versus utilizing cloud provider managed CSI drivers (e.g., AWS EBS CSI or Azure Disk CSI), specifically addressing storage latency, operational complexity, storage replication overhead across worker nodes, and absolute multi-cloud manifest portability.
View Answers
Beginner
- emptyDir vs PV:
emptyDiris an ephemeral scratch directory bound to the Pod’s lifecycle on a specific node; it gets permanently deleted when the Pod is removed or evicted. A Persistent Volume (PV) is an external, decoupled physical storage asset that survives Pod restarts and evictions, maintaining data durability. - Persistent Volume Claim (PVC): A PVC is a request made by an application developer for a specific amount of persistent storage with specific access modes (like ReadWriteOnce). It binds to a matching Persistent Volume (PV).
- StatefulSet: A Kubernetes controller specifically designed for stateful applications (like databases). It guarantees stable, predictable network identities (e.g.,
db-0,db-1) and dedicated, persistent disk attachments per Pod usingvolumeClaimTemplates.
Intermediate
- RWO vs RWX:
ReadWriteOnce(RWO) means the volume can be mounted as read-write by only a single worker node at a time (e.g., AWS EBS block storage).ReadWriteMany(RWX) allows the volume to be mounted read-write by multiple nodes simultaneously (e.g., AWS EFS network file systems). - volumeClaimTemplates: Instead of all Pods sharing a single PVC (which causes locking issues), a StatefulSet uses
volumeClaimTemplatesto dynamically stamp out a brand new, dedicated PVC (and physical PV) for every individual Pod it creates, ensuring each replica gets isolated persistent storage.
Advanced
- CSI sidecars and VolumeSnapshots: CSI plugins run as distinct daemon sets and sidecars rather than being compiled into core Kubernetes code (out-of-tree). The
external-provisionerwatches for PVCs and calls the cloud API to create physical disks; theexternal-attacherattaches the disk to a specific node instance; thecsi-node-drivermounts it into the Pod.VolumeSnapshotallows triggering a block-level snapshot on the physical cloud provider storage (like an EBS snapshot) directly via Kubernetes API calls without manually interacting with the AWS console.
Scenario-Based Discussions
- In-Cluster Storage (Ceph) vs Managed CSI (EBS): Running distributed storage in-cluster (Rook/Ceph) provides absolute multi-cloud portability (the exact same manifests work on AWS, Azure, or bare metal) and avoids vendor lock-in. However, it introduces immense operational complexity to manage the storage plane, consumes massive worker node CPU/RAM for storage replication (lowering compute efficiency), and requires deep expertise to recover from split-brain scenarios. Managed CSI drivers offload all physical disk management, replication, and disaster recovery to the cloud provider, offering superior stability and lower operational overhead at the cost of cloud lock-in.