In-Place Pod Vertical Scaling in Kubernetes 1.35: The GA Deep Dive

Kubernetes 1.35 marks a pivotal milestone in cluster resource management: the graduation of In-Place Pod Vertical Scaling (IPVS) from beta to General Availability (GA). After more than six years of development—first introduced as an alpha feature in v1.27 and refined through beta in v1.33—this capability is now production-ready. IPVS empowers operators to adjust CPU and memory allocations for running pods instantly, eliminating the need for disruptive restarts. This breakthrough transforms day‑to‑day operations, enabling finer‑grained resource optimization, faster response to workload demands, and higher overall cluster efficiency. In this post, we explore the technical underpinnings, evolution, and practical application of this game‑changing feature.

How In-Place Pod Vertical Scaling Works Under the Hood

At its core, In‑Place Pod Vertical Scaling leverages close cooperation between the kubelet and the container runtime (such as containerd or CRI‑O). When a user updates a Pod’s .spec.resources.requests or .spec.resources.limits for CPU or memory, the Kubernetes API server accepts the modification. The kubelet, watching for such changes, then initiates a live resize by directly manipulating the control group (cgroup) associated with the container. This adjustment happens without terminating the container process, allowing the application to continue running while its resource constraints are re‑bound.

The process can be summarized:

Update – A user patches the pod’s resource requirements via kubectl edit, kubectl patch, or a controller.
Detection – The kubelet detects the spec change and validates the new values against any applicable constraints (e.g., container limits, QoS class).
Runtime call – The kubelet calls the container runtime through the CRI to resize the container’s cgroup.
Apply – The runtime updates the cgroup’s CPU quota/memory limit (or both), immediately affecting the container’s resource envelope.
Status – The kubelet updates the pod’s resize status and the ContainersResize condition to reflect success or any errors.

The feature respects Kubernetes resource semantics: the pod’s QoS class may change after the resize, and admission controllers that enforce resource quotas or limits still apply. Importantly, the container’s filesystem and network namespace remain untouched, and processes inside the container keep running.

Feature Evolution: Alpha to GA

The journey of In‑Place Pod Vertical Scaling highlights Kubernetes’ commitment to stability and user feedback. Below is a concise comparison of its maturity stages:

Version/Stage	Introduction	Key Capabilities	Known Limitations
Alpha (v1.27)	2023-08	Basic in‑place resize for CPU & memory; opt‑in via feature gate `InPlacePodVerticalScaling`; only increases supported	No decrease; no status conditions; limited runtime support
Beta (v1.33)	2024-12	Full increase/decrease; `ContainersResize` condition added; metrics and events; stable API	Still needed explicit feature‑gate enablement on some clusters; runtime coverage expanding
GA (v1.35)	2025-12	Enabled by default; broad container runtime support (containerd, CRI‑O, etc.); production‑ready; no feature gate	Supports only CPU and memory; ephemeral storage resize not available

This progression demonstrates the maturation of both the Kubernetes control plane and the ecosystem of container runtimes to safely handle live resource modifications.

Practical Implementation: How to Use In‑Place Scaling

Once your cluster runs Kubernetes 1.35 (or later), no special configuration is required—IPVS is turned on by default. You can resize any existing pod that uses CPU and memory resources, provided the underlying runtime supports it (all major runtimes do as of 1.35).

Example: Scale a Deployment’s pod resources

Suppose you have a Deployment named app-deploy:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: app-deploy
spec:
  replicas: 2
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
    spec:
      containers:
      - name: myapp
        image: myapp:1.0
        resources:
          requests:
            cpu: "500m"
            memory: "512Mi"
          limits:
            cpu: "1"
            memory: "1Gi"

To increase the memory limit to 2Gi and CPU limit to 2 without restarting pods:

kubectl patch deployment app-deploy \
  --type='json' \
  -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/resources/limits/memory", "value":"2Gi"}]'

kubectl patch deployment app-deploy \
  --type='json' \
  -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/resources/limits/cpu", "value":"2"}]'

Or edit the Deployment directly:

kubectl edit deployment app-deploy

Then modify the resource limits under spec.template.spec.containers[0].resources.limits.

The kubelet will apply the new limits to all running pods managed by the Deployment. You can verify the resize status:

# Watch pod conditions
kubectl get pod <pod-name> -o wide --watch

# Detailed view
kubectl describe pod <pod-name>

Look for the condition ContainersResize with status True and reason ContainerResourcesUpdated.

Considerations and Best Practices

Runtime support – Ensure your nodes run a CRI that supports the UpdateContainerResources CRI method (containerd ≥ 1.6, CRI‑O ≥ 1.25).
Resource types – Only CPU and memory can be resized in‑place. Ephemeral storage and hugepages are not yet supported.
Application behavior – Some applications may not automatically pick up new limits (e.g., JVM heap size must be tuned via flags). Be aware that decreasing memory limits can cause OOM kills if the container’s resident set exceeds the new cap.
Monitoring – Use metrics from kubelet and the pod resource monitor to observe post‑resize utilization and adjust further if needed.

Conclusion

In‑Place Pod Vertical Scaling in Kubernetes 1.35 represents a significant operational advancement, allowing clusters to adapt to workload fluctuations instantly. By removing the restart penalty, it improves service availability, reduces resource fragmentation, and simplifies capacity planning. The GA status guarantees stability and broad ecosystem support. As clusters grow more dynamic, features like IPVS will become indispensable in the Kubernetes toolkit.

Author: James P Samuelkutty
Contact: LinkedIn | Email