Mastering In-Place Vertical Scaling for Pod-Level Resources in Kubernetes 1.36

Introduction

Kubernetes v1.36 brings a powerful new capability: in-place vertical scaling for pod-level resources has graduated to Beta, meaning it's enabled by default via the InPlacePodLevelResourcesVerticalScaling feature gate. This feature allows you to dynamically adjust the aggregate resource budget (.spec.resources) of a running Pod without necessarily restarting your containers. In this how-to guide, you'll learn exactly how to leverage this feature to simplify resource management for complex Pods, such as those with sidecars, and scale their shared pool of CPU and memory on the fly.

Mastering In-Place Vertical Scaling for Pod-Level Resources in Kubernetes 1.36

What You Need

A Kubernetes cluster running version 1.36 or later (with the feature gate enabled by default).
kubectl command-line tool installed and configured to access your cluster.
Understanding of basic Kubernetes Pod and resource concepts.
Permission to patch Pods (edit or update resources).
A test environment (e.g., minikube, kind, or a cloud cluster) where you can safely experiment.

Step 1: Define a Pod with Pod-Level Resources and No Container-Level Limits

To take advantage of in-place vertical scaling at the Pod level, your Pod specification must include both a spec.resources block (which defines the aggregate budget) and containers that do not have individual resource limits. This way, the containers inherit the pod-level budget.

Create a file named shared-pool-pod.yaml with the following content:

apiVersion: v1
kind: Pod
metadata:
  name: shared-pool-app
spec:
  resources: # Pod-level limits
    limits:
      cpu: "2"
      memory: "4Gi"
  containers:
    - name: main-app
      image: nginx:latest
      # No container-level limits or requests – they inherit from the pod level
      resizePolicy:
        - resourceName: "cpu"
          restartPolicy: "NotRequired"
        - resourceName: "memory"
          restartPolicy: "NotRequired"
    - name: sidecar
      image: busybox:latest
      command: ["sleep", "3600"]
      resizePolicy:
        - resourceName: "cpu"
          restartPolicy: "NotRequired"

Important: The resizePolicy is set at the container level. As of v1.36, pod-level resizePolicy is not supported, so the Kubelet evaluates each container separately. Use NotRequired to avoid container restarts during non-disruptive updates.

Apply the Pod:

kubectl apply -f shared-pool-pod.yaml

Step 2: Perform an In-Place Resize via the Resize Subresource

Now that your Pod is running, you can increase the shared CPU pool from 2 to 4 CPUs without restarting the containers. Use the resize subresource to send a patch:

kubectl patch pod shared-pool-app --subresource resize --patch \
  '{"spec":{"resources":{"limits":{"cpu":"4"}}}}'

The patch only updates the pod-level limits.cpu. Memory can be updated similarly. The Kubelet will receive the change and attempt to apply it to the cgroups of each container that inherits from the pod-level budget.

Step 3: Verify the Resize

Check the updated resource status of the Pod:

kubectl describe pod shared-pool-app

Look for the Resources section under Spec – it should now show cpu: 4. You can also inspect the cgroup of a container inside the Pod to confirm the change:

kubectl exec shared-pool-app -c main-app -- cat /sys/fs/cgroup/cpu/cpu.max

If the resize succeeded without a restart, the container's CPU limit should have increased instantly.

Step 4: Understand When a Restart Is Required

Not all resource types or scenarios allow non-disruptive changes. The resizePolicy per container determines the behavior:

Non-disruptive (NotRequired): The Kubelet updates cgroup limits dynamically via the Container Runtime Interface (CRI). This works for CPU and memory, but may not be supported for other resources like hugepages or ephemeral storage.
Disruptive (RestartContainer): If you set restartPolicy: RestartContainer for a specific resource, the Kubelet will restart that container to apply the new pod-level budget safely. Use this when the runtime doesn't support dynamic updates or if you want to ensure a clean slate.

Note: When you change the pod-level resources, every container that inherits from that budget will see a resize event. The Kubelet consults each container's resizePolicy individually. If any container requires a restart, it will be restarted independently of the others.

Step 5: Safely Reduce Resources (Quick Guide)

You can also reduce the pod-level resource budget. However, be careful: reducing CPU or memory below the current usage of any container may cause throttling or OOM kills. The Kubelet will apply the reduction, but it's best to monitor usage before shrinking.

Example patch to reduce CPU to 1.5 cores:

kubectl patch pod shared-pool-app --subresource resize --patch \
  '{"spec":{"resources":{"limits":{"cpu":"1500m"}}}}'

Always test reductions in a non-production environment first.

Tips and Best Practices

Start with NotRequired for known supported resources. This avoids unnecessary restarts and keeps your services running during scaling events.
Use Pod-level resources when containers share a common resource pool. This is ideal for sidecar patterns where you don't want to calculate per-container limits manually.
Monitor container-level usage with tools like kubectl top pod or observability platforms to make informed scaling decisions. Pod-level limits are an aggregate; ensure the total is not exceeded by any single container's demand.
Test scaling operations against a stress container that generates CPU load to verify that in-place changes take effect without disruption.
Remember that resizePolicy is per container, not per pod. If you want to enforce a restart for all containers, you must set RestartContainer in each container's policy.
Keep an eye on the Kubelet logs for messages about resize attempts, especially if a change fails to apply in place and falls back to a restart.
Combine with Horizontal Pod Autoscaler (HPA) for dynamic scaling – you can trigger vertical adjustments based on metrics while HPA handles count changes.

With the steps above, you can confidently use in-place vertical scaling for pod-level resources in Kubernetes 1.36. This feature simplifies operations for multi-container Pods, reduces downtime, and gives you finer control over resource co‑scheduling.