Skip to content

StatefulSet - can't rollback from a broken state #67250

Open
@MrTrustor

Description

@MrTrustor

/kind bug

What happened:

I updated a StatefulSet with a non-existent Docker image. As expected, a pod of the statefulset is destroyed and can't be recreated (ErrImagePull). However, when I change back the StatefulSet with an existing image, the StatefulSet doesn't try to remove the broken pod to replace it by a good one. It keeps trying to pull the non-existing image.
You have to delete the broken pod manually to unblock the situation.

Related Stackoverflow question

What you expected to happen:

When rolling back the bad config, I expected the StatefulSet to remove the broken pod and replace it by a good one.

How to reproduce it (as minimally and precisely as possible):

  1. Deploy the following StatefulSet:
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
spec:
  selector:
    matchLabels:
      app: nginx # has to match .spec.template.metadata.labels
  serviceName: "nginx"
  replicas: 3 # by default is 1
  template:
    metadata:
      labels:
        app: nginx # has to match .spec.selector.matchLabels
    spec:
      terminationGracePeriodSeconds: 10
      containers:
      - name: nginx
        image: k8s.gcr.io/nginx-slim:0.8
        ports:
        - containerPort: 80
          name: web
        volumeMounts:
        - name: www
          mountPath: /usr/share/nginx/html
  volumeClaimTemplates:
  - metadata:
      name: www
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "standard"
      resources:
        requests:
          storage: 10Gi
  1. Once the 3 pods are running, update the StatefulSet spec and change the image to k8s.gcr.io/nginx-slim:foobar
  2. Observe the new pod failing to pull the image.
  3. Roll back the change.
  4. Observe the broken pod not being deleted.

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.7", GitCommit:"dd5e1a2978fd0b97d9b78e1564398aeea7e7fe92", GitTreeState:"clean", BuildDate:"2018-04-19T00:05:56Z", GoVersion:"go1.9
.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10+", GitVersion:"v1.10.5-gke.3", GitCommit:"6265b9797fc8680c8395abeab12c1e3bad14069a", GitTreeState:"clean", BuildDate:"2018-07-19T23:02:51Z", GoVersi
on:"go1.9.3b4", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration: Google Kubernetes Engine
  • OS (e.g. from /etc/os-release): COS

cc @joe-boyce

Metadata

Metadata

Assignees

Labels

kind/bugCategorizes issue or PR as related to a bug.lifecycle/frozenIndicates that an issue or PR should not be auto-closed due to staleness.sig/appsCategorizes an issue or PR as relevant to SIG Apps.sig/architectureCategorizes an issue or PR as relevant to SIG Architecture.sig/schedulingCategorizes an issue or PR as relevant to SIG Scheduling.

Type

No type

Projects

Status

Needs Triage

Status

Needs Triage

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions