Open
Description
Environment Information
- Which image of the operator are you using? master branch
- Where do you run it - cloud or metal? Kubernetes or OpenShift? Bare Metal K8s
- Are you running Postgres Operator in production? No (test environment)
- Type of issue? Bug report
File Path
pkg/cluster/pod.go
in the recreatePod
function
Description of the Issue
Sometimes, when a pod has already been deleted, the waitForPodDeletion
function in the recreatePod
method fails to catch the PodDeletion Event. This causes the function to hang or timeout, even though the pod deletion has actually occurred.
Steps to Reproduce
- Trigger pod recreation via the
recreatePod
function - In some cases, the pod gets deleted successfully
- However, the
waitForPodDeletion
function doesn't detect this event - The process gets stuck waiting for an event that won't come
Expected Behavior
The waitForPodDeletion
function should reliably detect when a pod has been deleted, regardless of timing or race conditions.
Actual Behavior
The function sometimes misses the deletion event, causing the process to hang.
Additional Information
This issue appears to be related to event handling in the operator. It might be a race condition where the pod deletion event occurs before the event listener is properly set up, or the event is somehow missed by the subscriber mechanism.
Metadata
Metadata
Assignees
Labels
No labels