-
Notifications
You must be signed in to change notification settings - Fork 453
Description
Hello ExaBGP Team,
Firstly, I'd like to express my appreciation for your exceptional product.
We utilize Ansible for installing and configuring ExaBGP in our setups. Below is our Ansible 'exabgp' role:
---
- name: Install packages
ansible.builtin.apt:
name:
- exabgp
state: present
- name: Configure ExaBGP
ansible.builtin.template:
src: exabgp.conf.j2
dest: /etc/exabgp/exabgp.conf
mode: 0644
notify: Reload ExaBGP
- name: Enable and start ExaBGP
ansible.builtin.systemd:
name: exabgp
enabled: true
state: started
And here's the handler 'Reload ExaBGP':
---
- name: Reload ExaBGP
ansible.builtin.systemd:
name: exabgp
state: reloaded
enabled: true
Unfortunately, we've noticed an issue. Our monitoring system detected that 'exabgp.service' was restarted: "Systemd's exabgp.service restarted 1 times on node1."
This issue can be reproduced using the 'systemctl restart exabgp && systemctl reload exabgp' command. On my Ubuntu 22.04, the result is as follows:
# systemctl restart exabgp && systemctl reload exabgp
Job for exabgp.service failed because a fatal signal was delivered to the control process.
See "systemctl status exabgp.service" and "journalctl -xeu exabgp.service" for details.
The journal log provides this information:
Aug 07 13:57:38 node1 systemd[1]: Starting ExaBGP...
Aug 07 13:57:38 node1 systemd[1]: Started ExaBGP.
Aug 07 13:57:38 node1 systemd[1]: Reloading ExaBGP...
Aug 07 13:57:38 node1 systemd[1]: Reloaded ExaBGP.
Aug 07 13:57:38 node1 systemd[1]: exabgp.service: Main process exited, code=killed, status=10/USR1
Aug 07 13:57:38 node1 systemd[1]: exabgp.service: Failed with result 'signal'.
Aug 07 13:57:38 node1 systemd[1]: exabgp.service: Scheduled restart job, restart counter is at 1.
Aug 07 13:57:38 node1 systemd[1]: Stopped ExaBGP.
Aug 07 13:57:39 node1 systemd[1]: Starting ExaBGP...
Aug 07 13:57:39 node1 systemd[1]: Started ExaBGP.
...
From the logs, the issue arises because exabgp doesn't have sufficient time to start before systemd sends 'USR1'.
To address this, we applied the workaround by overriding the default exabgp unit:
cat /etc/systemd/system/exabgp.service.d/override.conf
[Service]
ExecStartPost=/bin/sleep 2
We'd appreciate you letting us know if there's a better solution.
To Reproduce
Steps to reproduce the behavior:
systemctl restart exabgp && systemctl reload exabgp
Expected behavior
Reload command sent immediately after restarting the service doesn't lead to failing and restarting the service a second time.
Environment:
- OS: Ubuntu 22.04.3 LTS
- Version 4.2.17