SIGILL during checkpoint

Dear CRIU team,

When using CRIU 3.19 on debian bookworm to checkpoint a container, approximately 5% of the attempts fail with the following error messages in the log:

```
(00.246054) Parasite syscall_ip at 0x40002c0be000
(00.278562) Set up parasite blob using memfd
(00.278627) Putting parasite blob into 0xffff81c9f000->0x400137d16000
(00.278760) Dumping GP/FPU registers for 15222
(00.278795) Putting tsock into pid 15222
(00.279009) Wait for parasite being daemonized...
(00.279028) Wait for ack 2 on daemon socket
(00.279173) Error (criu/parasite-syscall.c:88): si_code=4 si_pid=15222 si_status=4
(00.279198) Error (criu/parasite-syscall.c:95): 15222 was stopped by 4 unexpectedly
```

According to the comments in parasite-syscall.c and the log output, this indicates that a SIGILL occurred while executing the parasite code.

The process running in the container has an installed custom signal handler for nearly all signals, including `SIGILL`. After CRIU sets the parasite blob into the process, will the old signal handler be still be trigerred? I am asking because if that is the case, I could set a breakpoint there with a hardware debugger (I am working with an ARM SoC) to have a look to the stack trace and determine what caused that `SIGILL`.

Do you have any other suggestions or insights on how can I proceed to debug and resolve this problem?

Thanks in advance and regards
gspecht478

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SIGILL during checkpoint #2641

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

SIGILL during checkpoint #2641

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions