Skip to content

src/update_handler: ignore mount error before running slot hook #1591

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

gportay
Copy link
Contributor

@gportay gportay commented Jan 11, 2025

The handler mounts the other-slot filesystem, and it returns in error if it fails to mount it.

The filesystem may be in bad health, or may be not yet made, and thus, it causes the handler to fail.

As a consequence, it results in a slot that is unupdatable until the filesystem is remade manually (i.e. externally).

This ignores the mount error before the slot hook is run, and leaves the hook to decide whether or not to fail.


Hello,

This PR intends to fix an issue if installing a bundle with a pre-install slot hook, on the device slot A system, with the slot B partitions created, but without the filesystems made yet.

The handler function img_to_raw_handler() runs the pre_install hook and fails because the filesystem is not made yet. I will be made copy right after by the function write_image_to_dev().

(Well, I supposed it comes from that handler, I have not checked it).

With this PR, I am suggesting to ignore mount failure in function mount_and_run_slot_hook(). This fixes the issue I am facing; however, I guess it may have some side-effects and I am not convinced yet.

What are you opinions?

The issue can be reproduced easily:

  • generate a bundle with a pre-hook,
  • corrupt the other filesystem slot (dd if=/dev/zero of=/dev/other-slot), and
  • install the bundle (rauc install bundle.raucb).

Regards,
Gaël


Here are the failing logs on 1.12:

# journalctl -u rauc | cat
Oct 08 15:42:35 buildroot systemd[1]: Starting RAUC Update Service...
Oct 08 15:42:35 buildroot rauc[305]: Using central status file /boot/rauc/central.raucs
Oct 08 15:42:35 buildroot rauc[305]: Failed to load system status: No such file or directory
Oct 08 15:42:35 buildroot rauc[305]: Getting Systeminfo: /usr/lib/raspberrypi-firmware-rauc-bootloader-backend/system-info
Oct 08 15:42:35 buildroot rauc[305]: Booted into rootfs.0 (ROOTFS-A)
Oct 08 15:42:35 buildroot systemd[1]: Started RAUC Update Service.
Oct 08 15:42:36 buildroot rauc[305]: Marked slot rootfs.0 as good
Oct 08 15:42:36 buildroot rauc[305]: rauc mark: marked slot rootfs.0 as good
Jan 10 16:54:53 buildroot rauc[305]: input bundle: /tmp/buildroot.raucb
Jan 10 16:54:53 buildroot rauc[305]: Active slot bootname: /dev/mmcblk0p5
Jan 10 16:54:53 buildroot rauc[305]: installing /tmp/buildroot.raucb: started
Jan 10 16:54:53 buildroot rauc[305]: Installation 0f629fe9 started
Jan 10 16:54:53 buildroot rauc[305]: installing /tmp/buildroot.raucb: Checking and mounting bundle...
Jan 10 16:54:53 buildroot rauc[305]: Reading bundle: /tmp/buildroot.raucb
Jan 10 16:54:53 buildroot rauc[305]: Detected CRL but CRL checking is disabled!
Jan 10 16:54:53 buildroot rauc[305]: Verifying bundle signature... 
Jan 10 16:54:54 buildroot rauc[305]: Verified detached signature by 'O = Test Org, CN = Test Org Release-1'
Jan 10 16:54:54 buildroot rauc[305]: Mounting bundle '/tmp/buildroot.raucb' to '/run/rauc/bundle'
Jan 10 16:54:54 buildroot rauc[305]: Configured loop device '/dev/loop0' for 51494912 bytes
Jan 10 16:54:54 buildroot rauc[305]: Checking image type for slot type: ext4
Jan 10 16:54:54 buildroot rauc[305]: Image detected as type: *.ext4
Jan 10 16:54:54 buildroot rauc[305]: Checking image type for slot type: vfat
Jan 10 16:54:54 buildroot rauc[305]: Image detected as type: *.vfat
Jan 10 16:54:54 buildroot rauc[305]: Marking target slot rootfs.1 as non-bootable...
Jan 10 16:54:54 buildroot rauc[305]: Marked slot rootfs.1 as bad
Jan 10 16:54:54 buildroot rauc[305]: installing /tmp/buildroot.raucb: Updating slots...
Jan 10 16:54:54 buildroot rauc[305]: installing /tmp/buildroot.raucb: Checking slot rootfs.1
Jan 10 16:54:54 buildroot rauc[305]: Updating slot rootfs.1
Jan 10 16:54:54 buildroot rauc[305]: installing /tmp/buildroot.raucb: Updating slot rootfs.1
Jan 10 16:54:54 buildroot rauc[305]: Updating /dev/mmcblk0p6 with /run/rauc/bundle/rootfs.ext4
Jan 10 16:54:54 buildroot rauc[305]: Mounting slot /dev/mmcblk0p6
Jan 10 16:54:54 buildroot rauc[366]: mount: /run/rauc/rootfs.1: wrong fs type, bad option, bad superblock on /dev/mmcblk0p6, missing codepage or helper program, or other error.
Jan 10 16:54:54 buildroot rauc[366]:        dmesg(1) may have more information after failed mount system call.
Jan 10 16:54:54 buildroot rauc[305]: Updating slot rootfs.1 status
Jan 10 16:54:54 buildroot rauc[305]: Installation 0f629fe9 failed: Installation error: Failed updating slot rootfs.1: failed to mount slot: failed to run mount: Child process exited with code 32
Jan 10 16:54:54 buildroot rauc[305]: Installation error: Failed updating slot rootfs.1: failed to mount slot: failed to run mount: Child process exited with code 32
Jan 10 16:54:54 buildroot rauc[305]: installing /tmp/buildroot.raucb: Installation error: Failed updating slot rootfs.1: failed to mount slot: failed to run mount: Child process exited with code 32
Jan 10 16:54:54 buildroot rauc[305]: installing /tmp/buildroot.raucb: finished
Jan 10 16:54:54 buildroot rauc[305]: installing `/tmp/buildroot.raucb` failed: 1

Here are the logs on 1.12 + PR:

# journalctl -u rauc | cat
Oct 08 15:42:35 buildroot systemd[1]: Starting RAUC Update Service...
Oct 08 15:42:35 buildroot rauc[303]: Using central status file /boot/rauc/central.raucs
Oct 08 15:42:35 buildroot rauc[303]: Failed to load system status: No such file or directory
Oct 08 15:42:35 buildroot rauc[303]: Getting Systeminfo: /usr/lib/raspberrypi-firmware-rauc-bootloader-backend/system-info
Oct 08 15:42:35 buildroot rauc[303]: Booted into rootfs.0 (ROOTFS-A)
Oct 08 15:42:35 buildroot systemd[1]: Started RAUC Update Service.
Oct 08 15:42:36 buildroot rauc[303]: Marked slot rootfs.0 as good
Oct 08 15:42:36 buildroot rauc[303]: rauc mark: marked slot rootfs.0 as good
Jan 10 16:58:04 buildroot rauc[303]: input bundle: /tmp/buildroot.raucb
Jan 10 16:58:04 buildroot rauc[303]: Active slot bootname: /dev/mmcblk0p5
Jan 10 16:58:04 buildroot rauc[303]: installing /tmp/buildroot.raucb: started
Jan 10 16:58:04 buildroot rauc[303]: Installation 8c72cc7c started
Jan 10 16:58:04 buildroot rauc[303]: installing /tmp/buildroot.raucb: Checking and mounting bundle...
Jan 10 16:58:04 buildroot rauc[303]: Reading bundle: /tmp/buildroot.raucb
Jan 10 16:58:04 buildroot rauc[303]: Detected CRL but CRL checking is disabled!
Jan 10 16:58:04 buildroot rauc[303]: Verifying bundle signature... 
Jan 10 16:58:05 buildroot rauc[303]: Verified detached signature by 'O = Test Org, CN = Test Org Release-1'
Jan 10 16:58:05 buildroot rauc[303]: Mounting bundle '/tmp/buildroot.raucb' to '/run/rauc/bundle'
Jan 10 16:58:05 buildroot rauc[303]: Configured loop device '/dev/loop0' for 51494912 bytes
Jan 10 16:58:05 buildroot rauc[303]: Checking image type for slot type: ext4
Jan 10 16:58:05 buildroot rauc[303]: Image detected as type: *.ext4
Jan 10 16:58:05 buildroot rauc[303]: Checking image type for slot type: vfat
Jan 10 16:58:05 buildroot rauc[303]: Image detected as type: *.vfat
Jan 10 16:58:05 buildroot rauc[303]: Marking target slot rootfs.1 as non-bootable...
Jan 10 16:58:05 buildroot rauc[303]: Marked slot rootfs.1 as bad
Jan 10 16:58:05 buildroot rauc[303]: installing /tmp/buildroot.raucb: Updating slots...
Jan 10 16:58:05 buildroot rauc[303]: installing /tmp/buildroot.raucb: Checking slot rootfs.1
Jan 10 16:58:05 buildroot rauc[303]: Updating slot rootfs.1
Jan 10 16:58:05 buildroot rauc[303]: Updating /dev/mmcblk0p6 with /run/rauc/bundle/rootfs.ext4
Jan 10 16:58:05 buildroot rauc[303]: Mounting slot /dev/mmcblk0p6
Jan 10 16:58:05 buildroot rauc[303]: installing /tmp/buildroot.raucb: Updating slot rootfs.1
Jan 10 16:58:05 buildroot rauc[363]: mount: /run/rauc/rootfs.1: wrong fs type, bad option, bad superblock on /dev/mmcblk0p6, missing codepage or helper program, or other error.
Jan 10 16:58:05 buildroot rauc[363]:        dmesg(1) may have more information after failed mount system call.
Jan 10 16:58:05 buildroot rauc[303]: Ignoring mount error before slot hook error: failed to mount slot: failed to run mount: Child process exited with code 32
Jan 10 16:58:05 buildroot rauc[303]: Running slot 'slot-pre-install' hook for rootfs.1
Jan 10 16:58:05 buildroot rauc[303]: Running slot hook 'slot-pre-install' for rootfs.1
Jan 10 16:58:05 buildroot rauc[303]: opening slot device /dev/mmcblk0p6
Jan 10 16:58:05 buildroot rauc[303]: writing data to device /dev/mmcblk0p6
Jan 10 16:58:23 buildroot rauc[303]: Updating slot rootfs.1 status
Jan 10 16:58:23 buildroot rauc[303]: installing /tmp/buildroot.raucb: Updating slot rootfs.1 done
Jan 10 16:58:23 buildroot rauc[303]: installing /tmp/buildroot.raucb: Checking slot firmware.1
Jan 10 16:58:23 buildroot rauc[303]: Updating slot firmware.1
Jan 10 16:58:23 buildroot rauc[303]: Updating /dev/mmcblk0p3 with /run/rauc/bundle/firmwarefs.vfat
Jan 10 16:58:23 buildroot rauc[303]: opening slot device /dev/mmcblk0p3
Jan 10 16:58:23 buildroot rauc[303]: writing data to device /dev/mmcblk0p3
Jan 10 16:58:23 buildroot rauc[303]: installing /tmp/buildroot.raucb: Updating slot firmware.1
Jan 10 16:58:27 buildroot rauc[303]: Mounting slot /dev/mmcblk0p3
Jan 10 16:58:27 buildroot rauc[303]: Running slot 'slot-post-install' hook for firmware.1
Jan 10 16:58:27 buildroot rauc[303]: Running slot hook 'slot-post-install' for firmware.1
Jan 10 16:58:27 buildroot rauc[303]: Unmounting slot /dev/mmcblk0p3
Jan 10 16:58:27 buildroot rauc[303]: Updating slot firmware.1 status
Jan 10 16:58:27 buildroot rauc[303]: installing /tmp/buildroot.raucb: Updating slot firmware.1 done
Jan 10 16:58:27 buildroot rauc[303]: Marking target slot rootfs.1 as bootable...
Jan 10 16:58:27 buildroot rauc[303]: installing /tmp/buildroot.raucb: All slots updated
Jan 10 16:58:27 buildroot rauc[377]: 0x0000001c 0x80000000 0x00038064 0x00000004 0x80000004 0x00000000 0x00000000
Jan 10 16:58:27 buildroot rauc[303]: Marked slot rootfs.1 as active
Jan 10 16:58:27 buildroot rauc[303]: Installation 8c72cc7c succeeded
Jan 10 16:58:27 buildroot rauc[303]: installing /tmp/buildroot.raucb: finished
Jan 10 16:58:27 buildroot rauc[303]: installing `/tmp/buildroot.raucb` succeeded

Note: I have tested it with that custom bootloader backend, and that pre-install slot hook. The bundle is generated by that genimage config file.

Copy link

codecov bot commented Jan 11, 2025

Codecov Report

Attention: Patch coverage is 25.00000% with 3 lines in your changes missing coverage. Please review.

Project coverage is 84.39%. Comparing base (52a8d00) to head (642cc35).
Report is 7 commits behind head on master.

Files with missing lines Patch % Lines
src/update_handler.c 25.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1591      +/-   ##
==========================================
- Coverage   84.42%   84.39%   -0.03%     
==========================================
  Files          69       69              
  Lines       21666    21663       -3     
==========================================
- Hits        18291    18282       -9     
- Misses       3375     3381       +6     
Flag Coverage Δ
service=false 81.10% <25.00%> (-0.04%) ⬇️
service=true 84.31% <25.00%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

The handler mounts the other-slot filesystem, and it returns in error if
it fails to mount it.

The filesystem may be in bad health, or may be not yet made, and thus,
it causes the handler to fail.

As a consequence, it results in a slot that is unupdatable until the
filesystem is remade manually (i.e. externally).

This ignores the mount error before the slot hook is run, and leaves the
hook to decide whether or not to fail.

Signed-off-by: Gaël PORTAY <gael.portay@rtone.fr>
@gportay gportay force-pushed the ignore-mount-error-before-running-slot-hook branch from 5fdf833 to 642cc35 Compare January 17, 2025 04:03
@ejoerns
Copy link
Member

ejoerns commented Jan 17, 2025

Fixes: #276

@gportay Thank you for addressing this! The discussion on the original issue halted back in 2018, but re-reading my comment I would stick with what I said back then (and your solution is one of those proposed there).

I would also clarify in the docs that a pre-install hook (on a mountable fs slot type) must not assume the slot is mounted/readable but must explicitly check and handle this itself.

@ejoerns ejoerns requested a review from jluebbe January 17, 2025 07:12
@gportay
Copy link
Contributor Author

gportay commented Jan 17, 2025

I would also clarify in the docs that a pre-install hook (on a mountable fs slot type) must not assume the slot is mounted/readable but must explicitly check and handle this itself.

I am thinking about adding a flag in parameter to the function mount_and_run_slot_hook(), so it returns in error or not if the filesystem is mounted or not. This way, we can behave differently if running a pre-install or post-install slot hook.

But, my opinion is to defer the error handling to the script for now and not adding any flags... this introduces a "behavior" change for bundles installing a blob that is not a filesystem (let's say dm-verity hashes) using a post-install slot hook (RAUC would no fail anymore). I wonder if it makes sense to control such things from the system.conf or the manifest.raucm.

@jluebbe jluebbe requested a review from ejoerns January 20, 2025 15:11
@ejoerns ejoerns added this to the Release v1.13 milestone Jan 20, 2025
@ejoerns
Copy link
Member

ejoerns commented Jan 20, 2025

I would also clarify in the docs that a pre-install hook (on a mountable fs slot type) must not assume the slot is mounted/readable but must explicitly check and handle this itself.

I am thinking about adding a flag in parameter to the function mount_and_run_slot_hook(), so it returns in error or not if the filesystem is mounted or not. This way, we can behave differently if running a pre-install or post-install slot hook.

You could just check for the hook_cmd being R_SLOT_HOOK_PRE_INSTALL and ignore the warning for this case only.

But, my opinion is to defer the error handling to the script for now and not adding any flags... this introduces a "behavior" change for bundles installing a blob that is not a filesystem (let's say dm-verity hashes) using a post-install slot hook (RAUC would no fail anymore). I wonder if it makes sense to control such things from the system.conf or the manifest.raucm.

When using a file system type, we must assume that the installed blob is a file system. Otherwise, a different slot type, like raw moust be used.

g_message("Mounting slot %s", slot->device);
res = r_mount_slot(slot, &ierror);
if (!res) {
g_propagate_error(error, ierror);
goto out;
g_warning("Ignoring mount error before slot hook error: %s", ierror->message);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This message doesn't make sense here. I guess it is copied and modified from below?

Please also ensure that this applies to pre-install handlers only as mentioned in the other comment.

@jluebbe
Copy link
Member

jluebbe commented Jan 21, 2025

In general, we need to allow mount failures before the installation, as the slot can be un-mountable for valid reasons.

The way this PR implements it would lead to cases where RAUC_SLOT_MOUNT_POINT is unset in the hook script. That could cause it to do something else by mistake (e.g. rm -rf $RAUC_SLOT_MOUNT_POINT/bin would remove /bin).

I'm not sure how widespread this issue is or what the best way to handle it would be. We could:

  • Pass $RAUC_MOUNT_PREFIX/mount-error-$SLOT_NAME in RAUC_SLOT_MOUNT_POINT, but create an immutable file instead. That would (probably?) lead to errors in the script if it tries to access the mount point as a directory.
  • Use a different hook name for scripts that claim to handle mount errors correctly.
  • Add a flag in the manifest to declare that mount errors are handled correctly. It could be required only if a pre-install hook was used. Older RAUC versions would reject those bundles, though, making it hard to deploy.
  • Add a flag to the system conf, which allows executing hooks without mounted filesystems. It would be the responsibility of the user to set an appropriate min-bundle-version so that only bundles with correctly implemented hooks can be installed.

@gportay @ejoerns Do you see other options? Do you have preferences?

@gportay
Copy link
Contributor Author

gportay commented Jan 21, 2025

The way this PR implements it would lead to cases where RAUC_SLOT_MOUNT_POINT is unset in the hook script. That could cause it to do something else by mistake (e.g. rm -rf $RAUC_SLOT_MOUNT_POINT/bin would remove /bin).

Oh that is a good catch!

I'm not sure how widespread this issue is or what the best way to handle it would be. We could:

* Pass `$RAUC_MOUNT_PREFIX/mount-error-$SLOT_NAME` in RAUC_SLOT_MOUNT_POINT, but create an immutable file instead. That would (probably?) lead to errors in the script if it tries to access the mount point as a directory.

* Use a different hook name for scripts that claim to handle mount errors correctly.

* Add a flag in the manifest to declare that mount errors are handled correctly. It could be required only if a pre-install hook was used. Older RAUC versions would reject those bundles, though, making it hard to deploy.

* Add a flag to the system conf, which allows executing hooks without mounted filesystems. It would be the responsibility of the user to set an appropriate `min-bundle-version` so that only bundles with correctly implemented hooks can be installed.

@gportay @ejoerns Do you see other options? Do you have preferences?

Well, I have no preference for now.

EDIT; I would hesitate between 2 and 4.

@ejoerns
Copy link
Member

ejoerns commented Jan 21, 2025

The way this PR implements it would lead to cases where RAUC_SLOT_MOUNT_POINT is unset in the hook script. That could cause it to do something else by mistake (e.g. rm -rf $RAUC_SLOT_MOUNT_POINT/bin would remove /bin).

Oh that is a good catch!

The documentation for RAUC_SLOT_MOUNT_POINT says:

For mountable slots, i.e. those with a file system type, RAUC will attempt to automatically mount the slot if a pre-install or post-install hook is given and provide the slot’s current mount point under this env variable.

Thus one could also argue that this variable always needs to be checked and that the change now finally reflects the documented behavior (because "attempt to automatically mount" does not mean it will ensure it is mounted).

The hook documentation itself is a bit more specific here:

For target slot types that represent a mountable file system, the hook will be executed with the target slots’ file system mounted. Note that a broken or unformatted target slot will currently cause the installation to be aborted with an error.

But it does not say something about the variable, though.

I also would not know of good use cases where a pre-install hook needs access to the target partition. The way more common use case would be to use this in a post-install hook, of course.

Anyway, if we can prevent potential issues with this, we should probably do so.

I'm not sure how widespread this issue is or what the best way to handle it would be. We could:

* Pass `$RAUC_MOUNT_PREFIX/mount-error-$SLOT_NAME` in RAUC_SLOT_MOUNT_POINT, but create an immutable file instead. That would (probably?) lead to errors in the script if it tries to access the mount point as a directory.

* Use a different hook name for scripts that claim to handle mount errors correctly.

* Add a flag in the manifest to declare that mount errors are handled correctly. It could be required only if a pre-install hook was used. Older RAUC versions would reject those bundles, though, making it hard to deploy.

* Add a flag to the system conf, which allows executing hooks without mounted filesystems. It would be the responsibility of the user to set an appropriate `min-bundle-version` so that only bundles with correctly implemented hooks can be installed.

@gportay @ejoerns Do you see other options? Do you have preferences?

Well, I have no preference for now.

EDIT; I would hesitate between 2 and 4.

I would like to introduce a new flag only if necessary.

Based on the documentation, we could maybe implement the 'immutable file' approach and update the documentation or use a distinct hook name. Both add complexity, though..

@jluebbe
Copy link
Member

jluebbe commented Jan 22, 2025

I also would not know of good use cases where a pre-install hook needs access to the target partition. The way more common use case would be to use this in a post-install hook, of course.

A pre-install hook could be used to copy some data that needs to be restored later. I wouldn't count that as a good use case, but it's certainly currently possible, so there's probably someone doing that.

@ejoerns ejoerns modified the milestones: Release v1.13, Release v1.14 Jan 22, 2025
@ejoerns ejoerns modified the milestones: Release v1.14, Release v1.15 Apr 2, 2025
@gportay
Copy link
Contributor Author

gportay commented Apr 15, 2025

@jluebbe, @ejoerns, do you think this can be merged? Or should we implement a flag somewhere?

I do not mind implementing a flag, but I would prefer not to.

For now, if I am correct, the update fails because the filesystem is not mountable.

With this patch, the update would succeed but, the pre-install hook may use the variable $RAUC_MOUNT_PREFIX incorrectly because that variable is empty.

I tend to think the maintainer of the pre-install hook should check for the $RAUC_MOUNT_PREFIX before using it.

However, as I said, I do not mind implementing a flag to avoid people doing unexpected and unrecoverable things such as rm -Rf "$RAUC_MOUNT_PREFIX/".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants