Closed
Description
Motivation
- We need this to ensure atomicity of accelerator’s hardware perspective.
Required Features
- Add new field under
scaling_group.scheduler_opts
JSON column indicating whether to allow fragmentation of accelerators supporting fractional shares- This should be an opt-in feature
- Check the newly created field on agent-side accelerator scheduling process and cancel the kernel creation process if there isn’t enough resource to avoid resource fragmentation
Impact
- Administrators will be able to select whether to allow fragmentation or not on a scaling group basis
Testing Scenarios
- Create a Backend.AI cluster consisted of multiple mock accelerators
- Try to create a session with accelerator forced to be fragmentized
- Verify kernel creation being blocked when the new option is enabled
Metadata
Metadata
Assignees
Labels
No labels