Qwen3Moe(ForCausalLM) does not respect UNSLOTH_RETURN_HIDDEN_STATES when loss and labels are given. Newest versions of everything, python 3.11, cuda 12.6. The reason can be found in the patched code: ```python ... elif self.loss_function.__name__.endswith("ForCausalLMLoss") and labels is not None: ... # ========= OLD non fused ========= # logits = self.lm_head(hidden_states[:, slice_indices, :].to(lm_head_weight.device)) else: logits = self.lm_head(hidden_states[:, slice_indices, :]) ... ```