-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Description
- Did you update?
pip install --upgrade unsloth unsloth_zoo
✅ Colab
orKaggle
or local / cloud -> Local- Which notebook? Please link! -> .py file(Local)
- Which Unsloth version, TRL version, transformers version, PyTorch version?
- unsloth 2025.7.4
- trl 0.19.1
- transformers 4.53.2
- torch 2.7.1+cu126
- torchaudio 2.7.1+cu126
- torchvision 0.22.1
- Which trainer?
SFTTrainer
,GRPOTrainer
etc -> NOT TRAINED AT ALL!! - Number GPUs used, use
nvidia-smi
-> NVIDIA GeForce RTX 4060 8GB
|
↓
GPU信息 GPU Info:
╰─ nvidia-smi
Fri Jul 18 00:08:01 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 576.88 Driver Version: 576.88 CUDA Version: 12.9 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Driver-Model | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4060 ... WDDM | 00000000:01:00.0 Off | N/A |
| N/A 50C P8 1W / 80W | 0MiB / 8188MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
代码 Code:
from unsloth import FastLanguageModel
max_seq_length = 2048
dtype = None
load_in_4bit = True
model, tokenizer = FastLanguageModel.from_pretrained(
model_name=MODEL_PATH,
max_seq_length=max_seq_length,
load_in_4bit=load_in_4bit,
dtype=dtype,
device_map="cuda:0",
)
model.save_pretrained("lora_model")
tokenizer.save_pretrained("lora_model")
model.save_pretrained_gguf(
"model",
tokenizer,
quantization_method="q4_k_m"
)
print("Done.")
我无法编译llama.cpp,所以我下载了它的预编译版本。(我必须这么做,否则将会有另一个报错)如你所见,我只是加载了本地的模型并重新保存为.gguf文件,就已经出现了报错。
I can't build llama.cpp, so I download the prerbuilt version.(I had to do this; otherwise, there would be another error. )As you can see, an error occurred simply by loading the local model and re-saving it as a .gguf file.
顺便一提,我在发现问题之前进行了微调,微调过程没有任何问题,并且LoRA模型也能正常保存,但是将模型保存为gguf就无法做到。
By the way, I performed fine-tuning before discovering this issue. The fine-tuning process went smoothly without any problems, and the LoRA model was also saved properly. However, saving the model as a gguf file fails to work.
!!重要!!:我把unsloth的源码稍微修改了一下,打印出了报错时正在执行的命令,命令如下:
!! IMPORTANT !!: I made slight modifications to the unsloth source code and printed the command being executed when the error occurred. The command is as follows:
python llama.cpp/convert_hf_to_gguf.py model --outfile F:\Projects\Python\my_venv\fine-tuning\model\unsloth.BF16.gguf --outtype bf16
再附上被我修改的源码(在文件F:\miniconda3\envs\my_venv\Lib\site-packages\unsloth\save.py
中):
Additionally, here's the modified source code (in F:\miniconda3\envs\my_venv\Lib\site-packages\unsloth\save.py
):
...
def try_execute(commands, force_complete = False):
for command in commands:
print("!!!DEBUG!!!: ", command) # <--------------------------------HERE!!!
with subprocess.Popen(command, shell = True, stdout = subprocess.PIPE, stderr = subprocess.STDOUT, bufsize = 1) as sp:
for line in sp.stdout:
line = line.decode("utf-8", errors = "replace")
if "undefined reference" in line:
raise RuntimeError(f"*** Unsloth: Failed compiling llama.cpp with {line}. Please report this ASAP!")
elif "deprecated" in line:
return "CMAKE"
elif "Unknown argument" in line:
raise RuntimeError(f"*** Unsloth: Failed compiling llama.cpp with {line}. Please report this ASAP!")
elif "***" in line:
raise RuntimeError(f"*** Unsloth: Failed compiling llama.cpp with {line}. Please report this ASAP!")
print(line, flush = True, end = "")
pass
if force_complete and sp.returncode is not None and sp.returncode != 0:
raise subprocess.CalledProcessError(sp.returncode, sp.args)
pass
pass
return None
pass
...
报错详情 Error details:
╰─ python main.py
🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
🦥 Unsloth Zoo will now patch everything to make training faster!
F:\miniconda3\envs\my_venv\Lib\site-packages\unsloth_zoo\gradient_checkpointing.py:339: UserWarning: expandable_segments not supported on this platform (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\pytorch\c10/cuda/CUDAAllocatorConfig.h:28.)
GPU_BUFFERS = tuple([torch.empty(2*256*2048, dtype = dtype, device = f"{DEVICE_TYPE}:{i}") for i in range(n_gpus)])
Unrecognized keys in `rope_scaling` for 'rope_type'='yarn': {'attn_factor'}
==((====))== Unsloth 2025.7.4: Fast Qwen3 patching. Transformers: 4.53.2.
\\ /| NVIDIA GeForce RTX 4060 Laptop GPU. Num GPUs = 1. Max memory: 7.996 GB. Platform: Windows.
O^O/ \_/ \ Torch: 2.7.1+cu126. CUDA: 8.9. CUDA Toolkit: 12.6. Triton: 3.3.1
\ / Bfloat16 = TRUE. FA [Xformers = 0.0.31.post1. FA2 = False]
"-____-" Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!
Unrecognized keys in `rope_scaling` for 'rope_type'='yarn': {'attn_factor'}
Unrecognized keys in `rope_scaling` for 'rope_type'='yarn': {'attn_factor'}
Loading checkpoint shards: 100%|████████████████████████| 2/2 [00:12<00:00, 6.13s/it]
Unsloth: ##### The current model auto adds a BOS token.
Unsloth: ##### Your chat template has a BOS token. We shall remove it temporarily.
Unsloth: Merging 4bit and LoRA weights to 16bit...
Unsloth: Will use up to 32.89 out of 63.75 RAM for saving.
Unsloth: Saving model... This might take 5 minutes ...
0%| | 0/36 [00:00<?, ?it/s]
We will save to Disk and not RAM now.
100%|█████████████████████████████████████████████████| 36/36 [00:15<00:00, 2.36it/s]
Unsloth: Saving tokenizer... Done.
Unrecognized keys in `rope_scaling` for 'rope_type'='yarn': {'attn_factor'}
Done.
Unsloth: Converting qwen3 model. Can use fast conversion = False.
==((====))== Unsloth: Conversion from QLoRA to GGUF information
\\ /| [0] Installing llama.cpp might take 3 minutes.
O^O/ \_/ \ [1] Converting HF to GGUF 16bits might take 3 minutes.
\ / [2] Converting GGUF 16bits to ['q4_k_m'] might take 10 minutes each.
"-____-" In total, you will have to wait at least 16 minutes.
Unsloth: Installing llama.cpp. This might take 3 minutes...
Unsloth: [1] Converting model at model into bf16 GGUF format.
The output location will be F:\Projects\Python\my_venv\fine-tuning\model\unsloth.BF16.gguf
This might take 3 minutes...
!!!DEBUG!!!: python llama.cpp/convert_hf_to_gguf.py model --outfile F:\Projects\Python\my_venv\fine-tuning\model\unsloth.BF16.gguf --outtype bf16
INFO:hf-to-gguf:Loading model: model
Unrecognized keys in `rope_scaling` for 'rope_type'='yarn': {'attn_factor'}
INFO:hf-to-gguf:Model architecture: Qwen3ForCausalLM
Unrecognized keys in `rope_scaling` for 'rope_type'='yarn': {'attn_factor'}
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Exporting model...
INFO:hf-to-gguf:gguf: loading model weight map from 'model.safetensors.index.json'
INFO:hf-to-gguf:gguf: loading model part 'model-00001-of-00004.safetensors'
INFO:hf-to-gguf:token_embd.weight, torch.bfloat16 --> BF16, shape = {4096, 151936}
INFO:hf-to-gguf:blk.0.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.0.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.0.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.0.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.0.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.0.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.0.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.0.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.0.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.0.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.0.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.1.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.1.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.1.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.1.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.1.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.1.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.1.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.1.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.1.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.1.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.1.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.2.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.2.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.2.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.2.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.2.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.2.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.2.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.2.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.2.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.2.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.2.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.3.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.3.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.3.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.3.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.3.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.3.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.3.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.3.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.3.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.3.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.3.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.4.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.4.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.4.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.4.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.4.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.4.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.4.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.4.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.4.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.4.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.4.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.5.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.5.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.5.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.5.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.5.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.5.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.5.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.5.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.5.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.5.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.5.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.6.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.6.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.6.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.6.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.6.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.6.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.6.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.6.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.6.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.6.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.6.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.7.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.7.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.7.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.7.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.7.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.7.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.7.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.7.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.7.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.7.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.7.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.8.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.8.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.8.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.8.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.8.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.8.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.8.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.8.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.8.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.8.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.8.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.9.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.9.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.9.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.9.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.9.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:gguf: loading model part 'model-00002-of-00004.safetensors'
INFO:hf-to-gguf:blk.10.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.10.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.10.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.10.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.10.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.10.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.10.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.10.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.10.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.10.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.10.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.11.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.11.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.11.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.11.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.11.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.11.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.11.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.11.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.11.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.11.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.11.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.12.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.12.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.12.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.12.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.12.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.12.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.12.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.12.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.12.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.12.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.12.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.13.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.13.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.13.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.13.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.13.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.13.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.13.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.13.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.13.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.13.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.13.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.14.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.14.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.14.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.14.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.14.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.14.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.14.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.14.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.14.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.14.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.14.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.15.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.15.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.15.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.15.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.15.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.15.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.15.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.15.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.15.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.15.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.15.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.16.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.16.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.16.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.16.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.16.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.16.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.16.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.16.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.16.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.16.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.16.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.17.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.17.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.17.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.17.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.17.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.17.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.17.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.17.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.17.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.17.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.17.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.18.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.18.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.18.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.18.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.18.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.18.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.18.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.18.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.18.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.18.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.18.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.19.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.19.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.19.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.19.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.19.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.19.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.19.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.19.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.19.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.19.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.19.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.20.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.20.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.20.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.20.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.20.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.20.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.20.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.20.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.20.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.20.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.20.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.21.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.21.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.21.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.21.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.21.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.21.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.21.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.21.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.21.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.21.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.21.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.22.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.22.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.22.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.22.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.9.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.9.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.9.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.9.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.9.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.9.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:gguf: loading model part 'model-00003-of-00004.safetensors'
INFO:hf-to-gguf:blk.22.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.22.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.22.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.22.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.22.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.22.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.22.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.23.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.23.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.23.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.23.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.23.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.23.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.23.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.23.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.23.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.23.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.23.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.24.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.24.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.24.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.24.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.24.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.24.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.24.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.24.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.24.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.24.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.24.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.25.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.25.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.25.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.25.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.25.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.25.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.25.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.25.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.25.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.25.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.25.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.26.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.26.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.26.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.26.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.26.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.26.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.26.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.26.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.26.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.26.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.26.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.27.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.27.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.27.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.27.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.27.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.27.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.27.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.27.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.27.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.27.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.27.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.28.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.28.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.28.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.28.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.28.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.28.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.28.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.28.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.28.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.28.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.28.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.29.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.29.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.29.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.29.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.29.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.29.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.29.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.29.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.29.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.29.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.29.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.30.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.30.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.30.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.30.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.30.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.30.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.30.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.30.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.30.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.30.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.30.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.31.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.31.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.31.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.31.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.31.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.31.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.31.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.31.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.31.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.31.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.31.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.32.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.32.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.32.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.32.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.32.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.32.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.32.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.32.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.32.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.32.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.32.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.33.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.33.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.33.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.33.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.33.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.33.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.33.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.33.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.33.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.33.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.33.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.34.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.34.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.34.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.34.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.34.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.34.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.34.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.34.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.34.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.34.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.34.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.35.attn_k.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.35.attn_q.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.35.attn_v.weight, torch.bfloat16 --> BF16, shape = {4096, 1024}
INFO:hf-to-gguf:gguf: loading model part 'model-00004-of-00004.safetensors'
INFO:hf-to-gguf:output.weight, torch.bfloat16 --> BF16, shape = {4096, 151936}
INFO:hf-to-gguf:blk.35.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.35.ffn_down.weight, torch.bfloat16 --> BF16, shape = {12288, 4096}
INFO:hf-to-gguf:blk.35.ffn_gate.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.35.ffn_up.weight, torch.bfloat16 --> BF16, shape = {4096, 12288}
INFO:hf-to-gguf:blk.35.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.35.attn_k_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:blk.35.attn_output.weight, torch.bfloat16 --> BF16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.35.attn_q_norm.weight, torch.bfloat16 --> F32, shape = {128}
INFO:hf-to-gguf:output_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:gguf: context length = 131072
INFO:hf-to-gguf:gguf: embedding length = 4096
INFO:hf-to-gguf:gguf: feed forward length = 12288
INFO:hf-to-gguf:gguf: head count = 32
INFO:hf-to-gguf:gguf: key-value head count = 8
INFO:hf-to-gguf:gguf: rope theta = 1000000
INFO:hf-to-gguf:gguf: rms norm epsilon = 1e-06
INFO:hf-to-gguf:gguf: file type = 32
INFO:hf-to-gguf:Set model quantization version
INFO:hf-to-gguf:Set model tokenizer
WARNING:hf-to-gguf:
Traceback (most recent call last):
File "F:\Projects\Python\my_venv\fine-tuning\main.py", line 134, in <module>
model.save_pretrained_gguf(
File "F:\miniconda3\envs\my_venv\Lib\site-packages\unsloth\save.py", line 1866, in unsloth_save_pretrained_gguf
all_file_locations, want_full_precision = save_to_gguf(
^^^^^^^^^^^^^
File "F:\miniconda3\envs\my_venv\Lib\site-packages\unsloth\save.py", line 1204, in save_to_gguf
try_execute([command,], force_complete = True)
File "F:\miniconda3\envs\my_venv\Lib\site-packages\unsloth\save.py", line 826, in try_execute
raise RuntimeError(f"*** Unsloth: Failed compiling llama.cpp with {line}. Please report this ASAP!")
RuntimeError: *** Unsloth: Failed compiling llama.cpp with WARNING:hf-to-gguf:**************************************************************************************
. Please report this ASAP!