Skip to content

Cannot continue training from last checkpoint #188

@chunping-xt

Description

@chunping-xt

After the first finetune with your checkpoint with ~30 hours of data at epoch=6, I tried inference but the result didn't sound like anything like speech. I was going to train a few more epochs to see if that improved but got an error with my last checkpoint.

inference with the last checkpoint: ckpt_0035000.pt

from IPython.display import Audio, display
from fam.llm.fast_inference import TTS

tts = TTS(first_stage_path = '/mnt/f/ckpt_0035000.pt')
wav_file = tts.synthesise( text, spk_ref_path="/mnt/f/sample.mp3" )
display(Audio(wav_file, autoplay=True)) # bad result

continue training from last checkpoint

!python fam/llm/finetune.py \
--train '/mnt/f/train.csv' --val '/mnt/f/eval.csv' \
--ckpt '/mnt/f/ckpt_0035000.pt' \
--spk-emb-ckpt '/mnt/f/metavoice-1B-v0.1/speaker_encoder.pt'

... error...:
/usr/local/envs/env_metavoice/lib/python3.10/site-packages/torch/nn/utils/weight_norm.py:28: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
/usr/local/envs/env_metavoice/lib/python3.10/site-packages/torch/nn/utils/weight_norm.py:28: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
Training: 90%|███████████████████████████████▎ | 35000/39060 [00:00<?, ?it/s]Before layer freezing trainable_count(model)=1243191296...
After freezing excl. last 1 transformer blocks: trainable_count(model)=51386368...
Traceback (most recent call last):
File "/mnt/f/repo_metavoice-src/fam/llm/finetune.py", line 387, in
main()
File "/usr/local/envs/env_metavoice/lib/python3.10/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/usr/local/envs/env_metavoice/lib/python3.10/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/usr/local/envs/env_metavoice/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/envs/env_metavoice/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/mnt/f/repo_metavoice-src/fam/llm/finetune.py", line 263, in main
finetune_jobid = hash_dictionary(properties)
File "/mnt/f/repo_metavoice-src/fam/llm/utils.py", line 97, in hash_dictionary
serialized = json.dumps(d, sort_keys=True)
File "/usr/local/envs/env_metavoice/lib/python3.10/json/init.py", line 238, in dumps
**kw).encode(obj)
File "/usr/local/envs/env_metavoice/lib/python3.10/json/encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/local/envs/env_metavoice/lib/python3.10/json/encoder.py", line 257, in iterencode
return _iterencode(o, 0)
File "/usr/local/envs/env_metavoice/lib/python3.10/json/encoder.py", line 179, in default
raise TypeError(f'Object of type {o.class.name} '
TypeError: Object of type PosixPath is not JSON serializable
...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions