Open
Description
Hey Malik,
I am trying to use your model in my work and I was wondering what are the correct hyperparameters you are using in your experiments. In the paper you say that you use 1000 iterations and for the Adam optimizers the suggested ones is the paper, I assume that the lr is the one that pytorch uses 1e-3. However in your tim.py
code under the config()
you use lr = 1e-4
I was wondering if that is correct
Metadata
Metadata
Assignees
Labels
No labels