So, I set the retain_graph=True
for my mse_loss.backward
call and it started training and process converged.
But, when I started adding batches the new error popped out when doing second epoch:
\none of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [30, 1]], which is output 0 of AsStridedBackward0, is at version 2; expected version 1 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!\n
When I was writing the code I was looking at the examples/state_estimation_2d.py
as a referance. The main difference I see is that instead of predicting factor weights I'm trying to predict the measurements.
I think I'm missing something in how to prepare the graph for autograd or how to call/update inner/outer loops. I tried to visualize the graph with torchviz
but it didn't help.
I have already looked through the discussions and issues and haven't spotted anything related to above mentioned errors.
\nIt would be great if somebody can have a look at my code and maybe give a hint how to fix training with batching.
\nHere is the version of jupyter notebook for the second error:
\nhttps://github.com/nosmokingsurfer/fgraph_diff/blob/master/theseus_tests/theseus_tum_vi/linear_motion_test.ipynb
Cheers,
\nAlex
Hi @nosmokingsurfer, the case of the error that required you to add retain_graph=True
is that you were not setting initial values for the optimization variables, which means that after the first loop the values from the previous optimization were using as initial values (thus retaining graph info). You can replace your initialization for loop with the following
theseus_inputs = {}\n for i in range(N):\n if i < N - 1:\n tmp = torch.zeros(B, 4)\n tmp[:, 2] = 1.0\n tmp[:, 0] = 0.5 * predicted_acc[:, i] ** 2 + predicted_acc[:, i]\n theseus_inputs[f\"predicted_odometry_{i}\"] = tmp\n # Using SE2(...).tensor converts the (x, y, theta) input to (x, y, cos, sin)\n theseus_inputs[f\"pose_{i}\"] = th.SE2(torch.zeros(B, 3)).tensor
backward()
: Trying to backward through the graph a second time (...).
#604
-
Hi! My name is Alex and I'm learning how to do backprop through FGO. My current task is to train NN model to predict odometry measurements in end-to-end manner. I have already implemented a simple example with 1D synthetic data and First I was getting an error at
So, I set the But, when I started adding batches the new error popped out when doing second epoch:
When I was writing the code I was looking at the I think I'm missing something in how to prepare the graph for autograd or how to call/update inner/outer loops. I tried to visualize the graph with I have already looked through the discussions and issues and haven't spotted anything related to above mentioned errors. It would be great if somebody can have a look at my code and maybe give a hint how to fix training with batching. Here is the version of jupyter notebook for the second error: Cheers, |
Beta Was this translation helpful? Give feedback.
-
Hi @nosmokingsurfer, sorry it took me so long to respond to this, I've been really busy with other deadlines. Do you have a smaller example that reproduces this error? Your notebook is a bit large and it will take me a long time to figure out what's going on. |
Beta Was this translation helpful? Give feedback.
-
Hi @nosmokingsurfer, the case of the error that required you to add theseus_inputs = {}
for i in range(N):
if i < N - 1:
tmp = torch.zeros(B, 4)
tmp[:, 2] = 1.0
tmp[:, 0] = 0.5 * predicted_acc[:, i] ** 2 + predicted_acc[:, i]
theseus_inputs[f"predicted_odometry_{i}"] = tmp
# Using SE2(...).tensor converts the (x, y, theta) input to (x, y, cos, sin)
theseus_inputs[f"pose_{i}"] = th.SE2(torch.zeros(B, 3)).tensor |
Beta Was this translation helpful? Give feedback.
Hi @nosmokingsurfer, the case of the error that required you to add
retain_graph=True
is that you were not setting initial values for the optimization variables, which means that after the first loop the values from the previous optimization were using as initial values (thus retaining graph info). You can replace your initialization for loop with the following