-
Notifications
You must be signed in to change notification settings - Fork 401
Fix multiplication order for transforms #649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@Manishearth @toji @NellWaliczek @billorr-google: can you please sanity check this? I'm fairly sure the proposed change is correct, but we should make sure we're all on the same page for this. Let me know if you want to discuss this more. |
index.bs
Outdated
@@ -851,7 +851,7 @@ The <dfn method for="XRReferenceSpace">getOffsetReferenceSpace(|originOffset|)</ | |||
1. If |base| is an instance of {{XRBoundedReferenceSpace}}, let |offsetSpace| be a new {{XRBoundedReferenceSpace}} and set |offsetSpace|'s {{boundsGeometry}} to |base|'s {{boundsGeometry}}, with each point multiplied by the {{XRRigidTransform/inverse}} of |originOffset|. | |||
1. Else let |offsetSpace| be a new {{XRReferenceSpace}}. | |||
1. Set |offsetSpace|'s [=native origin=] to |base|'s [=native origin=]. | |||
1. Set |offsetSpace|'s [=origin offset=] to the result of [=multiply transforms|multiplying=] |originOffset| by |base|'s [=origin offset=]. | |||
1. Set |offsetSpace|'s [=origin offset=] to the result of [=multiply transforms|multiplying=]|base|'s [=origin offset=] by |originOffset|. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: We lost a space between [=multiply transforms|multiplying=]
and |base|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, fixed.
The multiplication order in the spec needs to match the matrix semantics we've agreed on, and the usage appeared to be backwards. Specific justifications following. getViewerPose: if |offset| is the view offset of an eye view, the expected semantics would be that it's a pose-like transform where the transform's position is the eye view's coordinates in XRViewerPose space. The corresponding matrix is viewerFromEye. XRViewerPose contains the viewer pose in the base reference space, so its transform's position is the viewer position in base space coordinates, and the transform is baseFromViewer. Chaining these transforms works as follows: ```js offset.matrix = viewerFromEye; viewerPose.transform.matrix = baseFromViewer; view.transform.matrix = baseFromEye = baseFromViewer * viewerFromEye = viewerPose.transform.matrix * offset.matrix ``` For originOffset, the logic is similar: ```js let offsetASpace = baseSpace.getOffsetReferenceSpace(offsetAInBase); // offsetAInBase.matrix = baseFromA let offsetBSpace = offsetASpace.getOffsetReferenceSpace(offsetBInA); // offsetBInA.matrix = AFromB // equivalent to: let offsetBSpace = baseSpace.getOffsetReferenceSpace(offsetBInBase); // offsetBInBase.matrix = baseFromA * AFromB = offsetAInBase.matrix * offsetBInA.matrix ``` Lastly, for XRRay, the convention is to transform points or vectors by premultiplying the transform matrix from the left, I've rephrased those usages to make that explicit. The current text was basically saying "multiply vector by matrix" which would be backwards.
Thanks for this! The description and spec change sound correct to me, though I'd definitely want @Manishearth to also chime in since I believe at least the |
IMO this is incomplete: The choice can be made either way here until we clarify what XRRigidTransform's
can mean one of two things: either a transform that takes an object and offsets it by the position/orientation, or a transform that takes an object and transforms it into the coordinates of the space described by the offset. The latter is the inverse of the former. I'm worried of bouncing back and forth on this (which is what we've been doing so far): as long as this isn't explicitly specced, there will always be some confusion The definition we've implicitly settled (I think) on is "a transform that takes an object and offsets it by the position/orientation, orientation applied first", but we should be clear about that, and also include some math (IMO). I've been planning to make a PR for this (especially since this is bikeshed time), doing the following:
Thoughts? This can still land as-is, though, the math is correct, if we continue with the implicit assumptions we've been making about |
@Manishearth I agree that the current XRRigidTransform description is underspecified, but can we address this in a followup if we agree that this PR matches the intended interpretation? I've previously attempted to make the transforms unambiguous by adding additional language about source and destination of transforms in XRRigidTransform and getPose, along with adding coordinate-based clarifications, but there seemed to be strong resistance against doing this. See for example this part of the closed PR #569:
|
(To be clear, I'm not proposing using the old phrasing I've quoted above, I agree that it's clunky, and the source/destination space approach is still confusing. But I do think that it's helpful to explain positions and matrix values in terms of coordinates in a given space.) |
Yeah, like I said this can land as-is, I just wanted to point out that this only shifts the ambiguity to the matrix definition. But that is okay, since we're reasonably clear as to what the semantics of the matrix are, we just haven't specced it out. |
Thanks Klaus and Manish! Merging. |
Should I open a dedicated issue to discuss remaining matrix clarifications? |
The multiplication order in the spec needs to match the matrix
semantics we've agreed on, and the usage appeared to be backwards.
Specific justifications following.
getViewerPose: if |offset| is the view offset of an eye view, the expected
semantics would be that it's a pose-like transform where the transform's
position is the eye view's coordinates in XRViewerPose space. The
corresponding matrix is viewerFromEye. XRViewerPose contains the viewer
pose in the base reference space, so its transform's position is the viewer
position in base space coordinates, and the transform is baseFromViewer.
Chaining these transforms works as follows:
For originOffset, the logic is similar:
Lastly, for XRRay, the convention is to transform points or vectors by
premultiplying the transform matrix from the left, I've rephrased those usages
to make that explicit. The current text was basically saying "multiply
vector by matrix" which would be backwards.