Skip to content

Fix multiplication order for transforms #649

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 16, 2019

Conversation

klausw
Copy link
Contributor

@klausw klausw commented May 15, 2019

The multiplication order in the spec needs to match the matrix
semantics we've agreed on, and the usage appeared to be backwards.

Specific justifications following.

getViewerPose: if |offset| is the view offset of an eye view, the expected
semantics would be that it's a pose-like transform where the transform's
position is the eye view's coordinates in XRViewerPose space. The
corresponding matrix is viewerFromEye. XRViewerPose contains the viewer
pose in the base reference space, so its transform's position is the viewer
position in base space coordinates, and the transform is baseFromViewer.

Chaining these transforms works as follows:

offset.matrix = viewerFromEye;
viewerPose.transform.matrix = baseFromViewer;

view.transform.matrix = baseFromEye
                      = baseFromViewer * viewerFromEye
                      = viewerPose.transform.matrix * offset.matrix

For originOffset, the logic is similar:

let offsetASpace = baseSpace.getOffsetReferenceSpace(offsetAInBase);
// offsetAInBase.matrix = baseFromA
let offsetBSpace = offsetASpace.getOffsetReferenceSpace(offsetBInA);
// offsetBInA.matrix = AFromB

// equivalent to:
let offsetBSpace = baseSpace.getOffsetReferenceSpace(offsetBInBase);
// offsetBInBase.matrix = baseFromA * AFromB = offsetAInBase.matrix * offsetBInA.matrix

Lastly, for XRRay, the convention is to transform points or vectors by
premultiplying the transform matrix from the left, I've rephrased those usages
to make that explicit. The current text was basically saying "multiply
vector by matrix" which would be backwards.

@klausw
Copy link
Contributor Author

klausw commented May 15, 2019

@Manishearth @toji @NellWaliczek @billorr-google: can you please sanity check this? I'm fairly sure the proposed change is correct, but we should make sure we're all on the same page for this. Let me know if you want to discuss this more.

index.bs Outdated
@@ -851,7 +851,7 @@ The <dfn method for="XRReferenceSpace">getOffsetReferenceSpace(|originOffset|)</
1. If |base| is an instance of {{XRBoundedReferenceSpace}}, let |offsetSpace| be a new {{XRBoundedReferenceSpace}} and set |offsetSpace|'s {{boundsGeometry}} to |base|'s {{boundsGeometry}}, with each point multiplied by the {{XRRigidTransform/inverse}} of |originOffset|.
1. Else let |offsetSpace| be a new {{XRReferenceSpace}}.
1. Set |offsetSpace|'s [=native origin=] to |base|'s [=native origin=].
1. Set |offsetSpace|'s [=origin offset=] to the result of [=multiply transforms|multiplying=] |originOffset| by |base|'s [=origin offset=].
1. Set |offsetSpace|'s [=origin offset=] to the result of [=multiply transforms|multiplying=]|base|'s [=origin offset=] by |originOffset|.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: We lost a space between [=multiply transforms|multiplying=] and |base|

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, fixed.

The multiplication order in the spec needs to match the matrix
semantics we've agreed on, and the usage appeared to be backwards.

Specific justifications following.

getViewerPose: if |offset| is the view offset of an eye view, the expected
semantics would be that it's a pose-like transform where the transform's
position is the eye view's coordinates in XRViewerPose space. The
corresponding matrix is viewerFromEye. XRViewerPose contains the viewer
pose in the base reference space, so its transform's position is the viewer
position in base space coordinates, and the transform is baseFromViewer.

Chaining these transforms works as follows:

```js
offset.matrix = viewerFromEye;
viewerPose.transform.matrix = baseFromViewer;

view.transform.matrix = baseFromEye
                      = baseFromViewer * viewerFromEye
                      = viewerPose.transform.matrix * offset.matrix
```

For originOffset, the logic is similar:

```js
let offsetASpace = baseSpace.getOffsetReferenceSpace(offsetAInBase);
// offsetAInBase.matrix = baseFromA
let offsetBSpace = offsetASpace.getOffsetReferenceSpace(offsetBInA);
// offsetBInA.matrix = AFromB

// equivalent to:
let offsetBSpace = baseSpace.getOffsetReferenceSpace(offsetBInBase);
// offsetBInBase.matrix = baseFromA * AFromB = offsetAInBase.matrix * offsetBInA.matrix
```

Lastly, for XRRay, the convention is to transform points or vectors by
premultiplying the transform matrix from the left, I've rephrased those usages
to make that explicit. The current text was basically saying "multiply
vector by matrix" which would be backwards.
@klausw klausw force-pushed the r-offsetchaining branch from ee6a0ae to 58cc76e Compare May 15, 2019 18:19
@toji
Copy link
Member

toji commented May 15, 2019

Thanks for this! The description and spec change sound correct to me, though I'd definitely want @Manishearth to also chime in since I believe at least the XRView logic was something he added recently.

@Manishearth
Copy link
Contributor

IMO this is incomplete: The choice can be made either way here until we clarify what XRRigidTransform's matrix is. As mentioned before, the following text:

The matrix attribute returns the transform described by the position and orientation attributes as a matrix. This attribute SHOULD be lazily evaluated.

can mean one of two things: either a transform that takes an object and offsets it by the position/orientation, or a transform that takes an object and transforms it into the coordinates of the space described by the offset. The latter is the inverse of the former.

I'm worried of bouncing back and forth on this (which is what we've been doing so far): as long as this isn't explicitly specced, there will always be some confusion

The definition we've implicitly settled (I think) on is "a transform that takes an object and offsets it by the position/orientation, orientation applied first", but we should be clear about that, and also include some math (IMO). I've been planning to make a PR for this (especially since this is bikeshed time), doing the following:

  • Spec out what matrix does, both conceptually and mathematically
  • Maybe remove matrix decomposition from the transform multiplication algorithms, operate on rotations and translations directly (with a note mentioning the matrix method too)
  • Go through the spec ensuring that all the multiplication orders make sense

Thoughts? This can still land as-is, though, the math is correct, if we continue with the implicit assumptions we've been making about matrix.

@klausw
Copy link
Contributor Author

klausw commented May 16, 2019

@Manishearth I agree that the current XRRigidTransform description is underspecified, but can we address this in a followup if we agree that this PR matches the intended interpretation?

I've previously attempted to make the transforms unambiguous by adding additional language about source and destination of transforms in XRRigidTransform and getPose, along with adding coordinate-based clarifications, but there seemed to be strong resistance against doing this.

See for example this part of the closed PR #569:

An XRRigidTransform is a transform from a source XRSpace to a destination XRSpace described by a position and orientation. When applying an XRRigidTransform the orientation is always applied prior to the position. Effectively, the source space is rotated around its origin using the orientation, and then the source space's origin is translated to the XRRigidTransform/position in destination space.

The position attribute is a 3-dimensional point, given in meters, for the source space's origin location in the destination space's coordinate system.

The matrix attribute returns the transform described by the position and orientation attributes as a matrix. This attribute SHOULD be lazily evaluated. The top left 3x3 sub-matrix is the rotation matrix corresponding to the orientation, its column values are the source space's axis directions as unit vector coordinates in destination space. The fourth column contains the position. Premultiplying this matrix onto a column vector of source space coordinates produces a column vector of destination space coordinates.

@klausw
Copy link
Contributor Author

klausw commented May 16, 2019

(To be clear, I'm not proposing using the old phrasing I've quoted above, I agree that it's clunky, and the source/destination space approach is still confusing. But I do think that it's helpful to explain positions and matrix values in terms of coordinates in a given space.)

@Manishearth
Copy link
Contributor

Yeah, like I said this can land as-is, I just wanted to point out that this only shifts the ambiguity to the matrix definition.

But that is okay, since we're reasonably clear as to what the semantics of the matrix are, we just haven't specced it out.

@toji
Copy link
Member

toji commented May 16, 2019

Thanks Klaus and Manish! Merging.

@toji toji merged commit 6868682 into immersive-web:master May 16, 2019
@Manishearth
Copy link
Contributor

Should I open a dedicated issue to discuss remaining matrix clarifications?

@AdaRoseCannon AdaRoseCannon added the ed:spec Include in newsletter, spec change label Jun 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ed:spec Include in newsletter, spec change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants