Skip to content

marked hand as sameobject + added a clarifying note #93

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 16, 2021

Conversation

cabanier
Copy link
Member

@cabanier cabanier commented Jul 14, 2021

Closes #91


Preview | Diff

@cabanier cabanier marked this pull request as ready for review July 14, 2021 23:05
@alcooper91
Copy link

IMO, rather than the NOTE we should modify the algorithm, adding a new step 8/10 along the lines of:

foreach |joint| in |jointspaces| : 
  if the user agent cannot determine the |pose| of |joint| return false and stop processing these steps.

The last step becomes simply "return true", and the current 8.3/10.3 step can be removed.

@cabanier
Copy link
Member Author

IMO, rather than the NOTE we should modify the algorithm, adding a new step 8/10 along the lines of:

foreach |joint| in |jointspaces| : 
  if the user agent cannot determine the |pose| of |joint| return false and stop processing these steps.

The last step becomes simply "return true", and the current 8.3/10.3 step can be removed.

Unfortunately, the joints don't all have to belong to the same hand for fillJointRadii
For fillPoses they don't even have to be joint spaces.

@alcooper91
Copy link

I see, so the joints in fillJointRadii/fillJointPoses aren't necessarily all of the joints for a hand or all from the same hand; I think it's still better to put an explicit check in the algorithm to call out something along the lines of validating that all other poses from the hand that |joint| belongs to are valid before providing the data (e.g. as an addition to determining if a pose can be provided for the joint); especially since I believe NOTEs like what you've added are non-normative. (Unless that is your intention).

@cabanier
Copy link
Member Author

I see, so the joints in fillJointRadii/fillJointPoses aren't necessarily all of the joints for a hand or all from the same hand; I think it's still better to put an explicit check in the algorithm to call out something along the lines of validating that all other poses from the hand that |joint| belongs to are valid before providing the data (e.g. as an addition to determining if a pose can be provided for the joint); especially since I believe NOTEs like what you've added are non-normative. (Unless that is your intention).

A normative section higher up defines this:

When a hand is partially obscured the user agent MAY emulate the obscured joints, or it MAY report null poses for all of the joints.

So the spec already defines this behavior; it just wasn't obvious.

@alcooper91
Copy link

That's fair, my feedback is mainly based on how much back/forth I had to do through the spec to understand what was happening here when I was attempting to review it this morning to understand the behavior. (e.g. We have one line about it, but then don't reference this requirement in any algorithms about querying or updating the data). Even then, I didn't think that fillJointPoses/fillJointRadii was either incomplete or referring to joints from different hands.

Copy link

@jmajnert jmajnert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Notes should not introduce normative requirements. We should update the algorithms to do exactly what we want.
  2. For fillJointRadii - if user can request radii for a subset of joints, should we consider all other (not requested) joints from that hand when deciding whether return NaN or not? This should be clarified in the algorithm. The same for fillPoses - should we consider all spaces of a hand (even those that are not requested by caller), when deciding whether to return 16 NaNs or valid transforms?

@jmajnert
Copy link

IMO, the getJointPose algoritms should be redefined to clearly state, that null should be returned if pose of any joint on the same hand cannot be determined.
Then, the two other algorithms should be redefined in terms of getJointPose

@cabanier
Copy link
Member Author

  1. Notes should not introduce normative requirements. We should update the algorithms to do exactly what we want.

There's normative text that states that if one joint has no pose, none of the joints of that hand have a pose.
The note is just to clarify that this is implicitly used by the algorithm.

  1. For fillJointRadii - if user can request radii for a subset of joints, should we consider all other (not requested) joints from that hand when deciding whether return NaN or not?

Yes. The normative text defines this:

When a hand is partially obscured the user agent MAY emulate the obscured joints, or it MAY report null poses for all of the joints.

This should be clarified in the algorithm. The same for fillPoses - should we consider all spaces of a hand (even those that are not requested by caller), when deciding whether to return 16 NaNs or valid transforms?

Yes but again, implied by the normative text.
If we add sorting by hand to the algorithm, it will become much more complicated.

@jmajnert
Copy link

Having read the Editor's Draft version I see it is clearer than the FPWD. Still, I would propose that instead of the Notes we add a specific algorithm step.
Something like this perhaps:
50f6306

@alcooper91
Copy link

I wanted to comment on these two bits:

A normative section higher up defines this:

When a hand is partially obscured the user agent MAY emulate the obscured joints, or it MAY report null poses for all of the joints.

So the spec already defines this behavior; it just wasn't obvious.

If we add sorting by hand to the algorithm, it will become much more complicated.

I don't think the adding the note here makes it obvious, and I actually strongly disagree that it makes the algorithm more complicated, to me the fact that this was only implicit and not an explicit step added to a lot of confusion when trying to parse the text, as someone not intimately familiar with this spec. I don't think we necessarily need to sort by hand, but joints should have a reference back to their hand that we can use to determine if all other joints on that hand can be located. Essentially, I think we need some explicit algorithm for:

There's normative text that states that if one joint has no pose, none of the joints of that hand have a pose.

Perhaps as a compromise we could add/update a hand update algorithm that sets all poses to null if any pose on the hand is null, that way even if it was theoretically able to locate the pose the UA should have "nulled out" that pose for the frame.

We may even want to strengthen the text to be a bit more explicit and change:

When a hand is partially obscured the user agent MAY emulate the obscured joints, or it MAY report null poses for all of the joints.

To:

When a hand is partially obscured the user agent MUST either emulate the obscured joints, or report null poses for all of the joints.

@cabanier
Copy link
Member Author

So the spec already defines this behavior; it just wasn't obvious.
If we add sorting by hand to the algorithm, it will become much more complicated.

I don't think the adding the note here makes it obvious, and I actually strongly disagree that it makes the algorithm more complicated, to me the fact that this was only implicit and not an explicit step added to a lot of confusion when trying to parse the text, as someone not intimately familiar with this spec. I don't think we necessarily need to sort by hand, but joints should have a reference back to their hand that we can use to determine if all other joints on that hand can be located.

They have a reference back to their hand:
Every XRJointSpace has an associated hand, which is the XRHand that created it.

Essentially, I think we need some explicit algorithm for:

There's normative text that states that if one joint has no pose, none of the joints of that hand have a pose.

hmm, I think we need to patch populate the pose to do this. I'll make an attempt.

We may even want to strengthen the text to be a bit more explicit and change:

When a hand is partially obscured the user agent MAY emulate the obscured joints, or it MAY report null poses for all of the joints.

To:

When a hand is partially obscured the user agent MUST either emulate the obscured joints, or report null poses for all of the joints.

That sounds good to me. The 2 "may"'s are indeed confusing

@jmajnert
Copy link

to me the fact that this was only implicit and not an explicit step added to a lot of confusion when trying to parse the text, as someone not intimately familiar with this spec

+1!
I must agree. The algorithms in spec should be painfully explicit. That's what they are for. Changing their behaviour with some context in the prose of the spec that is not immediately available adds confusion and will cause interoperability problems.

@cabanier
Copy link
Member Author

@alcooper91 @jmajnert I hoisted the populate the pose algorithm into the spec and patched it.
I'm not 100% happy with it but I can't see any other way since the regular WebXR methods also need to account for this joint behavior.

Copy link

@alcooper91 alcooper91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment about the prose you added, but otherwise I think that prose looks good.

I definitely agree that it's not ideal. Would it be possible to do something like:

Algorithm: Populate Pose for XRHand
1. Let |pose| be the result of running |Populate the pose| for space in base space
2. If |space| is a {{XRJointSpace}}....

Or to modify the WebXR spec to accept extensions? Essentially all you've done is added a line in the middle of the algorithm, so perhaps the main algorithm could be patched with something like:

...
if |space| is not {{allowed to report poses}} return null
...

And then some additional text to provide a default implementation for {{allowed to report poses}} that can be overriden?

@cabanier
Copy link
Member Author

Or to modify the WebXR spec to accept extensions? Essentially all you've done is added a line in the middle of the algorithm, so perhaps the main algorithm could be patched with something like:

I think that would be ideal. We did something similar for Layers.
Maybe we can submit the current work but file an issue against WebXR at the same time. I believe @toji and @Manishearth are working on getting to CR so we can't make changes right now.
Maybe we should discuss if we should fork the spec to level 2 so we can still make changes.

@alcooper91
Copy link

That's fair, and I'll defer to @toji and @Manishearth on that.

Would something like this be possible as a workaround to not have to copy the full text? Or I guess that technically leaves a hole for getting the poses in the reference space...

I definitely agree that it's not ideal. Would it be possible to do something like:

Algorithm: Populate Pose for XRHand
1. Let |pose| be the result of running |Populate the pose| for space in base space
2. If |space| is a {{XRJointSpace}}....

@cabanier
Copy link
Member Author

Would something like this be possible as a workaround to not have to copy the full text? Or I guess that technically leaves a hole for getting the poses in the reference space...

You mean, only patch it for the methods in the hands spec?
If so, then it wouldn't apply for the getPose function the WebXR spec.

@alcooper91
Copy link

I see, then from my perspective this is fine, and helps clarify it; but I think there's no explicit modification made to fillJointRadii, which doesn't look to reference the "Populate the Pose" algorithm?

@jmajnert
Copy link

Sorry to be cutting in, but have you seen my proposition here: 50f6306 ?

I think it cheaply (without changes to other specs) solves our problems.

@cabanier
Copy link
Member Author

Sorry to be cutting in, but have you seen my proposition here: 50f6306 ?

I think it cheaply (without changes to other specs) solves our problems.

getPose also needs to be patched

@cabanier
Copy link
Member Author

I see, then from my perspective this is fine, and helps clarify it; but I think there's no explicit modification made to fillJointRadii, which doesn't look to reference the "Populate the Pose" algorithm?

I will make it call that algorithm.

@cabanier
Copy link
Member Author

I see, then from my perspective this is fine, and helps clarify it; but I think there's no explicit modification made to fillJointRadii, which doesn't look to reference the "Populate the Pose" algorithm?

I will make it call that algorithm.

Unfortunately, I couldn't use that so I had to patch it in place. @jmajnert, I reformatted your proposal a bit. Let me know if you prefer your approach.

index.bs Outdated
1. Return |allValid|.
1. Initialize |radii| as follows:
<dl class="switch">
<dt> If the user agent can determine the poses of all the joints belonging to the |joint|'s [=XRJointSpace/hand=]:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit backward. |joint| is not yet defined. I'd prefer my version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the fact that |joint| is used before it's definition here is a problem, though the rest of the steps seem OK. The version in 50f6306 does read a bit better while accomplishing the same thing, IMO.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. I'll use @jmajnert 's definition

@Manishearth
Copy link
Contributor

I don't understand why we need to patch the algorithm here; this hook already exists in the form of the definition for native origin: each space defines how its native origin behaves. That's what currently exists in the spec, with the wording of:

The native origin of the XRJointSpace may only be reported when native origins of all other XRJointSpaces on the same hand are being reported. When a hand is partially obscured the user agent MAY emulate the obscured joints, or it MAY report null poses for all of the joints.

except that MAY should probably be a "MUST either" as you've fixed in this PR. We could tighten up the wording to stop saying "reported" here and define the native origin as this only-when-everyone-else-exists quantity.

Copy link
Member

@toji toji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with @Manishearth that defining the behavior of the native origin is the right hook here rather than importing the entire populating the pose algorithm and adding a single line. Admittedly that does create a bit of indirection, but I think the Note that you have here clarifies that well enough.

This is a good change overall, thanks for working on it! Sorry this bit of algorithm interaction is so tricky to resolve.

index.bs Outdated
1. Return |allValid|.
1. Initialize |radii| as follows:
<dl class="switch">
<dt> If the user agent can determine the poses of all the joints belonging to the |joint|'s [=XRJointSpace/hand=]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the fact that |joint| is used before it's definition here is a problem, though the rest of the steps seem OK. The version in 50f6306 does read a bit better while accomplishing the same thing, IMO.

index.bs Outdated
@@ -405,6 +424,8 @@ When this method is invoked on an {{XRFrame}} |frame|, the user agent MUST run t

</div>

NOTE: if any of the spaces belonging to the same {{XRHand}} return <code>null</null> when [=hands/populate the pose|populating the pose=], all the spaces of that {{XRHand}} must also return <code>null</null> when [=hands/populate the pose|populating the pose=]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It' s not clear to me how this note significantly differs from the one on line 392? They're both saying the same thing, just using pointing at different mechanisms to say it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did that to clarify each algorithm.

@Manishearth
Copy link
Contributor

I think a way of clarifying might be to add a note on the populate the pose algorithm in the core spec that native origins may have conditions in which they are or aren't shown and to check their definition.

@cabanier
Copy link
Member Author

I don't understand why we need to patch the algorithm here; this hook already exists in the form of the definition for native origin: each space defines how its native origin behaves. That's what currently exists in the spec, with the wording of:

The native origin of the XRJointSpace may only be reported when native origins of all other XRJointSpaces on the same hand are being reported. When a hand is partially obscured the user agent MAY emulate the obscured joints, or it MAY report null poses for all of the joints.

I agree as well. Patching the whole algorithm is ugly and IMO just as confusing. I will take it out of the PR.
The notes should clarify and we should probably also add some example code.

@jmajnert
Copy link

The reason why we have this PR is because reviewers of a change on Chromium gerrit found the specification to be self contradicting: https://chromium-review.googlesource.com/c/chromium/src/+/3025151/comments/bfb1865f_07c0e6db
Let me quote:

  1. https://www.w3.org/TR/webxr-hand-input-1/#xrjointspace-interface States that if not all XRJointPoses can be identified/emulated, then all need to be reported as null.
  2. https://www.w3.org/TR/webxr-hand-input-1/#dom-xrframe-filljointradii and https://immersive-web.github.io/webxr-hand-input/#dom-xrframe-fillposes both contradict this, by points 8.3/10.3 respectively, which both should probably read something like "If the pose can't be found, zero out any pre-filled data and abort" (Or adding a pre-cursor step to check for that data and abort).

Again: they read the requirement that "all or nothing" is reported and found it to contradict the algorithms. They did not take the requirement as being an amendment to the algorithms. And that's how people will read it IMHO.

I too had a hard time understanding what the specification really wanted. I thought that the whole reason of introducing algorithms in W3C specifications was to clearly and unambiguously define the expected behavior, without the need to process all of the prose of the various parts of the document and glue them together.

That being said, I'm not an expert here. If you think that the PR as it stands clarifies the expected behavior, then so be it.

@cabanier
Copy link
Member Author

I too had a hard time understanding what the specification really wanted. I thought that the whole reason of introducing algorithms in W3C specifications was to clearly and unambiguously define the expected behavior, without the need to process all of the prose of the various parts of the document and glue them together.

No, you need to take all the prose into account when implementing a specification.
For instance, in this case the decision boils to the prose in WebXR that states "Query the XR device's tracking system". The normative text in WebXR Hands defines that this should fail if any of the joints can't be tracked or emulated.

It's true that going three steps deep isn't always obvious which is why we add informative notes. Examples will also help.

Copy link
Member

@toji toji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest changes LGTM.

Also, I sympathize with anyone who has trouble following the spec, but needing to step through multiple algorithms and documents to get the full picture of the required behavior is (sadly, but necessarily) par for the course in Web specs.

In this case we have a clearly defined behavior for the hands native origin, which the existing algorithms already allow and expect to be defined per-space-type, and we have a new method which explicitly states how to deal with this edge case. I don't see too many ways to simplify it without repeating algorithms, which gets problematic in the case that another spec needs to do the same thing (in which case which override of the algorithm takes precedence?)

Copy link

@alcooper91 alcooper91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest changes LGTM.

I'll admit that I only quickly skimmed fillPoses when trying to understand what was going on. Following the "populate the poses" logic sounds like it would have adequately satisfied the requirements in that case. However, before this change, the fillJointsRadii made no reference to any such similar requirement (instead referring simply to if the pose was locatable), which was ultimately the bulk of my issue/confusion.

@Manishearth
Copy link
Contributor

For instance, in this case the decision boils to the prose in WebXR that states "Query the XR device's tracking system". The normative text in WebXR Hands defines that this should fail if any of the joints can't be tracked or emulated.

This is correct.

@Manishearth Manishearth merged commit 2e10665 into immersive-web:main Jul 16, 2021
@cabanier cabanier deleted the cleanup branch July 19, 2021 14:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Require all joints to be valid
5 participants