Skip to content

Single-canvas inline, drop XRPresentationContext #656

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jun 13, 2019
Prev Previous commit
Next Next commit
Fixed up explainer
  • Loading branch information
toji committed Jun 13, 2019
commit 41c03e60362af4ddba6f154d7c677127e987d784
7 changes: 4 additions & 3 deletions explainer.md
Original file line number Diff line number Diff line change
Expand Up @@ -377,9 +377,9 @@ These scenarios can make use of inline sessions to render tracked content to the

The [`RelativeOrientationSensor`](https://w3c.github.io/orientation-sensor/#relativeorientationsensor) and [`AbsoluteOrientationSensor`](https://w3c.github.io/orientation-sensor/#absoluteorientationsensor) interfaces (see [Motion Sensors Explainer](https://w3c.github.io/motion-sensors/)) can be used to polyfill the first case.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not this PR, but boy does this sentence seem strange now that we're committed to inline.


To make use of this mode a `XRWebGLLayer` must be created with the `useDefaultFramebuffer` option set to `true`. This instructs the layer to not allocate a new WebGL framebuffer but instead set the `framebuffer` attribute to `null`. That way when `framebuffer` is bound all WebGL commands will naturally execute against the WebGL context's default framebuffer and display on the page like any other WebGL content. When that layer is set as the `XRRenderState`'s `baseLayer` the inline session is able to render it's output to the page.
To make use of this mode a `XRWebGLLayer` must be created with the `compositionDisabled` option set to `true`. This instructs the layer to not allocate a new WebGL framebuffer but instead set the `framebuffer` attribute to `null`. That way when `framebuffer` is bound all WebGL commands will naturally execute against the WebGL context's default framebuffer and display on the page like any other WebGL content. When that layer is set as the `XRRenderState`'s `baseLayer` the inline session is able to render it's output to the page.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future, I presume we're going to want inline sessions to participate in new XRLayers types that require composition. How is that going to work in light of the fact that inline sessions are required to use useDefaultFramebuffer=true? Are any layers besides XRWebGLLayer going be ignored in that case?

I worry about the compatibility aspects of these types of flags. The goal of WebXR is to be a write once, run anywhere deal. When it comes to drawing stuff, the spec should ideally say "You render your stuff to this framebuffer, we'll take care of composing your layers together with things the browser wants to draw and everything will just work." The more special canvas contexts, useXXXOnlyWorksInInline flags, etc, the more complexity we add for web developers and the more time we'll spend debugging sites that are broken on some hardware but not others.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future, I presume we're going to want inline sessions to participate in new XRLayers types that require composition.

Yes, absolutely!

How is that going to work in light of the fact that inline sessions are required to use useDefaultFramebuffer=true? Are any layers besides XRWebGLLayer going be ignored in that case?

Once we have new layer types and/or multi-layer the restriction that inline session layers must use that flag would be lifted (and I expect most layers wouldn't have the flag at all.) At that point if a layer IS constructed with the flag it would be an error to use it with any other layers.

I worry about the compatibility aspects of these types of flags... The more special canvas contexts, useXXXOnlyWorksInInline flags, etc, the more complexity we add for web developers and the more time we'll spend debugging sites that are broken on some hardware but not others.

Completely agree, but in that regard I feel like this is a lateral move rather than a regression. Previously we were asking developers to use a special secondary canvas for inline sessions that was necessary in order to support features that didn't actually exist yet, so we were already making users jump through hoops without any clear benefits to doing so. With this approach we are scaling back to hoop jumping to setting a single boolean option at which point everything works the way they already expect WebGL to work for inline content. Then when those eventual advanced features ARE ready we can introduce the more complex route with a much more tractable explanation of why they're necessary.

To address a comment that I remember from the call, it looks like we could say "turn this flag on automatically for inline" and be done with it. But then we're in a weird position down the road where when layer compositing IS a thing and we expect most content will eventually want to use it, now inline has to flip the same damn flag in reverse for the majority of content just to get the same default behavior as immersive. I'd rather have the default be compositing is used everywhere and for this first release we have this weird little required opt-out wart.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RafaelCintron Does that address your concern? If so, I think we'd like to try to get this merged today or tomorrow...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're removing XRPresentationContext (which I'm fully supportive of), now is an opportune time to eliminate any lingering immersive vs. inline differences as much as we can.

The way the spec is written today, setting the flag to true means content will not work in immersive sessions. This creates compatibility problems for developers who only have access to inline session hardware. I understand that the API cannot hide all differences between hardware types but having this flag, in my view, does not meet the bar for introducing compatibility problems.

Why can't we eliminate the flag and have the returned framebuffer always be non-null?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you're saying now, but I'm not sure how to achieve that in a way that actually makes developers lives simpler (and doesn't turn into a really sticky spec issue).

If the returned framebufer is always non-null then that means that content rendered to it is going somewhere other than the default backbuffer for the WebGL canvas, and as such needs an explicit mechanism to display it on the page. There's a couple of ways that could happen:

  • We have it bound to some sort of explicit output surface. This is what xrpresent did for us, and what we're trying to move away from.
  • We declare that we are taking ownership of the WebGL canvas when a layer is created like this and start displaying our content in favor of the default framebuffer. This seems pretty messy to me, would probably require coordination with the WebGL WG to achive, and at the very least Ken has expressed that he's less than thrilled with the idea when I've floated it with him in the past. There's also ownership lifetime issues to worry about in this model, and it would likely force something like spectator mode to use an OffscreenCanvas or second context altogether.
  • We can require the developer to manually blit the rendered content from our framebuffer (probably surfaced as a texture in that case) into the WebGL default framebuffer. This now requires WAY more work on the developers part to support inline content, and increases the divergence of the two rendering paths, which I think is what we're both trying to reduce.
  • A variant of the previous option, we could provide a way to surface the framebuffer content as an ImageBitmap, which could then be shown directly via a bitmaprenderer context or easily converted into a texture for WebGL use. Again, though, this is significantly more work in order to get inline to display properly. And we wouldn't want to expose the ImageBitmap producer to immersive sessions because it would either hold references to swap chain surfaces in an environment where we want to control their flow or necessitate a copy. And we still run into the core issue that the code paths between immersive and inline and now more divergent than ever.

In light of the above, it seemed to me that flipping a single boolean flag and then having everything else "just work" was an attractive option. I've coded up a couple of tests that use this route already and the code ends up looking like this:

let layer = new WebGLLayer(session, gl, { compositionDisabled: mode == 'inline'; });

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have it bound to some sort of explicit output surface. This is what xrpresent did for us, and what we're trying to move away from.

This is along the lines of what I had in mind, but without the extra IDL of XRPresentationContext.

What is your plan for introducing composition to inline sessions in the future? Won't that require "taking ownership of the WebGL canvas" like you described? If so, seems like we should have a discussion of the implications sooner rather than later, especially if the flag is going to end up in the final, ratified spec. Or will inline session never be able to take advantage of composition?

My high order bit is minimizing the number of differences between inline and immersive for core scenarios. Without this PR, developers who create xrpresent contexts will at least have the 3D portion of their content work correctly in immersive. From that standpoint, this change is more of a regression than a lateral move.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is along the lines of what I had in mind, but without the extra IDL of XRPresentationContext.

I'm really curious what you have in mind for the output surface in that case, without introducing additional IDL?

What is your plan for introducing composition to inline sessions in the future?

Essentially to re-introduce XRPresentationContext or something like it, though I think there's still reasonable design discussions to be had around the shape of that API (hence my curiosity about what you had in mind above). I do strongly feel there's value to having composited output from inline sessions, but it's very hard to justify to developers the need to juggle additional canvases and new interfaces to do so when we don't yet support any scenarios that compositing would actually benefit from (multilayer, etc.)

Also, I missed emphasizing previously that one of the explicit goals of this change is to make the polyfill more performant, and I don't see a way to do that if a secondary output surface is involved, whereas the proposal in this PR makes it trivial. (To be clear: I'm also OK with the polyfill staying limited to the core API)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Essentially to re-introduce XRPresentationContext or something like it, though I think there's still reasonable design discussions to be had around the shape of that API (hence my curiosity about what you had in mind above).

I see. I didn't realize that you were already counting on inline developers having to change their code in other ways in order to get composition in the future.

When we add XRPresentationContext in the future, could we detect "use composition" intent for inline sessions by virtue of developers using XRPresentationContext in the WebGL layer's init parameters? This way, we can remove the flag now instead of having awkward "we may remove this flag in the future" verbiage or get stuck having to support XRPresentationContext use without composition down the road.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want the XRPresentationContext to be added to the WebGL layer directly, because the entire point is that it accepts the composited results, not the results from a single layer. That's why it was on the session's XRRenderState previously, which is where I would be inclined to put it again in the future.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Why can't we use the presence of XRPresentationContext on XRRenderState to determine whether the developer has opted into composition for inline sessions?

Immersive and inline sessions can use the same render loop, but there are some differences in behavior to be aware of. Most importantly, inline sessions will not pump their render loop if they do not have a `baseLayer` with `useDefaultFramebuffer` set. (This restriction may be lifted in the future to enable more advanced effects.) Instead the session acts as though it has been [suspended](#handling-suspended-sessions) until a valid `baseLayer` has been assigned.
Immersive and inline sessions can use the same render loop, but there are some differences in behavior to be aware of. Most importantly, inline sessions will not pump their render loop if they do not have a `baseLayer` with `compositionDisabled` set. (This restriction may be lifted in the future to enable more advanced effects.) Instead the session acts as though it has been [suspended](#handling-suspended-sessions) until a valid `baseLayer` has been assigned.

Immersive and inline sessions may run their render loops at at different rates. During immersive sessions the UA runs the rendering loop at the XR device's native refresh rate. During inline sessions the UA runs the rendering loop at the refresh rate of page (aligned with `window.requestAnimationFrame`.) The method of computation of `XRView` projection and view matrices also differs between immersive and inline sessions, with inline sessions taking into account the output canvas dimensions and possibly the position of the users head in relation to the canvas if that can be determined.

Expand All @@ -393,7 +393,7 @@ function beginInlineXRSession() {
// Inline sessions must have an appropriately constructed WebGL layer
// set as the baseLayer prior to rendering. (This code assumes the WebGL
// context has already been made XR compatible.)
let glLayer = new XRWebGLLayer(session, gl, { useDefaultFramebuffer: true });
let glLayer = new XRWebGLLayer(session, gl, { compositionDisabled: true });
session.updateRenderState({ baseLayer: glLayer });
onSessionStarted(session);
})
Expand Down Expand Up @@ -654,6 +654,7 @@ enum XREye {
//

dictionary XRWebGLLayerInit {
boolean compositionDisabled = false;
boolean antialias = true;
boolean depth = true;
boolean stencil = false;
Expand Down