Updated the explainer

immersive-web · toji · Jun 13, 2019 · May 17, 2019 · May 17, 2019 · May 22, 2019
commit 771b48a5fe312eba79bda66fac0179595f64f06d
diff --git a/explainer.md b/explainer.md
@@ -365,41 +365,7 @@ function checkARSupport() {
 
 The UA may choose to present the immersive AR session's content via any type of display, including dedicated XR hardware (for devices like HoloLens or Magic Leap) or 2D screens (for APIs like [ARKit](https://developer.apple.com/arkit/) and [ARCore](https://developers.google.com/ar/)). In all cases the session takes exclusive control of the display, hiding the rest of the page if necessary. On a phone screen, for example, this would mean that the session's content should be displayed in a mode that is distinct from standard page viewing, similar to the transition that happens when invoking the `requestFullscreen` API. The UA must also provide a way of exiting that mode and returning to the normal view of the page, at which point the immersive AR session must end.
 
-## Rendering to the Page
-
-There are a couple of scenarios in which developers may want to present content rendered with the WebXR Device API on the page instead of (or in addition to) a headset: Mirroring and inline rendering. Both methods display WebXR content on the page via a Canvas element with an `XRPresentationContext`. Like a `WebGLRenderingContext`, developers acquire an `XRPresentationContext` by calling the `HTMLCanvasElement` or `OffscreenCanvas` `getContext()` method with the context id of "xrpresent". The returned `XRPresentationContext` is permanently bound to the canvas.
-
-An `XRPresentationContext` can only be supplied imagery by an `XRSession`, though the exact behavior depends on the scenario in which it's being used. The context is associated with a session by setting the `XRRenderState`'s `outputContext` to the desired `XRPresentationContext` object. An `XRPresentationContext` cannot be used with multiple `XRSession`s simultaneously, so when an `XRPresentationContext` is set as the `outputContext` for a session's `XRRenderState`, any session it was previously associated with will have it's `renderState.outputContext` set to `null`.
-
-### Mirroring
-
-On desktop devices, or any device which has an external display connected to it, it's frequently desirable to show what the user in the headset is seeing on the external display. This is usually referred to as mirroring.
-
-In order to mirror WebXR content to the page, the session's `renderState.outputContext` must be set to a `XRPresentationContext`. Once a valid `outputContext` has been set any content displayed on the headset will then be mirrored into the canvas associated with the `outputContext`.
-
-When mirroring only one eye's content will be shown, and it should be shown without any distortion to correct for headset optics. The UA may choose to crop the image shown, display it at a lower resolution than originally rendered, and the mirror may be multiple frames behind the image shown in the headset. The mirror may include or exclude elements added by the underlying XR system (such as visualizations of room boundaries) at the UA's discretion. Pages should not rely on a particular timing or presentation of mirrored content, it's really just for the benefit of bystanders or demo operators.
-
-The UA may also choose to ignore the `outputContext` on systems where mirroring is inappropriate, such as devices without an external display like mobile or all-in-one systems.
-
-```js
-function beginXRSession() {
-  let mirrorCanvas = document.createElement('canvas');
-  let mirrorCtx = mirrorCanvas.getContext('xrpresent');
-  document.body.appendChild(mirrorCanvas);
-
-  navigator.xr.requestSession('immersive-vr')
-      .then((session) => {
-        // A mirror context isn't required to render, so it's not necessary to
-        // wait for the updateRenderState promise to resolve before continuing.
-        // It may mean that a frame is rendered which is not mirrored.
-        session.updateRenderState({ outputContext: mirrorCtx });
-        onSessionStarted(session);
-      })
-      .catch((reason) => { console.log("requestSession failed: " + reason); });
-}
-```
-
-### Inline sessions
+## Inline sessions
 
 There are several scenarios where it's beneficial to render a scene whose view is controlled by device tracking within a 2D page. For example:
 
@@ -411,31 +377,25 @@ These scenarios can make use of inline sessions to render tracked content to the
 
 The [`RelativeOrientationSensor`](https://w3c.github.io/orientation-sensor/#relativeorientationsensor) and [`AbsoluteOrientationSensor`](https://w3c.github.io/orientation-sensor/#absoluteorientationsensor) interfaces (see [Motion Sensors Explainer](https://w3c.github.io/motion-sensors/)) can be used to polyfill the first case.
 
-Similar to mirroring, to make use of this mode the  `XRRenderState`'s `outputContext` must be set. At that point content rendered to the `XRRenderState`'s `baseLayer` will be rendered to the canvas associated with the `outputContext`. The UA is also allowed to composite in additional content if desired. (In the future, if multiple layers are used their composited result will be what is displayed in the `outputContext`.)
+To make use of this mode a `XRWebGLLayer` must be created with the `useDefaultFramebuffer` option set to `true`. This instructs the layer to not allocate a new WebGL framebuffer but instead set the `framebuffer` attribute to `null`. That way when `framebuffer` is bound all WebGL commands will naturally execute against the WebGL context's default framebuffer and display on the page like any other WebGL content. When that layer is set as the `XRRenderState`'s `baseLayer` the inline session is able to render it's output to the page.
 
-Immersive and inline sessions can use the same render loop, but there are some differences in behavior to be aware of. Most importantly, inline sessions will not pump their render loop if they do not have a valid `outputContext`. Instead the session acts as though it has been [suspended](#handling-suspended-sessions) until a valid `outputContext` has been assigned.
+Immersive and inline sessions can use the same render loop, but there are some differences in behavior to be aware of. Most importantly, inline sessions will not pump their render loop if they do not have a `baseLayer` with `useDefaultFramebuffer` set. (This restriction may be lifted in the future to enable more advanced effects.) Instead the session acts as though it has been [suspended](#handling-suspended-sessions) until a valid `baseLayer` has been assigned.
 
 Immersive and inline sessions may run their render loops at at different rates. During immersive sessions the UA runs the rendering loop at the XR device's native refresh rate. During inline sessions the UA runs the rendering loop at the refresh rate of page (aligned with `window.requestAnimationFrame`.) The method of computation of `XRView` projection and view matrices also differs between immersive and inline sessions, with inline sessions taking into account the output canvas dimensions and possibly the position of the users head in relation to the canvas if that can be determined.
 
-Most instances of inline sessions will only provide a single `XRView` to be rendered, but UA may request multiple views be rendered if, for example, it's detected that that output medium of the page supports stereo rendering. As a result pages should always draw every `XRView` provided by the `XRFrame` regardless of what type of session has been requested.
-
 UAs may have different restrictions on inline sessions that don't apply to immersive sessions. For instance, the UA does not have to guarantee the availability of tracking data to inline sessions, and even when it does a different set of `XRReferenceSpace` types may be available to inline sessions versus immersive sessions.
 
 ```js
-let inlineCanvas = document.createElement('canvas');
-let inlineCtx = inlineCanvas.getContext('xrpresent');
-document.body.appendChild(inlineCanvas);
-
 function beginInlineXRSession() {
   // Request an inline session in order to render to the page.
   navigator.xr.requestSession('inline')
       .then((session) => {
-        // Inline sessions must have an output context prior to rendering, so
-        // it's a good idea to wait until the outputContext is confirmed to have
-        // taken effect before rendering.
-        session.updateRenderState({ outputContext: inlineCtx }).then(() => {
-          onSessionStarted(session);
-        });
+        // Inline sessions must have an appropriately constructed WebGL layer
+        // set as the baseLayer prior to rendering. (This code assumes the WebGL
+        // context has already been made XR compatible.)
+        let glLayer = new XRWebGLLayer(session, gl, { useDefaultFramebuffer: true });
+        session.updateRenderState({ baseLayer: glLayer });
+        onSessionStarted(session);
       })
       .catch((reason) => { console.log("requestSession failed: " + reason); });
 }
@@ -543,7 +503,7 @@ function drawScene() {
 
 Whenever possible the matrices given by `XRView`'s `projectionMatrix` attribute should make use of physical properties, such as the headset optics or camera lens, to determine the field of view to use. Most inline content, however, won't have any physically based values from which to infer a field of view. In order to provide a unified render pipeline for inline content an arbitrary field of view must be selected.
 
-By default a vertical field of view of 0.5 radians (90 degrees) is used for inline sessions. The horizontal field of view can be computed from the vertical field of view based on the width/height ratio of the `outputContext`'s canvas.
+By default a vertical field of view of 0.5 radians (90 degrees) is used for inline sessions. The horizontal field of view can be computed from the vertical field of view based on the width/height ratio of the `XRWebGLLayer`'s associated canvas.
 
 If a different default field of view is desired, it can be specified by passing a new `inlineVerticalFieldOfView` value, in radians, to the `updateRenderState` method:
 
@@ -647,15 +607,13 @@ dictionary XRRenderStateInit {
   double depthFar;
   double inlineVerticalFieldOfView;
   XRWebGLLayer? baseLayer;
-  XRPresentationContext? outputContext
 };
 
 [SecureContext, Exposed=Window] interface XRRenderState {
   readonly attribute double depthNear;
   readonly attribute double depthFar;
   readonly attribute double? inlineVerticalFieldOfView;
   readonly attribute XRWebGLLayer? baseLayer;
-  readonly attribute XRPresentationContext? outputContext;
 };
 
 //
@@ -748,11 +706,4 @@ partial dictionary WebGLContextAttributes {
 partial interface WebGLRenderingContextBase {
     Promise<void> makeXRCompatible();
 };
-
-//
-// RenderingContext
-//
-[SecureContext, Exposed=Window] interface XRPresentationContext {
-  readonly attribute HTMLCanvasElement canvas;
-};
 ```
diff --git a/input-explainer.md b/input-explainer.md
@@ -16,7 +16,7 @@ Tracked pointers are input sources able to be tracked separately from the viewer
 #### Screen
 Screen based input is driven by mouse and touch interactions on a 2D screen that are then translated into a 3D targeting ray. The targeting ray originates at the interacted point on the screen as mapped into the input `XRSpace` and extends out into the scene along a line from the screen's viewer pose position through that point. The specific mapped depth of the origin point depends on the user agent. It SHOULD correspond to the actual 3D position of the point on the screen where available, but MAY also be projected onto the closest clipping plane (defined by the smaller of the `depthNear` and `depthFar` attributes of the `XRSession`) if the actual screen placement is not known.
 
-To accomplish this, pointer events over the relevant screen regions are monitored and temporary input sources are generated in response to allow unified input handling. For inline sessions with an `outputContext`, the monitored region is the `outputContext`'s canvas. For immersive sessions (e.g. hand-held AR), the entire screen is monitored. 
+To accomplish this, pointer events over the relevant screen regions are monitored and temporary input sources are generated in response to allow unified input handling. For inline sessions the monitored region is the canvas associated with the `baseLayer`. For immersive sessions (e.g. hand-held AR), the entire screen is monitored. 
 
 ### Selection styles
 In addition to a targeting ray, all input sources provide a mechanism for the user to perform a "select" action. This user intent is communicated to developers through events which are discussed in detail in the [Input events](#input-events) section. The physical action which triggers this selection will differ based on the input type. For example (though this is hardly conclusive):