W3C

– DRAFT –
TPAC Browser Testing and Tools WG - Day 2

27 September 2024

Attendees

Present
alice, AutomatedTester, Christian, gsnedders, hdv, JaeunJemmaKu, Jamie, jdescottes, jgraham, jimevans, orkon, sadym, sasha, simonstewart, simonstewart8, valerie_young, Westin, whimboo, Yoav, ZoeBijl
Regrets
-
Chair
David Burns
Scribe
AutomatedTester, David Burns, jgraham, simonstewart, simonstewart8, davehunt, simons

Meeting minutes

Autofill testing

github: w3c/webdriver-bidi#706

martin: [Introduces himself, Yoav, and Christian]
… why is autofill important? 30% of guest checkouts use autofill and we see that there is a 41% increase of completion rate
… (describes use case on slide 4 of the slides)

martin: complex case is where we change fields e.g for different countries with different address formats

martin: Different browsers handle this case differently at the moment. In some cases we get prefilled fields that's different to what users would expect

martin: this is a huge problem for our buyers. We need to implement workarounds to avoid autofill doing something unexpected.

martin: In the spec a form can define scopes e.g. shipping address vs billing address.

martin: we expect to only prefill one scope at a time, e.g. not auto-fill the shipping address when only trying to auto-fill the billing address

martin: We'd like to be able to write wpt to get interop here

martin: Simple tests where we can define a form, call a testdriver function, then fill a form given some autofill data.

martin: This is the proposal

martin: We have a webdriver bidi PR open. It's been approved by Chrome. We have requests for positions from Mozilla and WebKit.

martin: We also have proposed draft tests. We've changed the proposal recently, and the tests haven't been updated yet.

martin: [shows the filesystem layout of the proposed tests and what they cover]

martin: [shows the proposed specification from the PR]

martin: Pass in a form element, and the data that's assumed to be stored in autofill.

martin: Autofill may not be fully specified, how to deal with that? We think this would help to make it more specified. In terms of browser vendor flexibility e.g. passwords, there's definitely some possibility to add flexibility e.g. making tests optional. We should at least get the majority of the cases tested somehow.

AutomatedTester: Can we get a link to the presentation?

[this will be forthcoming]

AutomatedTester: Is this proposal to simplify things rather than people using send keys to fill the form? I don't quite understand the testing use case.

simonstewart: WebDriver has a series of mechanisms for interacting. For a user you start typing and it pops up an autocomplete will the fill options. Why is this different?

martin: How would you define what the state of autofill is?

<Yoav> https://drive.google.com/file/d/18HpZ8uNWV4AwzzODkoSERKAL5kUp6AJE/view?usp=sharing

gsnedders: You can't interact with the actual autofill UI with sendKeys. You need a way to interact with the prompt.

AutomatedTester: I was imagining it to be like file upload.

sadym: The proposed API covers not just address, but any kind of field. You can provide any set of fields including ones not in the form. What's expected?

martin: The idea is that you're actually passing in a form where the autofill elements are specified, and then you're setting the state of the autofill memory. The trigger should cause the browser to activate the provided autofill.

sadym: So it's performing two different actions?

martin: Yes We thought about having a set autofill state command, but this requires state and that could leak between tests.

Yoav: Yes, the PR does two actions; filling the autofill store with data and then triggering. We could pivot to using sendkeys for the trigger part, but speccing the actually filled values requires jumping through some hoops. Figuring out if the autofill was triggered at all is hard. We need to inspect the data in a way that isn't UI related.

orkon: I read the proposal. I think it looks good and similar to something we have in CDP. This command doesn't actually modify the autofill state. It acts as a data provider for autofill. It avoids the problem of whether the user can actually access the autofill data; you don't need to worry about how the data gets into the autofill provider.

orkon: You can test that given the data is available to the browser that it's applied correctly to the form. I think I'm in favour.

martin: We don't want to test where we get the data from but what happens when we have it.

gsnedders: Feedback from WebKit. One concern is that there are a reasonable number of websites who view autofill adversarial. Adding testing API might make it easy for them to block autofill, which doesn't follow priority of constituencies. That gives us reason to be hesitant in case authors try to default autofill. Browsers are already applying

additional heuristics to help users.

gsnedders: When I looked at the earlier PRs, the trigger was on an input element rather than a form. If you're tying you get the prompt on a specific input element, but other fields will be filled. I'm not confident that autofill is form-scoped. We may fill into other forms.

gsnedders: That's needed for some websites.

gsnedders: There's the HTML notion of form associated elements distinct from the DOM tree.

gsnedders: Given the conflict between users and authors here, we don't want to over-specify this, especially in cases where there aren't auto-complete attributes.

martin: Can you elaborate on blocking autofill?

gsnedders: e.g. banks might try to avoid users being able to save or autofill passwords because they consider that a security feature.

martin: Could the just not specify the autocomplete attribute?

gsnedders: Browsers may still offer autocomplete

martin: What about just addresses rather than also considering passwords?

gsnedders: It's less of a problem, but where there are limited autocomplete attributes we can end up doing things where it's not completely apparent. But passwords are the worst case.

mmocny: You could write a wpt that tries to craft a form that tries to break the autocomplete field.

mmocny: Do we need a form target or should it be at the page level? Is that necessary?

martin: It makes the test easy to read. Y

mmocny: It seems like the user might want to test a page with multiple forms

simonstewart: You should pass in a specific input element you want it to trigger on

martin: Yes, and assume that the user has focused that field.

simonstewart: You locate the node and don't use focus.

<Zakim> mmocny, you wanted to react to mmocny

sadym: I think it misses an important part of saving the state. There's no guarantee that the field would have been saved in the state. Would it be enough to just have the part where you trigger the autofill assuming the data is already saved?

martin: How would you make the browser aware that the autofill memory has specific data?

sadym: With a specific save method. That was the original proposal, why did it change?

Yoav: We want to test that if we have a given state. Another test could be to save a fake state e.g. put data in a form and check that the data saved is the correct data

Christian: One reason we changed it is that this should be transient information. Prefilling the autofill storage makes it more complicated if you have to navigate between many possible entries.

<Zakim> simonstewart, you wanted to react to mmocny

simonstewart: You save autofill data, then get a handle back to that data. You then use that handle to clear the state later, and also select the set of autofill data you want to use for this specific form element.

simonstewart: That would give you the ability to clean up, but also allow checking that the right data is saved

martin: Would we have to test both storage and autofill together, or could we still separate out those tests? We'd like to separate those things

simonstewart: We could also have an event when things are saved to the autofill database.

<Zakim> mmocny, you wanted to react to mmocny

mmocny: The use case in the video where the form fields changed seemed to require two autocompletes to be triggered. Does the proposal allow that? Presumably the expectation is that the browser triggers autofill itself again, but I'm not sure that a single trigger command is enough to achieve what you want. Setting up state and sending keys like a

user would be a better pattern. The current proposal bypasses some of the actual user flow and so might pass more often than a real browser would.

Yoav: To the point about the adverserial nature of some autofill fields, I think we could carve out something to allow browsers to do whatever they need to do to protect their users. Maybe one way to achieve that is to make this just address related. Would that address the concerns?

gsnedders: I'd need to ask others.

jimevans: It sounds like you want to validate that data was autofilled in a specific way when triggered by a user's action. In the bidi case you might want to consider an event that the browser sends back that says that autofill happened or is about to happen.

mmocny: For testing or as a web api?

jimevans: For testing. You want the test to react to what the browser has done. otherwise there could be a race condition.

martin: We react to the event that says it's happened.

jimevans: You could maybe have the event tell you which fields were filled in, but that may not be possible. Just a notification would give the tests the ability to validate the autofill operation.

Yoav: Right now people can poll the CSS, but an event would be better.

sadym: First question is how well the browser behaviour is specified for autofill? Otherwise it's just do what you want and hope they match. I'm not sure we need an event if there's a command that can return once the autofill is completed. I don't see a scenario where the autofill is initiated by anything other than the command. Also, do you

consider scenarios where we want to prevent someone from breaking the user experience, for example the examples that gsnedders gave earlier?

martin: Not sure if it's sufficiently well standardised, there's room for interpretation. Tests would make it clearer.

Yoav: For the testdriver part I don't think we need to tighten the spec. But for wpt we would.

sadym: In the proposal it currently says "do any other UA specific autofill steps"

Yoav: If we need to define what it means to trigger autofill.

sadym: I think it's a good first step.

orkon: I also think an event is not necessary for this API. The command should wait for the operation to complete. The event might be useful if we need to detect autofill done by the user manually. If we want to go back to the save operation, we need to think about the browser state and that would need to be defined. I think you could argue that

other BiDI features could also be used in a user-hostile way. I think autofill is generally a positive thing. Would be good to know how many websites are trying to disable it. I think it's maybe not so many, especially for addresses.

<Zakim> gsnedders, you wanted to ask about latency with events

gsnedders: On a desktop OS you're typically typing something, and in the middle of those actions the event will happen. I think we're assuming that there's a single value store for each possible key. You can imagine typing something and you get an event indicating that something could be autofilled, but you've typed something that means that the

autofill options have changed, so the data is stale. I think events are closer to how this actually works in the browser, but it's not ideal. On mobile you might have a separate autofill prompt. I don't know which option is actually better here.

mmocny: The proposal on the screen was hoping to be a nice clean wrapper. Breaking all those pieces up to give more control would be a different proposal. You wouldn't have a single state per input.

Yoav: If you want more finegrained control you'd need a low level and high level commands

mmocny: The browser itself would still need to clear the state.

Yoav: This was useful. It would be useful to get responses to the open standards positions

Zakim. open the queue

Slideset: https://notes.igalia.com/p/UCAZ3KYIo#/

Accessibility Platform API testing in WPT

Slideset: https://notes.igalia.com/p/UCAZ3KYIo#/

spectranaut_: <talks through presentation>

spectranaut_: I hope that everyone has seen the RFC, and we wanted to outline the "few" "small" technical puzzles we need to solve to run the tests in WPT

<alice> RFC: web-platform-tests/rfcs#204

spectranaut_: we've been calling the project "acacia" but the name doesn't appear in the code at the moment

alice: In the ARIA group we define mappings for assistive APIs to query.

alice: Each OS has different ways of communicating with the assisstive technologies

alice: in webdriver there's already ways to query the computed accessibility tree

alice: we then transform the computed accessibility tree, which is done differently per platform

alice: we want to be able to query the transformed tree like the assistive technologies do

alice: that transformation is subject to a standard maintained by the ARIA WG. One for attributes, one for html, one for svg, mathml, and maybe css

<jcraig> Data viz graphic slide direct link: https://notes.igalia.com/p/UCAZ3KYIo#/2

alice: we want to test those standards the way we test other standards

alice: <shows slide with different platform APIs>

spectranaut_: it's a little complicated because there isn't a 1:1 mapping between platforms and platform APIs

alice: the demo of this tests the Linux API role of element with aria role of "blockquote". Ultimately tests the platform API

spectranaut_: basically, all the functionality is there to query more accessibility APIs on each platform, but it's not all been implemented

alice: each of the APIs has Python bindings available

alive

alice: we can use those bindings directly in the WPT code

alice: we would like help taking from a proof of concept to something everyone is happy with running in the WPT

alice: I've linked to the RFC, but there's a lot of things in there

spectranaut_: we have specific questions

spectranaut_: we need to get the browser process ID in order to find the browser through the accessibility APIs.

spectranaut_: most drivers don't expose the PID

spectranaut_: there are two ways to get the PID from webdriver. 1: update the webdriver spec to say that the PID should be returned by New Session, or 2: go to each browser and ask for the PID to come back in each extension.

AutomatedTester: when you say "browser process", do you mean the parent process?

spectranaut_: yes, just the parent process

jgraham: I think for this use case a capability is fine because the PID should be stable.

jgraham: there are lots of cases where the PID is meaningless (eg. browser on a different machine)

jgraham: it's tricky to come up with a general solution, since when we start the session, we don't know where the browser is

simonstewart: <gives examples of how we don't know where sessions are>

spectranaut_: not sure what firefox does, but in the chrome code I'm only returning the PID of the desktop browser

spectranaut_: is accidentally returning the PID a problem

alice: we would need to restrict running those tests to being certain we're on the same machine

spectranaut_: we can only test the accessibility API if it's on the same machine

spectranaut_: we can check the name of the browser against the PID to see if it's what we expect

<Zakim> jgraham, you wanted to react to jgraham

jgraham: I can't remember what geckodriver does. But if you launch the browser, you likely know you're on the same machine

jgraham: you could just ask the browser what the PID is, and in that case you may not know where the browser is

jgraham: because your use case depends on everything being on the same machine, then worrying about the location of the browser seems like an unnecessary rathole

AutomatedTester: adding to what jgraham is saying, generally how most people hit these situations with remote endpoints (eg. through BrowserStack or Docker), even when the PID is returned is one on a different machine, then if you're always on the same machine, I'd just carry on

<Zakim> Jamie, you wanted to ask whether we could return an opaque accessibility specific connection address or something like that

Jamie: I was just wondering if there's any use in having an "accessibility connection address", which would be opaque, but not necessarily be a PID

Jamie: I don't have specific thoughts about how this would work, but Acacia is the only thing that's interested in it

Jamie: would allow for different use-cases we can't currently support

(I note that the Firefox implementation just uses the PID on whatever machine it's running on)

alice: I like that idea

Jamie: I was thinking it might be something returned by webdriver.

alice: you'd just poke that ID through to the accessibility API?

spectranaut_: the next question is about python dependencies

spectranaut_: I don't know how to include those for WPT users. One way would be to add them to the top-level requirements. Maybe that's fine, even if it's just for a couple of tests

Slideset: https://notes.igalia.com/p/UCAZ3KYIo#/

spectranaut_: I did notice the concept of conditional requirements, and you can run wpt with a flag, but that flag likely won't be used anywhere else in the code

spectranaut_: I'd like to know who manages the data collection for wpt.fyi

jgraham: <itches to give answers>

jgraham: yes

jgraham: are these binary or pure python deps?

spectranaut_: I'm not sure

alice: at least the macOS ones are wrapping C

jgraham: it's likely that they're binary deps because then that makes things complicated

<Jamie> On Windows, comtypes only needs ctypes, which is part of Python stdlib

jgraham: what we normally do with WPT is we vendor the dependencies into the tree since that makes it easier to integrate with vendor CI systems

jgraham: because dealing with C in vendor things is complicated, we have a flag to ensure that the deps are available

jgraham: it's not a great solution, but that's likely the right path to follow here

jgraham: you'll end up with a specific requirements file to support that

alice: what's the best place to follow up?

jgraham: the Matrix channel, #wpt:matrix.org

jgraham: wpt.fyi is updated by CI

<Jem> so which option we are leaning to among three?

jgraham: in the repo we have the configuration of tests runs we execute, and we can enable those.

jgraham: just want to avoid landing a patch that suddenly causes binary deps to be required

jgraham: <discusses how to optionally run tests>

gsnedders: Everyone loves to hear me talk

gsnedders: one of the problems has historically been that some of the dependencies required to build a C extension was not always the right version of the right dep for things to compile

gsnedders: in cases where building the extension becomes complicated things become more fraught.

gsnedders: it may be okay, since this is likely on the platform already, but there are vendors in the room who may want to run the tests on pre-release versions of OSs

(/me notes that python now has the ability to build wheels with a stable ABI)

gsnedders: we'd need different deps per OS

gsnedders: that can be described in the requirements files. It can be done

spectranaut_: I guess I would say that the complexity can be kicked down the road if we don't expect vendors to run these yet

<gsnedders> (/me notes that nobody uses the stable ABI)

jgraham: maybe take this to a different venue later

20 minute break

spectranaut_: Ideally in the long term we'll be able to see by looking at the test results you'll see the correct number of assertions per platform / column

spectranaut_: We would like some kind of "non applicable" concept

spectranaut_: Non-applicable or skipped concept in the test harness

jcraig: Maybe there's a way we can mark this in wpt-metadata per subtest?

jgraham: This is all possible. We can just make it run the subtests that are relevant to the configuration.

jgraham: This feels very solvable.

spectranaut_: We send the test assertions back for each accessibility API. If there's no results then we know the test is not relevant.

jgraham: When you start the run you should have some way of knowing what you want/expect to run.

alice: We talked about needing to test accessibilty code that's typically not run. It needs to be turned on, and it can be tricky to turn it back off. We would like tests in a directory to have their own browser instance.

jgraham: This is a solved problem. We can help.

Is someone willing to be a contact person to help with WPT accessibility tests?

jgraham: Yes. Asking questions in the Matrix channel is good for async questions.

jgraham: There are a couple of people in Panos' team, also maxim(sp?) that could help

spectranaut_: The PR is a proof of concept, but appreciate someone to look over it. Right now it's on the Igalia fork

<alice> POC PR: Igalia/wpt#2

jgraham: Suggest moving PR to wpt repo as a non-draft

spectranaut_: that's all of our questions, thank you

jcraig: would it be helpful to have kyle and/or sam sneddon to review?

Accessibility module for WebDriver BiDi

github: w3c/webdriver-bidi#443

orkon: Since last year we have the ability to locate DOM elements by accessibility attributes

orkon: Still not supported is getting snapshots of the accessibility tree

whimboo: I added this as it was a request from Playwright

jimevans: We've already done the location of nodes. Being able to get the snapshot would be useful to me

orkon: We can't implement because there's no agreement yet on what the accessibility tree looks like

<Zakim> jcraig, you wanted to ask the need for the tree, since the implmentations will be different across WPT tests.

<whimboo> orkon: you were right. the interop 2024 initiative was the reason

jcraig: Lot of accessibility automation stacks being worked on. WPT tests currently exists entirely in the browser.

jcraig: We have an investigation to implement a tree dump. There's going to be different trees. I'd like to understand more about the needs are for getting the tree.

jimevans: Cross browser tree comparisons is something that we would be interested in.

jcraig: It's achievable in the near future to do comparisons by walking the tree if we have limited known test cases

<jcraig> s/had an investigation/have an investigation/

Jamie: The thinking is that if we get accessible node working well then it's possible to write a tree dump as a layer on top of that. Important to get accessible node right first. I think we're heading in a direction that allows this.

<Jamie> web-platform-tests/interop-accessibility#51 (comment)

<jcraig> TreeDump Investiigation (abandoned) web-platform-tests/interop-accessibility#51

jgraham: Puppeteer exposes an API for accessibility trees, Playwright API redirects to alternate tools. Agree that don't expect the accessibility tree to be comparable between browsers in the near future.

<jcraig> Some explanation of the reasoning for moving to a treewalker plan web-platform-tests/interop-accessibility#90

Judging priorities for new requested features

whimboo: last time at TPAC we created a sheet for prioritising our roadmap, I have added new entries based on recent feedback on client priorities

AutomatedTester: you met with cypress folks, is there a column for this?

whimboo: not yet, but plan to add it

AutomatedTester: there are only three items for Playwright, are more going to be added later?

whimboo: Playwright have a lot more P1s, but want to update based on usage

whimboo: the current P1s are the "must have" items from Playwright

AutomatedTester: do we have a priority list for cells without a priority?

whimboo: not yet, it will be filled in

ACTION: David to reach out to Christian Bromann for priorities for wdio

ACTION: David to look at Selenium items and work with them on Priorities

ACTION: whimboo to add priorities for Playwright and Cypress.io to spreadsheet

AutomatedTester: this is a good list, and very useful to have

ACTION: orkon to work with puppeteer team to get their priorities added to the spreadsheet

w3c/webdriver-bidi#787

Support retrieval of content quads for elements with CSS transitions

github: w3c/webdriver-bidi#787

https://test.csswg.org/harness/test/geometry-1_dev/

<whimboo> https://github.com/web-platform-tests/wpt/tree/fd25aeb946a8df94314cf6fb4fe63b46eee2cd8c/css/geometry

<gsnedders> https://wpt.fyi/results/css/geometry?run_id=5201124441456640&run_id=5179680340836352&run_id=5184785144348672 would be a better link for test results, rather than the old CSS WG Test Harness (which is the same tests)

<gsnedders> https://drafts.csswg.org/cssom-view/#dom-geometryutils-getboxquads seems like it should give a DOMQuad with what we want… if it were specified. Or implemented.

<gsnedders> c.f. https://wpt.fyi/results/css/cssom-view/idlharness.html?run_id=5201124441456640&run_id=5179680340836352&run_id=5184785144348672 for evidence of lack of implementation of getBoxQuads

jgraham: So the understanding I have is that getBoundingCLientRect returns a DOMRect, which is the bounds of a rectangle containing the element. For transformed elements that's different to the actual bounds of the element if those are non-rectangular. DOMQuads in https://drafts.fxtf.org/geometry/#DOMQuad represents the latter. This is what CDP

exposes via its API. There is a draft API for getting these values from content in CSSOM-view https://drafts.csswg.org/cssom-view/#dom-geometryutils-getboxquads however this is basically all TODO.

This is actually implemented in Firefox.

So the problem we have here is that in terms of a standard we don't have anything to point at to explain the behaviour of the requested API. We could implement it in Gecko and Chromium, but it would be nonstandard or implementation-defined. If we actually wrote a standard here we might as well finish CSSOM-view and then expose it to content

(assuming there's no compat issue), and then clients could use that via script. Having a BiDi only API here seems rather odd.

jgraham: we need someone from layout or from the CSS WG to help us out here. We should talk to Emilio

ACTION: jgraham to speak to Emilio and get advice on what we can do here

Emitting Events During "browsingContext.create" when "about:blank" is loaded

github: w3c/webdriver-bidi#766

whimboo: this issue was origianly raised by sadym
… this is about getting an event when the context being created and then when can we navigate
… so the question is what do we want to do with the creation of new tabs/documents?
… do we want to handle navigationStarted for about:black?

orkon: I checked the html spec and I couldn't see if there is a navigationStarted with about:blank
… perhaps we need to reverse the test so it doesn't have the event on about:blank

whimboo: we are happy to change this
… and this way we can improve the clients so they have a better idea when to work with the tab
… and I am not sure about the implementation status of these things in chrome and if/when it will be ready

jdescottes: when I was testing this in Firefox we weren't running events for navigation on about:blank
… if there are any scenario where we have any extra events in the browser?

<orkon> https://github.com/web-platform-tests/wpt/pull/47958/files#diff-c773a239b93f48290e8f834f6fd9c337720fb92b206b65e03ce1c64633789032

whimboo: we fire the created event early... this is a combined question of should we send anything out before the creation of the tab

orkon: there are wpt tests that I have linked
… [describes test]. This test wasnt passing in chrome but in Firefox

jdescottes: thanks to the link. THe item about the creation event coming out too early... I think we should talk about this.

<sadym> This is the mentioned test: web-platform-tests/wpt#47958

jdescottes: in firefox we have changed the way that we emit events so that we do it after the creation of the context

gsnedders: Are you saying the load is async in firefox? I thought you could always access the document in the iframe in a sync way

whimboo: this is for the new top level browser context

gsnedders: ok, that wasn't clear in the issue

orkon: looks like we're talking about 2 different issues. This is about the browserContext creation with about:blank

<sadym> https://github.com/web-platform-tests/wpt/pull/47958/files#diff-c773a239b93f48290e8f834f6fd9c337720fb92b206b65e03ce1c64633789032R14-R29

orkon: in firefox there are navigationStarted events being fired in about;blank

sadym: I've linked to the the test

jgraham: for the case of initial about:blank we should treat it like it is sync
… e.g. create a new tab with about:blank you can start interacting with it straight away

jgraham: and not emit a navigation started event

gsnedders: is there a use case where that is not about:blank?

[discussion about checking we are all started from about:blank and navigationStarted events]

whimboo: firefox has 2 events emited for about:blank. a sync and an async event
… and we should treat new window creation special if there is a URL in there

whimboo: how would window creation handle this?

orkon: the html spec says there are some special treatment for about blank. e.g. if it has params we need to update the history

whimboo: this should be possible for us to do this and not emit the event

ACTION: sadym to update test to hsow that events are not emited on about:blank

Implement input.FileDialogOpened

github: w3c/webdriver-bidi#568

sadym: I need to go review this and get back to the WG

Browsing Context Replacements and Resilient Command Execution

github: w3c/webdriver-bidi#540

whimboo: this was raise d awhile ago
… when we have browsing context and we have changed it to navigables
… when we architected this we worked with browsing contexts
… which was fine when there was creation since we just sent an about event
… we can do some basic things like get the event
… but we don't know what to do with actions when the page has been replaced partway through the execution
… we can repeat the command if the context gets replaced and error if things are missing that we are expecting.
… if we agree that we want to rerun the commands which is what is in firefox we can close this issue and raise another
… we noticed this with the latest change to navigables

jgraham: I don't think it's obvious what we should do if executing the script and to context goes away. e.g. waiting for a promise
… I don't think it's defined. It could be "wait for ever"...

sadym: it highlights another issue where we we have navigables and contexts
… previously we said we didn't want to redo that work as it was a major breaking change

jgraham: I don't think we have a resolution... we want to perhaps have a look at doing this for all commands. I think we need to relook at it

RRSAgent: make minutes

Streaming request/response bodies

jimevans: not sure I have a specific design in mind, but I know that I want the feature

jgraham: I think this is a high priority feature for various use cases. Everybody wants it

jgraham: if we have a large response, sending that is probably a bad idea

jgraham: similarly for writes. We want to be able to do that in pieces, rather than all at once

jgraham: The question is, what does this look like?

jgraham: I was looking at this in the Fetch spec, and it's all expressed in the Streams API, so a possible design approach would be to have a representation of a stream that you could read or write bytes to.

jgraham: Potentially that would let us use a JS Stream object, so maybe we'd get a feature for free, and we'd have less to spec.

jgraham: You can imagine that there could be a way to create a stream object in JS and return that over the protocol.

jgraham: I don't have a use case for that, but if we did things that way, that functionality would be easy to implement

jgraham: I don't know if that would actually work, but I'd like it to....

simonstewart: we currently only use text messages over websockets, but we _could_ use a binary message

jgraham: I know that there's been talk of non-websocket backends, but I don't know if we need to couple that to this

simonstewart: does that mean you have a preference for just sending text messages?

jgraham: yes, because otherwise we'd need to figure out another way of structuring the data

AutomatedTester: If someone wanted to change the response of audio and video coming through in a stream, what kind of format would we need for that?

jgraham: it's just bytes from our point of view (base64 encoded)

AutomatedTester: the use case I'm thinking of is if someone wants to inject frames into their testing to find out how quickly things are coming through

jgraham: There's lots of things we could do, but I think the API we want is something that allows us to read or write chunks of data until we're done

sadym: I'm looking how CDP implements it. It just says "get me the next chunk" and it sends a base64 chunk

jimevans: it takes a chunked response

jgraham: I'm thinking of the platform stream, which gives you a read method, which gives you bytes in a "view" (not sure what a "view" is)

jgraham: it reads bytes into an array buffer

jgraham: I think the key thing is that if you had a stream object, it would give you a handle you can use in this kind of API for reading or writing data

ack simonstewart

simonstewart: it's easy enough to make a chunked approach look like a stream on the local end, so having a handle and reading and writing bytes in a chunked way seems like a sensible approach

jgraham: I think someone needs to find out whether this works in detail, since we're not sure if anyone knows for sure

jgraham: maybe there's some conflict between the ecmascript definition and what we need

jgraham: We want something that gives you a handle you can read from and a handle to something you can write to

jgraham: maybe the conclusion is "go and do the work"

jimevans: I think we'd come to that conclusion in the GitHub issue already. I agree we need to just go and do the work

sadym: is the network module the best place to put this?

jgraham: that makes the most

jgraham: sense

simonstewart: we'll need to hook that into `provideResponse`

ACTION: someone to do the work of implementing the read and writes streams

Interception of created navigables

github: w3c/webdriver-bidi#756

sadym: there are some use cases when users want to prepare subscriptions for events after the browsing context is created, but before anything has happened

sadym: analogous connecting to a debugger

sadym: we can implement this with global subscriptions, but that seems suboptimal. I'm not sure if there are scenarios that can't be implemented this way

jgraham: I agree that this is a use case we need to address.

jgraham: particularly for something like `window.open`, you'd want the same subscriptions to be taken to the new window, but in general you'd want this if a new context is created.

jgraham: I'm not sure if there's a way to pause that creation

jgraham: I don't know if the debugger in Firefox has to deal with this, since it only deals with one window at a time.

jgraham: One can imagine other ways of handling this, but it's hard to know if you can pause things

jgraham: such as an event before a navigable is created, and then respond with a list of subscriptions to add

AutomatedTester: we know that some people want to be able to have sessions per tab because of the way that they run their tests. If we do this inheritance model, how would that work for that use case?

jgraham: there is a little bit of overlap with subscribing to stuff for specific user contexts. That is, if a tab is opened in this specific user context, then <scribe gets lost attempting to listen and type at the same time>

jgraham: if you could configure events at the user context level, that might solve a bunch of issues

sadym: another use case is setting the viewport when the browsing context is opened

sadym: it's not event related and it's not solvable currently

jgraham: yesterday, we discussed that we can't set some of the emulations at the top-level anyway because we think it might be process-level state.

jgraham: a possibility is that we'd only allow setting those things for user contexts

jgraham: and if you could also do "set viewport", but it applied to a user context and not a global, maybe that would be a declarative solution to a large subset of these issues

jgraham: we'd have global, user, browser contexts in that order

sadym: does that cover all use cases?

jgraham: it's definitely less flexible than being able to pause creation

ACTION: Julian to see if gecko can pause creation of new navigables

sadym: but we'd not be able to do what we can with CDP

sadym: but you're saying we can set things at the browser context or user context?

jgraham: yes

jgraham: I suspect that most test clients will want to be able to say something like "I'm running some tests, and I'd like to have all windows emulate the same conditions" This would allow this to happen

gsnedders: I don't know if webkit can pause the creation of new navigables

sadym: I wonder how firefox with cdp worked?

jgraham: I'm not sure it worked, but if it did, then this suggests it's possible

jgraham: it does sound complicated, and we should investigate properly

Request for a navigationCommited event to inform immediately about the navigation status

github: w3c/webdriver-bidi#788

ACTION: someone in chrome to find out what a `navigationCommited` actually is, and what it corresponds to in a spec

Support setting page content

github: w3c/webdriver-bidi#759

simonstewart: why couldn't you just do this by executing JS?

AutomatedTester: this is something that puppeteer has, but it depends on `document.write` which is deprecated.

AutomatedTester: the only major use case is maybe for WPT, but I can't see ordinary testers wanting to do this.

AutomatedTester: for component testing, people load the components into the page rather than just injecting it in because they need styling and additional content

<gsnedders> +1 for simonstewart's comment ("why couldn't you just do this by executing JS"); seems like it should be the same as navigating to about:blank and just calling document.open/write?

jgraham: does puppeteer use this a lot?

sadym: I'm not sure

<debate about whether regular users would do this>

simonstewart: what is replacing `document.write`?

gsnedders: nothing. You don't want to modify the parser state while parsing

AutomatedTester: what about navigating to a data URL?

gsnedders: it may well have a different origin

gsnedders: what are the semantics of this in CDP?

gsnedders: do those semantics matter?

simonstewart: could you use `document.outerHTML`?

<general agreement that it's not the right thing to do>

jgraham: does someone in the puppeteer team need this now, rather than just calling "executeScript"?

sadym: how far is `document.write` from being removed?

gsnedders: impossibly far away

jgraham: it might be hooked up to some logging saying "this is deprecated, please stop", but implementations could handle that by treating it differently if it came from a sandbox

jgraham: if we had a special command in gecko, it'd just be calling the JS

sadym: I don't have a firm answer yet

simonstewart: so there's general agreement this isn't something that needs to be done just yet?

jgraham: yes, but if puppeteer come back and say it's a priority, no-one is against adding it

RRSAgent: make minutes

BrowsingContext.captureScreenshot: restrict to top-level contexts only

github: w3c/webdriver-bidi#441

sadym: in bidi you can capture a screenshot, and you can capture a screenshot of an iframe beyond its viewport. It's not feasible to implement in chrome

sadym: our proposal is to keep the standard implementable.

sadym: we suggest restricting screenshots beyond viewports to the top-level browsing context

jgraham: gecko can implement this. You're allowed to return an unsupported operation if an implementation can't do it

jgraham: this doesn't seem like an important enough point of difference to say "if one browser can't do it, no-one can do it"

jgraham: people usually take screenshots for informative purposes

jimevans: do we need a spec change to allow people to not return a result?

jgraham: yes

<Zakim> gsnedders, you wanted to ask if we know of any use-cases for entire viewport screenshots within a frame

jgraham: no it doesnt need a spec change

gsnedders: do we know of any use case of people wanting to take a screenshot of the viewport of a frame

simonstewart: We used to get bug reports in selenium

simonstewart: People don't generally know the distinction between the top level viewport and an iframe

simonstewart: Can Chrome implement print for an iframe?

sadym: I don't think so.

jimevans: with respect to screenshots, the most common requested case for full-page screenshots is at the top level

jimevans: and if that page contains an iframe, typically speaking, the iframe has a size associated with it.

jimevans: what you get in the screenshot is whatever is currently displayed in the iframe, bounded by the size of the iframe's viewport

jimevans: someone asking for a screenshot of the entire content of an iframe is an unusual occurrence, even from the selenium context

jimevans: I think it's okay that if the browsing context of an iframe is given to the snapshot command, it's okay to return an unsupported operation exception (because I think it's a rare edge case)

jgraham: if some implementations can support it and some can't, this doesn't seem like a major interop issue

sadym: OK. Sounds fine

AutomatedTester: so long as we have the viewport in all cases, that's fine. Just thinking of the visual testing cases

Allow browser configuration by user context and not only globally

github: w3c/webdriver-bidi#789

jgraham: some people would like to treat different user contexts as completely separate sessions so that they can have a single browser process

jgraham: from a gecko point of view, I don't think we can implement this in general

jgraham: we use containers, which are extra attributes on storage. I don't think we have proxies per user context, etc

gsnedders: we already create a separate NSURLSession for each ephemeral browsing contextss, so I think we should be able to configure the NSURLSessions we're using for each top-level navigable separately. However, I'm not totally sure

simonstewart: presumably this is for "speed"?

<general agreement>

sadym: for use, proxies are per process.

sadym: accepting insecure certs, we can per user context

sadym: ah! You can set proxy

jgraham: the thing we can do here is to add some configuration options to create user context, which are roughly equivalent to ones from capabilities that we can set.

jgraham: per thing we try and set, we need to check whether or not it's possible for people to do

sadym: do we want to use capabilities when creating user contexts?

jgraham: very much no

jgraham: you can always return "unsupported operation" if someone asks for something you can't implement

simonstewart: we may just want to limit the capabilities we can pass through

jgraham: yes, to the things in capabilities that are configuration values

simonstewart: limiting to useInsecureCerts and proxy seems like something people can implement

sadym: unhandledPromptHandling too

RRSAgent: make minutes

<gsnedders> github-bot, end topic

RRSAgent: make minutes

Summary of action items

  1. David to reach out to Christian Bromann for priorities for wdio
  2. David to look at Selenium items and work with them on Priorities
  3. whimboo to add priorities for Playwright and Cypress.io to spreadsheet
  4. orkon to work with puppeteer team to get their priorities added to the spreadsheet
  5. jgraham to speak to Emilio and get advice on what we can do here
  6. sadym to update test to hsow that events are not emited on about:blank
  7. someone to do the work of implementing the read and writes streams
  8. Julian to see if gecko can pause creation of new navigables
  9. someone in chrome to find out what a `navigationCommited` actually is, and what it corresponds to in a spec
Minutes manually created (not a transcript), formatted by scribe.perl version 229 (Thu Jul 25 08:38:54 2024 UTC).

Diagnostics

Succeeded: s/martin/martin:/

Succeeded: s/computed/Linux API/

Succeeded: s/<scribe misses the answer>/the Matrix channel, #wpt:matrix.org/

Succeeded: s/Ideally/spectranaut_: Ideally/

Succeeded: s/Maybe/jcraig: Maybe/

Succeeded: s/We talked/alice: We talked/

Failed: s/had an investigation/have an investigation/

Succeeded: s/moving/moving to a treewalker plan/

Succeeded: s/link for tests/link for test results/

Succeeded: s/do the work/do the work of implementing the read and writes streams/

Succeeded: s/for ephemeral browsing context/for each ephemeral browsing contexts/

Maybe present: jcraig, martin, mmocny, RRSAgent, spectranaut_

All speakers: alice, AutomatedTester, Christian, gsnedders, Jamie, jcraig, jdescottes, jgraham, jimevans, martin, mmocny, orkon, RRSAgent, sadym, simonstewart, spectranaut_, whimboo, Yoav

Active on IRC: alice, AutomatedTester, Christian, davehunt, gsnedders, hdv, Jamie, jcraig, jdescottes, Jem, jgraham, jimevans, mmocny, orkon, sadym, sasha, simons, simonstewart, simonstewart8, spectranaut_, Westin, whimboo, Yoav, ZoeBijl