Except doing so is probably much more complicated than actually dealing with the...

mwcampbell · on Aug 1, 2021

I doubt that. Chromium's internal accessibility tree is already serializable; it has to be, so it can be sent from the renderer process to the main process. So Cloudflare's modified Chromium could send that tree down to their JS-based client, which could then construct a DOM with the appropriate HTML tags and ARIA attributes. This DOM wouldn't have any JavaScript or any references to remote resources, so it wouldn't pose the same security risks as the original web page.

ggreer · on Aug 1, 2021

There are several problems with that approach. First, there's not enough information in the serialized accessibility tree to reconstruct the DOM.[1]

Second, the serialization format is an internal API, so there are no constraints on backwards compatibility. It can change in any version of Chromium. In fact, the interface is updated all the time.[2] Cloudflare would have to constantly update their JS client to handle those changes. It's not an abstraction that can be relied upon.

Third, the bandwidth and latency requirements for inter-process communication are far higher than what is available for most client-server communication. Even if the API were stable, I doubt it would be feasible to use on typical Internet connections. If you don't believe me, go to chrome://accessibility/ and click "Start recording" on a tab. I did this for an IRCCloud tab and got 4500 events in approximately 2 seconds.

1. https://chromium.googlesource.com/chromium/src/+/HEAD/docs/a...

2. https://source.chromium.org/chromium/chromium/src/+/master:t...

mwcampbell · on Aug 1, 2021

> First, there's not enough information in the serialized accessibility tree to reconstruct the DOM.

There doesn't have to be enough in there to reconstruct the original DOM, just enough to expose all of the information that screen readers and other accessibility tools need. The fact that that information would be exposed through an HTML DOM in this case is irrelevant; we know the Chromium accessibility tree has all the necessary information.

> Second, the serialization format is an internal API, so there are no constraints on backwards compatibility.

OK, you got me there. Maybe the server side has to go all the way and construct the HTML.

> Third, the bandwidth and latency requirements for inter-process communication are far higher than what is available for most client-server communication.

OK, again, maybe the server side has to digest the data some more before sending it. But at least Chromium is already pushing serialized tree updates. I'll withhold a rant on how it could be much worse.

miki123211 · on Aug 1, 2021

Does this handle (lots of) (sometimes large) page updates, particularly across a semi-slow, semi-reliable network? Think lazy loading, sPA-style diff-based page transitions, or realtime progress bars. What about element positions (i.e. for switch control overlays that visually mark specific elements on the page)? Assuming this just sends keys directly to the remote browser, what about cursor-related events in editing fields? If latencies are over a few ms with those, some screen readers get confused.

mwcampbell · on Aug 1, 2021

Good questions. You have an especially good point about the latency of responses to cursor movement commands; the developers of NVDA and JAWS might have to rethink their approach to that.

But as far as I know, Cloudflare hasn't even tried yet.

x0x0 · on Aug 1, 2021

Would you need the css?

And mutations to this dom would need to be tightly synced to image updates to not confuse the hell out of nvda?

Or am I misunderstanding?

mwcampbell · on Aug 1, 2021

> Would you need the css?

Since this DOM would be invisible, hidden behind the canvas, I'd say you'd need just enough CSS to make each element have the same bounding box as the original. Bonus points if you can safely do enough CSS to make the font size and colors match; screen readers do have commands for querying those things.

> And mutations to this dom would need to be tightly synced to image updates to not confuse the hell out of nvda?

Chromium has already taken pains to make sure this works, because its whole accessibility implementation is dependent on pushing tree updates from the renderer process to the main process.

x0x0 · on Aug 1, 2021

got it, thanks!

dndn1 · on Aug 2, 2021

CF are doing a remote canvas/input. Why doesn't remote audio/input work for screenreader case?

detaro · on Aug 2, 2021

Because the screenreader is set up and runs on the users local PC, unless you expect users to use whatever unknown-to-them screenreader setup CF happens to choose to run remotely.

dndn1 · on Aug 2, 2021

Thats 1 option, but I think there is more to explore than that.

E.g. to look at canvas implementation it appears CF delivers a Chromium render to canvas. Maybe what helps here is standardisation. That render wasn't sufficient during the browser wars of ole, but isn't a point of contention now because of standardisation.