Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Microsoft goes its own way with Web audio/video spec, despite W3C rebuff (arstechnica.com)
82 points by coloneltcb on Jan 19, 2013 | hide | past | favorite | 50 comments


The big problem with Microsoft's spec is that it doesn't specify a protocol, so it is impossible for other browsers to implement the standard. The spec only says that RTP will be used but not what specific profile or codecs. Microsoft's prototype uses a plugin for Chrome and IE that implements their proprietary protocol.

The WebRTC spec on the other hand does specify a protocol via the RTCWeb IETF specs. The SDP used in WebRTC is part of that. Also the Chrome and Firefox teams are working hard to get the two browsers talking to each without resorting to prototype plugins and instead using their own separate implementations of the WebRTC protocol.

When/if Microsoft publishes the protocol specs, then their UC-RTC-Web spec will truly be complete and we can judge it on its merits compared to WebRTC. Also in a very real sense the WebRTC spec is more complete since it does specify a protocol.


Small correction: Chrome and Firefox teams are using basically the same implementation of WebRTC; see http://mxr.mozilla.org/mozilla-central/source/media/webrtc/t... - it's an import of the Chromium code into Mozilla. Having the source available definitely makes interoperability easier and is better than having a blobby bit, though.


Another big problem is their attempt to prevent mandating any open codecs.


A lot of people here are arguing that "Microsoft has legitimate concerns". Of course MS are framing it like that - would you really expect them to say "We made this proposal because Google+Firefox+Apple are neutering our Embrace+Extend+Extinguish ritual, and additionally the WebRTC thing is going to demolish the $8B value of our skype purchase"?

The fact that they may have good technical arguments should not confuse you. They're not doing anything to make the web better. They are doing everything to make Microsoft's hold in the market stronger. Everything else is incidental.

Story time:

Back in 1999, I worked on instant messaging infrastructure, and as a result followed the related IETF workgroup closely. That workgroup had very interesting discussions that went absolutely nowhere over a few years. At some point, everyone threw their hands up in the air and adopted XMPP - not because it was good (far from it, compared to other suggestions that were made) - but because an implementation was made and progressing, and becoming a de-facto standard.

What I realized through 1999-2002 following the standardization process was that, although every single argument ever made was of a technical nature, it was a game played between AOL, MS, Netscape and the other members of the workgroup, the game being "either you guys accept my existing infrastructure and give me dominance, or I'm not sharing my toys". And since no one gave up, toys -- to this day -- are not shared, and IM networks are still mostly disjoint.

The WebRTC game might be more delicate, but the validity of the technical arguments (which was there in the IM discussions too) should not confuse you. It's about market control and dominance, not about technical excellence.


You are spot on. I was at Microsoft during one of these protocol discussions. I worked on a popular (and contentious) directory product for whom interoperability (or lack thereof) was the product's bread and butter. For the mid-2000 release we were embracing XML goodness from the W3C. XML was in style in the same way JSON is today. We settled on WS-Transfer and IBM's WS-ResourceTransfer extensions because, frankly, it didn't matter what we picked so long as it was XML.

Well, we worked on the release for 2 years and then 6 months before shipping our VP decides that decision posed too big of a risk for IBM gaining 'the upper hand.' He made a point of blaming people for the decision he had approved years ago. We then had to rip out a perfectly good and tested implementation of WS-ResourceTransfer, design a replacement, make it an industry standard, and finish with the original schedule. We came up with something 'different' called MS-WSTIM ( http://msdn.microsoft.com/en-us/library/dd302851.aspx -- fun fact, I got stuck writing most of those docs).

After the change our principal technical architect spent the next year going to conferences talking about the technical merits of how MS-WSTIM exceeds WS-ResourceTransfer in oh-so-many ways. He supposedly worked to convince minor players to adopt the new 'industry standard.'

The whole thing was a sham. MS-WSTIM is WS-ResourceTransfer by a different name. It just wasn't IBM's, and like so much at Microsoft, politics reigned supreme. I would be surprised if the same were not happening here.


Sure, but are the other players completely disinterested?

Some of them might have pure motives: take Mozilla, for instance, which seems to see WebRTC as a trojan horse to force the market to adopt non-MPEG codecs. They have the users' best interests in mind, but it's not obvious that their focus on that aspect necessarily yields the best web RTC standard.


> but it's not obvious that their focus on that aspect necessarily yields the best web RTC standard.

Best RTC standard for who?

I believe that patent-free RTC (as WebRTC provides by virtue of preferring WebM) is better than a patent-encumbered even if it has a slightly better session initiation/description protocol.

Regardless - yes, one should definitely try to map the underlying reasons and subtext of every statement. But the bottom line is:

Google has been working, in public and with a freely usable implementation, on a specification. Mozilla adopts same specification (and mostly same codebase). Said specification ALREADY works on iphones, android phones, Chrome, Firefox and even IE (with the right plugin).

Microsoft proposes something, not completely defined, only provides a plugin implementation, only for IE and Chrome, unsupported on any phone. In my book, it needs a much better story than just "better session descriptor".

And frankly, with Microsoft's past of hijacking the web standards, the office openxml iso process hijacking, all the linux patent threats and the android patent extortion that is STILL ongoing, they need to work much, much, harder before I even give them the benefit of a doubt.


There seems to be widespread confusion about WebRTC already adopting VP8/Webm as a mandatory-to-implement codec. No such decision has been made. Though the two furthest along implementations (Chrome and Firefox), which also happen to have about a billion users between them are both pushing VP8 which might make it a defacto standard anyway.

They (IETF WebRTC) have adopted the Opus audio codec as mandatory-to-implement, which isn't officially part of WebM (though apparently Chrome 25 can play it in that container) as it's far more suited to real-time communication than Vorbis.

For the video codec, they were going to call for consensus (basically on VP8 vs H.264) but Google mysteriously suggested that the decision be delayed on the day before it was supposed to have happened, claiming to have something in the works that would solve the whole problem to everyone's satisfaction.

My wild guess (given the fact that the VP9 encoder and decoder is now available to download from mainline codebase and is beating x264 and H.265 objectively and subjectively [1] for pre-recorded content--and I'd maybe guess that even VP8 beats H.264 for standard WebRTC use cases) is that they're working out some behind the scenes political deal with other industry partners to jump straight to VP9, either by getting all relevant parties on board regardless of patent sabre-rattling, or by working something out with patent holders to neutralise that threat. Both are a bit wild as I said, the more boring explanation is that Google feared losing the call for consensus and had something brewing to sway a few swing voters but didn't have it ready in time.

[1]: http://forum.doom9.org/showthread.php?p=1611395#post1611395


WebRTC is being develop by IETF and W3C both of whom have very strong preferences for royalty-free (and otherwise RAND-Z/RF) standards. Why would you imagine they would pick an MPEG codec?

Maybe the low-delay wideband voice and audio codec also developed by IETF under the same requirements (both technical and royalty-wise) would fit better?

And, on the topic at hand, have Microsoft made any public statement about their support for that codec, despite the fact that it was co-developed by their Skype purchase? Only vague murmurings that the alternative standard discussed here shouldn't be "tied to a single codec". Note the masterful misdirection that would make you thing that WebRTC was "tied to a single codec" when in fact you can use any codec you like, you just have to support Opus (best in class, yet RF) and G.711 (old and busted for legacy interop) to help ensure interoperability.

The fact that the simple logic of this seems somehow underhand to you is a testament to the great political job that Microsoft, Apple, Nokia and others have done in the web codec arena where a codec, which could never be adopted under W3C rules, has come to seem like the obvious choice when really it was a bold attempt to subvert the W3C into not specifying a royalty-free codec so that a royalty-encumbered de-facto standard of their choice would fill the gap that was left.


I'd like to hear the counter, but MS's rationale here seems solid. And given they have it working compatibly in IE10 and Chrome seems like they're moving in good faith.

I know that a lot of HN likes to say, "M$ can't do that because everyone knows they're evil" -- I'd like to actually see good technical arguments for why what they're doing isn't prudent.


> I'd like to hear the counter, but MS's rationale here seems solid.

That's partly because Peter Bright (the story author) overstates Microsoft's case. SDP is a problematic part of the current spec, but most of those problems are just things that it can't do yet, and if you look at the mailing list thread responses to the messages he links to, you can see most of the discussion is about moving SDP expansion discussions back to the IETF working group. This will slow things down, but that's how standards work sometimes.

But, SDP is only a part of the spec. Meanwhile, Microsoft's proposal retains essentially none of the current WebRTC spec. That was the real issue with their proposal, coming relatively late in the game, and why it's misleading to put the disagreement in terms of only SDP. Microsoft's reps have issues with how the connections are bootstrapped, but took the opportunity to rewrite everything else while they were at it. That's certainly their prerogative, and is again how standards are sometimes produced, but the issue is more complicated than "SDP is bad, but the W3C is clinging to it" as Peter tells the tale.

I agree that Microsoft is moving in good faith, however. As a good parallel, see Firefox's incompatible Audio Data API, produced just as the Webkit-born Web Audio API was picking up steam. When specs are early and still being worked on, sometimes the best thing is to produce a working implementation, to show others that your proposal really is as good as you say it is. That isn't necessarily undermining the standards process, nor is it a sign of the process not working. It can be, but at this stage, Microsoft is still engaged, even if they are just repeating why they think their proposal is better.

It will slow things down, but standards work is sometimes really slow, especially when people disagree, and growing pains are inevitable. You can only hope this gets sorted out fast enough that the spec is still relevant by the time it comes out.


> As a good parallel, see Firefox's incompatible Audio Data API, produced just as the Webkit-born Web Audio API was picking up steam.

Wait, what? Audio Data was implemented in mid 2010 - way, way before Web Audio existed. Links:

https://wiki.mozilla.org/Audio_Data_API http://www.w3.org/standards/history/webaudio

Is there something incorrect in either of those, or some prior history to WebAudio I am not aware of?


As far as I know you are correct. Mozilla released the Audio Data API and produced a number of demos based on that. A W3C group was formed to work on Web Audio standardisation and in that group Google announced the work on their API.

Mozilla then followed with a "Media Processing" API: http://robert.ocallahan.org/2011/06/media-processing_15.html

It's possible that person you are responding to is thinking of that.


Is their rewrite of everything else better? WebRTC itself is so young that a complete rewrite shouldn't be out of consideration if needed. Nobody wants to carry bad decisions over the years anymore.


One of the working group members did an in depth look at the proposal that I thought was fairly even-handed: http://www.educatedguesswork.org/2012/08/initial_notes_on_mi...

> WebRTC itself is so young that a complete rewrite shouldn't be out of consideration if needed. Nobody wants to carry bad decisions over the years anymore.

That's partly true, but also one of the traps of discussing these things outside of the working groups in which they originate. You get new eyes on proposals, from people not caught up in the baggage of past discussions and maybe not as tired of discussing this topic ad nauseam so aren't as ready to settle for the first thing that comes along, but you also lose all the context of ongoing discussions and proposals.

And, of course, you can always rewrite to try to get the perfect spec, but at some point the rest of the world has settled on the thing that was only meant to be a temporary solution, and it's now so entrenched that it's not going anywhere. The real danger looming over any standards body is the perfect (or as he really said, le mieux -- just "the better") as the enemy of the good enough.


I think that perspective is shortsighted. It isn't that people want to carry over bad decisions. It's that everybody needs to follow the same standard, and it takes a while to get consensus among the many parties involved.

When Microsoft goes and pulls a Microsoft and completely ignores the W3C to implement their own standards, we are heading in the wrong direction. Only recently (IE 9+) did Microsoft start adhering to web standards and its been a relief.

This feels like they are abandoning that, and it's going to give me nightmares.


We're on the same side. But just having a 'standard' is not enough; it needs to fulfill it's purpose in the most useful way.

I'm by no means pro-MS, but they are not abandoning standards, they just proposed one. Google and Mozilla have implemented things their own way for the past half decade before any discussion had started, that doesn't preclude moving to a standard later on.


From what I've read, it just requires that JS authors do more. We'd see a library to go on top of theirs assuredly.


Former Microsoftie here.

Because they've done this time and time and time again, and it's never worked out well. Technical enough for you?


It's funny how many people just LOVE Sinofsky, but this is exactly the kind of crap he championed.

You can't create the future if you're afraid of it.


Technical enough for you?

No.


I think it's good their taking the non-SDP route. Hopefully they will be able to help whatever the final standard becomes by deviating from the current WebRTC.

Though I have my share of frustration with the consumer end of Microsoft, I love keeping up with all the research they provide. At least this isn't as drastic as some of their complaints about hardware standards; I think it has a good chance to contribute, and not hinder, what the W3C is doing.


>The problem is SDP. SDP wasn't originally designed for browser-to-browser communications, and some of the things that WebRTC uses it for go above and beyond the specification.

But SDP is just session description. In a way just metadata, so to speak. Hardly a big part of webRTC specification. Ok, there are some rules that govern how to choose appropriate codec and similar, which may seem complicated (but in my opinion aren't - I've worked with SIP for many years), but it seems way overboard to submit your own specification just for SDP. I think it would be many times better to try to influence webRTC specs than splitting the effort and almost certainly prolonging adoption.


[Linked from the article]

Microsoft's reasons in Microsoft's own words:

http://blogs.msdn.com/b/interoperability/archive/2012/08/06/...

The Ars article sums this up as:

Even backers of the current WebRTC specification acknowledge that the SDP problems have so far proven intractable. The WebRTC spec hinges on getting SDP fit for purpose.

So the title of the article is a bit link baity.


> Microsoft's reasons in Microsoft's own words:

Microsoft's PR official reasons in their own words.

The real reasons may or may not be related. Personally, I don't trust any statement from Microsoft about improved functionality or specification. That might be right in this case, but their past actions indicate that they care about market dominance rather than any technical detail.


I'm very hopeful that MSFT will soon adopt a common standard, whether WebRTC or an adjusted WebRTC. However, the business case does not support such a push from Microsoft. They purchased Skype for $8B, and it's their every incentive to ensure web telephony remains difficult and that Skype remains the best, cheapest, and simplest way for people to communicate with video and voice. They want Skype everywhere, but Skype becomes less compelling in a world where any two engineers can effectively build Skype-level quality (or better) inside a browser.


Wow, SDP and SIP that brings back nightmares about that stateful protocol. The code to handle them was unimaginably complicated


The interplay between open standards versus competition is fascinating. On one hand, all browser makers sharing the same standards makes my life as a software engineer easier. On the other hand, one standard to rule them all may stifle innovation and progress. My initial reaction was, "what a shocker". Then I thought a bit more deeply about it.


Seriously, why? They are basically using the exact opposite of the HTML5 video codec argument, when they and Apple said there should be only one codec. And now they say are saying there should be multiple codecs for WebRTC - because maybe now it may not completely suit them? (even though Skype uses VP8).


When did MS say there should be only one codec? MS actually said that they'll support whatever codec you had on your computer -- they just weren't going to ship <choose your favorite codec> in the box. It was Google that was saying that they'd stop supporting H264 in Chrome, period.


The issue at hand here is not alternate video codecs, it's the fundamental building blocks of the system (SDP for WebRTC, which Microsoft disagrees with) as well as the core use cases.


Digital video and audio is nothing new. At all. It's like decades old.

Getting images in browsers was never a problem.

Why is it STILL a problem to get audio and video in browsers?


<audio> and <video> pretty much work fine AFAIK. We're now talking about putting a P2P/VoIP/videoconferencing stack into the browser which is a bit more complex.


I still can't put <video src=blah></video> in a page and expect it to work in every browser. I need at least webm + mp4, or ogv + mp4, so I double the space utilization on my server for every video maintaining redundant copies, in addition to the various resolutions you make video at because ISPs are assholes and we are still in the dark ages of internet speeds.


A problem exists, because there's no standard way to access microphone and camera. And this access is absolutely essential in order to implement voip. Hence, we've got flash, which is becoming a big problem, because it's been discontinued on some platforms.


This is the exact reason we (Tinychat) never took WebRTC seriously. Its not a unified standard by any manner of speaking. Even firefox and chrome have different implementations. If we were to even attempt to get it working, we would have to have encoding and decoding since the codecs differ as well (no speex in webrtc). At least flash has the same experience across browsers.


It's work in progress. No one is expected to take it seriously in production. Yet.


And in a few years from now they will be complaining again that Google is "blocking" their browsers, by not supporting their standards.


Ohh yes good ol' 302 redirect was invented and only supported by Microsoft. I suggest you re-read the article than throw in the M$ bash at the hint of Microsoft. They have legitimate reasons for proposing the alternate. I rather have all the companies heard and come to a common ground to set the standards than one company leading the others.


Thee was a time when Microsoft could do this and get away with it. It's sad that they seem to be the last ones to know that those days are behind them.


> "I'd shut it down and give the money back to the shareholders."

It is far too early to say with certainty that Microsoft's ship has sailed. IBM turned themselves around. Apple turned themselves around. Microsoft is still in a far, far better position than either of those companies were at one point.

EDIT: Parent edited their comment to tone it down. Leaving my comment as it is.


I'd argue they are in a far better position than Google, Facebook and Apple regarding privacy and honesty (shokcing thought that). I think this will be the next battleground so they may have a chance of winning.


Why? Skype in china? Xbox live ads? Scroogled (and other anti-google campaigns)? WGA? They've done plenty of questionable things which are shady and violate privacy.


All the other guys are doing it worse - that's the thing. I'm not saying they are suddenly perfect, but they are less worse than the others.


The worst I can think of is facebooks changing of their TOS, and the companies are "scary" with the kind of data they're potentially compiling but they haven't actually done anything yet (and switching to microsoft products would simply make them the scary company).

Some things that apple does like patent abuse is something that Microsoft has done as well, so its difficult to single out apple for something like that. Or facebook's lock in is similar to the Windows API lock in.

You want to give me examples of things that make all 3 of those companies worse than microsoft?


Did you read the article? This doesn't seem like the old "Microsoft bullying others into following" story, this seems like legitimate reasoning for the moment.


They are not in a position to bully anyone, just like they were when IE3 came out. So they made IE4 which mopped the floor with NS4. And it still took a lot of time until the momentum changed and IE became popular.

And then they bullied everyone with IE 5, IE 5.5 and IE 6 - like they did (and still do) with Office (OpenXML), SMB, Kerberos and a variety of other products.

Of course their statement is going to be about merit - they've never said "oh, we're bullying here" even when they were. And their statement MIGHT even have some merit - but just like in the OASIS OpenOffice document spec, instead of working with the industry, they arrive late with their own underspecified and alternative specification.

It's same old Microsoft.


It's the same old same old... Missing spec - to replace a working spec. Preserving their Skype IP - if WebRTC is widely adopted Skype is done.


Well it won't matter. By the time enough people have migrated from the old versions of IE (which will most certainly not support this tech) tablets will be the target market.

Microsoft should really realize it is a has-been and not enter new areas.


As long as there is a JS implementation that they're providing (FTA) then I don't see whats worth a big fuss. I was worried from the headline that this would pre-empt serverless p2p VoIP from Chrome<->IE, but that doesn't seem to be the case.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: