> Consider that the name of local variables is never part of the binaries, only public symbols are.
"Never" and "only public" are wrong in the statement above, because non public symbols were indeed released by Microsoft.
I guess you are young enough not to know that Microsoft accidentally did release some NT builds with the names of the internal variables, and such builds were intentionally made with less compiler optimizations, allowing for easier reversing. Such events of releasing the internal names resulted in some very interesting stories and statements:
"_NSAKEY was a variable name discovered in Windows NT 4 Service Pack 5 (which had been released unstripped of its symbolic debugging data) in August 1999 by Andrew D. Fernandes of Cryptonym Corporation."
Also, the Windows code is under shared source license for nearly 20 years. Not really some sensitive secret thing or crown jewels that many want us to believe.
Back in NT3.x times, for quad-processors machines, and IIRC even NT 4.x with 8-processors machines (like AXEL crazy SMP monsters) Microsoft shipped windows NT as code, and you had to compile it on the machine it was to run on.
Microsoft does not release the kernel’s private symbols, trust me on that. But yes there was some leaks in the past, small portions of NT4 and W2K were leaked, I think I link to a Google query pointing to articles discussing the leaks in the Quora reply.
Well that's categorically untrue. Sure they don't release private symbols intentionally but they have done in the past accidentally. At that point it becomes a bit of a grey area, undoubtedly leaked/stolen source code is a no-no but reversing from private symbols when they do leak seem harder to quantify as you still need to reverse engineer the code, just structures/names etc are already known.
Private symbols are not the only way of gleaning more information, other examples I can think of are:
* Checked builds (prior to Win10). These builds shipped de-optimized kernels (e.g. no inlines) typically with copious debug strings which gave away important details. For example I gleaned a lot of knowledge of ALPC MSRPC from the checked build of rpcrt4.dll from Windows 8.
* SDK/DDK headers, especially in the brave new world of insider previews with preview SDK/DDKs there is sometimes information present which should not have been released including "private" information. Again bit of a grey area.
* The private symbols MS do ship. For example a significant proportion of the COM runtime has private symbols, intentionally. You can extract from those a surprising amount of system call structure information.
I'd recommend watching Alex Ionescu's talk at OffensiveCon about how he does reverse engineering on Windows to see many of these things in action. https://www.youtube.com/watch?v=2D9ExVc0G10
I'm not saying any of this would make it a clean-room re-implementation but to say ReactOS cannot possibly have been reverse engineered without just up and copying source isn't true.
I’d love that you point to an instance where the private symbols of the kernel actually shipped?
It is very possible that some private symbols were part of some leak, but stolen data does not qualify as “shipping” :)
Again, I stand behind my opinion. I eyeballed some of the code side-by-side and there was portions where I could literally see a line-by-line correlation, which I can hardly explain.
Then if reversing the kernel is so doable using legitimate means, why ReactOS is still largely stuck in the early 2000’s, coincidentally where the major leaks happened?
The private symbols have in the past ended up on the public symbol server (and quickly taken down), they have ending up "shipping" in public symbol packs. I can't point to specific incidents as the links to them no longer exist. This is why I said accidentally as they were not released as a conscious effort on MS's part.
However you seem to want to claim the only place those symbols can come from is being stolen. Of course in this case you use leak as a synonym for stolen, bit leak can just as much mean they were released accidentally by the owner, MS can't steal their own private symbols and release them on the web. I'm sure there's some symbol files traded in private scenarios which are actually taken through non public means but there have been actual incidences of public release of private symbols.
I'm not trying to claim that ReactOS is clean, I have no skin in the game from a project or user perspective. For all I know it might have lifted significant portions of its code from stolen source code or the WRK (which isn't stolen in so much as used without permission, which I'd regard as a totally different thing). I do however take exception to the typical software engineer's view there are somethings which cannot be reverse engineered into a almost similar form.
As to why ReactOS is stuck in the early 2000s, it could be because of all the source code which was stolen and put wholesale into the project. Although if that was the case I'd have expect MS would have sued the living shit of the project by now. It could also be because Windows was and is a very complex OS with many layers which if you're trying to re-implement with a team of 10s to 100s versus 1000s it's going to take a lot of time. It's seems unlikely that the project would spend the millions of man hours to create the abomination that is UWP.
Perhaps the best way to determine if ReactOS is unclean is for MS to open source the Windows Kernel, hell why would you even need ReactOS then :-)
The above link points to: “OffensiveCon19 - Alex Ionescu - Keynote - Reversing Without Reversing” and to 9th minute of the talk where it’s very relevant to the current discussion.
Often hotfixes came with private symbols, Microsoft has traditionally been very slack on this.
I’ve seen private symbols for sql server with the guid to switch editions published on the public symbol server for at least 6 months before they were pulled.
Full releases and service packs typically are stripped very well but if you are saying that no private symbols have been published to the public symbol servers then you are incorrect.
The only product that has been effective at stripping symbols traditionally has been office, they were always stripped if you could even get hold of them which was unlikely.
Don’t forget also you could download the checked windows builds which were very open.
> Then if reversing the kernel is so doable using legitimate means, why ReactOS is still largely stuck in the early 2000’s, coincidentally where the major leaks happened?
Because they don't care or need about the newer MS stuff and also don't have the resources either.
Also they can exploit Microsoft good record of backward compatibility, once you have a good enough lower API compatibility, you can just install a lot of newer MS tech directly on top of it.
> Sure they don't release private symbols intentionally but they have done in the past accidentally.
Even if that is the case, it's an incredibly poor idea to use them, so that the code ends up with spurious similarities in spite of being (otherwise) cleanly developed.
Are you arguing that MS mistakenly releasing a version of Windows with debug symbols didn't happen or that it constitutes a leak and isn't fair game?
Because the former is fairly well documented and the latter doesn't seem right in the context of this discussion. If MS themselves messed up and published the symbols through an official channel that's fair game IMO. Although obviously IANAL etc... I'm talking from an ethical perspective, not a legal one.
I don't know much about ReactOS or the NT kernel but we have this type of controversy regularly in the emulation scene and while sometimes it's true that people reuse docs they shouldn't have, a lot of the time people underestimate the skill and cunning of reverse-engineers to figure out how things work without having access to any restricted information.
Kind of late to the game here, but MS could ship actual source code and copying that code would still be copyright infringement. Reverse engineering for inter-operability is legal in many places, but copying implementations is not. Even having seen the code, you would have to reimplement it in a way that worked equivalently, but was different. The OP is claiming that even things like macros have the same implementation and names (i.e. code that is never exposed publicly in any way). Even if you could deduce this from debug symbols (which would be quite tricky, but probably not impossible), you've got to find another way to do that work.
I don't really subscribe to a belief of absolute morality, but in the context of the discussion, I think that no matter how you got access to that code, if you say that you are reverse engineering it, then copying an implementation is not doing what you are saying you are doing (as well as being copyright infringement).
> Are you arguing that MS mistakenly releasing a version of Windows with debug symbols didn't happen or that it constitutes a leak and isn't fair game?
I think he's saying that even having access to leaked or accidentally released originals is not implicit permission to use it freely. Otherwise any piece of software that was ever legitimately released would be fair game, just throw it at a decompiler and profit.
If you're making a clean room design having so many similarities to the original is unlikely to happen accidentally.
Anybody implementing a clean room design should theoretically have no prior knowledge of the original's inner workings. The specs are written by one person, checked to not include any of the original material by a second one, before being passed to a third to be implemented.
From far enough a piece of wire and an isolation transformer do the same thing. The secret sauce is in that isolation, you can't just shunt it and pretend it's the same.
Personal attacks and name-calling, as here and in your previous two comments, are not ok on HN. We ban accounts that do that. It's great if you're providing correct information, but please edit out the swipes.
"Never" and "only public" are wrong in the statement above, because non public symbols were indeed released by Microsoft.
I guess you are young enough not to know that Microsoft accidentally did release some NT builds with the names of the internal variables, and such builds were intentionally made with less compiler optimizations, allowing for easier reversing. Such events of releasing the internal names resulted in some very interesting stories and statements:
https://en.wikipedia.org/wiki/NSAKEY
"_NSAKEY was a variable name discovered in Windows NT 4 Service Pack 5 (which had been released unstripped of its symbolic debugging data) in August 1999 by Andrew D. Fernandes of Cryptonym Corporation."