You do see this in embedded for several reasons, including:
- For high-volume production, reducing ROM size can have a big impact on profitability (this is less true than it was 20 years ago, but still true), so your dev boards will have large EPROMS and your production boards will have small ROMs
- Debugging tools present may allow for easier reverse-engineering of your devices
Obviously the devices go through a lot of testing in the production environment, but things like error-injection just may not exist at all, which limits how much you can test.
Compared to basically every other part of release qualification (manual QA, canarying, etc.) re-testing on the prod build is so unbelievably cheap there's no reason to not.
I suppose we're referring to different kinds of testing. Manual QA, etc, on prod sure.
But if you're building client software artifacts, to unit test or integration test involves building different software in a different configuration, with different software and running it in a test harness. To facilitate unit testing or integration testing client software you:
- Build with a lower optimization level (-O0 usually) so that the generated code bares even a passing resemblance to what you actually wrote and your debugger can follow along.
- Generate debug info.
- Avoid stripping symbols.
- Enable logging.
- Build and link your tests code into a library artifact.
- Run it in a test harness.
That's not testing what you ship. It's testing something pretty close, obviously, but does not bear any semblance to a deterministic build.
On the contrary; it's quite possible to design automated tests that operate on release artifacts. This is true not only at the integration level (testing the external interfaces of the artifact in a black-box manner), but also at a more granular level; e.g., running lower-level unit tests in your code's dependency structure.
It's true that not all tests which are possible to run in debug configuration can also be run on a release artifact; e.g. if there are test-only interfaces that are compiled out in the release configuration.
I think maybe the source of the confusion in this conversation is perhaps the kind of artifact being tested? For example, if I were developing ffmpeg, to choose an arbitrary example, I would absolutely have tests which operate on the production artifact -- the binary compiled in release mode -- which only exercise public interfaces of the tool; e.g. a test which transcodes file A to file B and asserts correctness in some way. This kind of test should be absolutely achievable both in dev builds as well as when testing the deliverable artifact.
> I, uhh, usually do this in my released software too.
Do you have any idea how annoying it is to get logged garbage when starting something on the command line (looking at you IntelliJ)?
I once spent several weeks hunting through Hadoop stack traces for a null pointer exception that was being thrown in a log function. If the logging wasn’t being done in production, I wouldn’t have wasted my life and could have been doing useful things. Sadly, shutting down the cluster to patch it wasn’t an option, so I had to work around it by calling something unrelated to ensure the variable wasn’t null when it did log.
Yes, which is why I regularly (think quarterly or annually) check to make sure we have good log hygiene, and are logging at appropriate log levels and not logging useless information.
I have alerting set up to page me if the things I care about start logging more than the occasional item at ERROR, so I have to pay some attention or I get pestered.
Hrmm. Surely the vast majority of testing happens on non-release builds, despite the fact that release builds may also be tested. Unit tests are generally fastbuild artifacts that are linked with many objects that are not in the release, including the test's main function and the test cases themselves. Integration tests and end-to-end tests often run with NDEBUG undefined and with things like sanitizers and checked allocators. I would say that hardly anyone runs unit tests on release build artifacts just because it takes forever to produce them.
When I was at Google we ran most tests both with production optimizations and without. There is no reason not to do it since the cost of debugging those problems is huge.
> Surely the vast majority of testing happens on non-release builds, despite the fact that release builds may also be tested.
Of course.
> I would say that hardly anyone runs unit tests on release build artifacts just because it takes forever to produce them.
I don't know that this follows: just because 99% of the invocations of your unit test are in fastbuild doesn't mean that you don't also test everything in opt at least once.
I can't remember seeing any cc_test target at Google that ran with realistic release optimizations (AutoFDO/SamplePGO+LTO) and even if they did it's still not the release binary because it links in the test case and the test main function.
Did you look in the CI system for configurations there? I see FDO enabled in those tests. (Speaking at a high level, configurations can be modified in bazelrc and with flags without being explicitly listed in the cc_test rule itself)
> release binary because it links in the test case and the test main function.
Sure, but it's verifiably the same object files as get put into the release artifact.