rand() may call malloc()

marcan_42 · on April 9, 2022

The solution isn't to stop using rand(). The solution is to stop using newlib.

If you're doing your own custom memory management like this, you shouldn't even have a malloc implementation at all. Even newlib is too bloated for your use case. At this point, chances are you're using a trivial subset of the C library and it'd be easy to roll your own. You can import bits and pieces from other projects (I personally sometimes copy and paste bits from PDClib for this). In such a tight embedded project, chances are you don't even have threads; why even pull in that reentrancy code?

Freestanding C code with no standard library isn't scary. If you need an example, look at what we do for m1n1:

https://github.com/AsahiLinux/m1n1/tree/main/src

In particular, this is the libc subset we use (we do have malloc here, which is dlmalloc, but still not enough of libc to be worth depending on a full one):

https://github.com/AsahiLinux/m1n1/tree/main/sysinc

klodolph · on April 9, 2022

I don’t think there’s any reason to be worried about a “bloated” library if you understand the components of that library. I’ve used Newlib in embedded projects without malloc and it’s fine. It’s easier and faster to check what parts of Newlib you are pulling in than to try and reimplement it on your own.

It’s a trap that people fall into to think that it’s easy to roll your own. It depends on what, exactly, you are rolling, what resources you have, etc. Maybe I need a couple math functions, a decent implementation of memset/memcpy/etc, and having the C library at my disposal gives me those things.

The idea that you throw out Newlib just because one function pulled in malloc seems unjustifiable to me.

On the other hand, I think it’s quite normal to want your own PRNG.

toast0 · on April 9, 2022

I don't know about newlib's library or code structure, but I've got a project I'm working on where I don't want a whole libc, but it's pretty easy to pull in bits and pieces of the FreeBSD libc; when I come across something else I need, I just add the C file it's in to my Makefile and that's usually enough. For the things that are self-contained, I don't need to build the environment libc expects, and if I try to include something that's not self-contained, the linker will yell about the missing symbols.

klodolph · on April 9, 2022

The main use case for Newlib is embedded systems, and Newlib supports more architectures than FreeBSD does. Newlib also includes a few assembly routines for functions like memcpy and memset.

toast0 · on April 9, 2022

Sure, my point is, the question isn't use newlib or write your own, you can probably pick pieces out of it, rather than using the whole thing, by avoiding their build system and just using their source. I just don't know if the code (or license) is structured for that.

extheat · on April 9, 2022

If rand() gives you trouble who knows what else in the C library may give you more trouble. It’s not that hard to copy paste or even write your own implementation for basic parts of the C API, minus the allocation related things. For me I’ve found GitHub Copilot great for filling in basic stubs for C stdlib APIs.

klodolph · on April 9, 2022

> If rand() gives you trouble who knows what else in the C library may give you more trouble.

Ooh, I can answer this one! It turns out that I know what else in the C library can give you trouble. Newlib is open-source. For embedded projects, I find myself reading the source code to Newlib, and sometimes looking at the disassembly to double-check what version of a function I’m getting.

If I had gotten burned by having malloc included when I didn’t wanted (never happened to me personally), I would consider running "ar d" to remove it from the copy of the library I’m using. It’s pretty easy to modify static libraries.

> For me I’ve found GitHub Copilot great for filling in basic stubs for C stdlib APIs.

I might want a decent implementation of snprintf, might want some decent implementation of memcpy/memset/memmove. I’ve implemented all of these myself, but it’s a pain, and I generally would rather grab Newlib rather than trust Copilot.

monocasa · on April 9, 2022

I've been a fan of importing pdclib functions piece by piece as I need chunks of libc in deeply embedded situations. Simple enough that I can audit everything for the semantics of each functions dependencies without any rigamore, but without the sirens of yak shaving singing to me like when I get the urge to write libc stuff myself in most cases. Although the CC0 license is a bit unfortunate in some cases for source code.

adhesive_wombat · on April 9, 2022

> Although the CC0 license is a bit unfortunate in some cases for source code.

CC0 isn't a viral license, you have literally no obligations if you use it: no attribution, no license compatibility worries, no relicensing issues, not even a need to mention it at all.

monocasa · on April 9, 2022

The issues are deeper than a question of virality.

> Can I apply a Creative Commons license to software?

> We recommend against using Creative Commons licenses for software. Instead, we strongly encourage you to use one of the very good software licenses which are already available. We recommend considering licenses listed as free by the Free Software Foundation and listed as “open source” by the Open Source Initiative.

> Unlike software-specific licenses, CC licenses do not contain specific terms about the distribution of source code, which is often important to ensuring the free reuse and modifiability of software. Many software licenses also address patent rights, which are important to software but may not be applicable to other copyrightable works. Additionally, our licenses are currently not compatible with the major software licenses, so it would be difficult to integrate CC-licensed work with other free software. Existing software licenses were designed specifically for use with software and offer a similar set of rights to the Creative Commons licenses.

https://creativecommons.org/faq/#can-i-apply-a-creative-comm...

detaro · on April 9, 2022

read further, its not counting CC0 as a CC license in that context:

> Also, the CC0 Public Domain Dedication is GPL-compatible and acceptable for software. For details, see the relevant CC0 FAQ entry

following that link: https://wiki.creativecommons.org/wiki/CC0_FAQ#May_I_apply_CC...

> May I apply CC0 to computer software? If so, is there a recommended implementation?

> Yes, CC0 is suitable for dedicating your copyright and related rights in computer software to the public domain, to the fullest extent possible under law. Unlike CC licenses, which should not be used for software, CC0 is compatible with many software licenses, including the GPL. However, CC0 has not been approved by the Open Source Initiative and does not license or otherwise affect any patent rights you may have.

monocasa · on April 9, 2022

The other issues still exist; not everything is GPLed.

Particularly the license has been an issue getting through legal in older companies where it might not be whitelisted. Particularly with their FAQ still saying not to use it for software.

johnny22 · on April 9, 2022

picolibc seems nice enough

kevin_thibedeau · on April 9, 2022

Newlib can be configured without reentrancy support and you can do that in a multi-tasking environment while still using their malloc provided you implement some locking callbacks. This saves a ton of static state for non-reentrant library functions you're unlikely to ever need and can replace with safe variants if you do need them.

marcan_42 · on April 9, 2022

When you are working in a tight embedded system, instead of fighting to slim down a library that is designed to scale to significantly larger systems, it makes much more sense to start from zero and add only what you need. You also shouldn't update your dependencies blindly, ever (even updating the compiler should be done with care).

The considerations are very different from, say, developing apps for an operating system. There is a gradient between those, and the software development considerations shift as you move along. Most people doing embedded development defer to libraries way too early - usually because they're either pure hardware people who haven't learned low-level firmware bring-up and rely on vendor tooling, or people from a higher level software background who find the idea of bare metal scary.

You can see an example of this gradient in the Asahi Linux boot chain:

- m1n1: bare metal, no threads (barebones SMP support for research only), statically linked, no device model, not readily portable, 64-bit only, assumes everything is an M1/Apple Silicon, embedded libc subset, vendored (C: dlmalloc, printf, decompression algorithms, libfdt) or git submodule (Rust: FAT32 implementation) dependencies, single-purpose NVMe & FAT32 implementations (no abstraction).

- u-boot: bare metal, no threads, statically linked, basic device model, portable, embedded libc subset, vendored dependencies, basic filesystem & block device abstraction.

- GRUB: runs on EFI environment, no threads, dynamically linked modules, portable, filesystem & block device abstractions, no actual modifications for Apple Silicon (we use vanilla bins)

- Linux kernel: bare metal, complex thread model, dynamically linked modules, rich device model, portable, embedded libc subset, vendored dependencies, rich filesystem and block device abstractions.

- Linux userspace: you know this.

Notice how none of the components before userspace use a full fat libc, and only have minimal dependencies which are carefully controlled - and this is on a system with gigabytes of RAM.

Personally, I would only use newlib on systems which are roughly equivalent, in terms of model/software stack, to a full desktop OS. That is, embedded systems with at least a threaded RTOS and a filesystem abstraction, possibly a BSD-style sockets layer. Anything smaller than that (no threads? lwip callback style sockets? no filesystem?), you're better off rolling your own.

Espressosaurus · on April 9, 2022

Even with a threaded RTOS (assuming an embedded target like your typical MCUs) I still prefer going without dynamic allocation--it makes it much easier to reason about and you don't have to worry about heap fragmentation. When you have RAM you can count in kilobytes chances are you should be thinking about how and where you're using it.

MBCook · on April 9, 2022

There’s a footnote at the end of the article that that’s basically what happened. They upgraded one of their tool chains and it came with a version of newlib that had been compiled reentrant when the previous version hadn’t.

noobermin · on April 9, 2022

My immediate thought was to turn off the re-entrancy code, there has to be a #define switch somewhere you can use to disable it since you won't ever need that. That's the better solution, although using your own prng is also good.

Also, this seems to be suggesting they are in fact importing all of newlib with its malloc code onto their code which is...kind of a waste. Again, if they could throw that re-entrancy switch somewhere they'd probably eek out a KB or so for the code on their eeprom or w/e holds it. Again, the better solution isn't to keep using newlib with that shit turned on (lol) but then implement your own rand() which will take up a few more bytes in addition to the newlib rand() that you won't use, turning off the re-entrancy is all around the right solution, assuming they want to keep using newlib and won't roll their own (which is by far even better assuming they have the will for that and the time, although as you point out, you don't need a lot of the stdlib, basically most of stdio is out, str to int converters, and so on, it might be worth a future investment to have their own mini library).

chrsig · on April 9, 2022

So...it wasn't really rand() at fault here, it was some reentrancy library that newlib uses...

For some reason the blog lists the source for rand_r, to which the caller supplies the state rather than using global state.

I wonder if they'll run into this with any of the non-reentrant C functions (strtok, I'm looking at you). It makes their conclusion of "we stopped using rand()" a bit wanting.

I mean, it's great that they stopped using rand(), that's a good move, and pragmatic given their issue, it just feels like it's a really surface level conclusion.

raggi · on April 9, 2022

came here to raise the same thing, it's odd that this rand was allocating. it seems a little odd to be fixing reentry in rand() and not rand_r() when rand is already so bad anyway. I'm not sure what's worse between doing this reentrancy dance and dropping the taocp magic number accidentally if the static isn't seeded (as in musl). rand is bad, stop using rand was the right answer in any case.

piker · on April 9, 2022

To be fair they did say at the end they added a regression test to look for malloc calls in the binaries.

AdamH12113 · on April 9, 2022

This sort of thing is why I prefer to leave the C standard library out of my microcontroller code altogether. It keeps me from having to worry about what PC-centric assumptions they may be making under the hood, and they're often a bit bloated anyway.

Something you do have to watch out for regardless is accidental promotions of floats to doubles. C loves float promotion, and if you're not careful (and don't use the special compiler options) it's easy to turn a one-cycle FPU operation into a hundred-cycle CPU function call.

I keep thinking there ought to be a better language for bare-metal microcontroller programming, but the demand for it is so small I'm not sure who would bother supporting it.

_vdpp · on April 9, 2022

Ada fits the bill well here, with zero-footprint runtimes available and also a tool for statically determining the stack size at compile-time: https://www.adacore.com/gnatpro/toolsuite/gnatstack

azinman2 · on April 9, 2022

Is there an OSS version of such a tool?

I’d love to see Ada get “modernized” where the syntax could fundamentally change but the ideas are similar. Being able to explicitly say the range of values is pretty cool, for example.

_vdpp · on April 10, 2022

Ada syntax takes a little while to get used to, but replace {} with begin/end and it otherwise feels like any other procedural language out there.

This is one of the better intros: http://cowlark.com/2014-04-27-ada/index.html

One of my favorite lesser-advertised features is the ability to return arbitrary-length arrays or sum types from functions without heap allocation due to the use of a secondary stack.

_vdpp · on April 10, 2022

One of the more experienced embedded Ada devs out there wrote this: https://github.com/simonjwright/stack_usage

I’m not sure if it would meet your requirements or not.

legalcorrection · on April 9, 2022

You probably know, but for the benefit of the others, at least some compilers support warning for that with -Wdouble-promotion.

markrages · on April 9, 2022

After about the fourteenth time I accidentally blew my timing and size budgets with double-precision softfloat routines, I learned of -Werror=double-promotion.

nicoburns · on April 9, 2022

Have you looked at Rust? The Rust embedded ecosystem is still a bit immature, but there is whole ecosystem of std-lib free libraries - some specific to hardware, others generic implementations of useful functionality.

wongarsu · on April 9, 2022

And while the headline features of Rust are memory safety and fearless concurrency, the whole language and ecosystem is designed around making it hard to do the wrong thing on accident.

Which in this instance also means that floats stay floats, doubles stay doubles and u8 stays u8 unless you explicitly convert. If you're now asking what happens when you add a float to a double: that's a compiler error, you have to explicitly cast.

jcranberry · on April 9, 2022

Is Rust good for embedded? I thought the binaries they produced were too fat?

nicoburns · on April 9, 2022

The extra fat in Rust binaries compiled with default settings is things like:

- the standard library

- jemalloc (this is no longer the default, but is still a factor in people's perception about binary size because it used to be)

- debug symbols

- unwinding support for panics

All of these can be turned off for use cases like embedded where binary size is important and you'll then get binaries of comparable size to those produced by C compilers.

steveklabnik · on April 9, 2022

You just have to pay attention. I work in Rust on embedded, it’s not a problem. The smallest x86_64 binary rustc has ever produced is like, 145 bytes. I’m on Arm, we have space for a lot more, but still actively try to keep sizes down. Some tasks in our system are ~500 bytes, the larger ones are on the order of 10k.

You can’t just use the broader ecosystem willy nilly, and if you want truly small stuff, you have to know how to coax the compiler from time to time. But that’s just always going to be the case.

SAI_Peregrinus · on April 9, 2022

It statically links. But bare-metal embedded generally doesn't support dynamic linking at all, so that's not a real problem there.

A lot of the rest of the fat is from things like debugging symbols bounds checks that aren't included when building in release mode.

TingPing · on April 9, 2022

Embedded is a wide field. It's young but has promise.

SnowHill9902 · on April 9, 2022

That’s more about the compiler than C.

traes · on April 9, 2022

My friend, the compilers are implementing that behavior based on the C standard, which is the literal definition of what C is. It's absolutely about C.

chrisseaton · on April 9, 2022

What do you think the compiler should be doing differently? Is there’s no FPU support then what is the alternative?

languageserver · on April 9, 2022

it is one of the unfortunate things about C, a language I love and defend dearly, is that on one hand it makes few assumptions about the system and is "bare metal" enough to shoot yourself, but at the same time it makes no assumptions about what is now agiven on modern systems

zgs · on April 9, 2022

Remove malloc from your libc and catch this entire family of problems at link time.

MarkSweep · on April 9, 2022

This is the correct answer. If you are really doing what the article says (allocating all memory at compile time), why is malloc() being linked in at all?

renox · on April 9, 2022

What we do at day job is to have a 'whitelist' of all malloc callers which is checked by the CI.

SAI_Peregrinus · on April 9, 2022

Sometimes you want to allow dynamic allocation at application startup, but never thereafter. Free happens implicitly when the system reboots.

This lets you use dynamically sized buffers, depending on the particular configuration present at startup. Very handy to not need to build many different configs and maintain separate binary images (and keep track so that you never update hardware with the wrong image). The disadvantage is that you can't just remove malloc() in such a design.

furyofantares · on April 9, 2022

The article contains a fair bit of fluff that may be aimed at developers without familiarity with embedded systems or I guess C, I'm not exactly sure what's with the tone but the short of it is:

- They've got an embedded system with all static allocation (they do mention having a dynamic allocator though)

- They've recently found a stack corruption crash which was traced back to rand() calling malloc() which is both unexpected and also malloc should never be called in the system at all

- They traced it back to their usage of newlibc configured to add reentant suppose for c functions that don't don't support it, and this support uses malloc, so when they call rand() this results in a call to malloc()

- A recent tooling update caused them to be using newlibc built this way; they previously were using newlibc configured without this feature

- Their solution is to not use rand() and instead use a different prng, and to write a tool that detects use of malloc() and fails the build

Animats · on April 9, 2022

This is why Rust has both "std" and "core". "core" lacks allocation capability.

ajuc · on April 9, 2022

We had a team project at university to write some graphic demo in 3d from scratch (no open gl, no fancy graphic libraries, just plotting pixels onto screen).

We did, and it was pretty slow despite all the microoptimizations (for example we used fixed point math). It was at the time when hyperthreaded CPUs were introduced and my friend added support for multithreading - 1 thread would draw 1 half of the screen. It broke our code and we found out rand() wasn't thread-safe.

So being the inexperienced dumbasses we were - we added locks around the rand() calls :). Which made the whole multithreading useless but we didn't realize it then :) What we should do is implement rand() on local variables, it's like 3 lines of code :)

matthews2 · on April 9, 2022

There's also rand_r() if you didn't want to write that three lines of code :)

urbandw311er · on April 9, 2022

It’s possible the architect of the article is actually a marketeer who wanted to publicise the company and its “careful approach” to IOT. Be careful what you wish for! Now your engineers just look a little inexperienced as a result.

adunk · on April 9, 2022

OP here.

I think part of the message here is to not be afraid to come off as inexperienced as an engineer.

This particular call to rand() was written by a person with decades of experience with writing code like this. This person maybe isn't a starninjarock developer, but definitely experienced. Who did it? Let's git blame the code:

  13a3029435b core/lib/random.c (adamdunkels  2009-02-11 11:09:59 +0000 52)   return (unsigned short)rand();

(Yep, that's me!)

unwind · on April 9, 2022

I had to check the author ...learned that Thingiverse are Swedish, based in Stockholm even.

The article was written Adam Dunkels who is certainly not without creds in the embedded world. Why whould he show the code to rand_r() in an article about rand()? So confused now.

AlotOfReading · on April 9, 2022

You shouldn't be confused if you read the article. The actual PRNG is implemented in rand_r() and is the first place most people would look. The issue is in the reentrant wrapper in rand().

unwind · on April 10, 2022

I seem to be catching downvotes, but I did read the article. The text right before the rand_r() code box reads:

This is the code of the rand() function, which looks familiar:

That has to count as at least a typo, but I think it's a bigger editing-type problem.

togaen · on April 9, 2022

Interesting article, but, man, incredibly annoying writing style. It reads like a LinkedIn post. Use normal sentence/paragraph structure and say what you’re going to say.

matthews2 · on April 9, 2022

  > The stack memory is somewhat tricky to allocate, because its maximum size is determined at runtime.
  >
  > There is no silver bullet: we can’t predict it. So we need to measure it.

When you're extremely memory constrained, you should probably know the max stack depth at different points in your program.

GCC has -Wstack-usage=BYTES to warn you if a single function uses more than BYTES of stack space (VLAs and alloca not included), which admittedly isn't too useful if your function calls another...

touisteur · on April 9, 2022

I know it's offtopic, but adding perf probes to glibc's malloc in a running system was quite revealing: - qsort calls malloc - and qsort is used a lot in glibc internal functions such as gethostbybame so... Yeah. - snprintf can call malloc too

I'm sure I'm not finished discovering fun stuff.

noobermin · on April 9, 2022

As someone who dabbles with NES programming, 10KB is quite a lot of ram for me. NES programming is almost always in 6502 asm and at least when I do it I don't even need the stack. Generally people do not use it for storage, they use the stack just for function calling, and generally uses just a fraction of the zero page for the most part. Instead, every subroutine either gets its own variables somewhere in memory or you use zero page and have to be careful that the interrupt handlers do not disrupt the parts of zero page that other routines might use. It's a lot harder perhaps but it does force you to be more intentional with your memory usage. I'll admit I've never coded any like crazy demos (yet) but programming in this way, I've yet to use anywhere near to the full 2KB for standard ram the NES has.

CHR RAM (like graphics ram, sort of) is a different story perhaps, but at least for the game logic and some sprite manipulation, yeah 2KB for an 8bit game is plenty. I think programming in C (which requires a stack, also probably requires floats which the 6502 has no native support for and would require implementation by the compiler) adds a lot of convenience but the overhead is too much for a NES, although there are libraries and tools for NES C programming out there.

jffhn · on April 9, 2022

And in Java, Math.cos(double) might do allocation as well, for large angles reduction. In Jafama I took care to avoid that, by encoding the quadrant in two useless bits of reduced angle exponent.

bombcar · on April 9, 2022

Just use dilbert_rand(). Threadsafe reentrant and doesn’t allocate.

SourceParts · on April 9, 2022

Can you.

Stop.

Writing blog posts.

Like Linkedin posts.

Like this.

It makes them.

Hard to read.

levidos · on April 9, 2022

And too much repetition

boomlinde · on April 9, 2022

And you know what people think about that?

Annoying.

How annoying?

Very annoying.

SeanLuke · on April 9, 2022

His solution is to use a PCG generator? As far as I know its designer withdrew submission of a journal article after negative reviews and has never resubmitted one. And this doesn't look good:

https://pcg.di.unimi.it/pcg.php

jart · on April 9, 2022

If you have enough hardware to do multiplication and aren't doing cryptography, you can always use my random number generator instead.

    uint64_t lemur64(void) {
      static uint128_t s = 2131259787901769494;
      return (s *= 15750249268501108917ull) >> 64;
    }

The test suite on that page says it's fine.

    master jart@nightmare:~/cosmo2$ o/examples/getrandom.com -b lemur64 | RNG_test stdin64 -seed 2131259787901769494
    RNG_test using PractRand version 0.95
    RNG = RNG_stdin64, seed = 2131259787901769494
    test set = core, folding = standard (64 bit)
    
    rng=RNG_stdin64, seed=2131259787901769494
    length= 512 megabytes (2^29 bytes), time= 3.2 seconds
      no anomalies in 229 test result(s)
    
    rng=RNG_stdin64, seed=2131259787901769494
    length= 1 gigabyte (2^30 bytes), time= 7.5 seconds
      no anomalies in 246 test result(s)
    
    rng=RNG_stdin64, seed=2131259787901769494
    length= 2 gigabytes (2^31 bytes), time= 15.0 seconds
      no anomalies in 263 test result(s)
    
    rng=RNG_stdin64, seed=2131259787901769494
    length= 4 gigabytes (2^32 bytes), time= 28.6 seconds
      no anomalies in 279 test result(s)
    
    rng=RNG_stdin64, seed=2131259787901769494
    length= 8 gigabytes (2^33 bytes), time= 55.7 seconds
      no anomalies in 295 test result(s)
    
    rng=RNG_stdin64, seed=2131259787901769494
    length= 16 gigabytes (2^34 bytes), time= 108 seconds
      no anomalies in 311 test result(s)

There's also Scott Adams' prng:

    uint64_t dilbert_rand(void) {
      return 9;
    }

Another one of my favorites is UNIXv6:

    int rand(void) {
      static int gorp;
      gorp = (gorp + 625) & 077777;
      return gorp;
    }

adrian_b · on April 9, 2022

Increasing the size of the state of a PRNG always improves a lot the quality of the generated sequence of numbers.

While multiplicative congruential PRNGs with a 32-bit state are poor, those with an 128-bit state, like yours, may be quite good.

However, for any application that needs pseudo-random numbers, you must have a good understanding of its requirements, in order to be able to choose the right PRNG.

For example, when the random numbers are used to choose the coordinates of random points in a multi-dimensional space, simple congruential PRNGs are guaranteed to show some correlations between the coordinates that should have been independent, though for a PRNG with an 128-bit state, unless it uses a poorly chosen multiplicative constant, the correlations will appear only for rather long sequences.

There is another kind of PRNG, which has exactly the same speed as yours, but which produces much better pseudo random sequences, the MLFG (Multiplicative Lagged Fibonacci Generator, 1984). Unlike yours, it is a non-linear generator, which is the reason for the good quality. While the speed is the same, it uses a larger state.

The choice of a PRNG can be important in 2 cases, for embedded microcontrollers, like in the parent article, and for applications which have to generate a huge amount of random numbers, e.g. billions of numbers per second.

Otherwise, there is a simpler choice, just use AES in counter mode as the PRNG.

Even when cryptographic strength is not needed, it does not hurt. The AES speed on all modern CPUs is not much lower than that of the fastest PRNGs, and only very few applications need faster PRNGs than AES. Moreover, a PRNG in counter mode has many advantages over the PRNGs based on a recurrence formula. It is possible to access quickly any arbitrary point in the random sequence. It is possible to split the random sequence in any number of subsequences, which can be computed in parallel and which are guaranteed to be uncorrelated, which is important in many multi-threaded programs.

jwmerrill · on April 9, 2022

O’Neil has responded to those criticisms here: https://www.pcg-random.org/posts/on-vignas-pcg-critique.html

rurban · on April 9, 2022

quality is not the problem with pcg, but size. he could have used newlib's rand_r(), because that's already provided. or another simple and short prng, like xoroshiro64s.

btw I never needed a malloc for my baremetal firmwares so far. 10KB just for a stack and heap? crazy, that's way too much. recursion is forbidden anyway, and sbrk not provided. some unfortunate libc's need that, but MISRA will not be happy.

oh_sigh · on April 9, 2022

I agree with the article that a MWC generator could be better, but for their use case, a really crappy rand() might be more than adequate.

languageserver · on April 9, 2022

  int
  rand_r (unsigned int *seed)
  {
        long k;
        long s = (long)(*seed);
        if (s == 0)
          s = 0x12345987;
        k = s / 127773;
        s = 16807 * (s - k * 127773) - 2836 * k;
        if (s < 0)
          s += 2147483647;
        (*seed) = (unsigned int)s;
        return (int)(s & RAND_MAX);
  }

Who on this green Earth wrote this??

   long s = (long)(*seed);

m55au · on April 9, 2022

What is wrong with that?

languageserver · on April 9, 2022

The size of a long is not defined, and is platform dependent. It may be the size of int.

Likewise

if (s < 0) s += 2147483647;

Overflow/underflow is UB in this case, and who says it is 32 bit in the first place?

adrian_b · on April 9, 2022

That code is indeed wrong, but the error consists in using "long" instead of "unsigned long".

The sizes of "unsigned long" and of "long" are unknown, but they are at least 32 bit.

Had "unsigned long" been used, modular arithmetic without overflow would have been used, which was the intention of the code.

Of course the condition of the test must also be changed for an "unsigned long".

As it is written here, it is completely wrong. If this program would be compiled with the correct "-fsanitize=undefined -fsanitize-undefined-trap-on-error" options, which should be always used for any C/C++ program, unless there are good reasons to do otherwise, then the program will be aborted at one of the the first invocations of the function.

m55au · on April 9, 2022

Those compilation options did not make the program abort for me nor am I getting any warnings (at least not with -Wall -Wextra), even when changing all longs to ints in the above code, in order to investigate the case when sizeof (int) == sizeof (long) == 4. Setting *seed = 4294967295 has also no effect. What am I missing?

adrian_b · on April 9, 2022

Well, looking now more carefully at the code, it is in fact correct.

The constants 16807 and 127773 are chosen so that their product is 2147480811, which is smaller enough than 2^31 to prevent overflows in the line:

s = 16807 * (s - k * 127773) - 2836 * k;

where s is known to be positive and less than 2^31, and (s - k * 127773) is the remainder from the division of s by 127773, and k is the quotient of that division.

Adding 2^31 - 1 to a negative 32-bit number cannot produce an overflow in:

if (s < 0) s += 2147483647;

So both the parent comment and my initial confirmation were wrong, the library code is correct. Initially I have replied too quickly, without reading the source. because most such congruential generators are written to use modular arithmetic, while this one has been carefully modified to obtain an equivalent result using integer arithmetic (which also makes it much slower than what can be done with modular arithmetic, so the reason for this is unclear).

The only thing that can be criticized is the lack of a comment emphasizing that those constants cannot be modified haphazardly, otherwise integer overflows are certain.

So your test has verified that the code is correct, so no overflows happened.

Your large seed becomes -1 after conversion to long on a 32-bit system, or it retains its value on a 64-bit system.

For different reasons, in both cases there are no overflows (on a 64-bit system, long is 64-bit, so overflows would have been avoided even with much wider allowable ranges for the constants; the constants are optimized for systems with 32-bit longs, where they are just below the limit for overflows).

Some other values that are either negative or larger than 2^31 may cause overflows, but they are illegal for this function.

m55au · on April 9, 2022

Thank you for the explanation. This looks to be correct, e.g. by changing 16807 to 16808 and using the sanitize options, the program can indeed terminate with SIGILL depending on the value of seed.

languageserver · on April 9, 2022

> That code is indeed wrong, but the error consists in using "long" instead of "unsigned long".

Is this not what I was referring to? I would like to learn more, I think we have the same reasoning, but I am bad at explaining. It think perhaps it could be a good learning experience to HN if you could say exactly what lines you believe are wrong.

jesprenj · on April 9, 2022

If malloc() damages the running system, why even implement it in the standard library?

pif · on April 9, 2022

If you are developing against a supposed implementation, rather than the published interface, you are wrong. rand is not the culprit, newlib is not the culprit: you are.

You don't want to use malloc? Don't have malloc.

h2odragon · on April 9, 2022

Wonder what other joyous opportunities for unforeseen functionality they're unknowingly baking into their "IoT" devices.

aaaaaaaaaaab · on April 9, 2022

I don’t get it. Why does newlib malloc in rand() instead of just storing the global state in a static variable?

adrian_b · on April 9, 2022

In the traditional standard C library, there are many functions which, in order to be invoked with less parameters, use some static variables.

However this approach works only in single-threaded programs. In multi-threaded programs, the static variables are clobbered when invocations of the function from different threads are interleaved.

The correct method to make the standard C library reentrant is to replace all the functions that use static variables with reentrant functions having one extra parameter, which typically use the old name of the function with the suffix "_r".

This has been done, but this means that if you have an old single-threaded program, when you want to make it multi-threaded you must search the program source for all bad standard library functions and replace them with good functions.

Newlib offers an optional facility to avoid the editing of the source files when making a conversion to multi-threading, by replacing all the static variables with dynamically-allocated variables, which will be distinct in different threads.

This feature should better be avoided, because the newer reentrant standard functions also correct other problems, in many cases.

An exception is precisely the "rand_r" function, which is supposed to replace "rand", but which has been wrongly defined, with an inappropriate size of the state parameter.

None of the ancient functions "rand", "rand_r", "srand", "random" or "srandom" should be used in any remotely serious application of random numbers.

There are a huge number of good PRNGs for which the source code is freely available.

freemint · on April 9, 2022

It didn't.

ufo · on April 9, 2022

Does anyone know why the reentrancy layer needed malloc in the first place? I didn't understand that part.

saagarjha · on April 9, 2022

It's used to store internal random state.

ufo · on April 9, 2022

But why is the internal RNG state dynamically allocated? I would have expected it to be a static variable allocated only once.

veltas · on April 9, 2022

So that you can call it in different threads or in a signal without clobbering another invocation's state.

gojomo · on April 9, 2022

Is this thus a `malloc()` without a matching `free()`?

saagarjha · on April 9, 2022

Yes. It lives forever, so this isn't a problem.

naoqj · on April 9, 2022

Jesus how annoyingly written.

languageserver · on April 9, 2022

It reads like a marketing blog post. Written as if this was one of those "I dissambled a printer driver to discover a race condition that could cause privilege escalation and I made a CVE that saved millions of PCs around the world" or something, but really it is just someone who took the time to read 10 lines of badly written C.

tedunangst · on April 9, 2022

Maybe in an low quality implementation. rand() is not defined to have error conditions, and definitely not assert when malloc fails.