Rust vs. C++ Formatting

spoiler · on Jan 17, 2023

In the vector example, why not just `collect<...>()` and debug print that, or am I missing the point of what they tried to illustrate?

Also, I do agree that C++'s approach to formatting the date was more user friendly/ergonomic, but on the other hand it's not something I've struggled with. I also feel like there is a more succinct version (I don't know if it would have the same performance characteristics and how much that mattered when writing the post).

They also later mentioned overall consistency and coherence in the Rust ecosystem, and I value that more over a couple of hand crafted formatters in application code

tialaramex · on Jan 17, 2023

Arguably we might not be able to store the resulting collection.

For example suppose we've got 64MB of RAM, and we've got the iterator 0..i32::MAX

If we try to collect() that, we're going to have a bad time, but while that's a lot of numbers, we certainly could write it to a file, we probably do have gigabytes of disk space, or we could send it over the network - the Internet certainly has storage for these numbers.

estebank · on Jan 17, 2023

But even then the point still stands: you wouldn't want a non-custom debug representation of the iterator precisely because of the cost. You could create a new-type that wraps your iterator and implements a custom `Debug` impl with the contents you desire. Here the issue at play seems to be a disagreement on what is more useful in a Default implementation: a preview of the results of the iteration or a representation of the type structure. I believe that which of the two is more useful depends entirely on what you're using the Debug info for (with the caveat that going from the later to the former in your head is possible, but the other way around isn't).

Arnavion · on Jan 17, 2023

You can't preview an arbitrary Iterator anyway because iterating is destructive. The elements you iterated will be gone forever. Besides, the typesystem won't let you do it anyway, since collect() requires either `Self` or `&mut Self`, and `fmt` only receives `&Self`.

So at best you could only do this with Clone'able Iterators, but even then it might not be ideal for formatting to create "wasteful" clones. So maybe you'd only do it for individual Iterator types like Range or std::slice::Iter for which cloning is cheap. But even .map()ping such a trivially-Cloneable Iterator would yield a non-Cloneable or non-trivially-Cloneable Iterator, so at that point you have to wonder if it's worth making a small subset of Iterators have these "useful" Debug impls.

spoiler · on Jan 17, 2023

But even in Cpp, you need to eventually iterate over the underlying iterator to get the results of the map. So, if you want to debug print it, you still need to iterate over it regardless.

Regarding the mutability and collect issues you raised, I think I just explained poorly. What I meant is that the author could've done something like this https://play.rust-lang.org/?version=stable&mode=debug&editio...

However, debug printing "on the fly" (without collection first) would definitely be more involved with interior mutability and require a wrapper type, for the reasons you outlined, but not impossible.

Arnavion · on Jan 17, 2023

Your playground is what the article author did, except they used `itertools::Itertools::format` so that the entire collection or formatted string doesn't need to be built up-front but can instead be fed to the `std::fmt::Formatter` element-by-element.

spoiler · on Jan 17, 2023

Ah, I understand that part, but I assumed they were looking for the most ergonomic/succinct and idiomatic approach (ie not needing to reach for a crate). But it doesn't have the same performance characteristics as using itertools (which is better here)

shakow · on Jan 17, 2023

If the vector is 64MB long, you probably don't want to print it either.

ridiculous_fish · on Jan 16, 2023

How do Rust projects localize their format strings?

tialaramex · on Jan 17, 2023

For a lot of Rust projects they just don't. Unlike a lot of stuff from the 1990s or so, the Rust core itself doesn't care about localization. It speaks Unicode fluently, but it doesn't try to think about locale problems at all.

For projects that are user facing and so need to do localization, you do have a few options, including both gettext and ICU at opposite ends of the scale of "How difficult is your problem/ how much work do you want to do?"

Gettext is just going to let you substitute translations like "X {1} Y {2}" in one language versus "YorX: {2}or{1}" in another language, whereas ICU understands how to render the Japanese form of today's date and which is the correct plural form for 38 of this thing in Russian (but it's still on you to prepare all the plural variants for each type of thing you want to quantify, ICU just tells you which of them to use)

est31 · on Jan 17, 2023

> Unlike a lot of stuff from the 1990s or so, the Rust core itself doesn't care about localization. It speaks Unicode fluently, but it doesn't try to think about locale problems at all.

The design of Rust's standard library is to be as minimal as possible. Full localization support would be too much for a tiny standard library such as Rust's. Contrast this to Java or Go.

In the library ecosystem, there are full implementations of fluent available, and the compiler is being translated. It's not ergonomic to use yet, meaning you have to put the strings into a separate .ftl file (maybe it will never be), but it's quite powerful and developed by a lot of experts on the topic.

ridiculous_fish · on Jan 17, 2023

That's a depressing answer, that nobody bothers.

gettext seems pretty difficult to use in Rust - there's no obvious way to apply a runtime format string "My name is {1}".

https://blog.hackeriet.no/rust-and-translation-files/ describes using gettext from Rust but not how to handle format strings.

haimez · on Jan 17, 2023

It seems reasonable to me that this is not a compile time constraint, but rather one of tooling and the deployed environment- ie: is the Japanese translation available here at runtime? How? Is there a runtime configuration flag? Is the translations file available at runtime? Is the date formatter configured appropriately?

Maybe if you bake in assumptions about the deployed environment at compile time you can catch errors earlier for your specific environment, but all environments? No. That’s not the responsibility of the standard library and given the language lacks a single specific runtime target- definitely not the responsibility of the language/compiler. This is instead a classic use case for tests and opinionated (optional/third party) libraries instead.

ridiculous_fish · on Jan 17, 2023

I agree that e.g. locating .mo files is out of scope for a stdlib, but runtime format strings seems like something the stdlib could do. Because it does not, std::fmt is borderline useless for software which wants to support multiple languages.

haimez · on Jan 17, 2023

My basic suggestion is that the standard library _should_ be useless at supporting multiple languages, because it’s not a goal that can be achieved for almost anyone’s actual use case. You’re going to have to actually design internationalization into your own codebase in a way that makes sense for your application, and a half-measured implementation just doesn’t have a place at the language core.

Format strings, for example- are essentially compile time macros in rust in order to avoid all the fun kinds of string handling bugs that C’s implementation allowed- but that means no runtime customization without also defining were alternate formats live (and how they’re verified, etc). Not supporting multiple languages is a feature, not a bug. If you need multi language support, you need to a library to define those semantics in a way that fits or use case.

tialaramex · on Jan 17, 2023

Rust's format! macro only wants actual literals, because you get a compile error if the format is wrong, which would not be possible for a non-literal. C++ does this for a constant even if it's not literal, which is fancier, but Rust doesn't have a way to achieve that today.

However, you don't have to use format! to format stuff, you would presumably want an API which better reflects the runtime errors you can now encounter, such as wrong number of arguments, wrong order of arguments, incompatible formats.

estebank · on Jan 17, 2023

On top of that, you could provide a `format` macro that shadows the language item and does support gettext or similar (although that might not be true with regards to the f-string support).

est31 · on Jan 17, 2023

The f-string support (capturing "{hello}") doesn't need anything more than a macro can do. macro_rules macros can't dream up literals from stuff they computed, but proc macros can.

You are probably aware, but there is a currently ongoing project to move the format_args macro further into the compiler (it's a builtin macro right now, but not doing much that a proc macro can't do), to do its work during AST->HIR lowering. On top of that, some optimizations are proposed that would be impossible to implement on macros today.

For some of those optimizations, a primitive to allow proc macros to expand macros of their own would make it possible to have with a pure-macro solution. Even just the ability for macros to say that they want their input to be expanded instead of receiving pre-expanded input would be enough. These primitives are not available today, but are possible future extensions.

merb · on Jan 17, 2023

well there is: https://crates.io/crates/gettext-macros besides the problem, that it's gpl-v3 which is unsuitable for most projects, it's still a working example that it isn't so hard to use.

yencabulator · on Jan 18, 2023

And for anyone who thinks gettext and ICU are just old/brittle/clunky/unsafe things, you may enjoy https://projectfluent.org/

https://github.com/projectfluent/fluent-rs

LegionMammal978 · on Jan 17, 2023

Personally, I haven't had any issues using the i18n-embed-fl crate [0], with a wrapper over the fl!() macro that passes a lazily-initialized global loader. (I did have to add `loader.set_use_isolating(false)` to my initialization function to keep it from outputting invisible Unicode characters; you only really need it to be `true` if you're planning on outputting RTL text in an LTR context, or vice versa.)

[0] https://crates.io/crates/i18n-embed-fl

ridiculous_fish · on Jan 17, 2023

Great tip, I wasn't aware of that crate!

sedatk · on Jan 17, 2023

There's no built-in support for localization AFAIK.

mr_00ff00 · on Jan 16, 2023

Finally, after 40 years technology has advanced far enough for C++ to have a print function instead of cout

asguy · on Jan 16, 2023

It always had printf: https://cplusplus.com/reference/cstdio/printf/

Edit: It's almost like the whole world got a lot of work done with the tools they already had.

tialaramex · on Jan 16, 2023

It had C's printf, which means it isn't type safe, which is consequently a terrible primitive for this work. Like, it makes sense in C, which thinks boolean is a fancy new concept and thinks 0-terminated strings are a good idea, but it's not actually good.

std::println is more or less what you would obviously build for a modern language and it's notable because C++ could have provided something pretty similar even in C++ 98, and something eerily similar in C++ 11 but it chose not to.

hn_go_brrrrr · on Jan 16, 2023

I think you'll find implementing a type-safe print function in C++11 very challenging. What you could do in constexpr was very limited, type deduction was less powerful, and it didn't have fold expressions. I'd say this really only became feasible since C++17.

saghm · on Jan 16, 2023

Maybe the top comment in this thread wasn't so facetious after all then; it did take almost 40 years for C++ to advance enough to have a decent print function

Kranar · on Jan 16, 2023

Boost had a type-safe printf function dating back to October 10th, 2002:

https://www.boost.org/doc/libs/1_31_0/libs/format/

1ris · on Jan 16, 2023

And outstreams where not that bad, aswell. Sure, the operator overloading looks a bit rough. But that's IMHO a pragmatic choice if you want to offer customisation points and didn't have variadic functions yet. They where introduced only in c++11.

tialaramex · on Jan 16, 2023

C++ did have variadic functions because it inherited them from C.

What it didn't inherit from C was a way to write variadic functions with variadic types, so that had to be home grown.

1ris · on Jan 16, 2023

Do you mean this feature [0]? I'm not aware of any differences in c and c++ about this. Can you get a type of a argument in C? How? At compile time, or at runtime? Both sound very un-C-like to me. cppreference is usually excellent documentation but it doesn't mention something like this.

I don't considers this to be "proper" variadic arguments, because a functions argument has to have a type. and these, as far as I'm aware of don't have one. This is about a powerfull as passing a void**. This is essentially memcopying multiple differently typed into a char* buffer and then passing that buffer. You can than correctly copies them back you have pretty much the same behaviour. Both methodss obviously lacks important aspects of the language abstraction of a function parameter and i don't what that feature can bring to the table that the previous techniques don't.

[0] https://en.cppreference.com/w/cpp/utility/variadic

kevin_thibedeau · on Jan 16, 2023

> Can you get a type of a argument in C?

You can get it indirectly using _Generic().

1ris · on Jan 16, 2023

Sorry my bad. Yes, it works for proper function arguments. Does this work for variadic arguments? Parent seems to suggest, but i'm not aware of any mechanism for this.

tialaramex · on Jan 16, 2023

Sure, a hypothetical C++ 11 equivalent would e.g. probably not be able to do the trick where we reject bogus formats at compile time, which is something programmers want and which Rust only gets to do by making this a fancy macro whereas C++ 23 is doing it in the actual library code.

otherjason · on Jan 17, 2023

fmtlib (https://github.com/fmtlib/fmt), a C++11 formatting library that was the basis for C++20 std::format, is able to do this.

tialaramex · on Jan 18, 2023

fmtlib does not, in fact, do this on C++ 11 because as explained it doesn't have enough constant evaluation.

The fmtlib code uses a C-style macro to try to handle format errors at compile time, which detects whether it has enough consteval and no-ops if it does not. On a modern C++ with enough consteval it does exactly what you'd expect it to do, but on older compilers it does nothing.

The result is that fmtlib on an older compiler gets you Exceptions, at runtime, for a format that was invalid at compile time, and if you upgrade the compiler it magically switches to compile time errors.

Edited to add quote from fmtlib docs:

"Compile-time checks are enabled by default on compilers that support C++20 consteval. On older compilers you can use the FMT_STRING macro defined in fmt/format.h instead. It requires C++14 and is a no-op in C++11."

bee_rider · on Jan 17, 2023

Booleans are weird though. A single bit on a system that mostly talks about bytes and bigger? How are you supposed to represent that?

LegionMammal978 · on Jan 17, 2023

In the context of Rust, a bool is a 1-byte value with 2 valid representations (0x00 and 0x01) and 254 invalid representations (disregarding uninit bytes). The compiler can make use of these invalid representations for niche optimizations: For instance, an Option<bool> (i.e., std::optional<bool>) takes only 1 byte instead of 2, since the compiler can repurpose the invalid byte 0x03 to represent the None (i.e., std::nullopt) state of the Option. The compiler generally tries to perform these niche optimizations for all enums, although it's only 100% guaranteed to happen in certain cases.

jchw · on Jan 17, 2023

Basically like an enum? It's not that weird. It does not need to use a storage of one bit, just like C chars can technically be larger than eight bits. I suspect a bigger reason that boolean is missing from C is simply because C can't offer much additional value for boolean in its type system.

C++ can obviously do more with boolean types because it's got a more advanced type system, like disallowing values other than true and false to be assigned in code that only uses defined behavior. Boolean itself can take up whatever size the compiler decides it does, same idea as C. But, template types can special case boolean to use bitfields and allow for more space-efficient operations.

gmadsen · on Jan 17, 2023

i believe this is in cpp23

1ris · on Jan 16, 2023

printf is a bad joke of a formatting function.

When i want to print a string i don't want to worry about the security implications of that. With printf i have to. [0]

And i certainly don't want a turing complete contraption. [1] Also looking at log4j.

And even if everything is correct, it's has to parse a string at runtime. I consider that alone unaesthetic.

>Edit: It's almost like the whole world got a lot of work done with the tools they already had.

The best metaphor i know for this attitude is "stacking chairs to reach to moon". If you don't care about the limits of the tech you will be stuck within it.

I'm time and time again amused how anti intellectual and outright hostile to technological progress the programming profession is. programmers, out of all of them.

[0] https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=printf

[1] https://news.ycombinator.com/item?id=25691598

Someone · on Jan 16, 2023

> And even if everything is correct, it's has to parse a string at runtime. I consider that alone unaesthetic.

Technically, it doesn’t have to do that. If a program includes the header declaring printf using the <> header defined in the standard and then calls printf the compiler is allowed to assume that the printf that the program will be linked to will behave according to the standard, and need not compile a call to printf. It can generate code that behaves identically.

A simple example is gcc converting a printf with a constant string to a puts call (https://stackoverflow.com/questions/25816659/can-printf-get-...)

asguy · on Jan 16, 2023

> If you don't care about the limits of the tech you won't be able exceed what you think is possible.

Did you propose/implement/release something better than printf?

> I'm time and time again amused how anti intellectual and outright hostile to technological progress the programming profession is. programmers, out of all of them.

Perfect is the enemy of good. Some people talk about getting work done, some people get the actual work done and move on.

1ris · on Jan 16, 2023

>Did you propose/implement/release something better than printf?

This is what the article is about? Things much better that printf are a dime a dozed and available since 20 years.

>Some people talk about getting work done,

Like this article does? While you busy arguing that you could do the same thing, but much worse?

spoiler · on Jan 16, 2023

> Perfect is the enemy of good. Some people talk about getting work done, some people get the actual work done and move on.

In my experience, people with this motto generally produce code which frustrates the whole team.

Being a perfectionist is toxic in its own way, though.

There needs to be a balance. I think that balance is to think and plan a few steps ahead (not too much, as it's counter productive) before hitting the keyboard. I know this sounds a bit like a "d'oh, of course" but it really—and unfortunately—isn't something that people practice; they just think they do.

Gibbon1 · on Jan 16, 2023

Lets consider #embed which is new for C23. It allows you to import binary blobs of data into a C program at compile time. Like say if you want to import an image or sound file or a table.

How hard was that to implement? Seriously no reason it couldn't have been part of C89. Why wasn't it? Because the compiler writers and the C++ standards committee have no personal use for it. It took 40 years of waiting and five years to get it just barely past the standards committee. If you think no one would strenuously oppose a feature like embed you'd be wrong.

Those guys also have no interest in printf type functions. And improving printf would be a lot more work than implementing #embed.

CountSessine · on Jan 16, 2023

That’s neat - Borland C had the same thing with the `emit()` pseudo-function with their C89 compiler. I guess Borland’s compiler writers wanted it more than gcc’s?

panzi · on Jan 17, 2023

You can use ld to turn any file into an object file:

    ld -r -b binary -o foo_txt.o foo.txt

foo_txt.o then has these symbols:

extern const char _binary_foo_txt_start[]; extern const char _binary_foo_txt_end[]; extern const void *_binary_foo_txt_size;

So you need to write your own declarations (it doesn't generate a header file).

_binary_foo_txt_size is weird and has to be used as: (size_t)&_binary_foo_txt_size

Or use (size_t)(_binary_foo_txt_end - _binary_foo_txt_start) instead.

Gibbon1 · on Jan 16, 2023

Consider the difference between what a compiler does and say a video game or embedded firmware. Compilers are old school batch mode programs that import data from a file, parse it, transform it to something, and emit it as a file.

chlorion · on Jan 17, 2023

>some people get the actual work done and move on

These people's "actual work" often ends up causing endless streams of security vulnerabilities and bugs too.

Most of the same people you are referring to don't seem to believe that security vulnerabilities exist or are important enough to care about for some reason, but in the real world these are very important issues.

scoutt · on Jan 17, 2023

> These people's "actual work" often ends up causing endless streams of security vulnerabilities and bugs too.

On the other hand, we have people that apparently wouldn't make a program if they are not guaranteed (by another human being) that it will be safe.

If those people generating bugs and vulnerabilities would had to sit tight waiting for someone to make a safe language to do anything, today the world would be 40 years or more behind.

(safe languages that, sarcastically, were created using all those unsafe tools and insfrastructure)

Also in this real world a trillion of printf are being output right now, and will be for a long long time. Is the world falling apart?

You can also list all the printf CVEs but... how many println! are being output?

usefulcat · on Jan 17, 2023

> Perfect is the enemy of good.

Sure, but we're talking about printf here. printf is manifestly mediocre.

I guess 'perfect is the enemy of mediocre' doesn't have quite the same ring.

syrrim · on Jan 17, 2023

Everything has security implications in c, but printf isn't particularly bad. Common use of it involves a fixed format string specified at the call site. This prevents the most dangerous use of it (user specified format string) and also allows the compiler to detect when the format string doesn't correspond to the types of the arguments. Both these failures can be converted into compile time errors in common compilers. Printf, for all C's other faults, really isn't that bad.

arcticbull · on Jan 16, 2023

> Edit: It's almost like the whole world got a lot of work done with the tools they already had.

This feels a little defensive, but also pretty out of line with the philosophy of the C++ standards committee. The committee has been aggressively stapling every new leg they could find to that dog for decades. They just chose not to staple this particular leg on until now.

simplotek · on Jan 17, 2023

> This feels a little defensive, but also pretty out of line with the philosophy of the C++ standards committee. The committee has been aggressively stapling every new leg they could find to that dog for decades. They just chose not to staple this particular leg on until now.

Your comment doesn't bear any resemblance with reality. C++ started with a spartan standard library and only recently did it standardized it's file system API.

Compare that with what, say, POCO already offers. Or Boost. Or java/C#/Python/etc.

What exactly led you to believe that absurdity?

arcticbull · on Jan 17, 2023

> What exactly led you to believe that absurdity?

The fact that the standards committee simply chose to just add every feature every other language has.

simplotek · on Jan 17, 2023

> The fact that the standards committee simply chose to just add every feature every other language has.

Again, this take is outright wrong and totally clueless. I mean, the summary of each change introduced by any of the C++ standards is freely available. C++20's most compelling features beyond concepts and modules were small improvements over existing features like lambda captures and template resolutions, or new atributes.

What compells you to make such nonsensical claims?

arcticbull · on Jan 17, 2023

> What compells you to make such nonsensical claims?

Literally all the additions between C++03 and C++20.

Here's a small list since you'd rather attack me than review the changelogs, it seems.

- range-based for loops.

- enum class.

- digit separators and binary literals.

- consteval/constexpr/constinit.

- std::move

- std::forward

- std::variant based mock pattern matching.

- lambdas.

- structured binding declarations.

- an ABI for garbage collection.

- coroutines.

- concepts.

C++20 even added a three-way comparison operator.

This is just a random selection.

spoiler · on Jan 16, 2023

You could build anything you desire with only a hammer if you're creative enough

dureuill · on Jan 16, 2023

Yes, security researchers got quite a lot of work exploiting the antiquated tools of C coders

[1]: https://www.opencve.io/cve?cwe=CWE-134

Jensson · on Jan 16, 2023

Which is a C function and not safe at all. It is easy to make your own print function in C++ that is safe and easy to use, but there is no such function in std.

stevenhuang · on Jan 16, 2023

There is now in C++20 https://en.cppreference.com/w/cpp/utility/format/format

Which is just the great https://fmt.dev/latest/index.html that even c++11 projects can use.

unclad5968 · on Jan 16, 2023

Why are you worried about how safe printing to terminal is? Genuinely curious, I don't work in software dev.

GrumpySloth · on Jan 16, 2023

In C you can use “%g” printf format string (which indicates a value of type double), and then not pass a double to it, but e.g. an int. Easy mistake to make, when changing pre-existing code.

On x86 what will happen is the code will compile, but the function is going to read its argument from a floating point register instead of an integer register as it should. This:

1. Is a bug, since a completely unrelated garbage value is going to be printed.

2. Leaks the value of a register, which may be a security issue.

There are still other common issues which can easily turn into vulnerabilities, leaking private process memory, when people pass untrusted strings as format strings with the intention of printing them raw.

So you want a safe print to prevent trivial bugs in general, and security vulnerabilities in particular.

__david__ · on Jan 17, 2023

You’re not wrong, but on the other hand, every C compiler I’ve used for the past 25 years has at least a warning you can enable for that. And you can easily add some attribute to custom functions that make use of *printf() functions under the hood to get those also type checked by the compiler. In practice that’s been good enough for me (and catches exactly the type of error you describe).

charcircuit · on Jan 17, 2023

Usually the warning is enabled by default

maccard · on Jan 16, 2023

You're not always printing to a terminal,

    char buf[10];
    const char* foo = "wrong?";
    int res = snprintf(buf, 20, "What could possibly go %d", foo);

Will compile and do... something...

jholine · on Jan 17, 2023

Solved problem since ages ago.

error: format '%d' expects argument of type 'int', but argument 4 has type 'const char*' [-Werror=format=]

You only get into trouble when you use runtime format strings (like passing a user string as first argument to printf)

maccard · on Jan 17, 2023

That doesn't work on MSVC, and ignores multiple other issues with the 3 lines of code I shared (and as you correctly identified doesn't work with runtime format strings.

It also doesn't work if your code isn't a textbook example of being wrong. [0] is _slightly_ more contrived but still suffers all of the exact same problems, despite all of the information being available at compile time.

[0] https://gcc.godbolt.org/z/9vK3W4bYh

jasode · on Jan 16, 2023

>Why are you worried about how safe printing to terminal is?

The "type-safe" means "type-checked" by the compiler for correctness to help prevent bugs. It doesn't mean "safety-as-in-not-dangerous".

gary_0 · on Jan 16, 2023

printf() was often used for logging in eg. web servers. If there's no way of strictly checking the size/type of what's being printed (HTTP headers, say) then there are lots of tricky ways you can use it to write arbitrary memory and pwn the server.

Type-unsafeness in general also just allows for hard-to-find bugs, since only certain data at runtime will introduce undefined behavior.

RcouF1uZ4gsC · on Jan 16, 2023

Actually, in defense of C++, there are not very many statically typed, compiled languages that can implement a type-safe print function that takes a variable number of arguments (that do not all share a base class) using the language itself. For many languages, print is special and the implementation is provided as part of the language implementation.

ADDENDUM:

Even iostreams back in the 1980's was a huge deal at the time, as it showed that it was possible to implement type-safe io in a statically typed, compiled language using language features just like any other library, instead of having it be special purpose.

tragomaskhalos · on Jan 17, 2023

I always thought that iostreams was very elegant. But it wasn't until the internet came along that I learned that most of the rest of the world regards its use of `operator <<` as a crime against decency, so what do I know?

joenot443 · on Jan 16, 2023

Swift can do it, yes?

meindnoch · on Jan 16, 2023

Swift can do what?

String interpolation is part of the language in Swift.

forrestthewoods · on Jan 16, 2023

huh? Every new language has type safe print functions. Having to specify the type with printf formatters is archaic and awful.

kccqzy · on Jan 16, 2023

Most new languages with type-safe print functions are either dynamically typed (which makes this whole discussion moot) or leverage some significant advance in PL theory in order to make the print function type-safe. By advance, I mean not known before the 1990s.

forrestthewoods · on Jan 16, 2023

Sorry, I explicitly meant statically typed languages. Rust, Nim, Zig, Odin, Jai, etc.

> For many languages, print is special and the implementation is provided as part of the language implementation.

Sure. Printing variables to strings is definitely worth language-level features imho. It's a bad thing if a language (C++) requires users to come up with extremely complex libraries (fmt/std::format) because the language lacks the features to make such a common operation simple and reliable.

C style printf is a dumpster fire. C++ iostreams are unuseably slow. Modern languages definitely solve this particular problem much better!

lmm · on Jan 16, 2023

> Sorry, I explicitly meant statically typed languages. Rust, Nim, Zig, Odin, Jai, etc.

Requiring a macro is no better than what C does. Most of those languages have a less awful macro system than C's, but that's not an actual language advancement. The state of printing is still awful in the languages you mention.

forrestthewoods · on Jan 17, 2023

> Requiring a macro is no better than what C does.

I disagree in the strongest, most emphatic way possible.

As a user of programming languages the difference between printing in C and printing in Rust is night and day. Whether that is achieved with a powerful macro system or with the core language does not matter to me in the slightest. It carries zero weight.

Writing code that prints variables in C is painful and bad. Writing code that prints variables in Rust is easy and good. If you'd like to say that "easy and good" is the same as "painful and bad" then I disagree.

When std::format ships the gap between C++ and Rust will shrink dramatically. However Rust will still have an advantage with derive macros. C++'s lack of reflection continues to be a major pain point. Maybe in C++29.

I don't think you can fully separate "language" from "macros". If something can be implemented easily in a macro then there is less reason to bake it into a new feature in the language. I don't think it's a good idea to add language features solely so you can say it's a language feature and not a macro feature. YMMV.

lmm · on Jan 17, 2023

A language that makes it easy to represent how to process a potentially complex but semi ad-hoc structure in a coherent way is interesting. A language that's added a special case for when the processing you want to do happens to be printing is much less so. Derive is a poor substitute for a proper record system.

forrestthewoods · on Jan 17, 2023

Maybe. But a poor substitute is radically superior alternative to literally nothing.

int_19h · on Jan 17, 2023

If the result is type safe and the syntax is consistent with the rest of the language, the users won't and shouldn't care whether it's implemented as a builtin, as a macro, or as some other kind of magic.

dilap · on Jan 16, 2023

You don't like Zig's solution? I think it's excellent. Normal zig code, executed at compile-time. No magic, no macros, fully compile-time error-checked formatting.

https://github.com/ziglang/zig/blob/master/lib/std/fmt.zig#L...

lmm · on Jan 17, 2023

I don't, no. Proper stage polymorphism is great, but running arbitrary code at compile time without a coherent model of how you're doing that (which ad-hoc const/constexpr/etc. isn't) is like substituting for generics by having dynamic types. If I'm just going to run arbitrary code at compile time I might as well do that with a make rule.

dilap · on Jan 17, 2023

Sure, it is basically a dynamic language at compile time, but I feel like the downsides of that are significantly mitigated by the fact that it's running at, you know, compile time -- you'll still get compile-time errors for any mistakes!

(Some people even try to write entire programs in dynamic languages, as crazy as that is.)

Regarding "might as well have a makefile rule," it's not really fair -- in practical terms having a seemless way to evaluate functions in your code at compile time is way different than doing codegen. Like, in practice, how would you write a type-safe format function using codegen? You could do it, but it would be very, very gnarly. Very different.

Even Go -- which famously for a while tried to advocate codegen as an alternative to generics (w/ stuff like stringer to codegen print functions for human names for constants) -- resorts to dynamic types & runtime errors for formatting.

So let's see:

formatting approaches we've seen so far:

– dynamic w/ runtime errors (e.g. Go, C#, C, ...)

– static via baked into compiler support

– static via special-case restricted compile-time evaluated language (e.g. rust macros)

- static via full-lang compile-time eval (e.g. zig)

I think you said all those are awful, so, I'm curious -- what's the better approach you have in mind?

lmm · on Jan 17, 2023

IMO the least-bad vaguely mainstream approach is what people do in Haskell with e.g. formatting or fmt. Your formatter should be a first-class strongly typed value, and the way you form/construct that value should be the normal way you form values in your language.

I do think there's space for a "write a complex value literal by writing a string in a DSL and embedding that into the source" feature. But that shouldn't be specific to format strings, and it shouldn't be by running arbitrary code at compile time; rather it should be a language feature. I haven't seen a version of that that I'm really happy with yet, but Haskell's OverloadedStrings or Scala's StringContexts are some baby steps in the right direction.

dilap · on Jan 18, 2023

Thanks.

My haskell knowledge is minimal, but looking at formatting and fmt library examples, what stands out is you're operating in the syntax of the language, so it's clumsier to use (imo) than a string with interpolations. (It seems not-dissimilar to C++ streams, to me.)

I guess your second paragraph gets at that. I think you're arguing for, "build it into the language, but flexibly." Fair enough!

xigoi · on Jan 17, 2023

Nim's implementation doesn't require a macro.

atq2119 · on Jan 16, 2023

Rust's print! implementation is pretty darn complicated, wouldn't you say?

forrestthewoods · on Jan 16, 2023

I don't know what Rust's print! implementation looks like. I also don't care. I am not a Programming Language programmer. I am a user of programming languages.

Using C-style printf sucks balls. It's extremely error prone and it doesn't support complex types. Using Rust's print system is delightful. I can't make a type error and arbitrarily complex types can be printed with roughly zero effort via {:?} and #[derive(Debug)].

xigoi · on Jan 16, 2023

However, Nim's is very simple: https://github.com/nim-lang/Nim/blob/version-1-6/lib/system/...

elcritch · on Jan 16, 2023

The magic that makes Nim’s print / repr functions really neat is the `fieldPairs` iterator that works on any type. It’s a comptime function to iterate fields of a type, very handy and very simple to use. It can work on any type unlike Rust’s Debug macro which only works on types you own and can annotate.

https://nim-lang.org/docs/iterators.html#fieldPairs.i%2CS%2C...

dureuill · on Jan 16, 2023

> For many languages, print is special and the implementation is provided as part of the language implementation.

what's the point of not having print provided as part of the language implementation, though?

int_19h · on Jan 17, 2023

Print is not the only facility where strongly typed variadic arguments are useful. If you implement it as a builtin, all those other cases will have to be implemented as a library, and as a result, argument passing looks very different from print. And it is desirable that the same thing is consistently expressed in the same manner.

Karellen · on Jan 17, 2023

Because plenty of places where C is used won't need it, or indeed any part of libc.

Also, it's a fairly simple to decide upon dividing line between the language and the library - if it can be implemented in a handful of assembly instructions (or, ideally, 1 instruction) it's part of the language. If it can't and needs a "function" (either in C, or implemented in assembly) to work, or needs to talk to some other part of the computer (like a kernel, or a BIOS) it's part of the library.

Like malloc()/free() - again, not part of the core language spec, but part of the standard library, which can be omitted in some ("freestanding", as opposed to "hosted") implementations.

Remember, C was created in the '70s. Memory was measured in kilobytes. Tens of kilobytes if you were lucky. Even through the '80s, 1 megabyte was a lot.

stevenhuang · on Jan 16, 2023

The format in C++20 is the 3rd party https://fmt.dev/latest/index.html , which means even c++11 projects can use it.

kzrdude · on Jan 16, 2023

after 30 years, Python also finally has printf:

world = "earth"

print(f"Hello {world}")

taspeotis · on Jan 17, 2023

How's the Networking TS coming along? I can't wait to connect to ARPANET.

andrepd · on Jan 16, 2023

Rust always had printf. If you mean python-style fmt, there were also libraries for that for a long time, e.g. https://fmt.dev/

mk89 · on Jan 16, 2023

And it will take another 40 to convince people to use it instead of cout. :)

renox · on Jan 17, 2023

I doubt it very much: cout is so bad (ever tried to display an hexadecimal number? Barf!) that in many case printf is used instead even though it isn't typesafe. Which isn't so bad when you use a static checker in the CI, which you should anyway the C++ being so full of traps.

scrubs · on Jan 16, 2023

get with the program. Printf in its 86 varieties has been around for years.

Second, I'm increasingly associating that brit 70s show "keeping up appearances" with rust specifically that overly blushed face of the woman protagonist.