These days I think the sane option is to just add a static assert that the machi...

MenhirMike · 2024-06-30T22:42:16 1719787336

Big Endian is also called Network Order because some networking protocols use it. And of course, UTF-16 BE is a thing.

There is a non-trivial chance that you will have to deal with BE data regardless if your machine is LE or BE.

foobarian · 2024-06-30T23:22:04 1719789724

> some networking protocols

Pretty low-key way to refer to pretty much all layer 1-3 IETF protocols :D

MenhirMike · 2024-07-01T00:41:47 1719794507

It's like Tim Berners-Lee being referred to as "Web Developer" :)

https://imgur.com/kX5oBk6

IshKebab · 2024-07-01T06:23:52 1719815032

Yeah but that's a known order. You don't have to detect it.

matheusmoreira · 2024-07-02T04:38:05 1719895085

Plenty of binary formats out there containing big endian integers. An example is the format I'm dealing with right now: ELF files. Their endianness matches that of the ELF's target architecture.

Apparently they designed it that way in order to ensure all values in the ELF are naturally encoded in memory when it is processed on the architecture it is intended to run on. So if you're writing a program loader or something you can just directly read the integers out of the ELF's data structures and call it a day.

Processing arbitrary ELF inputs requires adapting to their endianness though.

bewo001 · 2024-07-01T14:24:48 1719843888

The fun really starts if you have a CPU using big endian and a bus using little endian..

Back in the late 1990s, we moved from Motorola 68k / sbus to Power/PCI. To make the transition easy, we kept using big endian for the CPU. However, all available networking chips only supported PCI / little endian at this point. For DMA descriptor addresses and chip registers, one had to remember to use little endian.

bvrmn · 2024-06-30T21:03:05 1719781385

> add a static assert that the machine is little endian and move on with your life

It's not clear how it would free you from interpreting BE data from incoming streams/blobs.

bla3 · 2024-06-30T21:29:22 1719782962

https://commandcenter.blogspot.com/2012/04/byte-order-fallac... covers that part.

forrestthewoods · 2024-06-30T21:09:37 1719781777

I feel like we're at a point where you should assume little endian serialization and treat anything big endian as a slow path you don't care about. There's no real reason for any blob, stream, or socket to use big endian for anything afaict.

If some legacy system still serializes big endian data then call bswap and call it a day.

syncsynchalt · 2024-06-30T21:22:22 1719782542

The internet is big-endian, and generally data sent over the wire is converted to/from BE. For example the numbers in IP or TCP headers are big-endian, and any RFC that defines a protocol including binary data will generally go with big-endian numbers.

I believe this dates from Bolt Baranek and Newman basing the IMP on a BE architecture. Similarly computers tend to be LE these days because that's what the "winning" PC architecture (x86) uses.

jandrewrogers · 2024-06-30T23:58:26 1719791906

The low-level parts of the network are big-endian because they date from a time when a lot of networking was done on big-endian machines. Most modern protocols and data encodings above UDP/TCP are explicitly little-endian because x86 and most modern ARM are little-endian. I can't remember the last time I had to write a protocol codec that was big-endian; that was common in the 1990s, but that was a long time ago. Even for protocols that explicitly support both big- and little-endian encodings, I never see an actual big-endian encoding in the wild and some implementations don't bother to support them even though they are part of the standard, with seemingly little consequence.

There are vestiges of big-endian in the lower layers of the network but that is a historical artifact from when many UNIX servers were big-endian. It makes no sense to do new development with big-endian formats, and in practice it has become quite rare as one would reasonably expect.

forrestthewoods · 2024-07-01T01:12:04 1719796324

No idea why you’re getting downvoted. Everything you’ve written is correct.

masklinn · 2024-07-01T04:37:32 1719808652

Is it though? Because my experience is very different than GP’s: git uses network byte order for its binary files, msgpack and cbor use network byte order, websocket uses network byte order, …

IshKebab · 2024-07-01T06:26:55 1719815215

Yeah I'd say it should be true but there are plenty of modern protocols that still inexplicably use big endian.

For your own protocols there's no need to deal with big endian though.

forrestthewoods · 2024-06-30T22:05:31 1719785131

> any RFC that defines a protocol including binary data will generally go with big-endian numbers

I'm not sure this is true. And if it is true it really shouldn't be. There are effectively no modern big endian CPUs. If designing a new protocol there is, afaict, zero benefit to serializing anything as big endian.

It's unfortunate that TCP headers and networking are big endian. It's a historical artifact.

Converting data to/from BE is a waste. I've designed and implemented a variety of simple communication protocols. They all define the wire format to be LE. Works great, zero issues, zero regrets.

classichasclass · 2024-07-01T00:15:24 1719792924

> There are effectively no modern big endian CPUs.

POWER9, Power10 and s390x/Telum/etc. all say hi. The first two in particular have a little endian mode and most Linuces run them little, but they all can run big, and on z/OS, AIX and IBM i, must do so.

I imagine you'll say effectively no one cares about them, but they do exist, are used in shipping systems you can buy today, and are fully supported.

forrestthewoods · 2024-07-01T01:00:59 1719795659

Yeah those are a teeny tiny fraction of CPUs on the market. Little Endian should be the default and the rare big endian CPU gets to run the slow path.

Almost no code anyone here will write will run on those chips. It’s not something almost any programmer needs to worry about. And those that do can easily add support where it’s necessary.

The point is that big endian is an extreme outlier.

userbinator · 2024-06-30T21:45:52 1719783952

Only the early protocols below the application layer are BE. A lot of the later stuff switched to LE.

kortilla · 2024-06-30T22:31:15 1719786675

Yes, those “early protocols” carry everything. Until applications stop opening sockets, this problem doesn’t go away.

lmm · 2024-06-30T23:18:38 1719789518

If you're writing an implementation of one of those "early protocols", sure. If not, call a well-known library, let it do whatever bit twiddling it needs to, and get on with what you were actually doing.

01HNNWZ0MV43FF · 2024-06-30T23:19:02 1719789542

But the payload isn't BE

bvrmn · 2024-06-30T21:17:38 1719782258

AFAIK quite a number of protocols and file formats use BE without any sign to become a legacy even in a distant future.

saagarjha · 2024-06-30T21:22:09 1719782529

You do realize that most of the networking stack is big-endian, right?

zajio1am · 2024-07-01T00:47:34 1719794854

BE MIPS is still alive, many recent Mikrotik hardware is BE MIPS.

Gibbon1 · 2024-07-01T02:50:11 1719802211

That should just be your bad decision shouldn't be other people problems.

circuit10 · 2024-07-01T00:06:18 1719792378

I write programs for my calculator, which is big endian

rerdavies · 2024-07-01T20:27:46 1719865666

Which calculator? And does it have a C compiler?

circuit10 · 2024-07-01T22:06:28 1719871588

It’s the Casio CG50, it uses SuperH which is supported by GCC and there is an unofficial SDK (well there are actually two of them)

The CPU is technically bi-endian but it’s controlled by a pin and it’s hardwired to big endian mode

Most C code just works, sometimes there are endianness bugs when porting things but they’re usually not hard to fix

IshKebab · 2024-07-01T08:35:47 1719822947

Good for you I guess?

circuit10 · 2024-07-01T08:49:25 1719823765

It’s an example of somewhere big endian is used other than an IBM mainframe, even if it’s equally niche

I’m pretty sure there are examples that are way less niche though but I don’t want to look into it at the moment

m463 · 2024-07-01T19:14:57 1719861297

or older macs

(but mostly, network byte order is big endian)