From the gpsd homepage (https://gpsd.gitlab.io/gpsd/index.html): “GPSD is everywhere in mobile embedded systems. It underlies the map service on Android phones. It's ubiquitous in drones, robot submarines, and driverless cars. It's increasingly common in recent generations of manned aircraft, marine navigation systems, and military vehicles.”
FTA: “Will it be cherry-picked back to the 3.20, 3.21, and 3.22 branches?
gpsd does not have enough volunteers to maintain "branches". Some distros try to cherry pick, but usually make things worse.
This bug was announced on gpsd-dev and gpsd-users email lists. So the packagers for several distros already saw it. What they do is what they do.”
So, it seems gpsd is like the tz database, a few volunteers maintaining an essential part of our software infrastructure.
> So, it seems gpsd is like the tz database, a few volunteers maintaining an essential part of our software infrastructure.
More than that. Software is now running the economy, controlling safety and security of the physical world.
gpsd, like Tz database, cURL, SQLite and the Linux Kernel, should be seen as critical planetary infrastructure, period. Safety of our economy and our physical well-being increasingly depends on them being operational.
And yes, it's worrying that we ended up in a situation, where the building blocks of our technological society are maintained by underpaid volunteers.
I saw in another comment here by offmycloud [1] that this affects:
> Android phones and tablets.
"In addition, the Android smartphone operating system (from version 4.0 onwards and possibly earlier; we don't know for sure when the change happened) uses GPSD to monitor the phone's on-board GPS, so every location-aware Android app is indirectly a GPSD client."
Can someone explain how the patch for this will reach all Android devices (especially the large number of devices running older versions of the OS and not getting any updates at all)? What exactly are the consequences for these users?
> Until last year, leap seconds had been very predicable. The effect of global warming on earth rotational speeds was only very recently seen, or even predicted. But, yes, going forward, that needs to change.
Somewhat unrelated: Can someone explain the rationale for writing comparisons in the ordering they're using (e.g., 2180 < week)?
I've seen similar before and always thought it seemed error-prone to not write them the way they'd be spoken aloud, but happy to entertain other explanations.
This is probably just me, but I always tend to write a bigger number in the right, so that I can picture a number line in my head.
(Which means I rarely use > and >= operators. It's always < or <=.)
Now, come to think about it, this might have something to do with my native language (Japanese). In Japanese, where the verb always comes at the end of a sentence, you can say "a < b" and "b > a" using the same order and same adjective.
a < b ... a -than b -toward bigger (a よりも b のほうが 大きい)
b > a ... b -toward a -than bigger (b のほうが a よりも 大きい)
I've seen style guides recommend that to avoid typos on == that accidentally result in assignment: write "123 == foo" instead of "foo == 123" so that you can't accidentally write "foo = 123".
It's not only about not accidentally writing `if (x=0)`.
The `if (0==x)` style also makes it obvious that the check is correct when reviewing/reading code. Sure, a linter might catch this. But this way the reader doesn't need to rely on that. Besides many codebases allow variable assignment as part of conditional/loop expressions, and sometimes sadly it's easier to write code this way than to get a team to use a linter.
Regarding it being unnatural... you get used to it, and especially in C one needs to take care to check the return code the right way (0!=, 0==, -1!=, 0<, !, etc.), whereas the other side of the check is often more straightforward (a function call, a variable etc.), so it's nice to have the constant up front. It takes very little extra space at the front. As a bonus all the constants will visually line up nicely that way.
> Surely such code would be caught by some other tool
This technique was invented back in the 1980s, back then compilers have no static analysis capabilities that we take for granted today. I think the reason of keep using it in 2020 is a matter of habit.
When working with lots of timeseries information I've found it helpful to always order comparisons so that left-to-right is always increasing, so I would write 2180 < week, or even (not 2180 < week) for the negation of the condition. This becomes almost required when there are multiple values being tested: (a <= 10 && 10 <= b && b <= c) is much easier to read than (a <= 10 && b >= 10 && b <= c). As a perspective, it's more focussed on establishing an invariant of the resulting data than writing a single predicate.
I assume it's a habit formed from guarding against unintentional assignment.
Less so for most comparisons, but for equality, it'll throw an error if you're using the wrong operator instead of having unintended side effects.
week = 2180 will set the week to 2180 in a lot of languages.
2180 = week will always throw an error.
So if I want to compare, it's safer to use the form of 2180 == week because if I forget the second '=', the compiler will tell me before I make bigger problems.
This mistake also requires that the assignment operator has a result (and further, that the result can be silently coerced into a boolean for whatever reason)
In both Rust and Swift, this mistake doesn't compile, their assignment operators don't have a result and so it can't very well be true (or false) ‡
‡ Technically the result of Rust's assignment operators is the empty tuple, which is the closest to "doesn't have a result" in the type system. I don't know about Swift.
Even more unrelated, but equally pedantic, I've always thought it's weird when people write "null != val" instead of "val != null" for the same reason.
When said out loud, "null is not val" just feels wrong.
I find initializing condtions like `if(Type variable = ...)` to be very nice in C++ to avoid excessive nesting while still keeping the variable scoped to the block. Of course, I also enable -Wparentheses for things like `if(val=null)`, which you get e.g. when using -Wall with both GCC and Clang.
In both these cases, this is caused by the language having built-in binary operators for when the left-hand side expression is a collection type that perform the operation elementwise and return an array.
Interestingly, it seems like PowerShell's operator overload resolution in general depends entirely on the type of the LHS. I say 'seems' because I couldn't find any sort of language specification when I looked into it a while ago like what C# and VB.NET have, and testing seemed to confirm that this was the case. Now, searching the PowerShell Core source, it seems from [1] that this is indeed the implementation.
This contrasts with C# [2] and VB.NET [3], where binary operator overload resolution is treated as if the candidate operator implementations were a two-parameter method group, making the resolution process 'commutative' (though not always commutative in practice as the operators themselves can still have different LHS and RHS types {Edit: example from the CLR [4]: +(Point, Size) but not +(Size, Point)}).
> it seemed error-prone to not write them the way they'd be spoken aloud
It is written the way it'd be spoken aloud. If you're not speaking it that way then you need to change the way you think. Programming is another language after all.
> I don't know about you, but "week is greater than 2180" sounds more natural than "2180 is less than week"
To you perhaps. To me, I think "if I put both sides on the number line, which way is being questioned? and is that question answered true or false?"
And therefore I almost always do an equals or less-than comparison because that's how I think about the number line: 0 in the center with negatives on the left and positives on the right.
So `if (week < 2048)` is just as valid and easy to think about as `if (!(2048 <= week))`. But then `if (!(2048 <= week))` provides an additional guarantee: that I won't accidentally assign to `week`.
I have heard that the timing on gps is somehow delivered as weeks and that the bitsize of the variable keeping track of the weeks is to small. So every now and then the weeks reset and this is managed through overrides in the clients. Is this bug not just referencing that thing, the override of the week rollover?
Yes and no. GPS has 10 bits for the week number (so 1024 weeks)
This code is using the number of leap seconds that have happened to sanity group of 1024 weeks we are in. The assumption is that by December of 2022, we would have another leap second, so if we had fewer than 19 total leap seconds, then something has gone wrong. However due to incorrect arithmetic, this sanity check is looking at October 2021.
Further comments point out that the sanity check need not be in production code at all, but should be moved to test code.
GPS also has a field indicating when the previous or next leap second was or will be; this field is 8 bits or about 5 years. The last leap second was in 2016 and no future leap second has been announced. So GPS needs to mark a non-leap to keep the week offset in bounds.
This happened before in 2003 during the previous long gap in leap seconds (1998-2005).
"In addition, the Android smartphone operating system (from version 4.0 onwards and possibly earlier; we don't know for sure when the change happened) uses GPSD to monitor the phone's on-board GPS, so every location-aware Android app is indirectly a GPSD client."
Why do people use gpsd instead of just reading $GPGLL or $GPRMC from /dev/ttyACM0 or /dev/ttyUSB0 or whatever, which always seemed far more reliable to me?
The faq answers this [1]. The issue is that GNSS vendors and standards need to clean up their act before you can do this reliably across different receivers.
I suppose. But I've had so many more problems with gpsd (especially when e.g. USB enumerates devices randomly) that on outdoor robots I've switched to parsing NMEA strings and binary data over serial directly, specifically for reliability reasons.
Also I've had gpsd think some other non-GPS serial device was GPS, took up the port, and I got frustrated at its incompetency and apt-get uninstalled it.
Yeah, if gpsd sees anything with the right serial chip it'll assume it's a receiver. The whole thing pretty big pile of hacks, but it has utility nonetheless.
Technically you can (in that you won't usually get an error), but the result is not what you would want: each byte only gets delivered to one of the readers, in a somewhat unpredictable manner, either mangling the data or starving one of the readers
This brings back so many nightmares at my old job. These kind of things would creep up on us all that time and we'd spend a week or so scratching our heads in how the hell did our systems would travel time.
Well, some other comments indicate that this might affect NTP servers? Would that indicate some kind of follow on effect on database timestamps? Timezone localisation/conversions? Replication setups?
Most people using NTP servers do not use GPS units directly, but sync them to public NTP servers - for example https://www.ntppool.org/en/. Redhat for example ships these as the default upstream NTP servers. if the ntppool.org servers use GPS and gpsd, they are likely to have patched the issue well in advance.
I work in a business where we do use serial and network connected GPS devices as stratum 0 timesources for NTP, and yes we have concerns about the implications of this bug on some of our remote devices. If the gpsd starts sending incorrect time/date to the local ntpd it will probably be marked as a false ticker. We have multiple GPS based NTP servers in our datacenters as fallbacks, however we will probably need to check for a firmware update from them for this issue from the vendor.
It most likely does not. You can get accurate (down to ~10ms) time from other NTP sources. What you want from a GPS based NTP server is the PPS output, which is accurate to a few ns.
Yes. It's the classic ice skater effect. Climate gets warmer => Ice (on mountains) melts. Molten ice = water flows down into the ocean, moment of inertia of the Earth is reduced => Earth's rotation speeds up.
Just to put things into perspective: The elevation of the land surface at the Earth's south pole is over 2800m above sea level. The highest point in greenland is over 3600m above sea level.
The number of leap seconds required is determined by the Earth's rotation speed, which isn't constant. In the same way that an ice skater extending his arms slows down, shifts in mass on the Earth can alter its rotation speed. Earthquakes, icemelt, atmospheric warming, and even the filling of the Three Gorges Dam can have an effect at the scale required for GPS synchronization.
This isn't even really a GPS thing, it's UTC time that bounces around because it wants to stay synchronized to the day/night cycle. GPS time is continuous.
We decided to have leap seconds to match UTC which otherwise needn't care and could be based on TAI, against UT, which depends on the gentle spinning of the large rock we all live on.
The IERS https://www.iers.org/ is in charge of monitoring the spinning of the Earth. On the basis of their assessment a decision is made every six months whether to inject (or indeed remove) a leap second.
If we decided not to match UTC to UT and thus we did not care precisely how quickly or slowly the Earth is spinning, we could abolish leap seconds.
If you meant, "Why can't we precisely predict the motions of a vast rock floating in space years into the future" then I don't know what to tell you. We're not God?
> If you meant, "Why can't we precisely predict the motions of a vast rock floating in space years into the future" then I don't know what to tell you. We're not God?
IMO this feels like the more interesting one to explore: are leap seconds wholly from things that are within measurement error in existing rotation, or is the Earth's rotational rate actually changing? If we had leap minutes as the smallest increment, could we predict them out centuries in advance? etc
No, they are not measurement error. The moment of inertia of the earth changes over time in irregular ways (e.g, melting glaciers, etc). Even a major earthquake can show up in the earth’s rotation.
Leap seconds account for the accumulated difference in the rotational period from time to time.
Leap seconds correct for the difference between the time as measured by atomic clocks and the time determined by solar observations.
Turns out the rotation speed of Earth varies. Things like tides, earthquakes, and climate change can affect it. There is no formula for that, the only thing you can do is measure and issue a leap second when required.
> I don't think gpsd has any reason to be predicting when a future leap-second is going to occu
Until last year, leap seconds had been very predicable. The effect of global warming on earth rotational speeds was only very recently seen, or even predicted. But, yes, going forward, that needs to change.
> And the code in question is clearly expecting only positive leap-seconds.
Yes, because until 2020, the thought of a negative leap second was unthinkable. I would welcome you testing that and seeing what falls out.
It'd be cool if we could just invent a new standard time system that's independent from things like that which add variation or unexpected randomness.
I mean sure it would be annoying but its a one time change and our generation has to endure the pain of upgrading our systems, but it would be worth it no?
I'd be interested to hear how you're going to bring it into alignment with GMT and the other Sun-and-Earth-based measuring systems which people are actually going to want to use. Note that the solar day does vary unpredictably in length (https://en.wikipedia.org/wiki/Day_length_fluctuations).
It looks like it you would need to change the Earth's rotational energy by ~1.4*10^22 J to change the length of a mean solar day by 1/365 seconds (which would cause UTC to change by 1 second per year). If energy costs 1 cent per kilowatt hour, this is only around $40 trillion, which is much less than I was expecting.
If energy becomes a few orders of magnitude cheaper and someone knows a reasonable mechanism to put that energy into the earth's rotation, Google or someone similar might find it easier to keep days at 86400 seconds than to deal with leap seconds.
There's not really anything to invent or upgrade. Just never add any more leap seconds (the past ones can't be safely removed). We need to lobby the IERS harder.
Switching everyone from UTC to TAI is too much work and if you try to use TAI while everyone else is on UTC you'll run into off-by-37 bugs everywhere. It's better to keep using UTC but add no more leap seconds to it.
If people didn't need leap seconds, they would already not be using them. There's absolutely no use case for adding leap seconds for a few decades and then stopping. Either put them in or don't.
At the time leap seconds were introduced, it was much rarer for anyone to be able to tell (much less care) that someone on a another continent had a clock a few seconds off from theirs (and those who cared most were probably astronomers, which is why we ended up with leap seconds). There's a reasonable argument that the number of bugs (and extra work for programmers) now caused by them is enough that we should just stop adding them (and perhaps change time zones every few millennia).
Of course, the 'correct' way to fix it would be to use TAI rather than UTC just about everywhere, but that change would be hard to implement compared to just not adding more leap seconds.
Leap seconds are the right thing to do but people didn't realize all the bugs and costs they would trigger. Now that we understand these costs we can and should change our minds.
Well, leap something is arguably the right thing to do, but I'm not convinced that seconds are the best size. It's very possible that a leap minute every century or two would cause less disruption. It's also possible that making leaps a lot more frequent would cause less disruption, because good code is well-tested and oblivious code is less impacted.
Which actually makes the problem more disruptive as when it needs to be solved the solution has to be rediscovered rather than retained as it is with problems that are more frequent.
> If people didn't need leap seconds, they would already not be using them.
Google already don't. But I think they had to patch their kernels etc. to achieve that.
Leap seconds were a bad solution and we should remove them from general-purpose computer systems (some very specialised systems may need them). But it's a massive coordination problem and most people just don't care enough to change anything.
No, but it has most of the same advantages and disadvantages as just ignoring them. I would bet that the only reason they apply a smear rather than just ignoring the leap second entirely is to keep their clocks in sync with the outside world.
I believe we used to, but only recently have we began to understand that the Earth's rotation is not slowing down in a linear/predictable fashion. A comment on the bug says this has something to do with global warming, but I don't really have much context here.
In the same vain there is no algorithm to precisely predict moon phases. A complete cycle takes proximately 27 days but not really. To find out you have to look into the sky which can also be a bit subjective.
FTA: “Will it be cherry-picked back to the 3.20, 3.21, and 3.22 branches?
gpsd does not have enough volunteers to maintain "branches". Some distros try to cherry pick, but usually make things worse.
This bug was announced on gpsd-dev and gpsd-users email lists. So the packagers for several distros already saw it. What they do is what they do.”
So, it seems gpsd is like the tz database, a few volunteers maintaining an essential part of our software infrastructure.