Assuming you're using a PPS signal and a kernel driver, presumably there's an interrupt handler or perhaps a capture timer peripheral that is capturing a hardware timer when the PPS edge occurs. It doesn't matter too much when the userspace code gets around to adjusting the hardware timer as long as it can compute the difference between when the PPS edge came in and when it should have come in. The Linux API for fine tuning the system time works in deltas rather than absolute timestamps, so it is once again fairly immune to userspace scheduling jitter.
Even good hardware oscillators can have a wide amount of drift, say 50uS per second, but they tend to be stable over several minutes outside of extreme thermal environments. Therefore, it's pretty easy to estimate and compensate for drift using a PPS signal as a reference. Presumably, that compensation is partially what takes a while for the time daemon to converge on.
Additionally, the clock sync daemon likely takes a while to converge because it isn't directly controlling the system time. Rather, it is sending hints to the kernel for it to adjust the time. The kernel decides how best to do that, and it does it in a way that attempts to avoid breaking other userspace programs that are running. For example, it tries to keep system time monotonically increasing. This means that there's relatively low gain in the feedback loop, and so it takes a while to cancel out error.
It's possible for a userspace program to instead explicitly set system time, but that really isn't intended to be used in Linux unless time is more than 0.5 seconds off. The API call to do that is inherently vulnerable to userspace scheduling jitter, but it's fine since 0.5 seconds is orders of magnitude longer than the expected jitter. You get the system time within the ballpark, and then incrementally adjust it until it's perfect.
If you're not using a kernel driver to capture the PPS edge's timestamp, then you're going to have a rougher time. Either you're just going to have to accept the fact that you can't do better than the scheduling jitter (other than assume it averages out), or you're going to have to do something clever/terrible. One idea would be to have your userspace process go to sleep until, say, 1ms before you expect the next PPS edge to come in. Then, go into a tight polling loop until the edge occurs. As long as reading the PPS pin from userspace is non-blocking and your process doesn't get preempted, you should be able to get at least within microseconds. You can poll system time in the same tight loop, allowing you to fairly reliably detect whether the process got preempted or not.
Thank you for the detailed response! The PPS is currently driving a hardware interrupt on the raspberry pi that is read in by kernel mode software. My project is to drive an external display. Normally I would bypass the raspberry pi altogether and connect the PPS signal to the strobe input of the SIPO shift register. The problem is that the PPS signal cannot be trusted to always exist. Using a raspberry pi has a few benefits. Setting the timezone based on location, leap seconds, and smoothing out inconsistent GPS data. So while opting to use system time to drive the start of second adds error, I think the tradeoff for reliability is worth it.
I have considered adding complexity, such as adding a hardware mux to choose whether to use the GPS PPS signal or the raspberry pi's start-of-second. I should walk before I run though.
If you want to precisely generate a PPS edge in software with less jitter than you can schedule, you can use a PWM peripheral. Wake up a few milliseconds before the PPS edge is due, get the system time, and compute the precise time until the PPS is due. Initialize the PWM peripheral to transition that far into the future, then go back to sleep until a bit after the transition should have happened, and disable the PWM peripheral.
This works because a thread of execution generally knows what time it is with higher precision than it can accurately schedule itself.
I'm not sure I understand how you're using a PPS signal to drive a display, though. Is it an LED segment display? I assume you want it to update once a second, precisely on the edge of each second. Displays generally exist for humans, though, and a human isn't going to perceive a few milliseconds of jitter on a 1Hz update.
Nixie tubes driven by a pair of cascaded HV5122 (driver + shift register). The strobe input is what updates the output registers with the recently shifted in contents. The driver takes 500 ns to turn on and the nixie tubes take about 10 us to fire once the voltage is applied.
I know it's absurd to worry about the last few ms, but it's part of what interests me about the project. The goal is to make The Wall Time as accurate as I can. I could go further with a delay locked loop fed from measuring nixie tube current. There is room push down to the dozens of nanoseconds of error relative to the PPS source, but I am content with the 10s of microseconds. I can't imagine ever having access to a camera that could capture that amount of error.
Thanks for the tip. Hardware timers are best. I'll likely have to take some measurements to calibrate the computation time of getting the system time and performing the subtraction.
Sounds like fun! For what it's worth, ublox GPS modules and their clones should be configurable to always produce a PPS signal regardless of whether or not they have a satellite fix. The module would probably do a better job than software on a pi could during transient periods without a fix (due to how accurate the oscillators need to be in a GPS module). So, as long as you can trust the GPS module to exist and be powered, you should be able to reliably clock your display update with it. The only reason really to generate your own PPS would be if you want it to work without a GPS module at all, perhaps by NTP or something; you're then of course again looking at only a millisecond or so of accuracy.
I'm using an uputronics GPS/RTC hat that has a u-blox M8 engine. I set it to stationary mode for extra accuracy. I'll have to look into other configuration options.
Even good hardware oscillators can have a wide amount of drift, say 50uS per second, but they tend to be stable over several minutes outside of extreme thermal environments. Therefore, it's pretty easy to estimate and compensate for drift using a PPS signal as a reference. Presumably, that compensation is partially what takes a while for the time daemon to converge on.
Additionally, the clock sync daemon likely takes a while to converge because it isn't directly controlling the system time. Rather, it is sending hints to the kernel for it to adjust the time. The kernel decides how best to do that, and it does it in a way that attempts to avoid breaking other userspace programs that are running. For example, it tries to keep system time monotonically increasing. This means that there's relatively low gain in the feedback loop, and so it takes a while to cancel out error.
It's possible for a userspace program to instead explicitly set system time, but that really isn't intended to be used in Linux unless time is more than 0.5 seconds off. The API call to do that is inherently vulnerable to userspace scheduling jitter, but it's fine since 0.5 seconds is orders of magnitude longer than the expected jitter. You get the system time within the ballpark, and then incrementally adjust it until it's perfect.
If you're not using a kernel driver to capture the PPS edge's timestamp, then you're going to have a rougher time. Either you're just going to have to accept the fact that you can't do better than the scheduling jitter (other than assume it averages out), or you're going to have to do something clever/terrible. One idea would be to have your userspace process go to sleep until, say, 1ms before you expect the next PPS edge to come in. Then, go into a tight polling loop until the edge occurs. As long as reading the PPS pin from userspace is non-blocking and your process doesn't get preempted, you should be able to get at least within microseconds. You can poll system time in the same tight loop, allowing you to fairly reliably detect whether the process got preempted or not.