For my own use cases, I've found that traditional cron timing specs don't match what I need from scheduled operations. I mainly care about two things: time window and periodicity -- and while it's possible to express both in the same cron timestamp, it's both too specific and not specific enough. I don't see this post making a clear argument about their functional requirements, just that fcron satisfies more technical requirements than Vixie cron.
But going in full "I want a pony"-mode: over the years, I've needed the following features from a job scheduling system:
1) time window (e.g. only run between 1am and 3am on weekdays)
2) periodicity (if job was successful, next run is X time away)
3) retry period (if job failed, retry after X time)
4) job timeout (if job is still running after X time, kill it)
5) start randomization (randomize start up to X time in window)
6) job dependencies (don't run job Y until job Z has finished successfully)
7) job anti-dependencies (don't run job Y while job Z is active)
FAFAIK, Vixie cron only allows 1 and 2; 4 can be done with coreutils' timeout (don't know if it's a POSIX command or only GNU), and 5 can be done with something like sleep $RANDOM, but the time spent waiting is added to the job's runtime -- which means that you no longer can create useful reports about job processing times. Of this list, fcron only adds nicer syntax, and 5 is built-in. It can kinda do 7, but serial is a global exclusion. If it were possible to make exclusion groups (e.g. job Y and Z have serial(1), job X has serial(2), and user's fcrontab has serial($UID)), that would be much nicer.
I acknowledge that Cron was never meant to solve 6 and 7, but it's still a feature that I require regularly in my systems. But I don't understand why 3 (explicit retries) was never implemented in any cron replacement, even systemd timers don't have it, at least I'm not aware of it (RestartSec is for services only).
1) time window (e.g. only run between 1am and 3am on weekdays)
Should just be `OnCalendar` in the timer.
2) periodicity (if job was successful, next run is X time away)
Not sure if this is specified as 1 && 2, or 1 || 2. 1 or 2 is pretty easy, 1 and 2 I think requires two timers, a "run-on-sunday" that then enables the "now run every 20 minutes" timer. Both do-able though.
3) retry period (if job failed, retry after X time)
Unit files should already do this, but it will be disconnected from the timer, the unit can retry every 10 minutes if it fails, up to 5 times, but AFAIK the timer wont check the status of the service before running. BUT I also think you can configure the unit to only run once so depending on your setup it would either not duplicate the service, or you would have killed it before the next timer.
4) job timeout (if job is still running after X time, kill it)
JobTimeoutSec, (also JobTimeoutAction might be useful).
5) start randomization (randomize start up to X time in window)
RandomizedDelaySec
6) job dependencies (don't run job Y until job Z has finished successfully)
7) job anti-dependencies (don't run job Y while job Z is active)
I think you could do both of these, but it would probably be at the unit level, not the timer. I think properly configuring a "Z still running, try Y again in 2 minutes" might be hard or might not be, as systemd can queue jobs, I think it might just queue Y until it hits a timeout and kills it.
I think you can actually achieve all this but might require a bit of setup building a graph that flips timers on or off but at its core I think all the "if x then y" bits exist.
My major complaint against systemd is that its documentation is so complete, it becomes quite dense and you have to understand quite a lot of interconnected parts. "The problem is the problem" though, its complex because what it aims to do is complex. I am still positive on it as a whole though.
You could also write a systemd generator to build all these files from a single specification.
> My major complaint against systemd is that its documentation is so complete, it becomes quite dense and you have to understand quite a lot of interconnected parts. "The problem is the problem" though, its complex because what it aims to do is complex. I am still positive on it as a whole though.
Aye, that is a problem. We have like 3-4 recipes in our internal documentation of "Here are the 12 lines of systemd for each of the 9 things you need to do to solve 90% of your use cases". I might ask around if we could put this onto some public blog or something.
Discoverability is such a thing. For example, which of these do I want?:
man systemd.{exec,unit,service}
It's not too bad to get a general feel for what's possible, but recalling where is frustrating. These three are so closely related they're notable. Others, like timers, are a little more obvious.
I've settled on Ansible roles. I define the important bits and config mgmt does the rest
5 is an odd requirement potentially better implemented outside cron, such as with a single line of shell code involving the "at" command (and "expr" to scale a /dev/random value).
4, 6, 7 gets you into "enterprise" (not the StarTrek thing;) territory. Think about it: how to represent that a job has or hasn't finished successfully? Using a process return status, or maybe something more persistent, restartable, or federated and independent of a PID eg. a state stored in a database, which involves further design trade offs such as transactional isolation, locking, eventual consistency, etc.
> 5 is an odd requirement potentially better implemented outside cron, such as with a single line of shell code involving the "at" command (and "expr" to scale a /dev/random value).
I'm very surprised you think that. I would say that the job scheduler is the perfect place to randomize start times.
If it's a one-off then you rarely need randomization. For scheduled tasks, you often want randomization so that your fleet of servers is not all doing the task at the exact same time. Think "start between 2am and 4am every day".
You can work around it by picking an offset at install time like scheduling for 3:17am in particular on this server, or by putting a random sleep into the job, but both of those obfuscate the actual scheduling intent.
Sounds like a local LLM, coupled with a periodic wake up signal, will be your pony. Presumably even a small one could run a schedule under those (very reasonable) constraints. Maybe not general pathological cases, but ordinary cases. You'd probably want things to run in one big prompt over time and the tricky part would be finding the minimum size required to properly execute the schedule. Too small, and it loses knowledge of the past. Too large, and it won't function at all.
EDIT: I believe in sysadmin llms and think they will become common and obvious in the next few years. I will look back at the downvotes on this comment with great pride. So, thanks and keep them coming!
I am serious. It may seem like killing a fly with a sledge-hammer, but a local llm sysadmin process has a lot of application apart from cron. The ability to simply talk about your processes in natural language is a huge bonus to those who don't do sysadmin daily - which is most of us. Obviously, using an llm is not appropriate where size and precision are required. In that case use cron and accept its limitations.
I was annoyed by cron/fcron limitations and figured systemd is the way go because of its flexibility and power, but also was annoyed about manually managing tons of unit files. So I wrote a tool with a config that looks kinda like a crontab, but uses systemd (or launchd on mac) behind the scenes: https://github.com/karlicoss/dron#what-does-it-do
But also it's possible to add more properties, e.g. arbirary systemd properties, or custom notification backends (e.g. telegram message or desktop notification)
Since it's python, I can reuse variables, use for loops, import jobs from other files (e.g. if there are shared jobs between machines), and check if it's valid with mypy.
Been using this for years now, and very happy, one of the most useful tools I wrote.
It's a bit undocumented and messy but if you are interested in trying it out, just let me know, I'm happy to help :)
systemd requiring two files for timers is quite intentional.
A timer runs a unit. Which means that you can test any timer easily by just starting the unit itself whenever needed. With cron, to test things you need to fiddle with the configuration and schedule it to run in the next minute or two, wait, check the logs, adjust, rinse and repeat.
This is also pretty useful if you use parametrized unit files. For example, most of our long term archiving is implemented in a parametrized systemd unit `push-to-archive@`. So in order to backup a new dataset, you deploy the config for it, add a simple systemd timer triggering `push-to-archive@newdataset` and that's it. Simple and even readable in the systemd timer status.
Fcron's approach is to provide a shell where you can examine, run, renice, and kill cron jobs: http://fcron.free.fr/doc/en/fcrondyn.1.html.
(You can run a job either independently or instead of the next scheduled run.)
I think this is competitive with what systemd provides by separating the service from the timer.
(Edit: I mean in this area. As a sibling comment points out, the separation also enables parametrization.)
An advantage that systemd has is that you can refer to a service by a meaningful name instead of a numeric job identifier.
What I would like to have in fcron is the ability to direct the job's output to the console when running a job immediately.
The complaints about systemd timers seem strange. To me, they're a sysadmin's dream to work with. I can see exactly when the last trigger was, when the next one will be, and I can test that the unit will work properly when it fires. And I don't have to remember cron syntax.
As for them not supporting email, if you really need to you can add an OnFailure= to the service unit to send one.
> A timer runs a unit. Which means that you can test any timer easily by just starting the unit itself whenever needed. With cron, to test things you need to fiddle with the configuration and schedule it to run in the next minute or two, wait, check the logs, adjust, rinse and repeat.
Not sure if I understand - a Cron "unit" is just a command that runs in a clean environment - you can test that without Cron.
SystemD require a "unit" wrapper around your command line - and is tricky to test without systemD? (No copy paste of the command and arguments - you would have to manually parse the ini file back into a command line).
> Not sure if I understand - a Cron "unit" is just a command that runs in a clean environment - you can test that without Cron.
It's not that easy to do. I've wasted hours in an old life figuring out that cron ran my scripts with a different $PATH than `env -i` did. systemd timers/units on the other hand are executed through the exact same code path, so if `systemctl start` works, you can be sure it works when started by a timer too.
It can take a little effort to run a command under the same environment it’ll have when cron runs it—copy-paste isn’t a full description of what you have to do. And it’s pretty easy to get wrong, then not find out until your job fails.
I’m a systemd fan, but managing unit files is definitely a pain. The units and timers I’ve written get thrown in `/lib/systemd/system` (or `/etc/systemd/system/`, who knows) along with a ton of the OS’s unit files.
By that logic, why not bring more types of entries into the service file?
Path units for inotify triggers, mount units for NFS automounts active only when the service is running, socket units that register network triggers, all of these might have some use.
It might seem cumbersome, but I do agree that configuring complex activity sometimes means many unit files. Timer units would be a clear example.
You have identified the key point, thank you. Sometimes, things are complicated enough that you want to break configuration up into multiple units. But not always. At the moment, systemd forces you to always break it up, even when that does more harm than good.
you can run the unit and run the timer, if you start the combined unit, then should it run it as a service or as a timer? and then you can have multiple timer files for a single service too
Beware! It explicitly does not run jobs in an empty environment - it's not a great candidate for system level Cron. But for its intended usecase it is very nice.
Super simple to use - no remembering the order of numbers.
Only one job runs at a time (so you don't need to pick times that don't overlap)
Of course, it doesn't handle every case.
I don't know if all Linux's have it.
Offsets (e.g. the day before the first friday of each month) (this can be implemented via multiple rules but I don't want that)
Programmable times (expands on the prior, would like to shell out to something else to get the next X times, so I can have any scheme I want for times!)
I mostly live in RHEL land so vixie cron was replaced around a decade ago and these days i use systemd, but, i think all of these can be achieved without changing your cron impl:
1. Timezones
CRON_TZ=Europe/CEST
0 10 * * * env TZ=America/New York echo “I run at 10am eastern”
0 10 * * * echo “I run at 10am Central European”
2. Offsets
0 10 * * 4 [[ date -d "tomorrow" +\%u -lt 8 ]] && echo “I run on the day before the first Friday”
This one is a bit gnarly for a few reasons:
The 4 in the cron spec means Thursday, 0 and 7 mean Sunday.
You can’t use the day of month and day of week fields here as you might (entirely reasonably!) expect. In cron, the other fields are AND together, but day of month and day of week are OR. There are 2 types of people in this world. Those who have been bitten by this and now have this rule tattooed on their brain and those who have yet to be bitten. There isn’t a 3rd type that just read the systemV spec docs… :-)
The % is escaped in the GNU date command because % means new line in cron. Ouch.
3. GNU date shenanigans has you covered here I think.
EDIT: nope, I just tested #1 and this idea doesn’t work. It doesn’t affect the run schedule, it would only impact the timezone of the echo commands here
If you need to stick with crontab entries (perhaps they are generated/managed by some other software), it should be possible to write a systemd generator[1] to generate the required timer/service units from the crontab file(s).
admittedly, cron emailing you by default every time it runs a job seems now like a ridiculous decision. But I take a certain pleasure in my morning ritual of fast scrolling through a hundred cron emails from different servers just in case there's an error.
yeah, I just use MAILTO="" on ones I don't need, but oddly I find myself addicted to reading the ones I never set that up on. And once or twice it's actually alerted me to problems it would've taken me much longer to notice.
Still a ludicrous decision to make it email by default.
It's not ludicrous, it was a very deliberate design decision.
First, you only get emails if the program produced output. The Unix Way states that programs should not produce any output at all when the command was successful and everything went fine. (Unless of course the intended job of that program is to display output.)
Also, you have to keep in mind that Unix was designed to be a multi-user operating system. Meaning one machine to hundreds or thousands of users. When you submit a job with `at` or `cron`, it _only_ makes sense that you want to see the output of that job somehow, and email was the easiest and best way at the time.
Imagine the headaches of trying to admin a Unix system with thousands of users, all of who can create a cron job to run every minute or hour and forget about it because they _never, ever saw the output_?
Cron emails you by default, and I think that is the correct choice. It _should_ be up to the user to figure _which_ output they do not want and tune their jobs (or programs) to match. It's not even hard, you just redirect stdout to /dev/null.
Yeah, I hadn't considered the utility of this in a situation with many people submitting jobs to run on mainframes. To me, the logical default would be to output to logfiles rather than depend on an email server being available, but in the historical context I suppose it makes sense.
About 20 years ago when I set up my first server and wrote my first cron tasks on it, I didn't know that cron sent output to email. The root email address on that server was set up with a mailbox on the server itself, that was unused and never checked. After a year or more, I started mysteriously running out of disk space and was baffled until I found 10+ Gb of unread emails in that account.
Maybe instead of demons with config files that have more and more bespoke syntax, it's better to go all the way to a library that provides the basic loop functionality? Like https://github.com/c-blake/cron or something similar in C or whatever.
The fact that the timer and the job are separate in systemd is actually a blessing, because now you can start your jobs manually and actually test that they work.
i cant say much about cron, i use it for simple scripts on systems. When i have a lot of jobs that need to be scheduled routinely and i need alerts if they dont, i use jenkins, it works really well.
But going in full "I want a pony"-mode: over the years, I've needed the following features from a job scheduling system:
FAFAIK, Vixie cron only allows 1 and 2; 4 can be done with coreutils' timeout (don't know if it's a POSIX command or only GNU), and 5 can be done with something like sleep $RANDOM, but the time spent waiting is added to the job's runtime -- which means that you no longer can create useful reports about job processing times. Of this list, fcron only adds nicer syntax, and 5 is built-in. It can kinda do 7, but serial is a global exclusion. If it were possible to make exclusion groups (e.g. job Y and Z have serial(1), job X has serial(2), and user's fcrontab has serial($UID)), that would be much nicer.I acknowledge that Cron was never meant to solve 6 and 7, but it's still a feature that I require regularly in my systems. But I don't understand why 3 (explicit retries) was never implemented in any cron replacement, even systemd timers don't have it, at least I'm not aware of it (RestartSec is for services only).