> Putting the CUE validation in our pipeline is too confronting for others
Sad. Can you get away with boiling the frog by getting it initially added but configured to check very, very little? Maybe some specific rare category of mistake that doesn't happen enough to annoy your colleagues when it catches it but is implicated in a recent outage management still remembers?
Then slowly adding rules over time as feasible (exploiting each outage as an opportunity to enlist stakeholder support for adding more rules that would have caught that particular misconfiguration event)
Sometimes I think figuring out how to stage and phase gradually making the changes you need to improve the system in light of social inertia against it is the most complex part of corporate enterprise software work. I definitely remember a time I wrote a whole perl-based wheel-reinventing crappy puppet knockoff just for a set of nagios plugins, entirely not for technical reasons, but for the political reason that this way we would control it instead of the other department which refused to get with the program and let us do staged canary rollouts. It was wrong technically if you assume frictionless corporate politics, but it was the right and only practical way to achieve the goal of ending the drumbeat of customer outages.
Sad. Can you get away with boiling the frog by getting it initially added but configured to check very, very little? Maybe some specific rare category of mistake that doesn't happen enough to annoy your colleagues when it catches it but is implicated in a recent outage management still remembers?
Then slowly adding rules over time as feasible (exploiting each outage as an opportunity to enlist stakeholder support for adding more rules that would have caught that particular misconfiguration event)
Sometimes I think figuring out how to stage and phase gradually making the changes you need to improve the system in light of social inertia against it is the most complex part of corporate enterprise software work. I definitely remember a time I wrote a whole perl-based wheel-reinventing crappy puppet knockoff just for a set of nagios plugins, entirely not for technical reasons, but for the political reason that this way we would control it instead of the other department which refused to get with the program and let us do staged canary rollouts. It was wrong technically if you assume frictionless corporate politics, but it was the right and only practical way to achieve the goal of ending the drumbeat of customer outages.