And how do you determine that? I'm not trying to be coy here, I genuinely don't understand.
Because you're not testing for patterns, what you test is some measurable metric(s) you want to maximise(or minimise), right? So how can you determine which metrics lead to dark patterns, without just using them and seeing if dark pattern emerge? And how do you spot these dark patterns if by their very nature they're undetectable by the metrics you chose to test first?
The "patterns" in dark patterns doesn't mean they're an emergent property of the system. You test whether a change improves a metric in A/B tests. You avoid accidental dark patterns in the change like you avoid bugs that cause accidental data loss in the change: you think carefully about what you're doing, maybe a reviewer looks it over, and so on. This isn't perfect, but nothing is.