A way to quantify this doesn't immediately come to my mind. Maybe reasonable metrics would be:
1. What % misleading/false posts are flagged
2. What % of those flagged are given meaningful context/corrections that are accurate.
It seems there's circular logic of first determining truth with 1, and then maybe something to do with a "trust"/quality poll with 2. I suspect a good measurement would be very similar to the actual community notes implementation, since both of those are the goal of the system [1].