I like your creative thinking, but that wouldn't work. An immediate problem is it would only train the driver to pay attention when they hear a disengagement chime. L2 depends on the driver to monitor the autopilot continuously.
More productively, Tesla currently senses hands on the wheel. Perhaps they could extend that with an interior camera that visually analyzes the driver's face to ensure their eyes are on the the road.
Recent Honda CRVs can have a attention monitoring system in them. I'm not sure how it works but it does seem to detect when the driver isn't looking around.
Even if the system has high confidence in its ability to handle a situation, if sufficient time has passed, request the driver resume control.
Then fusion the driver's inputs w/ the system's for either additional training data or backseat safety driving (e.g. system monitoring human driver).