Then again, if you have access to a model trained on sensitive data, why not ask the model directly, instead of probing it for training data? If sensitive data never is meant to be reasoned on and outputted, why did you train on sensitive data in the first place?
The entity training the data and the users of the model are not necessarily the same entity. Asking the model directly will not (or: shouldn't) work if there are guardrails in place not to give specific information. As for the reason, there are many, one of them being the fact that you train your model on such a huge number of items you can't guarantee there is nothing that shouldn't be there.
If there are guardrails in place not to output sensitive data (good practice anyway), then how would this technique suddenly bypass that?
I still have trouble seeing a direct threat or attack scenario here. If it is privacy sensitive data they are after, a regex on their comparison index should suffice and yield much more, much faster.
Then again, if you have access to a model trained on sensitive data, why not ask the model directly, instead of probing it for training data? If sensitive data never is meant to be reasoned on and outputted, why did you train on sensitive data in the first place?