The funny thing to me is that now more than ever structured data is important so that AI has a known good data set to train from, and so that search engines have contextual and semantically rich data to search.
AI isn't a solution to this; on the contrary, whatever insufficiency exists in the original data set will only be amplified, as compression artifacts can be amplified in audio.
We also can't trust any data that's created recently, because an LLM can be trained to provide correct-looking structured data that may or may not accurately model semantic structure.
The best source of structured, tagged, contextually-rich data suitable for training or searching in the future will be from people who currently AREN'T using generative AI.
AI isn't a solution to this; on the contrary, whatever insufficiency exists in the original data set will only be amplified, as compression artifacts can be amplified in audio.
We also can't trust any data that's created recently, because an LLM can be trained to provide correct-looking structured data that may or may not accurately model semantic structure.
The best source of structured, tagged, contextually-rich data suitable for training or searching in the future will be from people who currently AREN'T using generative AI.