Interesting, but probably not general and scalable way of fighting this problem. This practice is would be hard to implement for other types of content.
I think that copyright law is inherently unfit for internet. In its core, it is a legal restriction on re-publishing content which cannot be enforced on the internet. It does not prevent piracy or AI companies from collecting data. So I'd say that we should do away with copyright law altogether. This would, of course, remove a lot of incentive for producing content, but I think people would still produce content, even if they are not paid to do it, as long as their basic needs are satisfied. So if we, as a human race, progress to UBI, we can also solve copyright problem.
But if we get stuck in capitalistic age, I guess we have to pretend that information can be owned and legally restricted from redistribution.
I mean, updating the rules would help - clarifying that feeding data to any model / doing analysis on it requires copyright - but I doubt that it would stop companies from doing it. Because it is hard to prove in court that your work has been stolen.
But there is no real way of enforcing the rules. How would be combat piracy? If you make BitTorrent protocol illegal, people will just that using HTTP or anything else to share copyright-ed material.