If it is true that Copilot only generates small snippets that arent under copyright, then why doesnt Microsoft train it on their own internal source code? Having more training data is good, and they claim that there is nothing to worry about. Seems very hypocritical.
The output of a machine simply does not qualify for copyright protection – it is in the public domain. That is good news for the open movement and not something that needs fixing.
This is great. Someone should train a machine learning model on leaked windows source code, and use it to generate a public domain implementation of windows. The same should be possible with music or movies. But it cant be a way to strip open source licenses while leaving proprietary copyright intact.
If Copilot will lead to copyright being abolished completely, I am all for it.