On Fri, Dec 24, 2021, 00:11 Denny Vrandečić <dvrandecic@wikimedia.org> wrote:
Running a function for text generation from Wikifunctons on that Abstract Content would generate a text that we'd consider to be published under CC BY-SA as well.

There is something in this that I find to be contrarian to the idea of copyright in the first place. (Copyright is needed in order to be able to license something CC BY-SA.) How can one get _copy_right for something one never wrote in the first place (and in many cases not even formulated as a sentence in your mind). Let me expand on the thoughts I shared in the office hours.

Take for example how many millions of articles about places in various languages could be formed. It could look something like this: 
The description of some Abstract content: generates a sentence which ranks something compared to similar items in some location by some property. It would then behave something like (over-simplified pseudocode):

rank(this,property(P1082),location(administrative unit(this)))

Depending on where it is used, it could generate:
  • Stockholm ist die größte Stadt im Stockholms län.
  • San Francisco is the fourth largest city in California.
  • Gimo är den näst största tätorten i Östhammars kommun.
Not only is it inconceivable for someone to know of the phrasing of all the combinations this simple example can generate at the time the Abstract content is written, it will also be updated on Wikidata with new places added and new population data will change the ranking over time, changing the output of the function without the author being in control of it, and likely not even being aware of it. I find it a revolting suggestion that copyright could be attributed to the author of the Abstract content for this output.

This leads me to conclude that we either need to have the possibility to assign different licenses to the output of Abstract content or to make it all CC0 (in order to not committing copyfraud). As Denny implies, some Abstract content will be more static and "handwritten", but I suspect that for a foreseeable time that it will be in absolute minority in terms of generated output, so we can not ignore the situation we are putting ourselves in through the choices we make. 

/Jan Ainali