Hi there,
https://assets.publishing.service.gov.uk/media/67851771f0528401055d2329/ai_o... is an action plan by the Secretary of State for Science, Innovation and Technology that was released today.
Section 1.2 contains a series of recommendations that - in my opinion - overlap with some of the goals of Wikimedia, not limited to recommendation 13:
"13.Establish a copyright-cleared British media asset training data set, which can be licensed internationally at scale. This could be done through partnering with bodies that hold valuable cultural data like the National Archives, Natural History Museum, British Library and the BBC to develop a commercial proposition for sharing their data to advance AI."
In my opinion, this dataset could and should be labeled CC-0 or released under a permissive license such as CC-by which would make it compatible with Wikimedia projects such as Wikimedia Commons. In return, Wikimedians could offer the UK government to help annotate, curate or otherwise improve metadata, making this dataset more valuable for training. I am certain that these options are not on the top of the Secretary of State's mind. The 'commercial proposition' is most likely a sign that their line of thinking is in a different direction, which would be a shame, in many ways.
Other recommendations could be compatible with Wikimedia goals and requirements, but the lack of wording concerning free and open licenses in section 1.2 remains a cause for concern.
In some respects, I feel reminded on the early "Europeana" times when free and open licenses were not the consensus among EU member states concerning content on this platform.
Anyone willing to make the necessary calls to the UK Government?
Mathias
Hi Mathias,
Thank you for sharing this! I'm tagging in @Lucy Crompton-Reid lucy.crompton-reid@wikimedia.org.uk as she leads WMUK and can speak to this in a more informed manner than I can.
From the Foundation perspective I can share that we are hyper-focused in the UK, but all of our attention is on the UK Online Safety Act. This includes provisions related to age-gating that could be detrimental to our model and values https://wikimediafoundation.org/news/2023/09/19/wikimedia-foundation-calls-for-protection-and-fair-treatment-of-wikipedia/. This is taking all of our staff time.
When it comes to AI we rely on our network of allies who are more active in these discussions. These include members of the A2K coalition, for example. We make sure they know our key concerns related to AI, exchange resources, and brainstorm positions together. Lucy will have better insight than I do here, as she's the one who gets all the invitations to speaking opportunities, private roundtables and workshops on the issue!
Best, Ziski
On Mon, Jan 13, 2025 at 7:15 PM Mathias Schindler < mathias.schindler@gmail.com> wrote:
Hi there,
https://assets.publishing.service.gov.uk/media/67851771f0528401055d2329/ai_o... is an action plan by the Secretary of State for Science, Innovation and Technology that was released today.
Section 1.2 contains a series of recommendations that - in my opinion - overlap with some of the goals of Wikimedia, not limited to recommendation 13:
"13.Establish a copyright-cleared British media asset training data set, which can be licensed internationally at scale. This could be done through partnering with bodies that hold valuable cultural data like the National Archives, Natural History Museum, British Library and the BBC to develop a commercial proposition for sharing their data to advance AI."
In my opinion, this dataset could and should be labeled CC-0 or released under a permissive license such as CC-by which would make it compatible with Wikimedia projects such as Wikimedia Commons. In return, Wikimedians could offer the UK government to help annotate, curate or otherwise improve metadata, making this dataset more valuable for training. I am certain that these options are not on the top of the Secretary of State's mind. The 'commercial proposition' is most likely a sign that their line of thinking is in a different direction, which would be a shame, in many ways.
Other recommendations could be compatible with Wikimedia goals and requirements, but the lack of wording concerning free and open licenses in section 1.2 remains a cause for concern.
In some respects, I feel reminded on the early "Europeana" times when free and open licenses were not the consensus among EU member states concerning content on this platform.
Anyone willing to make the necessary calls to the UK Government?
Mathias _______________________________________________ Publicpolicy mailing list -- publicpolicy@lists.wikimedia.org To unsubscribe send an email to publicpolicy-leave@lists.wikimedia.org
On Mon, 13 Jan 2025, Mathias Schindler wrote:
In my opinion, this dataset could and should be labeled CC-0 or released under a permissive license such as CC-by which would make it compatible with Wikimedia projects such as Wikimedia Commons. In return, Wikimedians could offer the UK government to help annotate, curate or otherwise improve metadata, making this dataset more valuable for training.
I wonder whether it would be more beneficial to tag "restricted by copyright" material instead, and leave everything else to be a fair game, for AI, free knowledge community, or anybody else.
The labels should be on the copyrighted stuff and ideally cost some money to renew every year.
I think we might not want to have a clearly marked public domain/CC-0 zoo with a very high fence around it.
Marcin
publicpolicy@lists.wikimedia.org