Dear Ziko,
thank you for your thoughtful comments. My answers are inline.
On Tue, Apr 14, 2020 at 2:25 PM Ziko van Dijk zvandijk@gmail.com wrote:
Dear Denny,
Thank you for your well written piece with some very intriguing ideas. I have read most of it, and I must confess that I have not fully understand everything about keys and contractors. Maybe I am not exactly the target group.
I found it very sympathetic to read your own scepticisms, and obviously we both have read the same book by Umberto Eco. :-)
I really love Eco's book on the topic, "The Search for a Perfect Language", and I can recommend everyone to read it. It is sometimes a bit hard to find.
Single point of failure: I am not that worried about that, according to the "many-eyball-principle".
Single language wiki: This seems to me the biggest problem if your new wiki (wikis) is supposed to be a place where everybody can contribute, regardless of the native language. You think that "detailed discussions and debates" are less likely (in the beginning). Well, for any meaningful participation, would'nt it be important that everybody can communicate with everybody? Whether we would use one or several languages in the wiki, the language problem would be a limitation of the collaboration.
I actually do not think that this is crucial from the immediate beginning. I mean, it obviously would be great if there was a way for all contributors to be able to discuss everything from the start, but the lack of a solution for this issue didn't stop us from creating Commons, Meta, nor Wikidata - and two of these are among the most active projects we have.
If there is a "edit-war" like situation between contributors of different language background, where the lack of a common language prevents a productive discussion, there is always the option that they simply override that part in their local Wikipedias with local content. Don't forget that the content from Abstract Wikipedia is only used if a local community decides to do so. So such disagreements can be avoided, even if not always resolved.
By the way, I think that a big part of the negative attitude, that many (German) Wikipedians have towards Wikidata, is based on language barriers.
If this were true, I would expect the same attitude from many other wikis that do not speak English. But many other Wikipedians have embraced it. And in fact, even among the German communities I have sensed much more openness to collaboration and much more understanding between the projects in the recent years than it used to be.
In short, I do not think that it was the language barrier that caused the issues that have been there. After all, I have the feeling that the Wikipedia with the strongest opposition to a measured used of Wikidata is not the German, but the English one - and there the language barrier is rather small, give or take the propensity for using Q-prefixed numbers in otherwise understandable text.
Another reason is that Wikipedians have build up their own status within Wikipedia, and when they come to Wikidata, they have to start from the beginning to build up status. The same problems we would we with regard to ("normal") Wikipedia on the one hand, and Abstract Wikipedia and Wikilambda, on the other hand, I guess? So these wikis would be in future linked to each other very much, but the different communities might not go along well.
Yes, this is correct. This is already the case for our projects, and in fact, often also for the communities that have formed within the projects.
Reducing knowledge diversity: I agree that that is not so much the problem, as the Wikipedians will decide which content to take over and which not. What I would like to see: That as a reader, I can get an article (e.g. "San Francisco") in different versions: a long one, a short one, one interesting for people who live in SF, etc. In general, more modularity than now would be great.
I agree, that would be quite awesome. And whereas I don't think this to be an immediate goal, I do think that such a system will become *much* easier when we have the content in an abstract format and can do summarizations or choose different renderers for different audiences. For example, there could be different renderer that is more suitable for children of different ages, which keeps the readability-level in mind, or renderers geared toward more lay audiences and others towards specialists. Some of these are easier to do than others, but I see us working on these by 2024.
"We must make sure that it does not become too hard to contribute": Yes, that is a big problem (see above). I like the idea of "outsourcing" skills; that the people of local Wikipedia can ask people on Abstract WP and Wikilambda. You would need enough volunteers on AWP-WL to help; and you would need at least some people on local WP who can communicate its wishes to the helpers on AWP-WL. For very small WP communities, that would be an enourmous challenge.
Agreed. Both Wikidata and English Wikipedia have managed to create such environments to help contributors, be it the Teahouse or the "Ask a SPARQL query" page. I very much hope that we will foster a community that will live this spirit.
But I do think that this project is more complicated than any of the other projects we currently have, and I think that it would be important to initially provide this kind of support also coming from the development team. I hope that from this seed, a community-owned support system will grow.
My personal approach would be the following, based on experiences with German language encyclopedia for children, Klexikon. It would be great for small Wikipedias to find a corpus of ca. 3000-5000 encyclopedic articles. Well chosen by relevance for at least most parts of the world. In easy-to-understand English, not too long, with a good strcuture, written in a way that you can easily translate and adapt them for your own language. (Many people will now say: "Simple English Wikipedia already exists", but I think it is not there yet.)
Those 3000-5000 articles would be a wonderful encyclopedia already. The local Wikipedians would enrich the content then with some hundred or thousand articles of their own. In my experience, you do not need millions of articles to fulfill the knowledge hunger of most readers.
I see and understand your approach, but respectfully disagree. I do not think that, whoever runs the development of this project, should be in the business of guiding the content creation of the project. I firmly believe that creating the content and deciding on which content to create should be solely in the hand of the community.
Having said that, I also will absolutely welcome community members from initiating a project where they decide on a corpus of say 3000-5000 encyclopaedic articles chosen by relevance for at least most parts of the world, and make it their aim to create a good structure for these and adapt them to their own languages. In fact, I hope that people who have experience with running such projects will become contributors and do that. I do think that this would be a promising early strategy to create content.
But such a project obviously should not be exclusive.
I think that your "content translation framework" approach goes a little bit into this direction. Part of the framework could be to make suggestions about "localization". For example, the article about "Dogs" could have a note saying: "After this paragraph, you could add some sentences with regard to dogs in your own country/region."
Whereas I would love to claim that the content translation framework is mine, it very much isn't. There is a wonderful team at the Foundation that has created and maintained this over years, and they recently had a rather stormy uptick in translations, having lead to more than 600,000 translated articles. I cannot praise their hard work enough, and I am thankful to them for having enabled so many people to create so much content in so many languages already.
Kind regards, Ziko
Thank you for your comments, and stay safe, Denny
Am Di., 14. Apr. 2020 um 02:53 Uhr schrieb Denny Vrandečić < vrandecic@gmail.com>:
As some of you know, I have been working on the idea of a multilingual Wikipedia for a few years now. Two other publications on this are here, I have bothered you with mails about it here previously too:
https://research.google/pubs/pub48057/
https://wikipedia20.pubpub.org/pub/vyf7ksah
I've also been giving talks about the topic in several places about this idea, some of them have also been recorded:
https://www.youtube.com/watch?v=LLiJ6E9sG6U&list=PLQVG_tuf3Q2fji-CwqEDRJ...
I gathered some awesome feedback in those few years (also from some
members
of this list, thank you!), and I also implemented a few prototypes trying out the idea, learning a lot from that.
All of this has helped to sharpen the idea and come up with a more
concrete
proposal. In short, the proposal is that we do a two-step approach:
first,
allow for capturing Wikipedia content in an abstract notation, and
second,
allow for creating functions that translate this abstract notation into natural language (For simplicity, I gave this two steps names, Abstract Wikipedia for step 1, and Wikilambda for step 2. I realize that both
names
are not perfect, but that is just one of the many things that we can
figure
out together on the way).
I wrote up this proposal in a paper, which I uploaded to my Website
almost
two weeks ago, and I also submitted it to Arxiv. And as soon as it was published on Arxiv, I wanted to share it with you and see what you folks think (I wanted to wait for it as Arxiv would allow the URLs to remains table - my Website has gone down before and might so again).
https://arxiv.org/abs/2004.04733
The new proposal is much more concrete than the previous proposals (and therefore there is much more to criticize). Also, obviously, nothing of this is set in stone, and just like the names, I am very much looking forward to hear suggestions for how to improve the whole thing, and I
will
blatantly steal every good idea and proposal. I am not even sure what a good venue for this discussion is, I guess, eventually it should be on Meta?, but also about that I would like to hear proposals.
Abstract Wikipedia is a proposed extension to Wikidata that would capture the content next to the Wikidata items. Think of it as a new namespace, where we could create, maintain, and collaborate on the abstract content. Similar to the Wikidata-bridge, there should be a way to allow contributions from the Wikipedias to flow back without too much friction. The individual Wikipedias - and I cannot stress this enough - have the choice to use some or any or all or none of the content from Abstract Wikipedia, but I most definitely do not expect the content of the current Wikipedias to be replaced by this. In fact, I have no doubt that any
decent
article in any language Wikipedia will remain superior to the outcome of the proposed new architecture by far. This is a proposal for the places where the current system left us with gaps, not a proposal to turn the parts that are already brilliant today dull and terrible tomorrow.
Wikilambda is a proposed new Wikimedia project that allows us to share
in a
new form of knowledge assets, functions. You can think of it as similar
to
Modules or Templates, but a bit extended, with places for tests,
different
languages, evaluation, and also for all kind of functions, not only those that are immediately useful for one of the Wikimedia projects, and most importantly, shared among the projects. So one of the first goals would
be
to increasingly allow fo a place to have global templates, another idea that has been discussed and asked for for a very long time. Wikilambda, just as Wikidata, is expected to start as a project supporting the immediate needs of the sister projects, and over time to grow to a
project
that stands on its own merits as well.
We don't really have an effective process for starting new projects, so I am trying to follow a similar path that we took for Wikidata back then.
And
back then it all started with Markus Krötzsch, me and others talking
about
the idea to anyone who would listen until everyone was bored of hearing
it,
trying out prototypes, and then talking about it even more, and improving all of it constantly based on your feedback. And then making increasingly concrete proposals until we managed to show some kind of consensus from
the
communities, you, and the Foundation to actually do it. And then, well,
do
it.
So, I've done some of the talking, with researchers, with the public,
with
some of you, and also with folks at the Foundation, to figure out what
next
steps could be, and how this can be made to work. Here's a more concrete proposal. Now I am here to see whether we can find consensus and be
bold. I
want to hear from you. I want to hear what you think what the right place is to discuss this (here, this list? Another mailing list? Meta?
Wikidata?
Some Telegram or Facebook group? (OK, I was joking about the latter)). Which parts of the proposal are good and which need improvement? Where is more detail or clarification needed to allow for a meaningful discussion?
Just as with Wikipedia and Wikidata and our other projects, this is a
crazy
idea at first. Maybe even more crazy than our other projects. And the
only
way there is a chance of us being successful is, if, eventually,
thousands
of us work together on it. The only way this worked in the past is by
being
open, start out collaboratively, discuss the path forward, and work
towards
creating the project together.
Stay safe, Denny _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe