Several wikis have used bots to increase their article count in the past. Examples are the Volapük Wikipedia (vo) with 118,000 articles of which about 117,000 are bot-created stubs or the Aromanian Wikipedia (roa-rup) with 61,000 articles at the moment and less than 10,000 before the bot run.
Why do they use bots? Because they have a small userbase and want to cover as much topics as possible with small effort. Most of the languages that use bots are small languages without much written literature especially when it comes to non-fiction reference literature. There are no Aromanian encyclopedias, no or few reference books, no databases etc. An Aromanian either has to learn and use foreign languages or he will never be able to get informations about places in China or in America. The bot operator tried to change this by creating stubs about places in China, America and elsewhere. (Geographic objects are the easiest method to cover large numbers of topics without much effort.) But he did a horrible job with really bad and uninformative articles. I assume the reason for the bad articles is not any bad intent but just lack of technical skills to program a more useful bot.
The easiest reaction to this is to just let them do their thing and don't care about it. The second easiest reaction is running a delete bot and removing the bad articles because of their negative effects. But both methods do not address the original motivation of the bot operator: the wish to have information about a large range of entities available in the wiki's language.
How can this be addressed?
We need a datawiki. That's not a new proposal, proposals for datawikis have a long history. But there never was a specific reason not to implement it, it was just that nobody cared about it so much that it was implemented until now.
Here's my idea about it: When a search does not yield any matching articles on the local wiki, the software will look up the name in the central datawiki. If the central datawiki contains a matching entry, this entry will be loaded. It will contain an instance of a template filled with information about the entity. E.g.:
{{Town |name=Fab City |country=Awesomia |pop=89042 |lat=42.0 |lon=42.0 |elevation=12 |mayor=Adam Sweet }}
The software will now look for a template called "Town" on the local wiki. The local template [[Template:Town]] will for example look like this:
{| class="infobox" |- ! Name | {{{name|}}} |- ! Country | {{{country|}}} |- ! Population | {{{pop|}}} |- ! Mayor | {{{mayor|}}} |- ! Elevation | {{{elevation|}}} above sea level |- ! Geographic position | {{latlon| {{{lat|}}} | {{{lon|}}} }} |} '''{{{name|}}}''' is a place in [[{{countryname| {{{country|}}} }}]] with a population of {{{pop|}}}.
[[Category:{{countryname| {{{country|}}} }}]] [[Category:Towns]]
Of course this template will be localized in the language of the local wiki. This information will now be shown to the user who entered the name in the search. (The above examples are just, well, examples. Real entries would most likely contain much more data.)
The datawiki can be filled with information about any entity that has a certain set of recurring features (almost anything that has a infobox on Wikipedia), especially geographic objects. These objects also have the advantage that their names usually are international (at least among Latin script languages).
The advantages are: - when the central datawiki is filled with info (most of which can be bot-extracted from existing Wikipedia infoboxes), every Wikipedia - how small the userbase may be - has instant access to information about hundreds of thousands or millions of objects, they just need to implement some infobox templates - this solution also erases problems with outdated information in infoboxes (a problem even en.wp is suffering from). The data only needs to be updated in one single place instead of every single Wikipedia separately
With the work done by Nikola Smolenski on the Interlanguage extension (http://www.mediawiki.org/wiki/Extension:Interlanguage) it shouldn't be too hard to implement.
In view of the potential usefulness I cannot think of any argument that speaks against this in general. The prospect of providing at least basic information about millions of objects in all the different languages seems really great to me.
Many native speakers of smaller languages use foreign language wikis as their default wiki because the chance that their native wiki has an article on the topic is small. If the number of topics where a search on the native wiki yields results raises from "some thousands" to "millions", there is a chance that users will finally accept their native wiki as their default wiki. The entries will be basic, but if interwikis (of existing articles not generated from the datawiki) will be included in the info obtained from the datawiki, the more extensive data is just one click away, while an unsuccessful search on the local wiki (as you will get it as of now) is a dead end.
It certainly is worth putting some resources into it.
What do you think?
Marcus Buck User:Slomox
On Mon, Aug 23, 2010 at 5:55 PM, Marcus Buck me@marcusbuck.org wrote: ...
In view of the potential usefulness I cannot think of any argument that speaks against this in general. The prospect of providing at least basic information about millions of objects in all the different languages seems really great to me.
While I like the idea, I wonder how (and in which language) the community of this project will establish consensus..
-Palnatoke
Am 23.08.2010 18:20, schrieb Ole Palnatoke Andersen:
On Mon, Aug 23, 2010 at 5:55 PM, Marcus Buckme@marcusbuck.org wrote: ...
In view of the potential usefulness I cannot think of any argument that speaks against this in general. The prospect of providing at least basic information about millions of objects in all the different languages seems really great to me.
While I like the idea, I wonder how (and in which language) the community of this project will establish consensus..
-Palnatoke
It'll be multilingual in the same way as Meta or Commons or our other cross-language-border projects.
Marcus Buck User:Slomox
On 23 August 2010 17:23, Marcus Buck me@marcusbuck.org wrote:
Am 23.08.2010 18:20, schrieb Ole Palnatoke Andersen:
On Mon, Aug 23, 2010 at 5:55 PM, Marcus Buckme@marcusbuck.org wrote:
In view of the potential usefulness I cannot think of any argument that speaks against this in general. The prospect of providing at least basic information about millions of objects in all the different languages seems really great to me.
While I like the idea, I wonder how (and in which language) the community of this project will establish consensus..
It'll be multilingual in the same way as Meta or Commons or our other cross-language-border projects.
So, English then? ;-p
- d.
An'n 23.08.2010 18:31, hett David Gerard schreven:
It'll be multilingual in the same way as Meta or Commons or our other cross-language-border projects.
So, English then? ;-p
Yep, de facto English ;-) I don't like it and I spend much time to improve the usefulness of Commons for non-English speakers, but improving the participation oppurtunities for non-English folks on our multilingual projects is a different task with enough complexity in itself.
Although I'm sure they will establish sub-communities on the new wiki like they did on Commons. E.g. German speakers meet at the Forum (http://commons.wikimedia.org/wiki/Commons:Forum) instead of the Village pump. That will happen on a datawiki too and probably these subcommunities will focus on their respective regions, e.g. German speakers will focus on maintaining the town data for places in Germany, Austria, Switzerland etc.
Marcus Buck User:Slomox
On 23 August 2010 17:43, Marcus Buck me@marcusbuck.org wrote:
Although I'm sure they will establish sub-communities on the new wiki like they did on Commons. E.g. German speakers meet at the Forum (http://commons.wikimedia.org/wiki/Commons:Forum) instead of the Village pump. That will happen on a datawiki too and probably these subcommunities will focus on their respective regions, e.g. German speakers will focus on maintaining the town data for places in Germany, Austria, Switzerland etc.
I do like this idea very much indeed. What will it take in software terms? Something similar to Freebase? Something like Freebase bolted onto MediaWiki? OmegaWiki?
- d.
On Mon, Aug 23, 2010 at 6:13 PM, David Gerard dgerard@gmail.com wrote:
On 23 August 2010 17:43, Marcus Buck me@marcusbuck.org wrote:
Although I'm sure they will establish sub-communities on the new wiki like they did on Commons. E.g. German speakers meet at the Forum (http://commons.wikimedia.org/wiki/Commons:Forum) instead of the Village pump. That will happen on a datawiki too and probably these subcommunities will focus on their respective regions, e.g. German speakers will focus on maintaining the town data for places in Germany, Austria, Switzerland etc.
I do like this idea very much indeed. What will it take in software terms? Something similar to Freebase? Something like Freebase bolted onto MediaWiki? OmegaWiki?
I thought transwiki template transclusion is being worked on?
Magnus
An'n 23.08.2010 19:20, hett Magnus Manske schreven:
On Mon, Aug 23, 2010 at 6:13 PM, David Gerarddgerard@gmail.com wrote:
On 23 August 2010 17:43, Marcus Buckme@marcusbuck.org wrote:
Although I'm sure they will establish sub-communities on the new wiki like they did on Commons. E.g. German speakers meet at the Forum (http://commons.wikimedia.org/wiki/Commons:Forum) instead of the Village pump. That will happen on a datawiki too and probably these subcommunities will focus on their respective regions, e.g. German speakers will focus on maintaining the town data for places in Germany, Austria, Switzerland etc.
I do like this idea very much indeed. What will it take in software terms? Something similar to Freebase? Something like Freebase bolted onto MediaWiki? OmegaWiki?
I thought transwiki template transclusion is being worked on?
Magnus
If what I proposed is planned to be part of transwiki template transclusion and if transwiki template transclusion is "real soon now" in the literal sense and not "real soon now" in the extended sense, than I'm happy and satisfied ;-) Is there a roadmap for transwiki template transclusion and is it decided that at the end of this roadmap transwiki template transclusion will go live on the Wikimedia projects?
Additional question: My idea involves calling a local template from within the transwikied template to do the localisation. Will that be possible with transwiki template transclusion?
Marcus Buck User:Slomox
On Tue, Aug 24, 2010 at 3:32 AM, Marcus Buck me@marcusbuck.org wrote:
An'n 23.08.2010 19:20, hett Magnus Manske schreven:
On Mon, Aug 23, 2010 at 6:13 PM, David Gerarddgerard@gmail.com wrote:
On 23 August 2010 17:43, Marcus Buckme@marcusbuck.org wrote:
Although I'm sure they will establish sub-communities on the new wiki like they did on Commons. E.g. German speakers meet at the Forum (http://commons.wikimedia.org/wiki/Commons:Forum) instead of the Village pump. That will happen on a datawiki too and probably these subcommunities will focus on their respective regions, e.g. German speakers will focus on maintaining the town data for places in Germany, Austria, Switzerland etc.
I do like this idea very much indeed. What will it take in software terms? Something similar to Freebase? Something like Freebase bolted onto MediaWiki? OmegaWiki?
I thought transwiki template transclusion is being worked on?
Magnus
If what I proposed is planned to be part of transwiki template transclusion and if transwiki template transclusion is "real soon now" in the literal sense and not "real soon now" in the extended sense, than I'm happy and satisfied ;-) Is there a roadmap for transwiki template transclusion and is it decided that at the end of this roadmap transwiki template transclusion will go live on the Wikimedia projects?
http://www.mail-archive.com/foundation-l@lists.wikimedia.org/msg11368.html
Additional question: My idea involves calling a local template from within the transwikied template to do the localisation. Will that be possible with transwiki template transclusion?
I don't think that is part of the current design.
However, the calling wiki could pass though a language code, and the datawiki could respond with the translated result. datawiki would then host the translations for the keywords, which would be similar to how Commons templates operate. This also helps with uniformity, as the datawiki can manage out of date translations because they are local. translatewiki may also have ideas for how they can help manage the translations of these databoxen.
-- John Vandenberg
wikimedia-l@lists.wikimedia.org