Wikipedia-l February 2003

wikipedia-l@lists.wikimedia.org

70 participants
133 discussions

[robertocasiraghi@iol.it: Can Wikipedia articles be translated or "manipulated" to teach English?]

by Jimmy Wales

Can someone help me help Roberto? He would like to use Wikipedia texts in a magazine that helps people to learn English. He would take our texts and give line-by-line translations. He is happy to release his translations under the GNU FDL but wants advice on just what he needs to do. What sort of notice should he put on the articles? --Jimbo ----- Forwarded message from roberto casiraghi <robertocasiraghi(a)iol.it> ----- From: "roberto casiraghi" <robertocasiraghi(a)iol.it> Date: Thu, 27 Feb 2003 15:08:13 +0100 To: <jwales(a)bomis.com> Subject: Can Wikipedia articles be translated or "manipulated" to teach English? Hello. I am the publisher of English4Life, an Italian magazine that teaches English by providing a double translation into Italian (word by word and in good Italian) and a complete pronunciation guide. My question is: can I use Wikipedia texts for my purpose? I would leave the texts unchanged but supply suitable translations and comments. Do I have to ask for a special permission in order to do that? If this particular use of your texts is available under the Open Content license, will I lose my copyright on all the translations and comments (i. e. will they become available too under the Open Content License?). Kind regards. Roberto Casiraghi - English4Life Casiraghi Jones Publishing srl Via Marconi 28 20091 Bresso (Milano) website: www.linguefaidate.com ----- End forwarded message -----

21 years, 1 month

interlanguage links statistics

by Chuck Smith

> I am not sure if I understand the idea correct, but > here are > my thoughts. No, I don't think you understood my idea correctly. I simply wanted a bot to go through the English Wikipedia and count the interlanguage links for each article and then list which articles have the most. I know it's not going to be perfect, but it's an estimate. I mean our article count isn't perfect either. ;-) The statistics count could also measure how many interlanguage links are for each language as well. Could be interesting. > In its early stage, I guess, > users of a non-English wikipedia may be dominantly > bilinguals, non-native speakers, and etc. (Well, > esperanto > wiki and maybe some others will always remain so. > :-) Yeah, I think we have only have one native speaker of Esperanto in the Wikipedia. He really helped out on our article about native Esperanto speakers though. :) Chuck ===== Learn Esperanto! - http://www.lernu.net/ Enciklopedio: http://eo.wikipedia.org/ ___________________________________________________ Yahoo! Móviles Personaliza tu móvil con tu logo y melodía favorito en http://moviles.yahoo.es

21 years, 1 month

most international articles statistics?

by Tomos at Wikipedia

>>I asked this before, but I don't think anyone replied. >>Is there anyway to see a list of articles based on in >>how many languages they're in? It would be interested >>to see which articles are the most international or >>the most important... I am not sure if I understand the idea correct, but here are my thoughts. 1. There is perhaps no easy way to see the list. It takes a well-developed wikitionary to find out how "United Nations," "Dog," or any other good candidate is spelled in all the different languages. We can use inter-language links, but they are not complete. The only way is to build a list through a interlingual collaborative research project. It could be daunting, but the scope of the research could be limited based on a) the list of articles in smallerst wikis and/or b) to just a list of top 100 most-viewed articles in each site. 2. I have some guesses about which articles exist in the greatest number of languages. -Wikipedia might be existing in most sites. -GNU_FDL, because it is linked from most pages, may be as prevalent. -Basic academic disciplines such as mathematics, linguistics, and sociology, because they are likely to be linked from main pages. Lists of world countries, languages, etc. would be perhaps as popular. While these are not unimportant subjects, they do not represent "what's globally important" well. (I'm assuming that's what Chuck wanted to observe.) Beyond that, two types: Some articles are placed in very inactive site because a user wants to distribute an article multilingually. I have seen at least two such articles in Japanese wiki perhaps machine-translated. In its early stage, I guess, users of a non-English wikipedia may be dominantly bilinguals, non-native speakers, and etc. (Well, esperanto wiki and maybe some others will always remain so. :-) Then computer related topics would be popular because of the demographics of early adopters. With many wikis still in their infancies, those may turn out to be the ultimate winners, scoring points from the small-sized wikis. But I'm not saying the idea of a list of articles based on # of languages is uninteresting. If we really construct one, something else may come up, say, "Beatles" or "European Union," that would be interesing. I'm curious, indeed. I would be even more interested if different list exist for people, country, artist, movie, novel, music, etc. Did I answer your question? If my take is correct and you will initiate an interlingual research project, please let me know. Tomos _________________________________________________________________ MSN 8 with e-mail virus protection service: 2 months FREE* http://join.msn.com/?page=features/virus

21 years, 1 month

Let Wikipedia mostly categorize itself (was Re: Categories, back to square one)

by Daniel Mayer

On Monday 17 February 2003 12:24 pm, Magnus Manske wrote: > So, it seems (if I interpret Jimbo's mail on wikitech and the discussion > here correctly) that most of us would like *some kind* of category > scheme in wikipedia. I do, too! But, we seem to differ on the details > (shocked silence!). > > So far, I saw three concepts: > 1. Simple categories like "Person", "Event", etc.; about a dozen total. > 2. Categories and subcategories, like > "Science/Biology/Biochemistry/Proteomics", which can be "scaled down" to > #1 as well ("Humankind/Person" or something) > 3. Complex object structures with machine-readable meta-knowledge > encoded into the articles, which would allow for quite complex > queries/summaries, like "biologists born after 1860". > > Pros: > 1. Easy to edit (the wiki way!) > 2. Still easy to edit, but making wikipedia browseable by category, > fine-tune Recent Changes, etc. > 3. Strong improvement in search functions, meta-knowledge available for > data-mining. > > Cons: > 1. Not much of a help... > 2. We'd need to agree on a category scheme, and maintenance might get a > *little* complicated. > 3. Quite complex to edit (e.g., "<category type='person' > occupation='biologist' birth_month='5' birth_day='24' birth_year='1874' > birth_place='London' death_month=.....>") > > For a wikipedia I'd have to write myself, I'd choose #3, but with > respect to the wiki way, #2 seems more likely to achieve consensus (if > there is such a thing;-) > > Magnus Hm. I agree that #1 would be nearly useless and #3 is asking too much of mere mortals but #2 smells a lot like subpages. I remember one of the arguments against subages was that there are multiple ways to express the hierarchy of most subjects. So for example, an alternate hierarchy for [[proteomics]] might focus more on historical development instead; Science/Biology/Molecular biology/Genetics/Proteomics or depending on your opinion even Science/Biology/Molecular biology/Biochemistry/Proteomics. Biographies would be even more difficult: Biographies/Science/Physics/Albert Einstein vs Biographies/Science/Physics/Theoretical physics/Albert Einstein vs Science/Physics/Albert Einstein . But he was also a peace activist so; Biographies/Politics/Peace activists/Albert Einstein or even Biographies/Politics/Political movements/Peace activism/Albert Einstein. Having hierarchies is also bad database design for the above reasons and because novel and interesting relationships can not be searched for unless a human has already created a hierarchy specific to that relationship. What would be much better is to allow a spider to follow links starting from the Main Page and automaticly classify articles based on what they link to and what links to them. The spider would check the classification of articles linked to and from the unclassified article in order to classify it. Additional tweaks could be added for the spider to determine the classification of the article based on the presentation of its content and whether or not they are linked from certain pages that are given more weight. Our many lists would be very useful here; if something that looks like a biography to the spider is listed on [[List of astronomers]] then the spider would classify that article as "astronomer". The classification "astronomer", in turn, would already be classified under "astronomy", "science" and probably many other things at varying "weights of relevance". Biographies, for example, have a certain format (at least in en.wiki) whereby the birth and death dates are in parenthesis after the name on the first line. Most of them also state on the first line the country of origin of the person and their occupation(s). And many are listed on the list of biographies and lists of people pages and in the birth and death sections of year and day articles. All this can be used to classify the article so that the types of queries that you mention in #3 could be done. Another example is the use of the <math> tag for writing formula. So having that tag in an article would be /one/ thing that would be considered by the spider in determining its classification (but the spider wouldn't categorize the article under "mathematics" unless other articles linked to and from it are already categorized that way). And as time goes by we can make this categorization spider more and more sophisticated. But it will be rather crude at first so a human-powered feedback mechanism would have to be put into place to tweak the spider. The many lists that we already have can be a very useful way to prime the spider to quickly improve its accuracy. In short, let the complex linking and standardized article formating already present in Wikipedia determine its own "weighted relationship" categorization with a minimal amount of human intervention. There is a goldmine of categorization information already in Wikipedia that hasn't been tapped yet (like in the BBC program Connections, Wikipedia could eventually be used to find odd but interesting connections between disparate subjects along with the more predictable relationships). Yeah, I know - I'm just dreaming. But it would be a very neat thing to have and it would greatly minimize the amount of work (and inconsistent guesswork) that humans would have to do. But the above is probably already patented by somebody.... Software patents are evil! -- Daniel Mayer (aka mav) WikiKarma: I expanded and converted [[Titanium]] over to the WikiProject Elements format.

21 years, 1 month

Re: Re: Article basement (was: Re: [Wikipedia-l] Semantic links (was:

by Daniel Mayer

On Tuesday 25 February 2003 06:53 pm, wikipedia-l-request(a)wikipedia.org wrote: > Wikipedia isn't just in English, folks. An English-dependent syntax is > *not* acceptable. > > -- brion vibber (brion @ pobox.com) *cough* HTML (<table> <font size="{}"> <center> <small> <br>...) *cough* We should be better than that but how can you make something human readable if it isn't written in any human language? Translating the syntax would work except for the fact that contributors would not be able to copy and adapt metadata from one language to another. That is, unless we allowed all translated syntax to be in one "basement" - but then we are back to the unreadable aspect. --mav WikiKarma Added a bunch of events to [[February 19]]; updated all the year pages and many of the other articles linked from that page.

21 years, 1 month

Re: Wikipedia-l digest, Vol 1 #1043 - 8 msgs

by s

As far as language-links can't it be a simple matter of script categorization?: <LANGA - would be understood by most any latin-based reader as a potential start point.... This would lead to a a disambiguation page of Latin based languages... <Simplified Kanji here) is readable by Chinese, Japanese, Koreans...etc... That alone takes care of probaby 85 percent of the world... <arabic-based text here> - could even be Aramaic! - for an icon, it is similar enough, and could be the basis of all Aramaic -based languages.. Hebrew, Arabic... Urdu..? Three Icons, and weve taken care of ninety2 percent of the world... The rest... what... Cyrillic? - might be combined in the Latin. with a backwards N or something... Thai, and various different types of script can be incorporated... Thai into Semitic or Asian.... etc... - In other words Icons are the way to go, you just have to understand how they work to elicit a response... To someone who prefers to read Kanji, the (middle sign) or the (Character sign) is an island in the ocean... Stevertigo...

21 years, 1 month

most international articles statistics?

by Chuck Smith

I asked this before, but I don't think anyone replied. Is there anyway to see a list of articles based on in how many languages they're in? It would be interested to see which articles are the most international or the most important... Thanks, Chuck ===== Learn Esperanto! - http://www.lernu.net/ Enciklopedio: http://eo.wikipedia.org/ ___________________________________________________ Yahoo! Móviles Personaliza tu móvil con tu logo y melodía favorito en http://moviles.yahoo.es

21 years, 1 month

japanese wikipedia: article count and embassy

by Tomos at Wikipedia

Regarding Japanese article count: yes, the number is kind of inaccurate. The problem is not unique to Japanese, but shared with Chinese and others, I believe. The reason is as pointed out - Japanese writing can go in length without an alphabetical comma. The problem has been known to some Japanese users, because one user translted [[Wikipedia:What is an article]] and noticed the article count's behavior. I also experimented with a fairly long article (10921 bytes). As soon as I deleted all the commas and saved, the article count dropped one. When I reverted, the count went up one. I then wrote about it in [[Wikipedia_talk:What is an article]], but I guess it didn't get much attention. I am glad to find out that elian brought this up on this list. Regarding Embassy, I don't mind setting one up on English wiki, (I'm a native Japanese speaker and understand English okay), but I am afraid I cannot fully function as an ambassador. Is a limited staffer embassy and part-time ambassador accepted? Tomos _________________________________________________________________ MSN 8 helps eliminate e-mail viruses. Get 2 months FREE*. http://join.msn.com/?page=features/virus

21 years, 1 month

Wikipedia - the multilingual encyclopedia

by Chuck Smith

> Question is : what is the *most* important message > to > convey, that we are multilingual, or that we are an > encyclopedia ? Of course, it is that we are an encyclopedia, but it is fine to have one list on the non-English Wikipedias, because we've already been suppressed to using subdomain names instead of the prestigious www. > Most people read only one language. Nobody can read > 30 languages. > (very few anyway). Even people who can read more > than one language > have a preference. You're an American, right? I live in Rotterdam (although I'm an American too) and I don't know anyone here who would only read Wikipedia in one language. I know people who go first to Wikipedia, simple *because* it's the only multilingual encyclopedia in existence (from what I know, correct me if I'm wrong) and it's incredibly enlightening to see the different POV that articles get in different languages and my European friends thinks it's funny to see how often the English language articles in Wikipedia are biased toward an American POV. *Everyone* on the Esperanto Wikipedia can (and does) read two languages and most can read three or four. Most of the people who would have Internet access and would read Wikipedia can and will read more than one language! The majority of people who would only read Wikipedia in one language would be in the United States, England and Australia which is 0.6% of the world population... > I think the French wikipedia is correct in not > placing language lists > on three sides of the main page the way we do in > English. I can find > my four languages on one list. I don't need three > lists. The three lists on the English Wikipedia is a temporary measure until we can make www a multilingual portal and then we can eliminate the extra language lists on en.wikipedia.org and move it down to one language list. The other question is, how many first time visitors will see their languages on just one list? How many would even notice the language list at all if it wasn't shown three times? Having the language list three times is there to appease the people who find it offensive that www isn't a multilingual portal. Don't forget that it was one of the reasons why the Spanish Wikipedia broke off from us AND why it won't rejoin. > I don't understand why I seem to be coming across > like some kind of > rude crank on this. I don't mean to. I apologize > again to everyone. > I have nothing against anyone's language or anyone's > wikipedia. I > just don't see the point of the redundant, > inconsistent information on > available languages. Well, it's because you don't see language discrimination around you everyday. People in French Canada are getting their native language shoved aside by English and they needed laws to protect them. Most countries of Europe don't have laws to protect their languages and are getting run over by English. Some job offers are only for English native speakers which means that even if your English is perfect, but you aren't a native speaker, you're not getting the job. This can be quite an emotional issue for some, especially those who are fighting against language discrimination full-time. That's why I'm here in Rotterdam volunteering for TEJO (www.tejo.org). Language discrimination is a tough issue and it's not going to go away anytime soon... I just want to add, that I don't think that you personally are discriminating, but that you're simply not aware of the problem. I hope I wasn't too harsh, it's just that I feel quite strongly about these issues... Chuck ===== Learn Esperanto! - http://www.lernu.net/ Enciklopedio: http://eo.wikipedia.org/ ___________________________________________________ Yahoo! Móviles Personaliza tu móvil con tu logo y melodía favorito en http://moviles.yahoo.es

21 years, 2 months

Possible Aged Copyright Violation

by Gareth Owen

At the end of August last year, anonymous contributor 24.80.230.145 contributed this, http://www.wikipedia.org/w/wiki.phtml?title=Gardner_Fox&oldid=191349 sprung full formed from the head of Zeus, and a few other comic book writer progiles, then abruptly vanished. Well, thats uncannily like this: http://www.google.com/search?q=cache:_L8AdgxKTRwC:www.geocities.com/Athens/… Firstly, the wiki article is clearly based on the geocities one, (or vice versa, but I'd say thats unlikely) and the wording is very similar but different. How close must texts be to be a breach of copyright? -- Gareth Owen "I love the wikipedia, but sometimes I get the impression that certain people on this list are very bored, and so argue about something when there's really bugger all to argue about. Edit some articles, for god's sake." -- LP

21 years, 2 months

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

Wikipedia-l February 2003