Hi,
the 1st sunrise period for .eu domain registration is going
to begin soon - 7 December 2005
only domain names which are registred EU community/national
trademarks will be registered.
I'm not sure about the status of Wikimedia trademarks, but
guess at least Wikipedia a Wikimedia are suitable. IMO
"we" should apply at least for wikipedia.eu and wikimedia.eu.
Because of trademark issues and eu regulations concernig
who can apply for .eu, I'm affraid it will be a bit
complicated. I hope someone from the foundation can take care
of it.
(If the process is allready going, sorry :-)
Jan Kulveit ([[USEr:Wikimol]])
| Date: Tue, 29 Aug 2006 09:36:30 -0400
| From: Anthony <wikilegal(a)inbox.org>
| Subject: Re: [Foundation-l] Celebrity pictures
| To: "Wikimedia Foundation Mailing List" <foundation-l(a)wikimedia.org>
| Message-ID:
| <71cd4dd90608290636p645474e7m1f132123a17319e9(a)mail.gmail.com>
| Content-Type: text/plain; charset=ISO-8859-1; format=flowed
|
| On 8/29/06, Ray Saintonge <saintonge(a)telus.net> wrote:
| > Anthony wrote:
| > >On 8/28/06, Ray Saintonge <saintonge(a)telus.net> wrote:
| > >>Perhaps more significant than whether anyone has lost is whether any
| > >>such case has ever been filed. Given that they are distributed for
the
| > >>specific purpose of publicity there could be an implicit permission.
| > >>
| > >>
| > >If you're using the image for the purposes of promoting the person.
| > >If, on the other hand, you're using the image to sell an encyclopedia
| > >article which portrays the person in a way which they don't want to be
| > >portrayed, then there probably isn't implicit permission.
| > >
| > I don't know if it's to "sell" an encyclopedia. Lindsay Lohan would
| > need to think she's pretty special if she believes a picture of her will
| > make all the difference in encyclopedia sales. Is she as self-absorbed
| > as Paris Hilton? Our use is transformative, and it in no way adversely
| > affects the company's sales.. It would even be interesting to hear the
| > companies comment on the function of publicity shots.
| >
| I was talking about reuse. Specifically, someone who was selling
| print encyclopedias with the current Lindsay Lohan article in it. I
| didn't mean to imply that the selling point was the picture, but
| merely that the encyclopedia was being sold.
|
| > >Maybe I'm overly paranoid, but even here in the US where we have some
| > >very strong fair use and first amendment rights, I still wouldn't feel
| > >comfortable selling an encyclopedia with the current [[Lindsay Lohan]]
| > >article in it, without first receiving permission from the copyright
| > >holders of the images.
| > >(http://en.wikipedia.org/w/index.php?title=Lindsay_Lohan&oldid=72480012
| > >in case it changes before this is read)
| > >
| > This may be a problem for the print version, and specific permissions
| > should probably be sought when we get that far. For the on-line
| > verrsion however I have no problem with an active campaign to replace
| > the fair use images with "free" ones. It's clear that I'm more risk
| > tolerant than you, but that doesn't mean there's such a wide gap between
| > our views.
| >
| I don't think it makes sense to have such significant differences
| between the print version and the online version. Other than that, I
| agree with you though. I wouldn't have a problem distributing the
| current article online. In fact, I have a website where I'm doing it.
|
| Jimbo has stated, long in the past, that he doesn't want the print
| version to be a fork of the online version. Maybe he's changed his
| mind, but if not I think you have to consider the print and online
| versions to be the same thing.
|
| > >Frankly I think that case could be probably be won by the museum on
| > >appeal, if they spent enough money fighting it.
| > >
| > Yeah, Dillinger has been dead since 1934.
| >
| In Indiana the right to publicity persists after death, though.
|
| > >Besides, there are
| > >always going to be crazy jurisdictions (like Indiana, apparently) with
| > >laws so out of touch with reasonableness that we just can't follow
| > >them.
| > >
| > Developing policies to account for such extremes is playing to the
| > lleast common denominator.
| >
| Absolutely. I agree. But at the same time, US fair use is an extreme
| too, just on the other end of the spectrum.
|
| > >As for relying on the copyright holder of the image finding the
| > >Wikipedia article "respectful", well, I just think that's a horrible
| > >thing for us to even have to consider. Would Linsay Lohan (*) object
| > >to our portrayal of her in "Media spotlight"? I don't know, and I
| > >don't care.
| > >
| > There's also the question of who owns the copyright. I suspect it's the
| > studio who sends out fan pics to admirers.
| >
| I would think, with a publicity photo, that it'd be the publicist.
|
| Anthony
|
Might also the NY Times test apply here? Celebrities, being public figures
should be viewed in public. A publicity photo is just for that purpose, to
have the person's photo
be put out into the public. Besides the strict fair use argument (there is
no loss to their
marketplace by putting the celebrity photo out) one could also argue that a
plain head
shot (that does not involve any creative positioning etc.) really the photo
belongs to the
celebritity (and usually they own those photos by contractual work made for
hire agreements anyway) not to a publicist or other third party (though
sometimes photos made for specific publications are copyright by that
publication as part of a story that
is being done for that publication). A regular publicity photo might only
have a claim by
the celebrity and in that case one might even argue that under the NY Times
test that
as long as actual malice does not apply to use of the photo, that the photo
can be
reproduced anywhere else (I haven't done any caselaw research on this so I
cannot state
that this would prevail but it seems a reasonable argument to make now).
This question also opens up all the moral rights questions associated with
altering someone's work. Obviously if some punk rocker took a GFDL released
photo (with a right of publicity implicitly released) from someone's profile
on WP and then altered it into making them look "evil" that could be
considered defamatory (either as a breach of privacy or perhaps some other
tort, if not actual defamation) and so one could argue that the GFDL and CC
never include permission for such tranformations of an image, even a public
domain image could be the basis of a tortious transformation. If it were
just being
used in another encyclopedia article how could they argue that an accurate
image of their
face damages them in anyway (including under copyright laws) since that is
the reason they released the photo in the first place? Having the photo
used in a thumbnail in an encyclopedia is just enhancing their celebrity
status, not causing them to loose money.
It is the other transformative uses that need to be looked at, I think.
Of course if the GFDL were drafted by an Canadian open source foundation
then the
whole bundle of moral rights protections would definitely apply, besides the
right of
pubicity issues, but in the US it is not so clear even though it signed the
Berne Convention it has limited moral rights protections under the [[Visual
Artists Rights Act]]
that only apply to limited types of "works of visual art" as defined in
Title 17 USC sec. 101 and do not cover publicity photos.
alex756 (IAAL)
I have made another machine translation run and removed particle
insertion, the erroneous swahili lexicons identified by Martin Benjamin,
and recompiled the swahili thesaurus based solely upon the Kamusi
swahili lexicons, which Martin states are only partially completed and
possibly has some ambiguities. Future runs of this project will be
posted and announced after application of the grammar rules and full
conjugation and sentence decomposition and reconstruction rule sets
based upon Dr. Benjamin's parsing rules, which may be a month or two
from now after
more work is done on the grammar parser for this language. One other
challenge is language drift into Arabic, which was explained to me that
Swahili and many other African Languages have drifted to incorporate
arabic language derivatives which may require overlapping rule sets to
machine translate properly.
I have activiated the english link grammar parser for this second run
and have begun using word paring against the Kamusi lexicons, which are not
yet setup to fully handle these cases yet (but well on their way to this
goal). The Cherokee language (and most native languages) produce
words which are complete self contained morphemes and word meanings are
typically not split accross word pairs as appears to be the case in
Swahili, and the Cherokee parsers and lexicons are a lot further along,
having been in development by our linguists for several years for this
precise application (In Cherokee, each complex verb is in fact an entire
self contained sentence of sorts - and some nouns as well). As Martin
points out, this language has a lot more work to go to get to the same
point the machine translator for Native American Languages has already
reached with comprehesive lexicons and grammar rule sets for machine
translation. Nonetheless, the tremendous potential Wikipedia machine
translation holds for African Languages is compelling enough for the
Wolf Mountain Group to approve funding for this effort to move it
forward along with any other interested African Languages in support of
the Wikimedia Foundations Projects and Goals for African Communities.
I still anticipate we can get to 90% by the end of Autumn. This project
will be under development and regular updates which will be posted to the
machine translations page setup by Sabine on Meta for African
Languages. These first runs were examples to illustrate the power of
Wikitrans
to rapidly apply and create the whole of Wikipedia almost overnight in
another language (provided the lexicons and rule sets are complete and
accurate
for the translator to rely upon). The African languages project is
very useful to allow further abstractions to be instrumented in
WikiTrans to deal with a multitude of languages for all of Wikimedia's
projects, which is the ultimate goal.
The real value here are the grammar and parsing rule sets and word
paring logic for each language and dialect. Over time, Wikitrans will
develop a
large body of these rule sets and lexicons for all interested languages
we target. Rule sets may or may not be published, depending on the project
and the interests of the contributors. French, Spanish, German, Dine,
Italian, and other popular and pervasive language rule sets will certainly
be published sometime this fall so folks interested in porting a
language to WikiTrans can do so by writing rule sets and lexicons and
submitting
them to the project for test runs.
Latest run for swahili is at:
http://sw.wikigadugi.org
Latest lexicons, thesaurus, and xml dumps are at:
ftp://ftp.wikigaudgi.org/africa
Jeff
>>
>>
>> In the past we have accepted codes to be used as "language" codes which
>> were non-existent and have had as a result that we are not in compliance
>> with the rules of accepted use for the ISO-639 codes. When codes for new
>> languages are used that are not consistent with the existing ISO-639
>> codes (all two and three character codes) a language should not be
>> accepted at all.
>>
We can handle ISO-639 codes, it's no problem to assign the correct code if it
exist. In my opinion, even languages without an ISO-639 code should be
accepted; however, this shouldn't be controlled by us, but the decision should
be made in the New language requests vote.
>
> According to your rules, anyone (with help of
> a few friends or of a few sockpuppets) can re-open the Zorglub language
> (oldbies will understand which language is concerned).
> So, your proposal needs to mention the issue of constructed languages.
The policy is just a proposal at the moment; however, we'll take this into
account. There is a Quenya language test in Incubator; I suppose we should
delete it?
> Besides, I see you wrote "The Foundation will also have to approve the
> domain". Errrrrrrr. I'd prefer we avoid such bottleneck. How about
> something like "if at least 20 votes with a very large majority", no
> approval needed. If less votes or less obvious support, then, the
> Foundation or the spc must approve before creation ?
I removed the approval part, I think it remained there from the time when it
was imported on Meta and reworded by Daniel. As for your 20 votes suggestion -
unfortunately there are often cases of flash voting, where 20-30 voters can
easily appear out of nowhere and declare their support. We can counter this
somewhat by forcing all voters to have accounts, but there are still problems
with that. I think this has yet to be decided somehow. We should make a
percent range of approval, like with RfAs on English Wikipedia; e.g. more than
75% support gets approved automatically, between 50% and 75% needs approval by
Foundation/SPcom and less then 50% fails to get a wiki estalished.
Michal Zlatkovsky ([[incubator:User:Timichal]])
Hello all,
thank you for your help and interest as usual, now we close candidate
accepting. I hope we can release soon the complete list of candidates.
This election gets a wide attention, and we have now close to 20
candidates. I'm really excited about that.
Quicklist of candidate statement by language:
http://meta.wikimedia.org/wiki/Election_translations_2006/En#Candidates
Some of those candidates came late, but I hope the Wikimedia community
pay impartial attention to all those candidates. All statements will
be read closely, ideally in the language most convenient for each
reader, that is, each voter. Regretfully we need to admit not all
voter can read those statement in the language they are most familiar
with, though.
But luckily and thankfully being helped by many eager translators,
some language speakers will be able to have each of those statements
in their own languages - or not. It depends on our volunteering staff,
and if you are multilingual, you can help it. Your translation will
help your friends in your community and help assuring the neutrality
and impartiality of this coming Election. I think no one will be happy
if some candidates have their statements in multiple language, and
others have only in few, or some language speaker voters can read what
a certain candidate think and promise, but need to rely on machine
translation or a language version which is obscure for them at all. I
want to pursue the equal opportunity to the highest extent. On this
Election. Not only because I am serving as officer to that, but also I
believe in the equal right of us the Wikimedia editor and trust you
all as collaborator to the same goal; to provide free knowledge in
every potion of this planet, in every language. And voting for our
community representative should be one of most significant part to
pursue this goal in my opinion. And for that, we need your help, you
German, Spanish, Dutch, Italian, Polish ... and many other
translators.
Again, we need your help. To provide all our voters enough information
consider this Election as serious as possible. As impartial as
possible. And you can help us. Us the Wikimedia community, assuring it
the global character, the equality - and free access to a certain
knowledge: what those candidate think about our mission. So, give a
look to our workspace and consider what you can do!
Quicklist of candidate statement by language:
http://meta.wikimedia.org/wiki/Election_translations_2006/En#Candidates
Thank you for your attention, and see later on meta.
--
Kizu Naoko
Wikiquote: http://wikiquote.org
* vivemus, mea Lesbia, amemus *
Jeff,
I applaud you for your initiative - your effort is impressive, albeit
unreadable. I'll give my feedback in this post, and then suggest we
take the discussion of the specifics of Swahili translation off-list
(and welcome others who want to keep track of this thread to email us to
stay in the cc loop). The last 2 or 3 paragraphs of this post do speak
to the wider discussion list, so other readers might wish to SKIP TOWARD
THE BOTTOM.
The first problem derives from your sources. The first source, "public
swahili lexicon," is a useless set of about 1000 nouns, adjectives, and
conjunctions, essentially a tourist vocabulary without any verbs. I
would be surprised if that list gave any pairings that weren't also in
the other lists. The third source, "rogets thesaurus in swahili," is
one I would like to know more about, but is not useful for machine
translation purposes in the configuration you've set up - for example,
scroll down to line 51382, and look at the following 100-odd pairs for
"idhini" in no particular order, with no way to distinguish among parts
of speech, shades of meaning, relative frequency, etc. However, I was
heartened to see line 45405 and following; I'm sure that if any
wikipedia entries need to be translated that include "assify,"
"torpedinous," or "macht nichts," this thesaurus will prove quite handy.
It looks like someone started with a smallish Swahili-English
wordlist, plugged that into an English thesaurus, and extrapolated
dozens of additional English equivalents per word, yielding an
intriguing but lexicographically suspect set of equivalencies.
Which leads us to the Kamusi Project as a source. I will be the first
to say that the Kamusi is a pretty good Swahili dictionary that will one
day be a great Swahili dictionary, but at the moment contains
significant weaknesses that prevent it from being a reliable source for
machine translation. The first issue is the quality of the data. The
initial data were manually input from an existing print dictionary to
which we were granted copyright permission. Unfortunately, the students
entering the data, before we programmed the Edit Engine, introduced a
lot of errors. I am currently in the process of going through the
database entry by entry, fixing those errors and adding in new heaps of
data, including information for many data fields that we hadn't
introduced during the initial data entry phase. This is an incredibly
time consuming, research-intensive task, and I don't foresee having a
Swahili->English dictionary that I am really happy with for another
couple of years (at best - the thesaurus above, and my wife, would
describe our current funding situation as "pauperized").
The Kamusi lexicon is much better as a Swa->Eng source than as an
English->Swahili dictionary, because that is the direction in which
we've input most of the initial data. The magic of databases makes it
possible to have our data available bi-directionally, but the E->S
version of the Kamusi needs its own careful review. That review can
only come after the S->E data are thoroughly updated. Most especially,
precious few E->S entries have been arranged with the Grouping Tool (
http://research.yale.edu/swahili/serve_pages/groupingtool_en.php ), so
most entries appear in an arbitrary order that does not account for
homographs, differing senses, frequency, etc. So, it would be premature
to use the E->S Kamusi lexicon as a platform for machine translation,
even though we do intend to get there.
When the data are ready for machine use, the program would also need to
check the four "alternate spellings" fields, to pick up all the color v.
colour issues that occur in both English and Swahili. Also, I would
think that you would want to keep part of speech info associated with
each line, which would make it much easier to employ grammar rules. A
grammar hint: in Swahili, the adjective always comes *after* the noun
that it modifies, except for the words "kila," "nusu," and "robo", and a
few other cases, including the numbers preceding "elfu" for thousands
between 11,000 and 99,000.
Another hint: Swahili does not use articles, so you need to get rid of
most attempts at translations of a/ an/ the. When an article is
absolutely necessary (which a computer would have a difficult time
predicting), Swahili uses variations of "one" for a/ an, and "that" for
the. Just getting rid of the articles in your articles would be a 100%
improvement (bringing them up to 2% readable).
Ok, now assume we have good data, with a good way of predicting which
words were appropriate in which circumstances (something that will
eventually be aided by the work now being done toward building a central
OmegaT database), and a good set of grammar rules. You would still need
to deal with the agglutinative Swahili verb in all its glory. The
Kamusi Project has a good parser embedded in our Swahili->English
search, which disentangles the front end of any conjugated Swahili verb
according to an analysis of every grammatical rule in the language. (We
have a similar analysis completed and written in pseudo-code for the
back end, the verbal extensions, but ran out of money and had to lay off
our programmer before we could code it into the search engine.) Even
taking advantage of our parser, your translating software would need to
go the other way, building Swahili verbs from conjugated English verbs.
You would need to account for the noun classes of each noun that is
referred to in the verb (as many as three different nouns, each of which
is either one of four different conversational participants or belongs
to one of 16 different noun classes), which involves trivial calls to
our database once you've identified the appropriate elements in the
English sentence and chosen the relevant nouns - the "class" field is
the key here. The real problem comes from conjugated English verbs.
You need some way of knowing that "catches/ caught" relate to "catch,"
which would involve a database of English verbs and their irregular
forms, and then you would need to map the various movable elements of
the English sentence to the appropriate fixed points of the Swahili
verb. Not an impossible task to achieve to 90% over time, but not
nearly as straightforward as you are hoping.
Of course, this is all for Swahili, for which we have a pretty good
initial lexicon en route to becoming excellent, a complete description
of grammatical rules, and an accepted, unicoded orthography. Most other
African languages, even those spoken by millions of people, are missing
some or all of those elements in digital form. So, even if you could
get pretty good machine translation of Wikipedia for Swahili, you would
still be a long, long way away from rolling with other languages.
And we still haven't dealt with content. What's to say that content
that is appropriate for the English Wikipedia is appropriate for the
Swahili Wikipedia? For example, the entry for Agriculture. It begins
by discussing the derivation of the word "agriculture," which is of
course irrelevant for Swahili. Then it carries an unacknowledged POV
about modern agriculture (as though the vast numbers of Africans who
earn their livings with hand hoes are pre-modern museum relics, and
let's not even click on the link to "subsistence farming" that talks
about "life outside of modern society"), and essentially ignores all of
the issues of raising crops on small farms that would be of immediate
interest to an African farmer logging in from an internet kiosk.
(Comment to those who fear paternalism in this endeavor: the people I
live and work among in Tanzania express a huge interest in having access
to this sort of information, although they are not in a position to
contribute to the development of the resource.) So, an African farmer
trying to combat an insect infestation on her farm would find a
translation of an English "agriculture" article that focuses on
technology-intensive farming to be much less useful than an article
started almost from scratch that addressed farming in the context of
speakers of that language. It just happens that "agriculture" was the
second article I clicked to by following links from the initial article
on the pseudo-Swahili test site - what similar issues would arise on the
fourth article, or the tenth, or the 997,032nd?
There's also the issue that a great many of the current English
Wikipedia articles are works in progress, of varying quality. Would you
do a one time machine translation of the current Wikipedia, and ignore
all future edits? Translate only "stable versions? Re-translate
articles every time there is a change? Re-translate every time the
Kamusi Project data is updated (hundreds of times a week)? Have the
machine overwrite manual edits that someone did to machine translations,
when the English version changes? Do this for dozens of African
languages, and hundreds of languages around the world?
I don't want to dismiss the entire endeavor, although I've been working
on these issues for long enough to be sure that the undertaking is much
more complicated than you're estimating. Here's where I think your
translation project might prove useful: if a speaker of, for example,
Swahili went searching for an entry that didn't already exist in the
Swahili Wikipedia, an application could build a version of that page
on-the-fly from the English version that is current at that moment. The
Wikipedia user could then either (a) glean whatever information she
could from the article and move on, (b) laugh uproariously, or (c) go
into edit mode, work to turn the machine translation into something
readable in Swahili, and save that version - which would then become the
baseline page for that entry in that language, from which future edits
could take off. In this way, you would get the best of both worlds -
good articles written in the actual language whenever possible, and
fingertip access to rough machine translations from English when
articles are not initially available in the target language.
Over at Incubator we've been deciding our policy with regard to
starting new language (of an existing project) tests, and new project
tests.
We've come up with this:
- New languages can create a test on Incubator quite easily, needing
only to get a few people who will help. (
http://incubator.wikimedia.org/wiki/I:NTR)
- New
languages can create a full wiki using an approval process on
Incubator, and it will be made (or not) after consensus
has been reached. (http://incubator.wikimedia.org/wiki/I:NLR)
- New projects will need approval from the foundation to have a test
made, and then need further approval to make a full wiki.
Is this acceptable? We are also not sure about what exactly the Foundation
needs to approve. Views we got previously seemed to be slightly
contradictory on this matter.
Thank you for your replies,
Dbmag9 (http://incubator.wikimedia.org/wiki/User:Dbmag9)
-
Dear all,
Thank you all for your comments. I agree with what Timichal has said
for the most part. I think that there are still problems with the
voting process, some of which can never be solved.
A percent range of approval is a good idea; although it is again
vulnerable. Now we need to find a foolproof method of notifying the
foundation about what's going on at Incubator :).
Thank you again for your thoughts.
-Dbmag9 (http://incubator.wikimedia.org/wiki/User:Dbmag9)
>>
>>
>> In the past we have accepted codes to be used as "language" codes which
>> were non-existent and have had as a result that we are not in compliance
>> with the rules of accepted use for the ISO-639 codes. When codes for new
>> languages are used that are not consistent with the existing ISO-639
>> codes (all two and three character codes) a language should not be
>> accepted at all.
>>
We can handle ISO-639 codes, it's no problem to assign the correct code if it
exist. In my opinion, even languages without an ISO-639 code should be
accepted; however, this shouldn't be controlled by us, but the decision should
be made in the New language requests vote.
>
> According to your rules, anyone (with help of
> a few friends or of a few sockpuppets) can re-open the Zorglub language
> (oldbies will understand which language is concerned).
> So, your proposal needs to mention the issue of constructed languages.
The policy is just a proposal at the moment; however, we'll take this into
account. There is a Quenya language test in Incubator; I suppose we should
delete it?
> Besides, I see you wrote "The Foundation will also have to approve the
> domain". Errrrrrrr. I'd prefer we avoid such bottleneck. How about
> something like "if at least 20 votes with a very large majority", no
> approval needed. If less votes or less obvious support, then, the
> Foundation or the spc must approve before creation ?
I removed the approval part, I think it remained there from the time when it
was imported on Meta and reworded by Daniel. As for your 20 votes suggestion -
unfortunately there are often cases of flash voting, where 20-30 voters can
easily appear out of nowhere and declare their support. We can counter this
somewhat by forcing all voters to have accounts, but there are still problems
with that. I think this has yet to be decided somehow. We should make a
percent range of approval, like with RfAs on English Wikipedia; e.g. more than
75% support gets approved automatically, between 50% and 75% needs approval by
Foundation/SPcom and less then 50% fails to get a wiki estalished.
Michal Zlatkovsky ([[incubator:User:Timichal]])
I write to advise the community that we have successfully brought a
bookkeeper in to work with Michael Davis on a part-time basis, Tricia
Hoffman. Trish was referred to us by our audit firm and has more than
ten years of complicated full-cycle accounting experience. During an
initial transition phase, Michael will be working with Trish to learn
the accounts and work with our internal processes. Following the
certification and publication of our audited financial statements, we
will resume publication of our financials, on what I anticipate will be
a monthly basis. Trish has agreed to provide professional support for
our accounting functions on a contract basis. After we have worked
together for a while, we will reassess based on the amount of time that
is appropriate for the tasks required.
mav has had an extraordinary influence on WMF and is to be heartily
congratulated for his willingness to jump in and work hard, especially
in the earliest days of the projects. As I have come to appreciate the
work that must be done to keep things going in the office, mav's energy
and initiative become all the more apparent. I'm sure he will continue
to contribute in multiple ways going forward.
Thanks, mav!
-Brad