[Foundation-l] [Wiki-research-l] WikiCite - new WMF project? Was: UPEI's proposal for a "universal citation index"

Daniel Mietchen daniel.mietchen at googlemail.com
Wed Jul 21 08:07:30 UTC 2010


On Tue, Jul 20, 2010 at 9:26 PM, Brian J Mingus
<Brian.Mingus at colorado.edu> wrote:
> I like your suggestion that the abc disambiguator be chosen based on the
> first date of publication, and I also like the prospect of using slashes
> since they can't be contained in names. Using the full year is a good idea
> too. We can combine these to come up with a key that, in principle, is
> guaranteed to be unique. This key would contain:
>
> 1) The first three author names separated by slashes
why not separate by pluses? they don't form part of names either, and
don't cause problems with wiki page titles.

> 2) If there are more than three authors, an EtAl
don't think that's necessary if we get the abc part right.

> 3) Some or all of the date. For instance, if there is only one source by
> this set of authors that year, we can just use YYYY. However, once another
> source by those set of authors is added, the key should change to MMDDYYYY
> or similar.
I don't think it is a good idea to change one key as a function of
updates on another, except for a generic disambiguation tag.

> If there are multiple publications on the same day, we can
> resort to abc. Redirects and disambiguation pages can be set up when a key
> changes.
As Jodi pointed out already, the exact date is often not clearly
identifiable, so I would go simply for the year.
Instead of an alphabetic abc, one could use some function of the
article title (e.g. the first three words thereof, or the initials of
the first three words), always in lower case.

An even less ambiguous abc would be starting page (for printed stuff)
or article number (for online only) but this brings us back to the
7523225 problem you mentioned above.

> Since the slashes are somewhat cumbersome, perhaps we can not make them
> mandatory, but similarly use them only when they are necessary in order to
> "escape" a name. In the case that one of the authors does not have a slash
> in their name - the dominant case - we can stick to the easily legible and
> niecly compact CamelCase format.
>
> Example keys generated by this algorithm:
>
> KangHsuKrajbichEtAl2009
Kang+Hsu+Krajbich+2009+the+wick+in
or
Kang+Hsu+Krajbich+2009+twi

also note that the CamelCase key does not yield results in a google
search, whereas the first plused variant brings up the right work
correctly, while the plused one with initialed title tends to bring at
least something written by or cited from these authors.

> Author1Author2/Author-Three/2009
Author1+Author2+Author-Three+2009+just+another+article
or
Author1+Author2+Author-Three+2009+jat

Of course, it does not have to be _exactly_ three authors, nor three
words from the title, and it does not solve the John Smith (or Zheng
Wang) problem.

Daniel

-- 
http://www.google.com/profiles/daniel.mietchen



More information about the foundation-l mailing list