[Wikisource-l] [ol-discuss] Open Library, Wikisource, and cleaning and translating OCR of Classics

Samuel Klein meta.sj at gmail.com
Wed Aug 12 15:05:44 UTC 2009


Keeping the wikisource list in cc: .   SJ


On Wed, Aug 12, 2009 at 11:05 AM, Samuel Klein<meta.sj at gmail.com> wrote:
> On Wed, Aug 12, 2009 at 10:14 AM, Karen Coyle<kcoyle at kcoyle.net> wrote:
>> Just a few comments on OL plans....
>
> Thank you!
>
>
>>>  * version history for manifestations (latest cleaned up version of a
>>> file) and expressions (latest cleaned up translation of a work)
>>>  ** links to manifestations archived elsewhere, if they are not
>>> mirrored by the OL/IA for some reason
>>>
>>
>> Is this referring to the metadata or the full text?
>
> Both.  (both can be edited by people, or updated/cleaned by
> context-aware or cross-language info-retrieval scripts)
>
>>>  * providing a namespace and format for collections and lists of
>>> works; as a normalized way of identifying collections in which a given
>>> work has been included.  This is slightly different in use, intent,
>>> and visualization than classification categories.  There might be a
>>> couple dozen subject categories for a complex work, but it could have
>>> hundreds of associations with collections, awards, designations, &c.
>>>
>>
>> Yes, this is part of the "lists" function, in development, although the
>> details have not been fully worked out. We've noted that the NY Times
>> has made its best seller lists available, so that makes sense as a
>> collection; Pulitzer prizes, Booker prize, etc. All of these should form
>> lists or collections within OL. Plus users should be able to create any
>> lists, bibliographies, etc.
>
> Excellent - do you have a link to recent discussion?
>
>
>> Adding discussion pages has been discussed. There are two things here:
>> discussion on OL about the books, and discussion about the OL project.
>> As for the latter, more than discussion perhaps we need a place where
>> people share uses of OL, changes they've made to OL (all of the
>> templates are editable by anyone, although you need to share those
>> edits... I don't think we've explained this well, and definitely haven't
>> done enough to foster a community of users.). A kind of community space.
>> Yes, this is really needed.
>
> So, why not make this one use of the OL wiki?  discussion about a work
> and about the project will regularly overlap as style guidelines and
> community dynamics play out.
>
>
>> OL would like to show metadata in
>> the preferred language of the user. That presents lots of issues,
>> starting with the one of: what if there isn't any metadata in the
>> language of the user? But also how you do this AND give the user an idea
>> of the origins of the work (first publication date and place and
>> language). Wikipedia is able to do this because its data is created by
>> people. OL is working with metadata created for individual editions that
>> doesn't link easily to the work. Where there is a wikipedia entry for
>> the work OL may be able to use that to determine the origins, but in
>> many cases no such entry will be available.
>
> This seems solvable - define the style you'd recommend people create
> by hand where they have the time; and write scripts that can
> approximate this where there is limited data.  script-assisted people
> can do tremendous amounts of work category by category.
>
>> In any case, all of this is being discussed and considered. Since email
>> is so non-sticky, would the OL blog be a good place to provide more of
>> this information and discussion?
>
> A blog isn't sufficiently sticky for my tastes -- limited permalinks,
> no version history or diffs, limited capacity for collaboration
> directly on ideas, texts, and overviews; poor namespace control for
> naming and classifying discussions; and limited interlinnks between
> different posts/comments/contributors.
>
> Let's please use something at least as sticky as a wiki.  [NTS: we
> need a term for collaboration environments that parallels
> "Turing-complete" to describe anything that can mimic a set of basic
> wiki services.]
>
> SJ
>



More information about the Wikisource-l mailing list