Date: Fri, 25 May 2012 20:28:58 +0200
From: emijrp <emijrp(a)gmail.com>
To: "Discussion list for the Wikidata project."
<wikidata-l(a)lists.wikimedia.org>
Subject: Re: [Wikidata-l] Who to talk to about integrating WolrCat
Message-ID:
<CAPgALA5wT6Xo7OEkzcbxzGYDnD9qe3K1WBAYF0x1LScTfFTbyg(a)mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"
2012/5/25 Lydia Pintscher <lydia.pintscher(a)wikimedia.de>
>> A decision hasn't been made. We're looking at cc-0 for the beginning
>> for the data but the community might decide to change this later.
>>
>>
>I think that CC-0 is a good choice.
The answer is that WorldCat is published under Open Data Commons,
Attribution (ODC-BY).
For more information http://opendatacommons.org/category/odc-by/ .
I know CC0 is tempting choice, but I hope a choice is made that can
accommodate willing partners.
Max Klein
kleinm(a)oclc.org
--
Emilio J. Rodr?guez-Posada. E-mail: emijrp AT gmail DOT com
Pre-doctoral student at the University of C?diz (Spain)
Projects: AVBOT <http://code.google.com/p/avbot/> |
StatMediaWiki<http://statmediawiki.forja.rediris.es>
| WikiEvidens <http://code.google.com/p/wikievidens/> |
WikiPapers<http://wikipapers.referata.com>
| WikiTeam <http://code.google.com/p/wikiteam/>
Personal website: https://sites.google.com/site/emijrp/
Hello Wikidata Wizards,
Phoebe Ayers from the Board recommended I talk to you. My name is Max
Klein and I am the Wikipedian in Residence for OCLC. OCLC owns
Worldcat.org the world's largest holder of Library data at 264 million
bibliographic records about books, journals and other library items. We
would really like to partner with you as Wikidata is being built, in
incorporating our data into your project.
What we can offer:
* WorldCat.org metadata http://www.worldcat.org/ .
o Typically, for any work we have most of the following: title,
authors, publisher, formats, summaries, editions, subjects, languages,
intended audience, all associated ISBNs, length, and abstract.
* APIs to this data http://oclc.org/developer/
o And some other cool APIs like xISBN which returns all the ISBNs of
all the editions of book on the input of any single one.
* Library finding tools
o When viewing a record on our site, we show you the closest library
which has that work, and links to reserve it for pick-up.
* The Virtual International Authority File (VIAF)
http://viaf.org/, which is an Authoritative Disambiguation file
o That means that we have certified data on disambiguation of Authors
* WorldCat Identities, an Analytics site
http://www.worldcat.org/identities/
o It gives you for Author metadata and analytics: Alternative names,
significant dates, publication timelines, genres, roles, related
authors, and tag clouds of associated subjects.
What's in it for us:
* We are a not-for-profit member cooperative. Our mission is
"Connecting people to knowledge through library cooperation."
* Since I work at the research group, for now this is just a
research project.
o If at some point this goes live - and you want to - we'd like to
integrate the "find it at a library near me" feature, that means
click-throughs for us.
The ideas:
There are a lot of possibilities, and I'd like to hear your input. These
are the first few that I've can come up with.
* Making infoboxes for each book or author that contains all
their metadata.
o Ready to incorporate into all language projects.
* Using authority files to disambiguate or link works to their
creators.
o Solving DABs
* Using our analytics (e.g. author timelines) as Wikidata data
types to transclude.
o Curating articles with easy to include dynamic analytics
* Populating or creating works/author pages with their
algorithmically-derived history and details.
o Extremely experimental semantic work.
I'm roaring and ready to get this collaboration going. I know Wikidata
is at an early stage, and we are willing to accommodate you.
Send me any feedback or ideas,
Max Klein
Wikipedia in Residence
kleinm(a)oclc.org
+17074787023
Hi folks,
I made some effort to draft Wikidata in Hungarian Wikipedia. Although it is
not quite up-to-date, but is a useful overview for Hungarian language
community. As far as I remember, this was the first national language
article on Wikidata except German, preceeding the current system. I created
a soft redirect to it on Meta which was eradicated by Fuzzybot (
http://meta.wikimedia.org/w/index.php?title=Wikidata%2Fhu&diff=3728047&oldi…)
and now I cannot edit it at all. I don't want to translate that whole stuff
from week to week again, and seemingly nobody else does it, but could we
get back the link to huwiki please? I think a long English text that is
identical to the English version and very-very up-to-date and meanwhile
states that it *is* the Hungarian version is much less useful for Hungarian
speakers than a link to the Hungarian article that backlinks to the
original one for those who want to read the current version in English.
This solution is very uniform and easy to handle by bot and very aggressive
-- whoever made it had no regard on previous content and I am really
disappointed to face this. So what is the way to have at least a sentence
on top of this nice text that leads to the useful version and will not be
ruined by the bot again and again?
--
Bináris
Hello all
I have created a first preliminary draft of how data items from the Wikidata
repository may be accessed and rendered on the client wiki, e.g. to make infoboxes.
https://meta.wikimedia.org/wiki/Wikidata/Notes/Inclusion_syntax
It would be great if you could have a look and let us know about any
unclarities, omissions, or other flaws - and of course about your ideas of how
to do this.
Getting this right is an important part of implementing phase 2 of the Wikidata
project, and so I feel it's important to start drafting and discussing early.
Having a powerful but not overly complex way to create infoboxes etc from
Wikidata items is very important for the acceptance of Wikidata on the clinet
wikis, I believe.
Thanks,
Daniel
--
Daniel Kinzler, Softwarearchitekt
Wikimedia Deutschland e.V. | Eisenacher Straße 2 | 10777 Berlin
http://wikimedia.de | Tel. (030) 219 158 260
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt
für Körperschaften I Berlin, Steuernummer 27/681/51985.
Heya folks,
I just wanted to let you know that the next Wikidata office hours will
be on Tuesday and Wednesday next week. Denny and I will be around on
IRC in #wikimedia-wikidata to answer any question you might have and
discuss. I assume there will be a few more questions than usual now
that we have a demo system. Logs will be published afterwards.
English: May 29 at 16:30 UTC
(http://www.timeanddate.com/worldclock/fixedtime.html?hour=16&min=30&sec=0&d…)
German: May 30 at 16:30 UTC
(http://www.timeanddate.com/worldclock/fixedtime.html?hour=16&min=30&sec=0&d…)
Cheers
Lydia
--
Lydia Pintscher - http://about.me/lydia.pintscher
Community Communications for Wikidata
Wikimedia Deutschland e.V.
Obentrautstr. 72
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
Hi folks,
I'm about to head off to Berlin, but I've been quietly toiling away on
this project for a while (technically since 2005) and I figured it's
time to start making a little noise.
The project is the JsonData extension for MediaWiki, which allows for
validated editing of JSON files using a form in a MediaWiki
installation. You may have stumbled upon my jsonwidget javascript
editor, which has been around a long time (since 2005...gah, it's
almost as old as a second grader!) When I wrote that, I always
envisioned using that on a wiki, and never finished the job until
recently. I also have never had a strong desire to maintain a schema
format, so over this past weekend, I converted it to use a subset of
Kris Zyp's draft-03.
Validation is done both server side (PHP) and client side
(Javascript). The license for both the server side validator and the
client side library are three-clause BSD. The rest of the MediaWiki
extension is GPLv2.
The UI is still quite 2005-ish, unfortunately. There are other
draft-03 compatible editors out there (one recently announced built on
jquery-ui) which I may take a crack at .
More information about the project can be found here:
http://jsonwidget.org/wiki/JsonData
I've got a test install there, so if you'd like to do some playing
there, feel free to play around.
I haven't made a big point of getting this checked into Gerrit, but I
plan on doing that.
This is a personal project (during my own time) and not-at-all
something I'm doing on behalf of the WMF. It's also not (yet?)
affiliated with Wikidata project, and I'm happy to change names to
avoid confusion.
Rob
p.s. the security question answer, should you hit it, is "wikitech-l".
It's amazing how quickly wikis start attracting spam.
Hi emijrp,
I will have to investigate the licensing computability. Looking at the
wikidata site on meta, it only mentions that " The content in Wikidata
will be made available under a free license." [1] Are there any more
decisions so far on this topic or will any CC license do?
[1] http://meta.wikimedia.org/wiki/Wikidata/Notes/Requirements
Best,
Max Klein
Wikipedia in Residence
kleinm(a)oclc.org
+17074787023
Hi Max;
Is WorldCat aware of the Wikidata free license?
I think that it is a good idea to include that info in Wikidata, that
and
the DOI metadata, which is currently generated like this[1] (only
available
in English Wikipedia).
Regards,
emijrp
[1] https://en.wikipedia.org/wiki/Category:Cite_doi_templates
2012/5/25 Klein,Max <kleinm(a)oclc.org>
> Hello Wikidata Wizards,****
>
> ** **
>
> Phoebe Ayers from the Board recommended I talk to you. My name is Max
> Klein and I am the Wikipedian in Residence for OCLC. OCLC owns
Worldcat.org
> the world?s largest holder of Library data at 264 million
bibliographic
> records about books, journals and other library items. We would really
like
> to partner with you as Wikidata is being built, in incorporating our
data
> into your project.****
>
> ** **
>
> *What we can offer:*
>
> **? **WorldCat.org metadata http://www.worldcat.org/ . ****
>
> **o **Typically, for any work we have most of the following: title,
> authors, publisher, formats, summaries, editions, subjects, languages,
> intended audience, all associated ISBNs, length, and abstract.****
>
> **? **APIs to this data http://oclc.org/developer/****
>
> **o **And some other cool APIs like xISBN which returns all the
ISBNs
> of all the editions of book on the input of any single one.****
>
> **? **Library finding tools****
>
> **o **When viewing a record on our site, we show you the closest
> library which has that work, and links to reserve it for pick-up.****
>
> **? **The Virtual International Authority File (VIAF)
> http://viaf.org/, which is an Authoritative Disambiguation file****
>
> **o **That means that we have certified data on disambiguation of
> Authors****
>
> **? **WorldCat Identities, an Analytics site
> http://www.worldcat.org/identities/****
>
> **o **It gives you for Author metadata and analytics: Alternative
> names, significant dates, publication timelines, genres, roles,
related
> authors, and tag clouds of associated subjects.****
>
> ** **
>
> *What?s in it for us:*
>
> **? **We are a not-for-profit member cooperative. Our mission
is
> ?Connecting people to knowledge through library cooperation.?****
>
> **? **Since I work at the research group, for now this is just
a
> research project.****
>
> **o **If at some point this goes live - and you want to - we?d like
to
> integrate the ?find it at a library near me? feature, that means
> click-throughs for us.****
>
> ** **
>
> *The ideas:*
>
> There are a lot of possibilities, and I?d like to hear your input.
These
> are the first few that I?ve can come up with.****
>
> **? **Making infoboxes for each book or author that contains
all
> their metadata.****
>
> **o **Ready to incorporate into all language projects.****
>
> **? **Using authority files to disambiguate or link works to
> their creators.****
>
> **o **Solving DABs****
>
> **? **Using our analytics (e.g. author timelines) as Wikidata
> data types to transclude.****
>
> **o **Curating articles with easy to include dynamic analytics****
>
> **? **Populating or creating works/author pages with their
> algorithmically-derived history and details.****
>
> **o **Extremely experimental semantic work.****
>
> ** **
>
> I?m roaring and ready to get this collaboration going. I know Wikidata
is
> at an early stage, and we are willing to accommodate you.****
>
> Send me any feedback or ideas,****
>
> ** **
>
> Max Klein****
>
> Wikipedia in Residence****
>
> kleinm(a)oclc.org****
>
> +17074787023****
>
> ** **
>
> _______________________________________________
> Wikidata-l mailing list
> Wikidata-l(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>
--
Emilio J. Rodr?guez-Posada. E-mail: emijrp AT gmail DOT com
Pre-doctoral student at the University of C?diz (Spain)
Projects: AVBOT <http://code.google.com/p/avbot/> |
StatMediaWiki<http://statmediawiki.forja.rediris.es>
| WikiEvidens <http://code.google.com/p/wikievidens/> |
WikiPapers<http://wikipapers.referata.com>
| WikiTeam <http://code.google.com/p/wikiteam/>
Personal website: https://sites.google.com/site/emijrp/
Hey :)
This is the Wikidata summary of the week before 2012-05-25.
You can also find this at
http://meta.wikimedia.org/wiki/Wikidata/Status_updates/2012_05_25
= Development =
* Reworked our git workflow:
http://meta.wikimedia.org/wiki/Wikidata/Development/Process
* Wrote tests for client, lib and repo
* Introduced namespaces in lib and repo
* Finished work on diff functionality which now supports recursion
* Started work on handling changes on the client and appropriately
caching info there
* Reviewed code
* Test, debug and fix problems in the Wikidata core branch
* Fixed & refactored some minor stuff in the UI
* Fixed broken UI tests
* Getting git-review running on Windows
* Trying to make all PHPUnit tests running without Failures & Errors
under Windows
See http://meta.wikimedia.org/wiki/Wikidata/Development/Current sprint
for what we’re working on next.
You can follow commits at
https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/extensions/WikidataClient…
and https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/extensions/WikidataRepo.g…
(stuff not merged into master yet isn't included there)
= Discussions/Press =
* Proposal for inclusion syntax:
http://meta.wikimedia.org/wiki/Wikidata/Notes/Inclusion_syntax
* Feedback on the demo system in various places has been very positive so far
= Events =
see http://meta.wikimedia.org/wiki/Wikidata/Events
* LinuxTag
* upcoming: IRC office hours next Tuesday and Wednesday
* upcoming: MediaWiki hackathon (and Wikidata/RENDER event)
* upcoming: Admin Convention of the German Wikipedia
= other stuff =
* preparations for the hackathon and Wikidata/RENDER event next week
Anything to add? Please share! :)
Cheers
Lydia
--
Lydia Pintscher - http://about.me/lydia.pintscher
Community Communications for Wikidata
Wikimedia Deutschland e.V.
Obentrautstr. 72
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
I think this exchange with an expert of languages on-line can be of
interest to us. Discussion comes from the Unicode mailing list.
Best
Glenda.
At 16:29 25/05/2012, Philippe Verdy wrote:
2012/5/25 glenda.jerome <info(a)execlub.org>:
> At 00:15 24/05/2012, Philippe Verdy wrote:
>>
>>
>>
http://blog.wikimedia.org/2012/05/21/introducing-designs-for-the-universal-…
>> Enjoy :-)
>> -- Philippe.
>
>
> Philippe,
>
> could you please comment a little further on your comment? It would be of
> real help to people working on the Wikidata project. I personally have no
> idea why it would be a good/bad idea as I see cons and pros to the
concept.
> The real issue we have is to associate a term, its display, its exact
> meaning and its value to a linguistic community. My fear is that a term
may
> have orthoghraphic and meaning variances and that such a simplified
approach
> may errode this diversity. This is not an area where we are expert, but I
am
> aware that what may decided for a MediaWiki project may influence many
> things.
>
> Thank you!
> Glenda
I did not want to be obtrusive when posting this URL. But the current
project in the Wikimedia foundation certainly has many valuable
contributors available that could criticize the concepts and help
defining it more precisely.
But if the intent is to create associations of terms across languages,
the difficulty is several orders of magnitude higher than doing that
only within the same language.
For now there are manual tuning to help create interwiki links, but we
all know that these links are not (and cannot) be completely
transitive, include for the reason that articles on each wiki have
different structures and coverage of their subject (and it's irrealist
to have as many articles with exactly the same coverage across all
languages). So articles are splitting their subjects differently :some
wikis will group several subjects with similar names on the same page,
some will be so developed that they will be separated.
In my opinion, the only safe interlingual classification that works is
the one used in Commons (but Commons does not cover all subjects found
in other wikis, it is poor about history, linguistic, politics, music,
cinema, books and comics, and advanced sciences for example, because
it is difficult to illustrate many concepts distinctly with images,
audio or videos.
So instead, I would build a dedicated search engine that can search
through existing interwiki links and find associations, and possibly
develop some automatic translation tools, or text extractors that can
be used to compare contents from the various projects.