On Fri, Jul 16, 2010 at 9:26 AM, Aubrey zanni.andrea84@gmail.com wrote:
...
The issue of metadata is nontheless serious, because it's one of the most important flaws of Wikisource: not applying standards (i.e Dublin Core) and not having a proper tools for export/import and harvest metadata is still make us amateurs, at least for "real" digital libraries (who focus mainly on the metadata stuff, and sometimes provide either texts or images (it is really rare to have both)).
This is also a problem with Wikimedia Commons.
http://strategy.wikimedia.org/wiki/Proposal:Dublin_Core
The Perseus project is an *amazing* project, but I regard them far more ahead than us. The PP is actually a Virtual Research Environment, with tools for scholars and researcher for studying texts, (concordances and similar stuff).
I agree. I would go further; PP will always be far more advanced than a mediawiki system.
They store their data in TEI format, which is an extremely rich standard. Wikisource can incorporate some of the TEI concepts by using templates, but I doubt we could ever be a leader in this area, nor do I think we want to.
http://en.wikipedia.org/wiki/Text_Encoding_Initiative
It happens that I just finished my Master thesis about collaborative digital libraries for scholars (in the Italian context), and the outcome is quite clear: researcher do want collaborative tools in DLs, but wiki system are to simple and (right now) too naive to really help scholars in their work (and there's a lot of other issues I'm not going to explain here).
I would love to have PP people involved in collaboration with Wikisource, just don't know if this is possible.
I agree. PP and Wikisource are too different, and have very little to gain from the other. PP wants to improve/increase collaboration & community, but not at the expense of loosing the quality of their metadata. Wikisource wants to improve quality and metadata, but not at the expense of the ability to collaboration and our simple editing interface.
Again, interoperability is the first step towards useful 'collaboration'. i.e. Wikisource needs to export TEI. Then we could feed our poorly annotated/described sources into PP, where the academic community would then add the metadata.
TEI export would also be useful for wiktionary.
Just one more thing: why this awesome thread has not been linked to the source-l? Probably that would have been the best place to discuss.
;-)
-- John Vandenberg
John Vandenberg, 16/07/2010 03:03:
The Perseus project is an *amazing* project, but I regard them far more ahead than us. The PP is actually a Virtual Research Environment, with tools for scholars and researcher for studying texts, (concordances and similar stuff).
I agree. I would go further; PP will always be far more advanced than a mediawiki system.
They store their data in TEI format, which is an extremely rich standard. Wikisource can incorporate some of the TEI concepts by using templates, but I doubt we could ever be a leader in this area, nor do I think we want to.
The en.wiki article doesn't really explain anything: there is a great article in Italian, though. But apart from TEI, how do they work? Is there someone who can describe their workflow, as André did for PGDP?
Again, interoperability is the first step towards useful 'collaboration'. i.e. Wikisource needs to export TEI. Then we could feed our poorly annotated/described sources into PP, where the academic community would then add the metadata.
TEI export would also be useful for wiktionary.
Ok, TEI is awesome, but if you want metadata it's simpler to implement them directly in Wikisource than exporting our content in some other project hoping they will add metadatas. Aubrey mentioned LiberLiber, the Italian PG/PGDP; not only they're moving to MediaWiki, they want to implement an extension to add OAI-PMH. The project is named Open Alexandria because with OAI-PMH you can create a global huge digital library even if the actual "shelves" are scattered in several sources: http://www.openalexandria.org/online/?lang=en
Just one more thing: why this awesome thread has not been linked to the source-l? Probably that would have been the best place to discuss.
;-)
Changing subject is good, but if you reply to another thread clients will merge the new to the old thread, and if the latter is heated and has low signal/noise ratio the former will be buried in it...
Nemo
On 07/16/2010 03:03 AM, John Vandenberg wrote:
This is also a problem with Wikimedia Commons.
It would take 5 minutes to implement the suggested support for "Dublin core" metadata in the MediaWiki software. Why is this in the 5 year strategic plan? Actually adding the content of metadata to all pages could be a 5 year project, but not the software support.
On my website, http://runeberg.org/ I have included Dublin core metadata for more than a decade (see for example http://runeberg.org/affdyaff/ ), but nobody ever thanked me for this, and none of the pages were included in Worldcat because of this. Visitors find the website through Google full text search or by direct links from Wikipedia and other sites, nobody reports having discovered the site through the DC metadata.
I must be missing something. Why is DC metadata so important? Could you give an example of a website that does this and actually benefits from it?
On Fri, Jul 16, 2010 at 3:36 PM, Lars Aronsson lars@aronsson.se wrote:
On 07/16/2010 03:03 AM, John Vandenberg wrote:
This is also a problem with Wikimedia Commons.
It would take 5 minutes to implement the suggested support for "Dublin core" metadata in the MediaWiki software.
Ah, so you have a MediaWiki extension that can scale to Wikipedia project sizes?
Otherwise, this will involve parsing of template calls. I have tried that. Repeatedly. Not to be done in 5 min.
Magnus