Hi, please forgive me in advance if my technical knowledge isn't up to speed and I don't entirely understand the issues.
From what I've seen, there is currently an effort to allow database functions for metadata about Wikisource texts.
That in itself is of course very cool.
My question is about the actual texts themselves (not just the metadata describing them): Often there is more than one good way to format and present a single text. In the current Wikimedia environment this forces the community to decide on which format for any given text is the best one for readers and users. But in a true database environment it would be possible to tag all of the different possibilities within the text itself, allowing the reader or user to choose which format best serves his or her needs.
Is this possibility related to any of the current discussions?
Dovi
This wasn't discussed as far as I am aware. I don't really have a good idea about what *isn't* possible. But as no one else had answered you, I wanted you to at least know that the idea was not explored in the discussions I participated in.
Birgitte SB
On Aug 3, 2012, at 3:57 AM, Dovi Jacobs dovijacobs@yahoo.com wrote:
Hi, please forgive me in advance if my technical knowledge isn't up to speed and I don't entirely understand the issues.
From what I've seen, there is currently an effort to allow database functions for metadata about Wikisource texts. That in itself is of course very cool.
My question is about the actual texts themselves (not just the metadata describing them): Often there is more than one good way to format and present a single text. In the current Wikimedia environment this forces the community to decide on which format for any given text is the best one for readers and users. But in a true database environment it would be possible to tag all of the different possibilities within the text itself, allowing the reader or user to choose which format best serves his or her needs.
Is this possibility related to any of the current discussions?
Dovi _______________________________________________ Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
On 2012-08-03 10:57, Dovi Jacobs wrote:
Often there is more than one good way to format and present a single text. In the current Wikimedia environment this forces the community to decide on which format for any given text is the best one for readers and users. But in a true database environment it would be possible to tag all of the different possibilities within the text itself, allowing the reader or user to choose which format best serves his or her needs.
Can you give some examples of texts where the choice between two different formats has been a problem in Wikisource?
Hi everybody,
On 03-Aug-2012, at 2:57 AM, Dovi Jacobs wrote:
Hi, please forgive me in advance if my technical knowledge isn't up to speed and I don't entirely understand the issues.
From what I've seen, there is currently an effort to allow database functions for metadata about Wikisource texts. That in itself is of course very cool.
My question is about the actual texts themselves (not just the metadata describing them): Often there is more than one good way to format and present a single text. In the current Wikimedia environment this forces the community to decide on which format for any given text is the best one for readers and users. But in a true database environment it would be possible to tag all of the different possibilities within the text itself, allowing the reader or user to choose which format best serves his or her needs.
Is this possibility related to any of the current discussions?
Are you thinking of something like the multiple-layer model proposed by Aubrey in this excellent slide? --> http://en.wikipedia.org/w/index.php?title=File:Wikisource_2012_-_Aubrey.pdf&...
I'm a co-author on a recent paper in which we used Wikisource templates to implement a basic annotation system within Wikisource [1]. While we tried to make sure that the annotations were clearly demarcated from the transcribed text (see http://en.wikisource.org/wiki/Field_Notes_of_Junius_Henderson/Notebook_1#Sil... for an example), it would be awesome to have a pure-text transcription somewhere with the ability to add annotations in a user-friendly manner. As Aubrey pointed out, such an annotation layer would allow all kinds of interesting content to be added on to Wikisource pages, from comments to critic literature to TEI mark-up. However, I don't know if this is possible without a *completely* overhaul of MediaWiki/ProofreadPage specifically for Wikisource, which I don't think we have the resources for at the moment.
cheers, Gaurav http://en.wikisource.org/wiki/User:Gaurav
[1] http://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2012-07-30/Recent_...
On Aug 5, 2012, at 7:19 PM, Gaurav Vaidya gaurav@ggvaidya.com wrote:
Hi everybody,
On 03-Aug-2012, at 2:57 AM, Dovi Jacobs wrote:
Hi, please forgive me in advance if my technical knowledge isn't up to speed and I don't entirely understand the issues.
From what I've seen, there is currently an effort to allow database functions for metadata about Wikisource texts. That in itself is of course very cool.
My question is about the actual texts themselves (not just the metadata describing them): Often there is more than one good way to format and present a single text. In the current Wikimedia environment this forces the community to decide on which format for any given text is the best one for readers and users. But in a true database environment it would be possible to tag all of the different possibilities within the text itself, allowing the reader or user to choose which format best serves his or her needs.
Is this possibility related to any of the current discussions?
Are you thinking of something like the multiple-layer model proposed by Aubrey in this excellent slide? --> http://en.wikipedia.org/w/index.php?title=File:Wikisource_2012_-_Aubrey.pdf&...
I'm a co-author on a recent paper in which we used Wikisource templates to implement a basic annotation system within Wikisource [1]. While we tried to make sure that the annotations were clearly demarcated from the transcribed text (see http://en.wikisource.org/wiki/Field_Notes_of_Junius_Henderson/Notebook_1#Sil... for an example), it would be awesome to have a pure-text transcription somewhere with the ability to add annotations in a user-friendly manner. As Aubrey pointed out, such an annotation layer would allow all kinds of interesting content to be added on to Wikisource pages, from comments to critic literature to TEI mark-up. However, I don't know if this is possible without a *completely* overhaul of MediaWiki/ProofreadPage specifically for Wikisource, which I don't think we have the resources for at the moment.
There is an annotation JavaScript tool that I saw briefly at Wikimania. My initial reaction to it was to imagine it as a gadget that could be turned on for those who like annotation and not bother those who do not. Basically that it might solve the debate at en.WS. I really don't know how feasible the technical side of making this a MediaWiki gadget is, however.
http://okfnlabs.org/annotator/
On your point about an overhaul of Mediawiki/ProofreadPage specifically for Wikisource. ProofreadPage IS specifically for Wikisource as it is. It may need to be overhauled in order to play nicely with the visual editor. I cannot speak with confidence about anything technical, but if there are things wanting in ProofreadPage we might want to make sure they are put on the table while this is all being looked at.
Birgitte SB
2012/8/6 Birgitte_sb@yahoo.com
On Aug 5, 2012, at 7:19 PM, Gaurav Vaidya gaurav@ggvaidya.com wrote:
Hi everybody,
On 03-Aug-2012, at 2:57 AM, Dovi Jacobs wrote:
Hi, please forgive me in advance if my technical knowledge isn't up to
speed and I don't entirely understand the issues.
From what I've seen, there is currently an effort to allow database
functions for metadata about Wikisource texts.
That in itself is of course very cool.
My question is about the actual texts themselves (not just the metadata
describing them):
Often there is more than one good way to format and present a single
text. In the current Wikimedia environment this forces the community to decide on which format for any given text is the best one for readers and users. But in a true database environment it would be possible to tag all of the different possibilities within the text itself, allowing the reader or user to choose which format best serves his or her needs.
Is this possibility related to any of the current discussions?
Are you thinking of something like the multiple-layer model proposed by
Aubrey in this excellent slide? --> http://en.wikipedia.org/w/index.php?title=File:Wikisource_2012_-_Aubrey.pdf&...
I'm a co-author on a recent paper in which we used Wikisource templates
to implement a basic annotation system within Wikisource [1]. While we tried to make sure that the annotations were clearly demarcated from the transcribed text (see http://en.wikisource.org/wiki/Field_Notes_of_Junius_Henderson/Notebook_1#Sil... an example), it would be awesome to have a pure-text transcription somewhere with the ability to add annotations in a user-friendly manner. As Aubrey pointed out, such an annotation layer would allow all kinds of interesting content to be added on to Wikisource pages, from comments to critic literature to TEI mark-up. However, I don't know if this is possible without a *completely* overhaul of MediaWiki/ProofreadPage specifically for Wikisource, which I don't think we have the resources for at the moment.
:-) Let me spend few words about that. I developed that "onion" model few years ago as an outcome for my MA dissertation (herehttp://unibo.academia.edu/AndreaZanni/Papers/800397/Collaboratory_Digital_Libraries_for_Humanities_in_the_Italian_contextif you're interested), for a generic collaborative digital library. The idea of layers emerged from research and interviews I made (it's not very original, but I think it is useful).
Obviously, I thought a lot about Wikisource (it is by far the digital library that I understand and know better), but, still, we are bounded with some sort of NPOV which other DLs (digital libraries) may not have.
As you can observe (imagehttp://en.wikipedia.org/w/index.php?title=File:Wikisource_2012_-_Aubrey.pdf&page=19), the onion model present different layers, starting from the scan of the book and growing into more abstract layers (annotation, comments).
The idea is that you start from the most neutral and simple "dimension" (the image of a page), and than you have a sofwtare which allows you to develop in different other dimensions which are less neutral and more complex. Moreover, each and every layer support a particular form of collaboration. After the image, you have collaborative transcription. After, a TEI mark-up layer (TEI is a form of mark up used uin Digital Humanities, for philologists). After that, we could have hypertextual links, and links with critic literature, and then (personal?) annotated versions, and than comments. As you could see, the core layers are more "collaborative", the latter are more "social".
In Wikisource, we collaboratively edit every page, which is indeed unique and we "converge" to a "neutral" version of, for example, transcription (or policy, or annotated page). We are a wiki and there is no fork, and a very tiny space for human interpretation (or different interpretations)(i.e. Original research).
We could indeed think about a DL in which we can have a single scan, but different transcriptions, different annotated versions, different TEI markup version, different hyperlinks, different comments. The more we go up in the layers, the more "human interpretation" is important and the software of this onion-like DL should allow forks and different versions...
I'm not really sure if Wikisource is the right candidate for using this layer-model (I think in part it is), but surely we could think about improving Wikisource at the point we could have the first layers of the model (scans, collaborative transcriptions, maybe TEI mark up, hyperlinks), etc.
Moreover, there is another point, which was the one we were all discussing about. Using different layers is a technological challenge also because it could allow users to retrieve the transcription without the TEI-markup, or the transcription without the hyperlinks, or without annotation. Having different layers, thus NOT having in-link mark-up as we currently have, could be a huge improvement in terms of accessibility. We could develop Wikisource in many different ways, having people who make annotations which other users may not be interested inot (but others may be). We could have plain text versions of a book, an annotated one, an hyperlinked one, a TEI mark-up one. We could have different versions of each layers.
Again, I'm still not sure about Wikisource is the right candidate for this (maybe DPLA is), but surely I see Wikisource working fine with the most neutral and collaborative layers, and I would like them to be interoperable with other DLs and services to come (which could focus on annotation and stuff)(for example, the Open Knowledge Foundation tools like the annotator BirgitteSB linked before).
I hope this explained a bit the idea I had in mind.
Aubrey
wikisource-l@lists.wikimedia.org