Dear all, volcanic User:Alex brollo is working on "dictionaries", aka generating lists of used words in Pages and works. For example: https://it.wikisource.org/wiki/Discussioni_pagina:Il_cavallarizzo.djvu/2, as a list of the book Il cavallarizzo.
I bet that in the next few years (with more books, more users, Wikidata, and the world domination led by Wikisource) we would have more and more of these experiments. A list of used words of an ancient book could help customize OCR and tools for typo corrections, for example.
Moreover, we will have Wikidata, and maybe we will need to store some metadata (eg page numbers, or metadata about images and scans) into Wikisource. Lua could help us build tools for creating automatic indexes, or textual version in ns0 (eg precompile the pagelist tag...)
So, the question is: want we Wikisource communities a new Data namespace? How do you like the idea? Would you want to have the Wikibase extension in it, or just a normal namespace?
I'm sure you will find this mail confusing, but I think we are in the need of something, I just don't know what it is :-)
Aubrey
AFIK, Dario did a proposal in that regard some time ago: https://meta.wikimedia.org/wiki/DataNamespace
No idea what is the current status. Maybe you could ask him?
Cheers, Micru
On Fri, Dec 20, 2013 at 4:09 PM, Andrea Zanni zanni.andrea84@gmail.comwrote:
Dear all, volcanic User:Alex brollo is working on "dictionaries", aka generating lists of used words in Pages and works. For example: https://it.wikisource.org/wiki/Discussioni_pagina:Il_cavallarizzo.djvu/2, as a list of the book Il cavallarizzo.
I bet that in the next few years (with more books, more users, Wikidata, and the world domination led by Wikisource) we would have more and more of these experiments. A list of used words of an ancient book could help customize OCR and tools for typo corrections, for example.
Moreover, we will have Wikidata, and maybe we will need to store some metadata (eg page numbers, or metadata about images and scans) into Wikisource. Lua could help us build tools for creating automatic indexes, or textual version in ns0 (eg precompile the pagelist tag...)
So, the question is: want we Wikisource communities a new Data namespace? How do you like the idea? Would you want to have the Wikibase extension in it, or just a normal namespace?
I'm sure you will find this mail confusing, but I think we are in the need of something, I just don't know what it is :-)
Aubrey
Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
I will.
That, though, is a proposal of a "special" namespace, I mean, technologically. We could have a brand new namespace using Wikibase, or that DataNamespace, or a simple balnk new namespace, as it is the Wikisource namespace, or the ns0.
Right now, we just need a room to put some data.
Aubrey
On Fri, Dec 20, 2013 at 4:19 PM, David Cuenca dacuetu@gmail.com wrote:
AFIK, Dario did a proposal in that regard some time ago: https://meta.wikimedia.org/wiki/DataNamespace
No idea what is the current status. Maybe you could ask him?
Cheers, Micru
On Fri, Dec 20, 2013 at 4:09 PM, Andrea Zanni zanni.andrea84@gmail.comwrote:
Dear all, volcanic User:Alex brollo is working on "dictionaries", aka generating lists of used words in Pages and works. For example: https://it.wikisource.org/wiki/Discussioni_pagina:Il_cavallarizzo.djvu/2, as a list of the book Il cavallarizzo.
I bet that in the next few years (with more books, more users, Wikidata, and the world domination led by Wikisource) we would have more and more of these experiments. A list of used words of an ancient book could help customize OCR and tools for typo corrections, for example.
Moreover, we will have Wikidata, and maybe we will need to store some metadata (eg page numbers, or metadata about images and scans) into Wikisource. Lua could help us build tools for creating automatic indexes, or textual version in ns0 (eg precompile the pagelist tag...)
So, the question is: want we Wikisource communities a new Data namespace? How do you like the idea? Would you want to have the Wikibase extension in it, or just a normal namespace?
I'm sure you will find this mail confusing, but I think we are in the need of something, I just don't know what it is :-)
Aubrey
Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
-- Etiamsi omnes, ego non
Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
I think data that is calculated (like this word-list data) does not belong on the wiki. It is by definition re-generatable, and not meant for gradual improvement by human labor. A simple _reference_ to such data for books/pages where it exists would be enough, i.e. a gadget or something that points to the word-list data on an external (even static!) Web server.
A.
On Fri, Dec 20, 2013 at 7:35 AM, Andrea Zanni zanni.andrea84@gmail.comwrote:
I will.
That, though, is a proposal of a "special" namespace, I mean, technologically. We could have a brand new namespace using Wikibase, or that DataNamespace, or a simple balnk new namespace, as it is the Wikisource namespace, or the ns0.
Right now, we just need a room to put some data.
Aubrey
On Fri, Dec 20, 2013 at 4:19 PM, David Cuenca dacuetu@gmail.com wrote:
AFIK, Dario did a proposal in that regard some time ago: https://meta.wikimedia.org/wiki/DataNamespace
No idea what is the current status. Maybe you could ask him?
Cheers, Micru
On Fri, Dec 20, 2013 at 4:09 PM, Andrea Zanni zanni.andrea84@gmail.comwrote:
Dear all, volcanic User:Alex brollo is working on "dictionaries", aka generating lists of used words in Pages and works. For example: https://it.wikisource.org/wiki/Discussioni_pagina:Il_cavallarizzo.djvu/2, as a list of the book Il cavallarizzo.
I bet that in the next few years (with more books, more users, Wikidata, and the world domination led by Wikisource) we would have more and more of these experiments. A list of used words of an ancient book could help customize OCR and tools for typo corrections, for example.
Moreover, we will have Wikidata, and maybe we will need to store some metadata (eg page numbers, or metadata about images and scans) into Wikisource. Lua could help us build tools for creating automatic indexes, or textual version in ns0 (eg precompile the pagelist tag...)
So, the question is: want we Wikisource communities a new Data namespace? How do you like the idea? Would you want to have the Wikibase extension in it, or just a normal namespace?
I'm sure you will find this mail confusing, but I think we are in the need of something, I just don't know what it is :-)
Aubrey
Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
-- Etiamsi omnes, ego non
Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Asaf. I respectfully disagree. Word lists (work-specific) are useful proofreading tools only when the can be edited often - both removing and adding words. Any added word must be verified. Distributed proofreaders implements a powerful spelling-check tool, but - given the different policy of that website - only sysops have the permission to edit dictionaries. Wiki philosophy, on the contrary, dictates that most data-entry don't need any special privilege; therefore, dictionaries must be simple to edit by a common contributor,
We are just testing a tool into it.source to build, page for page, work-specific dictionaries; user has full control on it, and can edit it very simply, how many times he likes. Such dictionaries are fastly changing, far from being "static". The problem is, to share its contributions; downloading/uploading them from wiki pages is IMHO the simplest way to get this result.
Alex
2013/12/24 Asaf Bartov abartov@wikimedia.org
I think data that is calculated (like this word-list data) does not belong on the wiki. It is by definition re-generatable, and not meant for gradual improvement by human labor. A simple _reference_ to such data for books/pages where it exists would be enough, i.e. a gadget or something that points to the word-list data on an external (even static!) Web server.
A.
On Fri, Dec 20, 2013 at 7:35 AM, Andrea Zanni zanni.andrea84@gmail.comwrote:
I will.
That, though, is a proposal of a "special" namespace, I mean, technologically. We could have a brand new namespace using Wikibase, or that DataNamespace, or a simple balnk new namespace, as it is the Wikisource namespace, or the ns0.
Right now, we just need a room to put some data.
Aubrey
On Fri, Dec 20, 2013 at 4:19 PM, David Cuenca dacuetu@gmail.com wrote:
AFIK, Dario did a proposal in that regard some time ago: https://meta.wikimedia.org/wiki/DataNamespace
No idea what is the current status. Maybe you could ask him?
Cheers, Micru
On Fri, Dec 20, 2013 at 4:09 PM, Andrea Zanni zanni.andrea84@gmail.comwrote:
Dear all, volcanic User:Alex brollo is working on "dictionaries", aka generating lists of used words in Pages and works. For example: https://it.wikisource.org/wiki/Discussioni_pagina:Il_cavallarizzo.djvu/2, as a list of the book Il cavallarizzo.
I bet that in the next few years (with more books, more users, Wikidata, and the world domination led by Wikisource) we would have more and more of these experiments. A list of used words of an ancient book could help customize OCR and tools for typo corrections, for example.
Moreover, we will have Wikidata, and maybe we will need to store some metadata (eg page numbers, or metadata about images and scans) into Wikisource. Lua could help us build tools for creating automatic indexes, or textual version in ns0 (eg precompile the pagelist tag...)
So, the question is: want we Wikisource communities a new Data namespace? How do you like the idea? Would you want to have the Wikibase extension in it, or just a normal namespace?
I'm sure you will find this mail confusing, but I think we are in the need of something, I just don't know what it is :-)
Aubrey
Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
-- Etiamsi omnes, ego non
Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
-- Asaf Bartov Wikimedia Foundation http://www.wikimediafoundation.org
Imagine a world in which every single human being can freely share in the sum of all knowledge. Help us make it a reality! https://donate.wikimedia.org
Wikisource-l mailing list Wikisource-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikisource-l
wikisource-l@lists.wikimedia.org