Hello,
This is now off topic for foundation-l.
Better to continue this on wikisource-l.
Klaus Graf wrote:
>> Hello,
>>
>> I agree with Ray here, and I think that Klaus' mail does not report
>> exactly the reality. The French Wikisource has the greatest numbers of
>> scanned texts so far,
>
> Is there a proof for this claim?
http://wikisource.org/wiki/Wikisource:ProofreadPage_Statistics
lists 40,043 pages for fr.ws and 16,939 for de.ws.
http://fr.wikisource.org/wiki/Wikisource:Livres_disponibles_en_mode_page
lists 62,326 scanned pages (not yet all ocred and proofread, and I am
not sure that this page is up to date).
> but does not make mandatory to have them to
>> publish a text there. It is only a suggestion, which many contributors
>> follow.
>>
>> I think that the important point is not scanned texts, but notation on
>> whether and how the texts are proofread by editors, whatever means the
>> editors use to proofread the texts.
>
> I am monitoring discussions on digitization projects as archival
> professional since years. It's standard to give not only e-texts but
> scans. Wikisource demands no scans when a permanent web adress (e.g.
> library project) for the scans outside Commons is given.
>
> I think the average quality of other Wikisource branches is very poor.
> In most cases there is no source given: one cannot know which source
> is used, and for scholarly purposes the e-text is worthless.
I think we already have had this discussion earlier, but the
misunderstanding continues.
There are two different issues here. It is important not to mix them:
1. Scans provided alongside texts.
2. Notation of quality.
Quality is not an absolute value. It is relative to the sources
available for a given text. Quality does not have the same meaning for a
text from 1920 and a text from the 15th century. So one should not talk
about quality, but about notation of quality.
I agree that giving the source is important and should be part of a
quality notation. The most important is to have a clear notation so that
readers know how and by whom the texts have been proofread. Scans alone
are not a proof of quality, but they help getting a better quality. They
are not the only way to get good quality texts. Some texts may be
proofread by several contributors, so of very good qualilty, but
Wikisource might not be able to have scanned images if a public domain
edition is not easily avalaible.
> Klaus Graf
Regards,
Yann
--
http://www.non-violence.org/ | Site collaboratif sur la non-violence
http://www.forget-me.net/ | Alternatives sur le Net
http://fr.wikisource.org/ | Bibliothèque libre
http://wikilivres.info | Documents libres
Hello,
I agree with Ray here, and I think that Klaus' mail does not report
exactly the reality. The French Wikisource has the greatest numbers of
scanned texts so far, but does not make mandatory to have them to
publish a text there. It is only a suggestion, which many contributors
follow.
I think that the important point is not scanned texts, but notation on
whether and how the texts are proofread by editors, whatever means the
editors use to proofread the texts.
Regards,
Yann
Ray Saintonge wrote:
> Klaus Graf wrote:
>> One can add de.Wikisource which is a project making historical Public
>> Domain texts in German available with high quality standards. These
>> standards are NOT (yet) shared by the other Wikisource projects, see
>> also
>>
>> http://wikisource.org/wiki/Wikisource:Scriptorium#The_huge_leap
>>
>> Only de.Wikisource demands scanned texts (or digital photos) for
>> contributions, most other Wikisource branches have a lot of texts
>> which are unsourced. De.Wikisource has notes commenting the texts for
>> lots of texts.
> Much of what you suggest is not about to happen any time soon. The fact
> is that splitting up the Wikisource communities created circumstances
> where each Wikisource develops its own standards and criteria. The
> discussions which may have taken place leading up to these policies on
> de:Wikisource either did not take place elsewhere or did not have the
> same results. At best, there have been few determined contributors
> willing to lead by example. Simply telling people to do these gets nowhere.
>
> There is a clear benefit to having to having our texts supported by
> scanned texts, but many of us who may work well with textual material,
> may not have the same technical ease when working with images of any
> kind. Even adding a small number of illustrations that may otherwise
> accompany a text can be a problematic chore. I am quite prepared to
> identify where I found my material, but I am quite content to have
> others do the work of digitization.
>
> Commenting on texts is a great idea that could stand to be encouraged more.
>
> I agree with the premise that we cannot hope to keep up with the massive
> digitization projects undertaken by well-funded institutions, but a lot
> of restrictive requirements is self-defeating. The need is really for a
> balance somewhere between the minutiae of quality and the feeling that
> contributors are seeing a lot of growth. Wikisource will not become
> great by trying to beat the big institutions at their own game. Thus we
> need to ask oursaelves what we can do to add value that no other similar
> project can do. In doing so we cannot afford to get bogged down in
> standardized headings that do not allow for easy expansion without a
> complete understanding of tranclusion technology. We need to allow our
> imaginations the freedom to find new ways of connecting data without
> being tied to formal structures that are so strict as to close off these
> paths.
>
> Ec
--
http://www.non-violence.org/ | Site collaboratif sur la non-violence
http://www.forget-me.net/ | Alternatives sur le Net
http://fr.wikisource.org/ | Bibliothèque libre
http://wikilivres.info | Documents libres
On Jan 3, 2008 11:28 AM, geni <geniice(a)gmail.com> wrote:
> On 02/01/2008, Erik Moeller <erik(a)wikimedia.org> wrote:
> > FYI
> >
> > http://www.zotero.org/blog/zotero-and-the-internet-archive-join-forces/
> >
>
> Nothing new there are a number of sites that accept text dumps of
> copyvios already.
and there are many projects that don't permit copyvios.
Internet Archive's digital repository is at least as clean as
Commons/Wikisource, probably more so.
Zotero is backed by a university, develops open source software, and
is receiving grants from a notable funding body: I doubt that they
have neglected to consider copyright.
--
John
FYI
http://www.zotero.org/blog/zotero-and-the-internet-archive-join-forces/
Recently the Andrew W. Mellon Foundation awarded the Center for
History and New Media and the Internet Archive $1.2 million dollars to
develop new services that will aid scholarly sharing, collaboration,
citation, and annotation.
In 2008, users will be able to drag and drop items into the "Zotero
Commons"—a dedicated part of the Internet Archive's servers—through
icon in the left column.
Zotero Commons
Items donated to the Commons will be stored in subdirectories of the
Commons named for the donors. In addition to encouraging donations to
the commons (since those donating will receive credit for their
contributions), this feature will also enable users to identify others
who are working with and/or annotating the same content, fostering new
collaboration opportunities. The benefits to the scholarly community
of the Common are thus threefold:
1) The availability of permanent, persistent archival, off-site
storage for long-term management and use of digital content.
2) The ability to share resources publicly for easy access by other scholars.
3) The simplified discovery of new, related resources and potential
collaboration opportunities.
As an added incentive to donate to the Commons, the Internet Archive
will provide free OCR for your contributions and send you the
transcribed text to help you search your personal library.
In addition, modifications will be made to Zotero to make it easier
for researchers to select already archived files and web pages from
the Internet Archive's existing collections rather than saving local
copies. This will enable better referencing of "born digital" items
and allow for the collaborative annotation of web documents.
Zotero Commons and Zotero 2.0
Zotero 2.0 will allow you to sync your library's metadata to the Zotero Server.
You will sync your metadata with the Zotero server
With Zotero Commons you will be able to contribute public domain
images, texts, audio and other files.
You can also contribute files to the Zotero Commons
In turn, the Internet Archive will send you any text extracted from
donated documents.
--
Erik Möller
And Wikisource ?
Yann
-------- Original Message --------
Subject: Wikis Go Printable: Wikimedia/PediaPress/COL/OSI partnership
FYI - please forward :-)
WIKIS GO PRINTABLE
New open source technology will bring content from Wikipedia,
WikiEducator, and other wikis to the world of paper.
DECEMBER 13, 2007 - ST. PETERSBURG, FLORIDA: The Wikimedia Foundation
today announced a partnership that will make it possible to obtain
high quality print and word processor copies of articles from
Wikipedia and other wiki educational resources. The development of the
underlying open source software is supported by the Open Society
Institute (www.soros.org) and the Commonwealth of Learning
(www.col.org), and led by PediaPress.com, a start-up company based in
Germany.
"This technology is of key strategic importance to the cause of free
education world-wide," said Sue Gardner, Executive Director of the
Wikimedia Foundation. "It will make it possible to use and remix wiki
content for a variety of purposes, both in the developing and the
developed world, in areas with connectivity and without."
Deployment of the technology will happen in three stages. The first
stage, launched today, is a public beta test running on
WikiEducator.org of functionality for remixing collections of wiki
pages and downloading them in the PDF format. WikiEducator is a
project hosted by the Commonwealth of Learning and uses the same wiki
technology as Wikipedia.
"These tools have the potential to transform and improve the way we
author and share distance education materials, textbooks and other
learning resources -- I'm thrilled that the WikiEducator will be the
first online community to implement them," said Wayne Mackintosh,
Ph.D., an education specialist for the Commonwealth of Learning and
founder of the WikiEducator project.
The second stage, planned for early 2008, will be the deployment of
the technology on the projects hosted by the Wikimedia Foundation,
including Wikipedia. At this point, users will also be given the
option to order printed copies of wiki content directly from
PediaPress.com. "The integration into Wikipedia will be a milestone
for print-on-demand technology. Users will literally be empowered to
print their own encyclopedias", according to Heiko Hees, product
manager at PediaPress.com.
The third stage, planned for mid-2008, will be the addition of the
OpenDocument format for word processors to the list of export formats.
"Imagine that you want to use a set of wiki articles in the classroom.
By supporting the OpenDocument format, we will make it easy for
educators to customize and remix content before printing and
distributing it from any desktop computer," Sue Gardner explained.
This work is funded through a US$40,000 grant by the Open Society
Institute.
The technology developed through this cooperation will be available
under an open source license, free for anyone to use for any purpose.
It ties into the MediaWiki platform, the open source technology that
runs Wikipedia. As a result, thousands of wiki platforms around the
world will have the option of providing the same services to their
users.
CONTACTS
For more information, please contact Sandra Ordonez at (727) 231-0101,
or email her at: sordonez AT wikimedia DOT org
ABOUT THE WIKIMEDIA FOUNDATION
The Wikimedia Foundation Inc. is a nonprofit charitable organization
dedicated to encouraging the growth, development and distribution of
free, multilingual content, and to providing the full content of these
wiki-based projects to the public free of charge. It operates some of
the largest collaboratively-edited reference projects in the world,
including Wikipedia, one of the world's 10 most-visited websites. The
Foundation was created in 2003 by Jimmy Wales, the founder of
Wikipedia.
ABOUT THE COMMONWEALTH OF LEARNING
COL is an intergovernmental organisation created by Commonwealth Heads
of Government to encourage the development and sharing of open
learning and distance education knowledge, resources and technologies.
ABOUT THE OPEN SOCIETY INSTITUTE
The Open Society Institute (OSI), a private operating and grantmaking
foundation, aims to shape public policy to promote democratic
governance, human rights, and economic, legal, and social reform. On a
local level, OSI implements a range of initiatives to support the rule
of law, education, public health, and independent media. At the same
time, OSI works to build alliances across borders and continents on
issues such as combating corruption and rights abuses.
ABOUT PEDIAPRESS
PediaPress is a startup creating technology and services that make it
easy to derive printed books from wiki content. The company is located
in Mainz, Germany and has entered a long term partnership with the
Wikimedia Foundation.
--
http://www.non-violence.org/ | Site collaboratif sur la non-violence
http://www.forget-me.net/ | Alternatives sur le Net
http://fr.wikisource.org/ | Bibliothèque libre
http://wikilivres.info | Documents libres
>From [Wikimediafr-l].
In short, Emmanuel would like to develop a free software for images
post-scan processing.
Yann
-------- Original Message --------
Subject: [Wikimediafr-l] [WIKISOURCE] Scan. de livres
Date: Thu, 13 Dec 2007 11:29:46 +0100
From: Emmanuel Engelhart <emmanuel.engelhart (at) wikimedia (dot) fr>
Salut
Je cherche une solution libre pour traiter des pages d'écritures
scannées de livres à la chaîne
Je cherche un logiciel permettant de :
* Enlever les bordures noires et de manière générale les ombres (effet
de transparence)
* Re-équilibrer le texte par rotation simple.
* Re-découper automatiquement la page (par exemple 50px de marges
autour du bloc de texte)
Je ne trouve malheureusement rien et j'envisage donc de m'occuper
moi-même du problème.
Sachant que je suis tout nouveau face à ce problème, tout remarque,
tout conseil est le bienvenu.
Techniquement, j'envisage de faire un truc en script-fu (langage
scheme pour TheGimp). Cela en fera un outil libre, facile à modifier
et multi-plateforme ; en plus je n'aurai pas à m'occuper de la partie
algo. de traitement d'image.
Si vous avez aussi des échantillons de pages scannées en 300 dpi (voir
150), je suis aussi preneur (m'envoyer directement en privé). Cela me
permettrait d'évaluer une solution sur un panel large d'exemples.
Emmanuel
PS : Je viens de faire l'achat d'un AVISION FB6080E
(http://www.avision.de/?content=FB6080E). Ce scanner offre l'avantage
de scanner directement depuis la bordure de sa dalle. Sans être la
panacée, il permet de scanner des livres en réduisant beaucoup les
efforts au niveau de la reliure et au passage l'ombre (sur l'image) à
son niveau :
ce qui permet de scanner des livres (assez gros) inscannables autrement
(avec un scanner plat typique). Le tout fonctionne sous linux
parfaitement... avec quelques efforts ;)
--
http://www.non-violence.org/ | Site collaboratif sur la non-violence
http://www.forget-me.net/ | Alternatives sur le Net
http://fr.wikisource.org/ | Bibliothèque libre
http://wikilivres.info | Documents libres
All -
we've set up a blog to accompany our annual fundraiser. The headlines
from the blog will be featured in the sitenotice:
http://whygive.wikimedia.org/
I'd like to invite you to submit posts to the blog. These posts can be
provocative, and should give compelling reasons to support the
Wikimedia Foundation. You can draft posts here:
http://meta.wikimedia.org/wiki/Fundraising_2007/Why_Give_blog
Posts will be selected by a number of people: Cary Bass (our Volunteer
Coordinator), Sandy Ordonez (our Communications Manager), Sue Gardner
(Special Advisor to the Board), and myself. We'll probably try to have
a new post every 2-3 days at least.
Once again, the point of these posts is first and foremost to invite
the general public to donate. :-) Please submit stories in this
general spirit.
If you are willing to act as a moderator for comments to vet out spam
& trolling, please contact Cary Bass at <cbass AT wikimedia DOT org>.
For now, this is an experiment and as such, only in English. We will
set up blogs in other languages if this one has a measurable impact on
our fundraising.
Thanks for any and all help!
Erik Möller
Member of the Board