Lars,
I think we agree on what needs to happen. The only thing I am not sure of is where you would like to see the work take place. I have raised versions of this issue with the Open Library list, which I copy again here (along with the people I know who work on that fine project - hello, Peter and Rebecca). This is why I listed it below as a good group to collaborate with.
However, the project I have in mind for OCR cleaning and translation needs to - accept public comments and annotation about the substance or use of a work (the wiki covering their millions of metadata entries is very low traffic and used mainly to address metadata issues in their records) - handle OCR as editable content, or translations of same - provide a universal ID for a work, with which comments and translations can be associated (see https://blueprints.launchpad.net/openlibrary/+spec/global-work-ids) - handle citations, with the possibility of developing something like WikiCite
Let's take a practical example. A classics professor I know (Greg Crane, copied here) has scans of primary source materials, some with approximate or hand-polished OCR, waiting to be uploaded and converted into a useful online resource for editors, translators, and classicists around the world.
Where should he and his students post that material?
Wherever they end up, the primary article about each article would surely link out to the OL and WS pages for each work (where one exists).
(Plus you would have to motivate why a copy of OpenLibrary should go into the English Wikisource and not the German or French one.)
I don't understand what you mean -- English source materials and metadata go on en:ws, German on de:ws, &c. How is this different from what happens today?
SJ
On Mon, Aug 3, 2009 at 1:18 PM, Lars Aronssonlars@aronsson.se wrote:
Samuel Klein wrote (in two messages):
*A wiki for book metadata, with an entry for every published work, statistics about its use and siblings, and discussion about its usefulness as a citation (a collaboration with OpenLibrary, merging WikiCite ideas)
I could see this happening on Wikisource.
Why could you not see this happening within the existing OpenLibrary? Is there anything wrong with that project? It sounds to me as you would just copy (fork) all their book data, but for what gain?
(Plus you would have to motivate why a copy of OpenLibrary should go into the English Wikisource and not the German or French one.)
-- Lars Aronsson (lars@aronsson.se) Aronsson Datateknik - http://aronsson.se
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Samuel Klein, 11/08/2009 07:00:
Let's take a practical example. A classics professor I know (Greg Crane, copied here) has scans of primary source materials, some with approximate or hand-polished OCR, waiting to be uploaded and converted into a useful online resource for editors, translators, and classicists around the world.
Where should he and his students post that material?
Slovene Wikisource did something similar: http://meta.wikimedia.org/wiki/Slovene_student_projects_in_Wikipedia_and_Wik...
Nemo
On Tue, Aug 11, 2009 at 3:00 PM, Samuel Kleinmeta.sj@gmail.com wrote:
... Let's take a practical example. A classics professor I know (Greg Crane, copied here) has scans of primary source materials, some with approximate or hand-polished OCR, waiting to be uploaded and converted into a useful online resource for editors, translators, and classicists around the world.
Where should he and his students post that material?
I am a bit confused. Are these texts currently hosted at the Perseus Digital Library?
If so, they are already a useful online resource. ;-)
If they would like to see these primary sources pushed into the Wikimedia community, they would need to upload the images (or DjVu) onto Commons, and the text onto Wikisource where the distributed proofreading software resides.
We can work with them to import a few texts in order to demonstrate our technology and preferred methods, and then they can decide whether they are happy with this technology, the community, and the potential for translations and commentary.
I made a start on creating a Perseus-to-Wikisource importer about a year ago...!
Or they can upload the djvu to Internet Archive.. or a similar depositories... and see where it goes from there.
Wherever they end up, the primary article about each article would surely link out to the OL and WS pages for each work (where one exists).
Wikisource has been adding OCLC numbers to pages, and adding links to archive.org when the djvu files came from there (these links contain an archive.org identifier). There are also links to LibraryThing and Open Library; we have very few rules ;-)
-- John Vandenberg
On Tue, Aug 11, 2009 at 7:32 AM, John Vandenbergjayvdb@gmail.com wrote:
On Tue, Aug 11, 2009 at 3:00 PM, Samuel Kleinmeta.sj@gmail.com wrote:
... Let's take a practical example. A classics professor I know (Greg Crane, copied here) has scans of primary source materials, some with approximate or hand-polished OCR, waiting to be uploaded and converted into a useful online resource for editors, translators, and classicists around the world.
Where should he and his students post that material?
I am a bit confused. Are these texts currently hosted at the Perseus Digital Library?
If so, they are already a useful online resource. ;-)
If they would like to see these primary sources pushed into the Wikimedia community, they would need to upload the images (or DjVu) onto Commons, and the text onto Wikisource where the distributed proofreading software resides.
I see CC-NC...
http://www.perseus.tufts.edu/hopper/text?doc=Perseus%3atext%3a2003.02.0004
Too bad.
Magnus
On Tue, Aug 11, 2009 at 6:21 PM, Magnus Manskemagnusmanske@googlemail.com wrote:
I see CC-NC...
http://www.perseus.tufts.edu/hopper/text?doc=Perseus%3atext%3a2003.02.0004
Too bad.
Well, they can't copyright what is in the PD.
There is little about the XML in TEI format that can be called "creative", and any non-factual markup can be easily stripped out.
I remember now ... it was in March/April 2008 that I was looking at this, for the Pindar odes, and a djvu with pagescans is on archive.org.
http://www.perseus.tufts.edu/hopper/text?doc=Perseus%3Atext%3A1999.04.0101 http://www.archive.org/details/olympianpythiano00pinduoft
The Perseus etext doesnt appear to include the 125 pages which have the complete Greek texts. (btw, here is our unverified original source: http://el.wikisource.org/wiki/%CE%9F%CE%BB%CF%85%CE%BC%CF%80%CE%B9%CF%8C%CE%... )
However the commentary is all there, with pagination in the TEI so it is easy to marry the text with the images.
(warning: 850kb xml file, followed by medium res. image) http://www.perseus.tufts.edu/hopper/xmlchunk?doc=Perseus%3Atext%3A1999.04.01.... http://www.archive.org/stream/olympianpythiano00pinduoft#page/124/mode/2up
-- John Vandenberg
Samuel Klein wrote:
I think we agree on what needs to happen. The only thing I am not sure of is where you would like to see the work take place.
I'm not so sure we agree. I think we're talking about two different things.
This thread started out with a discussion of why it is so hard to start new projects within the Wikimedia Foundation. My stance is that projects like OpenStreetMap.org and OpenLibrary.org are doing fine as they are, and there is no need to duplicate their effort within the WMF. The example you gave was this:
*A wiki for book metadata, with an entry for every published work, statistics about its use and siblings, and discussion about its usefulness as a citation (a collaboration with OpenLibrary, merging WikiCite ideas)
To me, that sounds exactly as what OpenLibrary already does (or could be doing in the near time), so why even set up a new project that would collaborate with it? Later you added:
I could see this happening on Wikisource.
That's when I asked why this couldn't be done inside OpenLibrary.
I added:
(Plus you would have to motivate why a copy of OpenLibrary should go into the English Wikisource and not the German or French one.)
You replied:
I don't understand what you mean -- English source materials and metadata go on en:ws, German on de:ws, &c. How is this different from what happens today?
I was talking about the metadata for all books ever published, including the Swedish translations of Mark Twain's works, which are part of Mark Twain's bibliography, of the translator's bibliography, of American literature, and of Swedish language literature. In OpenLibrary all of these are contained in one project. In Wikisource, they are split in one section for English and another section for Swedish. That division makes sense for the contents of the book, but not for the book metadata.
Now you write:
However, the project I have in mind for OCR cleaning and translation needs to
That is a change of subject. That sounds just like what Wikisource (or PGDP.net) is about. OCR cleaning is one thing, but it is an entirely different thing to set up "a wiki for book metadata, with an entry for every published work". So which of these two project ideas are we talking about?
Every book ever published means more than 10 million records. (It probably means more than 100 million records.) OCR cleaning attracts hundreds or a few thousand volunteers, which is sufficient to take on thousands of books, but not millions.
Google scanned millions of books already, but I haven't heard of any plans for cleaning all that OCR text.
Let's take a practical example. A classics professor I know (Greg Crane, copied here) has scans of primary source materials, some with approximate or hand-polished OCR, waiting to be uploaded and converted into a useful online resource for editors, translators, and classicists around the world.
Where should he and his students post that material?
On Wikisource. What's stopping them?
On Tue, Aug 11, 2009 at 9:16 PM, Lars Aronssonlars@aronsson.se wrote:
Let's take a practical example. A classics professor I know (Greg Crane, copied here) has scans of primary source materials, some with approximate or hand-polished OCR, waiting to be uploaded and converted into a useful online resource for editors, translators, and classicists around the world.
Where should he and his students post that material?
On Wikisource. What's stopping them?
Greg: does Wikisource seem like the right place to post and revise OCR to you? If not, where? If so, what's stopping you?
I'm not so sure we agree. I think we're talking about two different things.
This thread started out with a discussion of why it is so hard to start new projects within the Wikimedia Foundation. My stance is that projects like OpenStreetMap.org and OpenLibrary.org are doing fine as they are, and there is no need to duplicate their effort within the WMF. The example you gave was this:
I agree that there's no point in duplicating existing functionality. The best solution is probably for OL to include this explicitly in their scope and add the necessary functionality. I suggested this on the OL mailing list in March. http://mail.archive.org/pipermail/ol-discuss/2009-March/000391.html
*A wiki for book metadata, with an entry for every published work, statistics about its use and siblings, and discussion about its usefulness as a citation (a collaboration with OpenLibrary, merging WikiCite ideas)
To me, that sounds exactly as what OpenLibrary already does (or could be doing in the near time), so why even set up a new project that would collaborate with it? Later you added:
However, this is not what OL or its wiki do now. And OL is not run by its community, the community helps support the work of a centrally directed group. So there is only so much I feel I can contribute to the project by making suggestions. The wiki built into the fiber of OL is intentionally not used for general discussion.
I was talking about the metadata for all books ever published, including the Swedish translations of Mark Twain's works, which are part of Mark Twain's bibliography, of the translator's bibliography, of American literature, and of Swedish language literature. In OpenLibrary all of these are contained in one project. In Wikisource, they are split in one section for English and another section for Swedish. That division makes sense for the contents of the book, but not for the book metadata.
This is a problem that Wikisource needs to address, regardless of where the OpenLibrary metadata goes. It is similar to the Wiktionary problem of wanting some content - the array of translations of a single definition - to exist in one place and be transcluded in each language.
Now you write:
However, the project I have in mind for OCR cleaning and translation needs to
That is a change of subject. That sounds just like what Wikisource (or PGDP.net) is about. OCR cleaning is one thing, but it is an entirely different thing to set up "a wiki for book metadata, with an entry for every published work". So which of these two project ideas are we talking about?
They are closely related.
There needs to be a global authority file for works -- a [set of] universal identifier[s] for a given work in order for wikisource (as it currently stands) to link the German translation of the English transcription of OCR of the 1998 photos of the 1572 Rotterdam Codex... to its metadata entry [or entries].
I would prefer for this authority file to be wiki-like, as the Wikipedia authority file is, so that it supports renames, merges, and splits with version history and minimal overhead; hence I wish to see a wiki for this sort of metadata.
Currently OL does not quite provide this authority file, but it could. I do not know how easily.
Every book ever published means more than 10 million records. (It probably means more than 100 million records.) OCR cleaning attracts hundreds or a few thousand volunteers, which is sufficient to take on thousands of books, but not millions.
Focusing efforts on notable works with verifiable OCR, and using the sorts of helper tools that Greg's paper describes, I do not doubt that we could effectively clean and publish OCR for all primary sources that are actively used and referenced in scholarship today (and more besides). Though 'we' here is the world - certainly more than a few thousand volunteers have at least one book they would like to polish. Most of them are not currently Wikimedia contributors, that much is certain -- we don't provide any tools to make this work convenient or rewarding.
Google scanned millions of books already, but I haven't heard of any plans for cleaning all that OCR text.
Well, Google does not believe in distributed human effort. (This came up in a recent Knol thread as well.) I'm not sure that is the best comparison.
SJ
Hello,
This discussion is very interesting. I would like to make a summary, so that we can go further.
1. A database of all books ever published is one of the thing still missing. 2. This needs massive collaboration by thousands of volunteers, so a wiki might be appropriate, however... 3. The data needs a structured web site, not a plain wiki like Mediawiki. 4. A big part of this data is already available, but scattered on various databases, in various languages, with various protocols, etc. So a big part of work needs as much database management knowledge as librarian knowledge. 5. What most missing in these existing databases (IMO) is information about translations: nowhere there are a general database of translated works, at least not in English and French. It is very difficult to find if a translation exists for a given work. Wikisource has some of this information with interwiki links between work and author pages, but for a (very) small number of works and authors. 6. It would be best not to duplicate work on several places.
Personally I don't find OL very practical. May be I am too much used too Mediawiki. ;oD
We still need to create something, attractive to contributors and readers alike.
Yann
Samuel Klein wrote:
This thread started out with a discussion of why it is so hard to start new projects within the Wikimedia Foundation. My stance is that projects like OpenStreetMap.org and OpenLibrary.org are doing fine as they are, and there is no need to duplicate their effort within the WMF. The example you gave was this:
I agree that there's no point in duplicating existing functionality. The best solution is probably for OL to include this explicitly in their scope and add the necessary functionality. I suggested this on the OL mailing list in March. http://mail.archive.org/pipermail/ol-discuss/2009-March/000391.html
> *A wiki for book metadata, with an entry for every published > work, statistics about its use and siblings, and discussion > about its usefulness as a citation (a collaboration with > OpenLibrary, merging WikiCite ideas)
To me, that sounds exactly as what OpenLibrary already does (or could be doing in the near time), so why even set up a new project that would collaborate with it? Later you added:
However, this is not what OL or its wiki do now. And OL is not run by its community, the community helps support the work of a centrally directed group. So there is only so much I feel I can contribute to the project by making suggestions. The wiki built into the fiber of OL is intentionally not used for general discussion.
I was talking about the metadata for all books ever published, including the Swedish translations of Mark Twain's works, which are part of Mark Twain's bibliography, of the translator's bibliography, of American literature, and of Swedish language literature. In OpenLibrary all of these are contained in one project. In Wikisource, they are split in one section for English and another section for Swedish. That division makes sense for the contents of the book, but not for the book metadata.
This is a problem that Wikisource needs to address, regardless of where the OpenLibrary metadata goes. It is similar to the Wiktionary problem of wanting some content - the array of translations of a single definition - to exist in one place and be transcluded in each language.
Now you write:
However, the project I have in mind for OCR cleaning and translation needs to
That is a change of subject. That sounds just like what Wikisource (or PGDP.net) is about. OCR cleaning is one thing, but it is an entirely different thing to set up "a wiki for book metadata, with an entry for every published work". So which of these two project ideas are we talking about?
They are closely related.
There needs to be a global authority file for works -- a [set of] universal identifier[s] for a given work in order for wikisource (as it currently stands) to link the German translation of the English transcription of OCR of the 1998 photos of the 1572 Rotterdam Codex... to its metadata entry [or entries].
I would prefer for this authority file to be wiki-like, as the Wikipedia authority file is, so that it supports renames, merges, and splits with version history and minimal overhead; hence I wish to see a wiki for this sort of metadata.
Currently OL does not quite provide this authority file, but it could. I do not know how easily.
Every book ever published means more than 10 million records. (It probably means more than 100 million records.) OCR cleaning attracts hundreds or a few thousand volunteers, which is sufficient to take on thousands of books, but not millions.
Focusing efforts on notable works with verifiable OCR, and using the sorts of helper tools that Greg's paper describes, I do not doubt that we could effectively clean and publish OCR for all primary sources that are actively used and referenced in scholarship today (and more besides). Though 'we' here is the world - certainly more than a few thousand volunteers have at least one book they would like to polish. Most of them are not currently Wikimedia contributors, that much is certain -- we don't provide any tools to make this work convenient or rewarding.
Google scanned millions of books already, but I haven't heard of any plans for cleaning all that OCR text.
Well, Google does not believe in distributed human effort. (This came up in a recent Knol thread as well.) I'm not sure that is the best comparison.
SJ
Yann & Sam
The problem is extraordinarily complex. A database of all "books" (and other media) ever published is beyond the joint capabilities of everyone interested. There are intermediate entities between "books" and "works", and important subordinate entities, such as "article" , "chapter" , and those like "poem" which could be at any of several levels. This is not a job for amateurs, unless they are prepared to first learn the actual standards of bibliographic description for different types of material, and to at least recognize the inter-relationships, and the many undefined areas. At research libraries, one allows a few years of training for a newcomer with just a MLS degree to work with a small subset of this. I have thirty years of experience in related areas of librarianship, and I know only enough to be aware of the problems. For an introduction to the current state of this, see http://www.rdaonline.org/constituencyreview/Phase1Chp17_11_2_08.pdf.
The difficulty of merging the many thousands of partial correct and incorrect sources of available data typically requires the manual resolution of each of the tens of millions of instances.
OL rather than Wikimedia has the advantage that more of the people there understand the problems.
David Goodman, Ph.D, M.L.S. http://en.wikipedia.org/wiki/User_talk:DGG
On Wed, Aug 12, 2009 at 1:15 PM, cyann@forget-me.net wrote:
Hello,
This discussion is very interesting. I would like to make a summary, so that we can go further.
- A database of all books ever published is one of the thing still missing.
- This needs massive collaboration by thousands of volunteers, so a
wiki might be appropriate, however... 3. The data needs a structured web site, not a plain wiki like Mediawiki. 4. A big part of this data is already available, but scattered on various databases, in various languages, with various protocols, etc. So a big part of work needs as much database management knowledge as librarian knowledge. 5. What most missing in these existing databases (IMO) is information about translations: nowhere there are a general database of translated works, at least not in English and French. It is very difficult to find if a translation exists for a given work. Wikisource has some of this information with interwiki links between work and author pages, but for a (very) small number of works and authors. 6. It would be best not to duplicate work on several places.
Personally I don't find OL very practical. May be I am too much used too Mediawiki. ;oD
We still need to create something, attractive to contributors and readers alike.
Yann
Samuel Klein wrote:
This thread started out with a discussion of why it is so hard to start new projects within the Wikimedia Foundation. My stance is that projects like OpenStreetMap.org and OpenLibrary.org are doing fine as they are, and there is no need to duplicate their effort within the WMF. The example you gave was this:
I agree that there's no point in duplicating existing functionality. The best solution is probably for OL to include this explicitly in their scope and add the necessary functionality. I suggested this on the OL mailing list in March. http://mail.archive.org/pipermail/ol-discuss/2009-March/000391.html
>> *A wiki for book metadata, with an entry for every published >> work, statistics about its use and siblings, and discussion >> about its usefulness as a citation (a collaboration with >> OpenLibrary, merging WikiCite ideas)
To me, that sounds exactly as what OpenLibrary already does (or could be doing in the near time), so why even set up a new project that would collaborate with it? Later you added:
However, this is not what OL or its wiki do now. And OL is not run by its community, the community helps support the work of a centrally directed group. So there is only so much I feel I can contribute to the project by making suggestions. The wiki built into the fiber of OL is intentionally not used for general discussion.
I was talking about the metadata for all books ever published, including the Swedish translations of Mark Twain's works, which are part of Mark Twain's bibliography, of the translator's bibliography, of American literature, and of Swedish language literature. In OpenLibrary all of these are contained in one project. In Wikisource, they are split in one section for English and another section for Swedish. That division makes sense for the contents of the book, but not for the book metadata.
This is a problem that Wikisource needs to address, regardless of where the OpenLibrary metadata goes. It is similar to the Wiktionary problem of wanting some content - the array of translations of a single definition - to exist in one place and be transcluded in each language.
Now you write:
However, the project I have in mind for OCR cleaning and translation needs to
That is a change of subject. That sounds just like what Wikisource (or PGDP.net) is about. OCR cleaning is one thing, but it is an entirely different thing to set up "a wiki for book metadata, with an entry for every published work". So which of these two project ideas are we talking about?
They are closely related.
There needs to be a global authority file for works -- a [set of] universal identifier[s] for a given work in order for wikisource (as it currently stands) to link the German translation of the English transcription of OCR of the 1998 photos of the 1572 Rotterdam Codex... to its metadata entry [or entries].
I would prefer for this authority file to be wiki-like, as the Wikipedia authority file is, so that it supports renames, merges, and splits with version history and minimal overhead; hence I wish to see a wiki for this sort of metadata.
Currently OL does not quite provide this authority file, but it could. I do not know how easily.
Every book ever published means more than 10 million records. (It probably means more than 100 million records.) OCR cleaning attracts hundreds or a few thousand volunteers, which is sufficient to take on thousands of books, but not millions.
Focusing efforts on notable works with verifiable OCR, and using the sorts of helper tools that Greg's paper describes, I do not doubt that we could effectively clean and publish OCR for all primary sources that are actively used and referenced in scholarship today (and more besides). Though 'we' here is the world - certainly more than a few thousand volunteers have at least one book they would like to polish. Most of them are not currently Wikimedia contributors, that much is certain -- we don't provide any tools to make this work convenient or rewarding.
Google scanned millions of books already, but I haven't heard of any plans for cleaning all that OCR text.
Well, Google does not believe in distributed human effort. (This came up in a recent Knol thread as well.) I'm not sure that is the best comparison.
SJ
-- http://www.non-violence.org/ | Site collaboratif sur la non-violence http://www.forget-me.net/ | Alternatives sur le Net http://fr.wikisource.org/ | Bibliothèque libre http://wikilivres.info | Documents libres
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
DGG, I appreciate your points. Would we be so motivated by this thread if it weren't a complex problem?
The fact that all of this is quite new, and that there are so many unknowns and gray areas, actually makes me consider it more likely that a body of wikimedians, experienced with their own form of large-scale authority file coordination, are in a position to say something meaningful about how to achieve something similar for tens of millions of metadata records.
OL rather than Wikimedia has the advantage that more of the people there understand the problems.
In some areas that is certainly so. In others, Wikimedia communities have useful recent experience. I hope that those who understand these problems on both sides recognize the importance of sharing what they know openly -- and showing others how to understand them as well. We will not succeed as a global community if we say that this class of problems can only be solved by the limited group of people with an MLS and a few years of focused training. (how would you name the sort of training you mean here, btw?)
SJ
On Thu, Aug 13, 2009 at 12:57 AM, David Goodmandgoodmanny@gmail.com wrote:
Yann & Sam
The problem is extraordinarily complex. A database of all "books" (and other media) ever published is beyond the joint capabilities of everyone interested. There are intermediate entities between "books" and "works", and important subordinate entities, such as "article" , "chapter" , and those like "poem" which could be at any of several levels. This is not a job for amateurs, unless they are prepared to first learn the actual standards of bibliographic description for different types of material, and to at least recognize the inter-relationships, and the many undefined areas. At research libraries, one allows a few years of training for a newcomer with just a MLS degree to work with a small subset of this. I have thirty years of experience in related areas of librarianship, and I know only enough to be aware of the problems. For an introduction to the current state of this, see http://www.rdaonline.org/constituencyreview/Phase1Chp17_11_2_08.pdf.
The difficulty of merging the many thousands of partial correct and incorrect sources of available data typically requires the manual resolution of each of the tens of millions of instances.
OL rather than Wikimedia has the advantage that more of the people there understand the problems.
David Goodman, Ph.D, M.L.S. http://en.wikipedia.org/wiki/User_talk:DGG
On Wed, Aug 12, 2009 at 1:15 PM, cyann@forget-me.net wrote:
Hello,
This discussion is very interesting. I would like to make a summary, so that we can go further.
- A database of all books ever published is one of the thing still missing.
- This needs massive collaboration by thousands of volunteers, so a
wiki might be appropriate, however... 3. The data needs a structured web site, not a plain wiki like Mediawiki. 4. A big part of this data is already available, but scattered on various databases, in various languages, with various protocols, etc. So a big part of work needs as much database management knowledge as librarian knowledge. 5. What most missing in these existing databases (IMO) is information about translations: nowhere there are a general database of translated works, at least not in English and French. It is very difficult to find if a translation exists for a given work. Wikisource has some of this information with interwiki links between work and author pages, but for a (very) small number of works and authors. 6. It would be best not to duplicate work on several places.
Personally I don't find OL very practical. May be I am too much used too Mediawiki. ;oD
We still need to create something, attractive to contributors and readers alike.
Yann
Samuel Klein wrote:
This thread started out with a discussion of why it is so hard to start new projects within the Wikimedia Foundation. My stance is that projects like OpenStreetMap.org and OpenLibrary.org are doing fine as they are, and there is no need to duplicate their effort within the WMF. The example you gave was this:
I agree that there's no point in duplicating existing functionality. The best solution is probably for OL to include this explicitly in their scope and add the necessary functionality. I suggested this on the OL mailing list in March. http://mail.archive.org/pipermail/ol-discuss/2009-March/000391.html
>>> *A wiki for book metadata, with an entry for every published >>> work, statistics about its use and siblings, and discussion >>> about its usefulness as a citation (a collaboration with >>> OpenLibrary, merging WikiCite ideas)
To me, that sounds exactly as what OpenLibrary already does (or could be doing in the near time), so why even set up a new project that would collaborate with it? Later you added:
However, this is not what OL or its wiki do now. And OL is not run by its community, the community helps support the work of a centrally directed group. So there is only so much I feel I can contribute to the project by making suggestions. The wiki built into the fiber of OL is intentionally not used for general discussion.
I was talking about the metadata for all books ever published, including the Swedish translations of Mark Twain's works, which are part of Mark Twain's bibliography, of the translator's bibliography, of American literature, and of Swedish language literature. In OpenLibrary all of these are contained in one project. In Wikisource, they are split in one section for English and another section for Swedish. That division makes sense for the contents of the book, but not for the book metadata.
This is a problem that Wikisource needs to address, regardless of where the OpenLibrary metadata goes. It is similar to the Wiktionary problem of wanting some content - the array of translations of a single definition - to exist in one place and be transcluded in each language.
Now you write:
However, the project I have in mind for OCR cleaning and translation needs to
That is a change of subject. That sounds just like what Wikisource (or PGDP.net) is about. OCR cleaning is one thing, but it is an entirely different thing to set up "a wiki for book metadata, with an entry for every published work". So which of these two project ideas are we talking about?
They are closely related.
There needs to be a global authority file for works -- a [set of] universal identifier[s] for a given work in order for wikisource (as it currently stands) to link the German translation of the English transcription of OCR of the 1998 photos of the 1572 Rotterdam Codex... to its metadata entry [or entries].
I would prefer for this authority file to be wiki-like, as the Wikipedia authority file is, so that it supports renames, merges, and splits with version history and minimal overhead; hence I wish to see a wiki for this sort of metadata.
Currently OL does not quite provide this authority file, but it could. I do not know how easily.
Every book ever published means more than 10 million records. (It probably means more than 100 million records.) OCR cleaning attracts hundreds or a few thousand volunteers, which is sufficient to take on thousands of books, but not millions.
Focusing efforts on notable works with verifiable OCR, and using the sorts of helper tools that Greg's paper describes, I do not doubt that we could effectively clean and publish OCR for all primary sources that are actively used and referenced in scholarship today (and more besides). Though 'we' here is the world - certainly more than a few thousand volunteers have at least one book they would like to polish. Most of them are not currently Wikimedia contributors, that much is certain -- we don't provide any tools to make this work convenient or rewarding.
Google scanned millions of books already, but I haven't heard of any plans for cleaning all that OCR text.
Well, Google does not believe in distributed human effort. (This came up in a recent Knol thread as well.) I'm not sure that is the best comparison.
SJ
-- http://www.non-violence.org/ | Site collaboratif sur la non-violence http://www.forget-me.net/ | Alternatives sur le Net http://fr.wikisource.org/ | Bibliothèque libre http://wikilivres.info | Documents libres
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
The training is typically an apprenticeship under the senior cataloging librarians.
David Goodman, Ph.D, M.L.S. http://en.wikipedia.org/wiki/User_talk:DGG
On Thu, Aug 13, 2009 at 1:48 AM, Samuel Kleinmeta.sj@gmail.com wrote:
DGG, I appreciate your points. Would we be so motivated by this thread if it weren't a complex problem?
The fact that all of this is quite new, and that there are so many unknowns and gray areas, actually makes me consider it more likely that a body of wikimedians, experienced with their own form of large-scale authority file coordination, are in a position to say something meaningful about how to achieve something similar for tens of millions of metadata records.
OL rather than Wikimedia has the advantage that more of the people there understand the problems.
In some areas that is certainly so. In others, Wikimedia communities have useful recent experience. I hope that those who understand these problems on both sides recognize the importance of sharing what they know openly -- and showing others how to understand them as well. We will not succeed as a global community if we say that this class of problems can only be solved by the limited group of people with an MLS and a few years of focused training. (how would you name the sort of training you mean here, btw?)
SJ
On Thu, Aug 13, 2009 at 12:57 AM, David Goodmandgoodmanny@gmail.com wrote:
Yann & Sam
The problem is extraordinarily complex. A database of all "books" (and other media) ever published is beyond the joint capabilities of everyone interested. There are intermediate entities between "books" and "works", and important subordinate entities, such as "article" , "chapter" , and those like "poem" which could be at any of several levels. This is not a job for amateurs, unless they are prepared to first learn the actual standards of bibliographic description for different types of material, and to at least recognize the inter-relationships, and the many undefined areas. At research libraries, one allows a few years of training for a newcomer with just a MLS degree to work with a small subset of this. I have thirty years of experience in related areas of librarianship, and I know only enough to be aware of the problems. For an introduction to the current state of this, see http://www.rdaonline.org/constituencyreview/Phase1Chp17_11_2_08.pdf.
The difficulty of merging the many thousands of partial correct and incorrect sources of available data typically requires the manual resolution of each of the tens of millions of instances.
OL rather than Wikimedia has the advantage that more of the people there understand the problems.
David Goodman, Ph.D, M.L.S. http://en.wikipedia.org/wiki/User_talk:DGG
On Wed, Aug 12, 2009 at 1:15 PM, cyann@forget-me.net wrote:
Hello,
This discussion is very interesting. I would like to make a summary, so that we can go further.
- A database of all books ever published is one of the thing still missing.
- This needs massive collaboration by thousands of volunteers, so a
wiki might be appropriate, however... 3. The data needs a structured web site, not a plain wiki like Mediawiki. 4. A big part of this data is already available, but scattered on various databases, in various languages, with various protocols, etc. So a big part of work needs as much database management knowledge as librarian knowledge. 5. What most missing in these existing databases (IMO) is information about translations: nowhere there are a general database of translated works, at least not in English and French. It is very difficult to find if a translation exists for a given work. Wikisource has some of this information with interwiki links between work and author pages, but for a (very) small number of works and authors. 6. It would be best not to duplicate work on several places.
Personally I don't find OL very practical. May be I am too much used too Mediawiki. ;oD
We still need to create something, attractive to contributors and readers alike.
Yann
Samuel Klein wrote:
This thread started out with a discussion of why it is so hard to start new projects within the Wikimedia Foundation. My stance is that projects like OpenStreetMap.org and OpenLibrary.org are doing fine as they are, and there is no need to duplicate their effort within the WMF. The example you gave was this:
I agree that there's no point in duplicating existing functionality. The best solution is probably for OL to include this explicitly in their scope and add the necessary functionality. I suggested this on the OL mailing list in March. http://mail.archive.org/pipermail/ol-discuss/2009-March/000391.html
>>>> *A wiki for book metadata, with an entry for every published >>>> work, statistics about its use and siblings, and discussion >>>> about its usefulness as a citation (a collaboration with >>>> OpenLibrary, merging WikiCite ideas)
To me, that sounds exactly as what OpenLibrary already does (or could be doing in the near time), so why even set up a new project that would collaborate with it? Later you added:
However, this is not what OL or its wiki do now. And OL is not run by its community, the community helps support the work of a centrally directed group. So there is only so much I feel I can contribute to the project by making suggestions. The wiki built into the fiber of OL is intentionally not used for general discussion.
I was talking about the metadata for all books ever published, including the Swedish translations of Mark Twain's works, which are part of Mark Twain's bibliography, of the translator's bibliography, of American literature, and of Swedish language literature. In OpenLibrary all of these are contained in one project. In Wikisource, they are split in one section for English and another section for Swedish. That division makes sense for the contents of the book, but not for the book metadata.
This is a problem that Wikisource needs to address, regardless of where the OpenLibrary metadata goes. It is similar to the Wiktionary problem of wanting some content - the array of translations of a single definition - to exist in one place and be transcluded in each language.
Now you write:
However, the project I have in mind for OCR cleaning and translation needs to
That is a change of subject. That sounds just like what Wikisource (or PGDP.net) is about. OCR cleaning is one thing, but it is an entirely different thing to set up "a wiki for book metadata, with an entry for every published work". So which of these two project ideas are we talking about?
They are closely related.
There needs to be a global authority file for works -- a [set of] universal identifier[s] for a given work in order for wikisource (as it currently stands) to link the German translation of the English transcription of OCR of the 1998 photos of the 1572 Rotterdam Codex... to its metadata entry [or entries].
I would prefer for this authority file to be wiki-like, as the Wikipedia authority file is, so that it supports renames, merges, and splits with version history and minimal overhead; hence I wish to see a wiki for this sort of metadata.
Currently OL does not quite provide this authority file, but it could. I do not know how easily.
Every book ever published means more than 10 million records. (It probably means more than 100 million records.) OCR cleaning attracts hundreds or a few thousand volunteers, which is sufficient to take on thousands of books, but not millions.
Focusing efforts on notable works with verifiable OCR, and using the sorts of helper tools that Greg's paper describes, I do not doubt that we could effectively clean and publish OCR for all primary sources that are actively used and referenced in scholarship today (and more besides). Though 'we' here is the world - certainly more than a few thousand volunteers have at least one book they would like to polish. Most of them are not currently Wikimedia contributors, that much is certain -- we don't provide any tools to make this work convenient or rewarding.
Google scanned millions of books already, but I haven't heard of any plans for cleaning all that OCR text.
Well, Google does not believe in distributed human effort. (This came up in a recent Knol thread as well.) I'm not sure that is the best comparison.
SJ
-- http://www.non-violence.org/ | Site collaboratif sur la non-violence http://www.forget-me.net/ | Alternatives sur le Net http://fr.wikisource.org/ | Bibliothèque libre http://wikilivres.info | Documents libres
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
The training is typically an apprenticeship under the senior...
To my regret training/apprenticeship does not fit to "everyone can...", "be bold!" set of wikimedia slogans/motto. As to me I would stand behind (vote for) training and apprenticeship.
On Sat, Aug 15, 2009 at 12:23 AM, David Goodmandgoodmanny@gmail.com wrote:
The training is typically an apprenticeship under the senior cataloging librarians.
David Goodman, Ph.D, M.L.S. http://en.wikipedia.org/wiki/User_talk:DGG
On Thu, Aug 13, 2009 at 1:48 AM, Samuel Kleinmeta.sj@gmail.com wrote:
DGG, I appreciate your points. Would we be so motivated by this thread if it weren't a complex problem?
The fact that all of this is quite new, and that there are so many unknowns and gray areas, actually makes me consider it more likely that a body of wikimedians, experienced with their own form of large-scale authority file coordination, are in a position to say something meaningful about how to achieve something similar for tens of millions of metadata records.
OL rather than Wikimedia has the advantage that more of the people there understand the problems.
In some areas that is certainly so. In others, Wikimedia communities have useful recent experience. I hope that those who understand these problems on both sides recognize the importance of sharing what they know openly -- and showing others how to understand them as well. We will not succeed as a global community if we say that this class of problems can only be solved by the limited group of people with an MLS and a few years of focused training. (how would you name the sort of training you mean here, btw?)
SJ
Exactly. That is why Wikipedia is an inappropriate place for this project. It lacks sufficient stability. I think Wikipedia should go on being what it is, an almost completely open place,and projects which need disciplined long term expertise should be organized separately. Wikipedia is a wonderful place to do many things, but not all.
David Goodman, Ph.D, M.L.S. http://en.wikipedia.org/wiki/User_talk:DGG
On Fri, Aug 14, 2009 at 7:42 PM, Pavlo Shevelopavlo.shevelo@gmail.com wrote:
The training is typically an apprenticeship under the senior...
To my regret training/apprenticeship does not fit to "everyone can...", "be bold!" set of wikimedia slogans/motto. As to me I would stand behind (vote for) training and apprenticeship.
On Sat, Aug 15, 2009 at 12:23 AM, David Goodmandgoodmanny@gmail.com wrote:
The training is typically an apprenticeship under the senior cataloging librarians.
David Goodman, Ph.D, M.L.S. http://en.wikipedia.org/wiki/User_talk:DGG
On Thu, Aug 13, 2009 at 1:48 AM, Samuel Kleinmeta.sj@gmail.com wrote:
DGG, I appreciate your points. Would we be so motivated by this thread if it weren't a complex problem?
The fact that all of this is quite new, and that there are so many unknowns and gray areas, actually makes me consider it more likely that a body of wikimedians, experienced with their own form of large-scale authority file coordination, are in a position to say something meaningful about how to achieve something similar for tens of millions of metadata records.
OL rather than Wikimedia has the advantage that more of the people there understand the problems.
In some areas that is certainly so. In others, Wikimedia communities have useful recent experience. I hope that those who understand these problems on both sides recognize the importance of sharing what they know openly -- and showing others how to understand them as well. We will not succeed as a global community if we say that this class of problems can only be solved by the limited group of people with an MLS and a few years of focused training. (how would you name the sort of training you mean here, btw?)
SJ
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
(top-posting unravelled) On Sat, Aug 15, 2009 at 1:12 PM, David Goodmandgoodmanny@gmail.com wrote:
On Sat, Aug 15, 2009 at 9:42 AM, Pavlo Shevelopavlo.shevelo@gmail.com wrote:
On Sat, Aug 15, 2009 at 12:23 AM, David Goodmandgoodmanny@gmail.com wrote:
The training is typically an apprenticeship under the senior cataloging librarians.
To my regret training/apprenticeship does not fit to "everyone can...", "be bold!" set of wikimedia slogans/motto. As to me I would stand behind (vote for) training and apprenticeship.
Exactly. That is why Wikipedia is an inappropriate place for this project. It lacks sufficient stability. I think Wikipedia should go on being what it is, an almost completely open place,and projects which need disciplined long term expertise should be organized separately. Wikipedia is a wonderful place to do many things, but not all.
The good news is that the broader Wikimedia community is not all like English Wikipedia, where "be bold" is often interpreted as demanding that the worst of anarchy be present in every situation. ;-)
Commons and Wikisource are able to build a sensible metadata layer around their collection using plain wiki text. We also have a project designed to add structure this metadata.
http://meta.wikimedia.org/wiki/Wikicat
Either way, wikisource and commons will likely figure out a way to have Dublin Core and MODS records for their collection in the next few years.
-- John Vandenberg
Yes, i think they are quite capable of doing it, & should take the primary responsibility. What I think they are not capable of is extending it to every published book in the world.
David Goodman, Ph.D, M.L.S. http://en.wikipedia.org/wiki/User_talk:DGG
On Sat, Aug 15, 2009 at 9:21 PM, John Vandenbergjayvdb@gmail.com wrote:
(top-posting unravelled) On Sat, Aug 15, 2009 at 1:12 PM, David Goodmandgoodmanny@gmail.com wrote:
On Sat, Aug 15, 2009 at 9:42 AM, Pavlo Shevelopavlo.shevelo@gmail.com wrote:
On Sat, Aug 15, 2009 at 12:23 AM, David Goodmandgoodmanny@gmail.com wrote:
The training is typically an apprenticeship under the senior cataloging librarians.
To my regret training/apprenticeship does not fit to "everyone can...", "be bold!" set of wikimedia slogans/motto. As to me I would stand behind (vote for) training and apprenticeship.
Exactly. That is why Wikipedia is an inappropriate place for this project. It lacks sufficient stability. I think Wikipedia should go on being what it is, an almost completely open place,and projects which need disciplined long term expertise should be organized separately. Wikipedia is a wonderful place to do many things, but not all.
The good news is that the broader Wikimedia community is not all like English Wikipedia, where "be bold" is often interpreted as demanding that the worst of anarchy be present in every situation. ;-)
Commons and Wikisource are able to build a sensible metadata layer around their collection using plain wiki text. We also have a project designed to add structure this metadata.
http://meta.wikimedia.org/wiki/Wikicat
Either way, wikisource and commons will likely figure out a way to have Dublin Core and MODS records for their collection in the next few years.
-- John Vandenberg
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
David Goodman wrote:
The problem is extraordinarily complex. A database of all "books" (and other media) ever published is beyond the joint capabilities of everyone interested. There are intermediate entities between "books" and "works", and important subordinate entities, such as "article" , "chapter" , and those like "poem" which could be at any of several levels.
I've already been in raging arguments at Wikisource about the meaning of "work". The general tendency there has been to treat "work" as equivalent to a book or set of related books. This is highly problematical for periodicals, encyclopedias and dictionaries.
I do agree that the problem is so complex, but there is a resistance on the part of many to accept standards that have been developed over a long period of time. Before the Category: namespace was made a part of Wikipedia there was considerable antipathy to adopting any kind of established category system. Muddling through from square one was the preferred option.
This is not a job for amateurs, unless they are prepared to first learn the actual standards of bibliographic description for different types of material, and to at least recognize the inter-relationships, and the many undefined areas. At research libraries, one allows a few years of training for a newcomer with just a MLS degree to work with a small subset of this. I have thirty years of experience in related areas of librarianship, and I know only enough to be aware of the problems.
This does not bode well! The big factor in Wiki participation and success is amateur involvement and crowd sourcing. What are the PhDs doing to bridge the gap? What efforts are being made to at least bring the most significant points to the level of the general contributor? Saying that it takes several years to bring an MLS up to speed is not good enough. Knowledge needs to be brought to the level where it was most useful. When I went to school typing was not introduced as a subject until the 10th grade; my son learned keyboarding in the first grade.
Our wiki projects also have a superfluity of people with an IT background who also do not do a very good job bringing information to where it belongs, and end up creating a mind-boggling assortment of templates of questionable value. In theory they are trying to bring standardization and simplicity to the projects, but just as often produce a simplistic and premature narrowing of the way knowledge is organized.
The difficulty of merging the many thousands of partial correct and incorrect sources of available data typically requires the manual resolution of each of the tens of millions of instances.
Yes, of course. There is no magic software that will do it all. Humans need to retain the right to decide the limits of technology.
OL rather than Wikimedia has the advantage that more of the people there understand the problems.
The librarians have their work cut out for them. They can help to build a system for the future, or they can let everyone muddle their way into a fuck-up.
Ec
Yann Forget wrote:
This discussion is very interesting. I would like to make a summary, so that we can go further.
- A database of all books ever published is one of the thing still missing.
No, no, no, this is *not* missing. This is exactly the scope of OpenLibrary. Just as Wikipedia is not yet a complete encyclopedia, or OpenStreetMap is not yet a complete map of the world, some books are still missing from OpenLibrary's database, but it is a project aiming to compile a database of every book ever published.
Personally I don't find OL very practical. May be I am too much used too Mediawiki. ;oD
And therefore, you would not try to improve OpenLibrary, but rather start an entirely new project based on MediaWiki? I'm afraid that this ("not invented here") is a common sentiment, and a major reason that we will get nowhere.
Hello,
Lars Aronsson wrote:
Yann Forget wrote:
This discussion is very interesting. I would like to make a summary, so that we can go further.
- A database of all books ever published is one of the thing
still missing.
No, no, no, this is *not* missing. This is exactly the scope of OpenLibrary. Just as Wikipedia is not yet a complete encyclopedia, or OpenStreetMap is not yet a complete map of the world, some books are still missing from OpenLibrary's database, but it is a project aiming to compile a database of every book ever published.
At least Wikipedia can say that it has the most complete encyclopedia, and OpenStreetMap the most complete free maps that ever existed. AFAIK OpenLibrary is very very far to have anything comprensive, through I am curious to have the figures. As I already said, the first steps would be to import existing databases, and Wikimedians are very good at this job.
Personally I don't find OL very practical. May be I am too much used too Mediawiki. ;oD
And therefore, you would not try to improve OpenLibrary, but rather start an entirely new project based on MediaWiki? I'm afraid that this ("not invented here") is a common sentiment, and a major reason that we will get nowhere.
You are wrong here. I was delighted to see a project as OL and I inserted a few books and authors, but I have not been convinced. On books and authors, Wikimedia projects have already much more data than OL, and a lot of basic funtionalities are not available: tagging 2 entries as identical (redirect), multilinguism, links between related entries (interwiki), etc.
I don't really care who would host this "Universal Library", as long as it is freely available with a powerful search engine, and no restriction on reuse. What I say is that Mediawiki is really much better that anything else for any massive online cooperative work. The most important point for such a project is building a community. OpenLibrary has certainly done a good job, but I don't see _a community_. The tools and the social environment available on Wikimedia projects are missing. I believe the social environment is a consequence both of the software and the leadership. Once the community exists it may be self-sustaining if other conditions are met. OL lacks a good software as Mediawiki and a leader as Jimbo.
Yann
Yann Forget wrote:
As I already said, the first steps would be to import existing databases, and Wikimedians are very good at this job.
Do you have a bibliographic database (library catalog) of French literature that you can upload? How many records? Convincing libraries to donate copies of their catalogs has been a bottleneck for OpenLibrary.
Lars Aronsson wrote:
Yann Forget wrote:
As I already said, the first steps would be to import existing databases, and Wikimedians are very good at this job.
Do you have a bibliographic database (library catalog) of French literature that you can upload? How many records? Convincing libraries to donate copies of their catalogs has been a bottleneck for OpenLibrary.
No, I don't have such a database. There is a copyright on databases in Europe, which makes things complicated.
Probably we need to start with libraries which are already collaborating with open content projects. There was a GLAM-wiki meeting in Australia recently: there might be a possibility with an Australian library?
But even before that, if we could extract the data from Wikimedia projects, we could create a basic working frame. I have been collecting such data on Wikisource and Wikibooks, but the lack of a structured system is a bottleneck.
Examples: 1. Comprehensive bibliography of Gandhi in French http://fr.wikibooks.org/wiki/Bibliographie_de_Gandhi
2. French translations of Russian authors: http://fr.wikisource.org/wiki/Discussion_Auteur:L%C3%A9on_Tolsto%C3%AF http://fr.wikisource.org/wiki/Discussion_Auteur:F%C3%A9dor_Mikha%C3%AFlovitc...
Regards,
Yann
David Strauss did a quick implementation (basically a demo) of an OpenLibrary extension for MediaWiki. In very little amount of code, he was able to easily search the OL (via AJAX) and when the user selected a given result, it poppulated a Citation template. What was nice is that when no results came up for a given search, there was an "add to open library" button that brought you to the OL site to add your bibliographic information.
I think it would be easy to build upon this work and one could do a really powerful MW extension (and maybe some new templates, etc) that would allow people to contribute to both MW and OL simultaneously.
I think that the OL should continue to do what is trying to do. I also think people should be able to quickly and easily create new and important wikimedia projects, especially when people are passionate to do so. And, I think when different projects on the Internet have a lot of overlap in what they are trying to do, and share similar philosophy and ethics, that they should have their machines play nice with each other and make sharing (reading and writing) data between them easy.
-Josh
On Fri, Aug 21, 2009 at 7:33 AM, Yann Forget yann@forget-me.net wrote:
Lars Aronsson wrote:
Yann Forget wrote:
As I already said, the first steps would be to import existing databases, and Wikimedians are very good at this job.
Do you have a bibliographic database (library catalog) of French literature that you can upload? How many records? Convincing libraries to donate copies of their catalogs has been a bottleneck for OpenLibrary.
No, I don't have such a database. There is a copyright on databases in Europe, which makes things complicated.
Probably we need to start with libraries which are already collaborating with open content projects. There was a GLAM-wiki meeting in Australia recently: there might be a possibility with an Australian library?
But even before that, if we could extract the data from Wikimedia projects, we could create a basic working frame. I have been collecting such data on Wikisource and Wikibooks, but the lack of a structured system is a bottleneck.
Examples:
- Comprehensive bibliography of Gandhi in French
http://fr.wikibooks.org/wiki/Bibliographie_de_Gandhi
- French translations of Russian authors:
http://fr.wikisource.org/wiki/Discussion_Auteur:L%C3%A9on_Tolsto%C3%AF
http://fr.wikisource.org/wiki/Discussion_Auteur:F%C3%A9dor_Mikha%C3%AFlovitc...
Regards,
Yann
http://www.non-violence.org/ | Site collaboratif sur la non-violence http://www.forget-me.net/ | Alternatives sur le Net http://fr.wikisource.org/ | Bibliothèque libre http://wikilivres.info | Documents libres
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Joshua Gay wrote:
David Strauss did a quick implementation (basically a demo) of an OpenLibrary extension for MediaWiki. In very little amount of code, he was able to easily search the OL (via AJAX) and when the user selected a given result, it poppulated a Citation template. What was nice is that when no results came up for a given search, there was an "add to open library" button that brought you to the OL site to add your bibliographic information.
Interesting, I didn't know that. Is this demo available somewhere?
Yann
Interesting, I didn't know that. Is this demo available somewhere?
Here is a demo of it up and running: http://ol.fkbuild.com/w/index.php/Main_Page
Click edit and then click on the OL button on the tool bar and enter a search item.
Also, I think someone I shared this with had trouble getting it to work with IE -- I've only ever tried it on firefox.
-Josh
On Fri, Aug 21, 2009 at 5:52 AM, Joshua Gayjoshuagay@gmail.com wrote:
I think it would be easy to build upon this work and one could do a really powerful MW extension (and maybe some new templates, etc) that would allow people to contribute to both MW and OL simultaneously.
I think that the OL should continue to do what is trying to do. I also think people should be able to quickly and easily create new and important wikimedia projects, especially when people are passionate to do so. And, I think when different projects on the Internet have a lot of overlap in what they are trying to do, and share similar philosophy and ethics, that they should have their machines play nice with each other and make sharing (reading and writing) data between them easy.
-Josh
I was gong to say basically this, and then Josh said it better :) There's no special reason to reinvent the wheel; as DGG mentioned there are several very difficult aspects of building a big bibliographic database (cataloging standards, getting the data in the first place, theoretical relationships between works) that the OL folks have tackled with some success; and there is value in having a project that focuses just on this hard problem. SJ is right that Wikimedian expertise lies in making large wikis functional and multilingual, and augmenting data; but that doesn't mean such a project has to be a *Wikimedia* project. I think cooperation between the projects would be better. Interlinking into Wikip/media would raise OL's profile substantially, and would mean that WP had access to some sort of canonical catalog data; a win for everyone. -- Phoebe
Hello,
I started a proposal on the Strategy Wiki: http://strategy.wikimedia.org/wiki/Proposal:Building_a_database_of_all_books...
IMO this should be a join project between Openlibrary and Wikimedia. Both have an interest and a capacity to work on this.
Regards,
Yann
Yann Forget wrote:
I started a proposal on the Strategy Wiki: http://strategy.wikimedia.org/wiki/Proposal:Building_a_database_of_all_books...
IMO this should be a join project between Openlibrary and Wikimedia.
Again, I don't understand why. What exactly is missing in OpenLibrary? Why does it need to be a new, joint project?
The page says "There is currently no database of all books ever published freely available." But OpenLibrary is a project already working towards exactly that goal. It's not done yet, and its methods are not yet fully developed. But neither would your new "joint" project be, for a very long time.
Wikipedia is also far from complete, far from containing "the sum of all human knowledge". But that doesn't create a need to start entirely new encyclopedia projects. It only means more contributors are needed in the existing Wikipedia.
Not only can the OpenLibrary do it perfect well without us. considering our rather inconsistent standards, they can probably do it better without us. We will just get in the way.
There is sufficient missing material in every Wikipedia, sufficient lack of coverage of areas outside the primary language zone and in earlier periods, sufficient unsourced material; sufficient need for updating articles, sufficient potentially free media to add, sufficient needed imagery to get; that we have more than enough work for all the volunteers we are likely to get.
To duplicate an existing project is particularly unproductive when the other project is doing it better than we are ever going to be able to. Yes, there are people here who could do it or learn to do it--but I think everyone here with that degree of bibliographic knowledge would be much better occupied in sourcing articles.
David Goodman, Ph.D, M.L.S. http://en.wikipedia.org/wiki/User_talk:DGG
On Wed, Sep 2, 2009 at 2:21 AM, Lars Aronssonlars@aronsson.se wrote:
Yann Forget wrote:
I started a proposal on the Strategy Wiki: http://strategy.wikimedia.org/wiki/Proposal:Building_a_database_of_all_books...
IMO this should be a join project between Openlibrary and Wikimedia.
Again, I don't understand why. What exactly is missing in OpenLibrary? Why does it need to be a new, joint project?
The page says "There is currently no database of all books ever published freely available." But OpenLibrary is a project already working towards exactly that goal. It's not done yet, and its methods are not yet fully developed. But neither would your new "joint" project be, for a very long time.
Wikipedia is also far from complete, far from containing "the sum of all human knowledge". But that doesn't create a need to start entirely new encyclopedia projects. It only means more contributors are needed in the existing Wikipedia.
-- Lars Aronsson (lars@aronsson.se) Aronsson Datateknik - http://aronsson.se
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Hello, I have already answered some of these arguments earlier.
David Goodman wrote:
Not only can the OpenLibrary do it perfect well without us. considering our rather inconsistent standards, they can probably do it better without us. We will just get in the way.
The issue is not if OpenLibrary is "doing it perfect well without us", even if that were true. Currently what OpenLibrary does is not very useful for Wikimedia, and partly duplicate what we do. Wikimedia has also important assets which OL doesn't have, and therefore a collaboration seems obviously beneficial for both.
There is sufficient missing material in every Wikipedia, sufficient lack of coverage of areas outside the primary language zone and in earlier periods, sufficient unsourced material; sufficient need for updating articles, sufficient potentially free media to add, sufficient needed imagery to get; that we have more than enough work for all the volunteers we are likely to get.
To duplicate an existing project is particularly unproductive when the other project is doing it better than we are ever going to be able to. Yes, there are people here who could do it or learn to do it--but I think everyone here with that degree of bibliographic knowledge would be much better occupied in sourcing articles.
It is clear that you didn't even read my proposal. Please do before emitting objections. http://strategy.wikimedia.org/wiki/Proposal:Building_a_database_of_all_books...
I specifically wrote that my proposal is not necessarily starting a new project. I agree that working with Open Library is necessary for such project, but I also say if Wikimedia gets involved, it would be much more successful.
What you say here is completely the opposite how Wikimedia projects work, i.e. openness, and that's just what is missing in Open Library.
David Goodman, Ph.D, M.L.S.
Regards, Yann
I have read your proposal. I continue to be of the opinion that we are not competent to do this. Since the proposal says, that "this project requires as much database management knowledge as librarian knowledge," it confirms my opinion. You will never merge the data properly if you do not understand it.
You suggest 3 practical steps 1. an extension for finding a book in OL is certainly doable--and it has been done, see [http://en.wikipedia.org/wiki/Wikipedia:Book_sources]. 2. an OL field, link to WP -- as you say, this is already present. 3. An OL field, link to Wikisource.A very good project. It will be they who need to do it.
Agreed we need translation information--I think this is a very important priority. It's not that hard to do a list or to add links that will be helpful, though not exact enough to be relied on in further work. That's probably a reasonable project, but it is very far from "a database of all books ever published"
But some of this is being done--see the frWP page for Moby Dick: http://fr.wikipedia.org/wiki/Moby_Dick (though it omits a number of the translations listed in the French Union Catalog, http://corail.sudoc.abes.fr/xslt/DB=2.1/CMD?ACT=SRCHA&IKT=8063&SRT=R...] I would however not warrant without seeing the items in hand, or reading an authoritative review, that they are all complete translations. The English page on the novel lists no translations; perhaps we could in practice assume that the interwiki links are sufficient. Perhaps that could be assumed in Wiksource also?
David Goodman, Ph.D, M.L.S. http://en.wikipedia.org/wiki/User_talk:DGG
On Wed, Sep 2, 2009 at 1:17 PM, Yann Forgetyann@forget-me.net wrote:
Hello, I have already answered some of these arguments earlier.
David Goodman wrote:
Not only can the OpenLibrary do it perfect well without us. considering our rather inconsistent standards, they can probably do it better without us. We will just get in the way.
The issue is not if OpenLibrary is "doing it perfect well without us", even if that were true. Currently what OpenLibrary does is not very useful for Wikimedia, and partly duplicate what we do. Wikimedia has also important assets which OL doesn't have, and therefore a collaboration seems obviously beneficial for both.
There is sufficient missing material in every Wikipedia, sufficient lack of coverage of areas outside the primary language zone and in earlier periods, sufficient unsourced material; sufficient need for updating articles, sufficient potentially free media to add, sufficient needed imagery to get; that we have more than enough work for all the volunteers we are likely to get.
To duplicate an existing project is particularly unproductive when the other project is doing it better than we are ever going to be able to. Yes, there are people here who could do it or learn to do it--but I think everyone here with that degree of bibliographic knowledge would be much better occupied in sourcing articles.
It is clear that you didn't even read my proposal. Please do before emitting objections. http://strategy.wikimedia.org/wiki/Proposal:Building_a_database_of_all_books...
I specifically wrote that my proposal is not necessarily starting a new project. I agree that working with Open Library is necessary for such project, but I also say if Wikimedia gets involved, it would be much more successful.
What you say here is completely the opposite how Wikimedia projects work, i.e. openness, and that's just what is missing in Open Library.
David Goodman, Ph.D, M.L.S.
Regards, Yann -- http://www.non-violence.org/ | Site collaboratif sur la non-violence http://www.forget-me.net/ | Alternatives sur le Net http://fr.wikisource.org/ | Bibliothèque libre http://wikilivres.info | Documents libres
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
David Goodman wrote:
I have read your proposal. I continue to be of the opinion that we are not competent to do this. Since the proposal says, that "this project requires as much database management knowledge as librarian knowledge," it confirms my opinion. You will never merge the data properly if you do not understand it.
That's all the point that it needs to be join project: database gurus with librarians. What I see is that OpenLibrary lacks some basic features that Wikimedia projects have since a long time (in Internet scale): easy redirects, interwikis, mergings, deletion process, etc. Some of these are planned for the next version of their software, but I still feel that sometimes they try to reinvent the wheel we already have.
OL claims to have 23 million book and author entries. However many entries are duplicates of the same edition, not to mention the same book, so the real number of unique entries is much lower. I also see that Wikisource has data which are not included in their database (and certainly also Wikipedia, but I didn't really check).
You suggest 3 practical steps
- an extension for finding a book in OL is certainly doable--and it
has been done, see [http://en.wikipedia.org/wiki/Wikipedia:Book_sources]. 2. an OL field, link to WP -- as you say, this is already present. 3. An OL field, link to Wikisource. A very good project. It will be they who need to do it.
Yes, but I think we should fo further than that. OpenLibrary has an API which would allow any relevant wiki article to be dynamically linked to their data, or that an entry could be created every time new relevant data is added to a Wikipedia projects. This is all about avoiding duplicate work between Wikimedia and OpenLibrary. It could also increase accuracy by double checking facts (dates, name and title spelling, etc.) between our projects.
Agreed we need translation information--I think this is a very important priority. It's not that hard to do a list or to add links that will be helpful, though not exact enough to be relied on in further work. That's probably a reasonable project, but it is very far from "a database of all books ever published"
But some of this is being done--see the frWP page for Moby Dick: http://fr.wikipedia.org/wiki/Moby_Dick (though it omits a number of the translations listed in the French Union Catalog, http://corail.sudoc.abes.fr/xslt/DB=2.1/CMD?ACT=SRCHA&IKT=8063&SRT=R...] I would however not warrant without seeing the items in hand, or reading an authoritative review, that they are all complete translations. The English page on the novel lists no translations; perhaps we could in practice assume that the interwiki links are sufficient. Perhaps that could be assumed in Wiksource also?
That's another possible benefit: automatic list of works/editions/translations in a Wikipedia article.
You could add {{OpenLibrary|author=Jules Verne|lang=English}} and you have a list of English translations of Jules Verne's works directly imported from their database. The problem is that, right now, Wikimedia projects have often more accurate and more detailed information than OpenLibrary.
David Goodman, Ph.D, M.L.S. http://en.wikipedia.org/wiki/User_talk:DGG
Regards,
Yann
I have been re-reading their documentation, and they have it well in hand. We would do very well to confine ourselves to matching up the entries in the WMF projects alone. Some of the data in WMF is more accurate than some of the OL data, but I would not say this to be a general rule. Far from it: the proportion of incomplete or inaccurate entires in enWP is probably well over 50% for books. (for journal articles it is better, because of a project to link to the pubmed information) The accuracy & adequacy -- let alone completeness-- of the bibliographic information in WS is close to zero, except where there is a IA scan of the cover and title page, from which full bibliographic information might be derived, but cannot necessarily be taken at face value.
The unification of editions is non-trivial, as using the algorithm you suggest, you will also have all works related to Verne, and additionally a combination of general and partial translations, children's books, comic adaptation, and whatever. Modern library metadata provides for this to a certain limited extent--unfortunately most of the entries in current online catalogs do not show full modern data--many catalogs never had more than minimal records; Dublin core is probably not generally considered to be fully up to the problem either, at least in any current implementation.
Those working on the OL side are fully aware of this. They have made the decision to work towards inclusion of all usable & obtainable data sets, rather than only the ones that can be immediately fully harmonized. This was very wise decision, as the way in which the information is to be combined & related is not fully developed, and , if they were to wait for that, nothing would be entered. There will therefore be the problem of upgrading the records and the record structure in place--a problem that no large bibliographic system has ever fully handled properly--not that this incarnation of OL is likely to either. Bibliographers work for their time, not for all time to come.
David Goodman, Ph.D, M.L.S. http://en.wikipedia.org/wiki/User_talk:DGG
On Thu, Sep 3, 2009 at 6:38 AM, Yann Forgetyann@forget-me.net wrote:
David Goodman wrote:
I have read your proposal. I continue to be of the opinion that we are not competent to do this. Since the proposal says, that "this project requires as much database management knowledge as librarian knowledge," it confirms my opinion. You will never merge the data properly if you do not understand it.
That's all the point that it needs to be join project: database gurus with librarians. What I see is that OpenLibrary lacks some basic features that Wikimedia projects have since a long time (in Internet scale): easy redirects, interwikis, mergings, deletion process, etc. Some of these are planned for the next version of their software, but I still feel that sometimes they try to reinvent the wheel we already have.
OL claims to have 23 million book and author entries. However many entries are duplicates of the same edition, not to mention the same book, so the real number of unique entries is much lower. I also see that Wikisource has data which are not included in their database (and certainly also Wikipedia, but I didn't really check).
You suggest 3 practical steps
- an extension for finding a book in OL is certainly doable--and it
has been done, see [http://en.wikipedia.org/wiki/Wikipedia:Book_sources]. 2. an OL field, link to WP -- as you say, this is already present. 3. An OL field, link to Wikisource. A very good project. It will be they who need to do it.
Yes, but I think we should fo further than that. OpenLibrary has an API which would allow any relevant wiki article to be dynamically linked to their data, or that an entry could be created every time new relevant data is added to a Wikipedia projects. This is all about avoiding duplicate work between Wikimedia and OpenLibrary. It could also increase accuracy by double checking facts (dates, name and title spelling, etc.) between our projects.
Agreed we need translation information--I think this is a very important priority. It's not that hard to do a list or to add links that will be helpful, though not exact enough to be relied on in further work. That's probably a reasonable project, but it is very far from "a database of all books ever published"
But some of this is being done--see the frWP page for Moby Dick: http://fr.wikipedia.org/wiki/Moby_Dick (though it omits a number of the translations listed in the French Union Catalog, http://corail.sudoc.abes.fr/xslt/DB=2.1/CMD?ACT=SRCHA&IKT=8063&SRT=R...] I would however not warrant without seeing the items in hand, or reading an authoritative review, that they are all complete translations. The English page on the novel lists no translations; perhaps we could in practice assume that the interwiki links are sufficient. Perhaps that could be assumed in Wiksource also?
That's another possible benefit: automatic list of works/editions/translations in a Wikipedia article.
You could add {{OpenLibrary|author=Jules Verne|lang=English}} and you have a list of English translations of Jules Verne's works directly imported from their database. The problem is that, right now, Wikimedia projects have often more accurate and more detailed information than OpenLibrary.
David Goodman, Ph.D, M.L.S. http://en.wikipedia.org/wiki/User_talk:DGG
Regards,
Yann
http://www.non-violence.org/ | Site collaboratif sur la non-violence http://www.forget-me.net/ | Alternatives sur le Net http://fr.wikisource.org/ | Bibliothèque libre http://wikilivres.info | Documents libres
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
You two seem to be talking past each other. Might I suggest that perhaps the quality of information on OPL and/or Wikipdia/Wikisource sites is rather different depending on whether you are reading in French or English? I don't know if this is the case but it could explain the discrepancies between your experiences.
Birgitte SB
--- On Thu, 9/3/09, David Goodman dgoodmanny@gmail.com wrote:
From: David Goodman dgoodmanny@gmail.com Subject: Re: [Foundation-l] Universal Library To: "Wikimedia Foundation Mailing List" foundation-l@lists.wikimedia.org Date: Thursday, September 3, 2009, 2:19 PM I have been re-reading their documentation, and they have it well in hand. We would do very well to confine ourselves to matching up the entries in the WMF projects alone. Some of the data in WMF is more accurate than some of the OL data, but I would not say this to be a general rule. Far from it: the proportion of incomplete or inaccurate entires in enWP is probably well over 50% for books. (for journal articles it is better, because of a project to link to the pubmed information) The accuracy & adequacy -- let alone completeness-- of the bibliographic information in WS is close to zero, except where there is a IA scan of the cover and title page, from which full bibliographic information might be derived, but cannot necessarily be taken at face value.
The unification of editions is non-trivial, as using the algorithm you suggest, you will also have all works related to Verne, and additionally a combination of general and partial translations, children's books, comic adaptation, and whatever. Modern library metadata provides for this to a certain limited extent--unfortunately most of the entries in current online catalogs do not show full modern data--many catalogs never had more than minimal records; Dublin core is probably not generally considered to be fully up to the problem either, at least in any current implementation.
Those working on the OL side are fully aware of this. They have made the decision to work towards inclusion of all usable & obtainable data sets, rather than only the ones that can be immediately fully harmonized. This was very wise decision, as the way in which the information is to be combined & related is not fully developed, and , if they were to wait for that, nothing would be entered. There will therefore be the problem of upgrading the records and the record structure in place--a problem that no large bibliographic system has ever fully handled properly--not that this incarnation of OL is likely to either. Bibliographers work for their time, not for all time to come.
David Goodman, Ph.D, M.L.S. http://en.wikipedia.org/wiki/User_talk:DGG
On Thu, Sep 3, 2009 at 6:38 AM, Yann Forgetyann@forget-me.net wrote:
David Goodman wrote:
I have read your proposal. I continue to be of the
opinion that we are
not competent to do this. Since the proposal
says, that "this project
requires as much database management knowledge as
librarian
knowledge," it confirms my opinion. You will never
merge the data
properly if you do not understand it.
That's all the point that it needs to be join project:
database gurus
with librarians. What I see is that OpenLibrary lacks
some basic
features that Wikimedia projects have since a long
time (in Internet
scale): easy redirects, interwikis, mergings, deletion
process, etc.
Some of these are planned for the next version of
their software, but I
still feel that sometimes they try to reinvent the
wheel we already have.
OL claims to have 23 million book and author entries.
However many
entries are duplicates of the same edition, not to
mention the same
book, so the real number of unique entries is much
lower. I also see
that Wikisource has data which are not included in
their database (and
certainly also Wikipedia, but I didn't really check).
You suggest 3 practical steps
- an extension for finding a book in OL is
certainly doable--and it
has been done, see [http://en.wikipedia.org/wiki/Wikipedia:Book_sources]. 2. an OL field, link to WP -- as you say, this
is already present.
- An OL field, link to Wikisource. A very good
project. It will be
they who need to do it.
Yes, but I think we should fo further than that.
OpenLibrary has an API
which would allow any relevant wiki article to be
dynamically linked to
their data, or that an entry could be created every
time new relevant
data is added to a Wikipedia projects. This is all
about avoiding
duplicate work between Wikimedia and OpenLibrary. It
could also increase
accuracy by double checking facts (dates, name and
title spelling, etc.)
between our projects.
Agreed we need translation information--I think
this is a very
important priority. It's not that hard to do a
list or to add links
that will be helpful, though not exact enough to
be relied on in
further work. That's probably a reasonable
project, but it is very
far from "a database of all books ever published"
But some of this is being done--see the frWP page
for Moby Dick:
http://fr.wikipedia.org/wiki/Moby_Dick (though it omits a number of the translations
listed in the French Union
Catalog, http://corail.sudoc.abes.fr/xslt/DB=2.1/CMD?ACT=SRCHA&IKT=8063&SRT=R...] I would however not warrant without seeing the
items in hand, or
reading an authoritative review, that they are all
complete
translations. The English page on the novel lists no
translations; perhaps we could
in practice assume that the interwiki links are
sufficient. Perhaps
that could be assumed in Wiksource also?
That's another possible benefit: automatic list of works/editions/translations in a Wikipedia article.
You could add {{OpenLibrary|author=Jules
Verne|lang=English}} and you
have a list of English translations of Jules Verne's
works directly
imported from their database. The problem is that,
right now, Wikimedia
projects have often more accurate and more detailed
information than
OpenLibrary.
David Goodman, Ph.D, M.L.S. http://en.wikipedia.org/wiki/User_talk:DGG
Regards,
Yann
http://www.non-violence.org/ | Site collaboratif sur la
non-violence
http://www.forget-me.net/ | Alternatives sur le Net http://fr.wikisource.org/ | Bibliothèque libre http://wikilivres.info | Documents libres
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
On Fri, Sep 4, 2009 at 6:58 AM, Birgitte SBbirgitte_sb@yahoo.com wrote:
You two seem to be talking past each other. Might I suggest that perhaps the quality of information on OPL and/or Wikipdia/Wikisource sites is rather different depending on whether you are reading in French or English? I don't know if this is the case but it could explain the discrepancies between your experiences.
That could be it. We cant hide the fact that the French Wikisource is leaps and bounds ahead of English Wikisource. ;-)
I also suspect that David is heavily biased due to his predominately English Wikipedia experience.
The underlying problem is that OL is approaching this from a traditional library perspective, and so is opening up slowly, and progress is slow and methodical. Wikisource is approaching the same goal with openness as a core philosophy, and progress is rapidly increasing.
To some, it seems that OL will reach the holy grail first, however they have seeded their database with lots of junk records, and they don't have digital items for these. The reality is that there is a lot of bibliographic entries which are wrong, and this data is usually fixed once the object represented has be reviewed. Without digital objects, there is no way for the world to know which are duplicates and which are slightly different editions which should have different records. Even if someone out in the real world knows that there was only one edition in a given year, there is no mechanism for the "community" to merge records. Without digitial objects, OL is _directory_ of works held in other locations; but it is not a library.
OTOH, Wikisource only has records for items that it has the full text for, which means it rarely has duplicates, and is much more like a "library" because people can actually read the text. And of course it has already has figured out a lot of the community process problems.
I dont think Wikisource is on a logarithmic growth yet overall, however there are spurts of logarithmic growth like you can see on the Hebrew Wikisource.
http://stats.wikimedia.org/wikisource/EN/PlotsPngArticlesTotal.htm
Keep in mind that the stats for Wikisource domains need to be _combined_, as French works are on the French WS, and English works are on the English WS. The total growth is the sum of all of the projects - this isnt like Wikipedia where each project is intending to have the same content in different languages.
-- John Vandenberg
John Vandenberg wrote:
The underlying problem is that OL is approaching this from a traditional library perspective, and so is opening up slowly, and progress is slow and methodical.
But they are not. They are starting from the Internet Archive (Brewster Kahle) perspective. "Real" archivists and librarians have complained that the Internet Archive is not enough of an archive, and OpenLibrary is not enough of a library. This is of course very similar to people complaining that Wikipedia is not enough of an encyclopedia. Both OpenLibrary and Wikipedia are primarily Internet projects. Perhaps the most interesting criticism of OpenLibrary was launched by Tim Spalding, founder of LibraryThing.com (another Internet project, but a commercial one, albeit with some volunteer vibes). He meant (my interpretation) that OpenLibrary asks a lot from libraries (a copy of their catalog database) but doesn't give much back, and giving something back would help OpenLibrary to win more allies among libraries, http://mail.archive.org/pipermail/ol-discuss/2009-August/000638.html
The first website to appear on the domain www.openlibrary.org was an online viewer for books scanned by/for the Internet Archive, so if "being able to read" is a requirement for a library, then it did have that function from the start. Later another website appeared on demo.openlibrary.org, containing catalog records. The demo website is what you now find as openlibrary.org. It is as if the online viewer and the bibliographic database are two different projects, and the Internet Archive put the new project under the old domain. But the online viewer is still there, for the books that have been digitized.
To some, it seems that OL will reach the holy grail first,
The OpenLibrary has a head start. Any project started now will have to spend much time to catch up. Any good ideas that might go into a new project, could be used in the existing Openlibrary.
For example, a new project might download the database dump from OpenLibrary and start to weed out the "junk records". But that junk sorting could also take place inside OpenLibrary. Why not?
If a new project goes to a library to ask for a copy of their catalog, they might get the question "we already gave (or didn't give) that to OpenLibrary, so how is your project any different?" And what should the new project answer to that?
I want to encourage wikipedians and wikisourcerers to join the OpenLibrary project, just like you should also join OpenStreetMap and other good projects for free knowledge and information. Bring your experience. If you get tired of one project, as I do sometimes, work on another one for a while.
OpenLibrary has author pages for 6.5 million author names. Some of these are "junk" duplicates that should be merged, but still there are quite a large number of authors there. These have a field for a Wikipedia URL, but only 1100 records have a value. Connecting author pages in OpenLibrary to Wikipedia biographies is just one way where we can do a lot, without needing to start a new project.
On Fri, Sep 4, 2009 at 7:21 PM, Lars Aronssonlars@aronsson.se wrote:
... For example, a new project might download the database dump from OpenLibrary and start to weed out the "junk records". But that junk sorting could also take place inside OpenLibrary. Why not?
Because metadata without digital objects are next to useless. Worldcat already provides a directory of where physical books are held.
A database of metadata with lots of duplicates and no means for the reader to fix them, and discuss them, is disrespectful.
If a new project goes to a library to ask for a copy of their catalog, they might get the question "we already gave (or didn't give) that to OpenLibrary, so how is your project any different?" And what should the new project answer to that?
See above. I dont see any value in going back to the libraries. Doing that would only end up with the same result that OpenLibrary has; it would be simpler to take the OpenLibrary dump.
I want to encourage wikipedians and wikisourcerers to join the OpenLibrary project, just like you should also join OpenStreetMap and other good projects for free knowledge and information. Bring your experience. If you get tired of one project, as I do sometimes, work on another one for a while.
Tell me _one_ thing that I can do at OpenLibrary that I can not do at Wikisource.
OpenLibrary has author pages for 6.5 million author names. Some of these are "junk" duplicates that should be merged, but still there are quite a large number of authors there. These have a field for a Wikipedia URL, but only 1100 records have a value. Connecting author pages in OpenLibrary to Wikipedia biographies is just one way where we can do a lot, without needing to start a new project.
_Most_ of them are duplicates.
http://openlibrary.org/search?q=Jules+Gabriel+Verne
I have an account at OpenLibrary, and I am responsible for 0.2% of the Wikipedia links :P
I am not keen on becoming attached to a project that is littered with so much crap, especially when I am not given the tools required to fix the crap, nor do I have any say in whether more crap can be imported.
http://openlibrary.org/user/jayvdb
These two need to be merged.
http://openlibrary.org/a/OL2296708A/Charles-C.-Nott http://openlibrary.org/a/OL2544127A/Charles-Cooper-Nott
Both of them look terrible, because I have no control over the presentation of the pages. Dups, lack of sorting, etc.
I haven't found the OpenLibrary coolaid; I'll stick with Wikisource, for good or ill.
-- John Vandenberg
John Vandenberg wrote:
I want to encourage wikipedians and wikisourcerers to join the OpenLibrary project, just like you should also join OpenStreetMap and other good projects for free knowledge and information. Bring your experience. If you get tired of one project, as I do sometimes, work on another one for a while.
Tell me _one_ thing that I can do at OpenLibrary that I can not do at Wikisource.
Are you suggesting that in addition to collecting free texts, Wikisource should also collect information about texts, free and nonfree, like OpenLibrary does? If so, that is a very interesting suggestion, and I support it.
On Fri, Sep 4, 2009 at 8:58 PM, Nikola Smolenskismolensk@eunet.yu wrote:
John Vandenberg wrote:
I want to encourage wikipedians and wikisourcerers to join the OpenLibrary project, just like you should also join OpenStreetMap and other good projects for free knowledge and information. Bring your experience. If you get tired of one project, as I do sometimes, work on another one for a while.
Tell me _one_ thing that I can do at OpenLibrary that I can not do at Wikisource.
Are you suggesting that in addition to collecting free texts, Wikisource should also collect information about texts, free and nonfree, like OpenLibrary does? If so, that is a very interesting suggestion, and I support it.
Yes, that is my vision. We should have bibliographic information, copyright details, list of chapter and summaries, list of older works which are referenced and list of later works which reference it, etc.
However, the Wikisource community is not yet large enough to manage that. A year ago the English Wikisource community changed the restrictions on who can have an Author page.
Previously our rule was: the author must have at least one "free" work.
It changed to: the author must either have one "free" work, or they must be deceased.
English Wikisource often includes modern works on the Author page of deceased people, listing biographies, posthumous collections, etc.
As our community grows, managed by people who are focused on old works, we can relax the inclusion criteria.
This is like the English Wikipedia becoming more inclusive as it has grown, because there are more people policing the edges.
Organic growth.
If this doesn't happen, I wont fret as there are more than enough public domain works to keep me learning for a few lifetimes. :-) I think it is much more important that we revive interest in old works which dont have a commercial publisher pushing new copies into bookstores.
-- John Vandenberg
John Vandenberg wrote:
I haven't found the OpenLibrary coolaid; I'll stick with Wikisource, for good or ill.
If that makes you happy, that's good for you. But now we were talking about the need for a project (either OpenLibrary or a new project) to list all the books ever published. When and how will Wikisource contain that? After every book has been scanned?
On 2 Sep 2009, at 12:35, David Goodman wrote:
There is sufficient missing material in every Wikipedia, sufficient lack of coverage of areas outside the primary language zone and in earlier periods, sufficient unsourced material; sufficient need for updating articles, sufficient potentially free media to add, sufficient needed imagery to get; that we have more than enough work for all the volunteers we are likely to get.
I apologise for taking this slightly out of context, but it touches upon something I've been wondering about recently, which is: do we have a complete set of WMF projects?
David focuses on Wikipedia, which is the main project, and also touches on Wikimedia Commons. We also have (in no particular order) WikiBooks, WikiSource, WikiNews, Wikiversity, Wiktionary, Wikiquote and WikiSpecies, in all their various languages. Each of these has essentially its own set of volunteers (so I disagree with David's assertion at the end of his paragraph - different work brings in different volunteers).
The latest* one of these projects is Wikiversity, which opened on 15 September 2006. That's almost 3 years ago. In terms of internet time, that's practically a generation ago.
Do we now have all of the projects running now that we could have running? Are all of the gaps in our project coverage already done sufficiently well by someone else that we couldn't improve on matters by having our own?
My personal feeling is that there's plenty of scope for new Wikimedia projects. There have been plenty mentioned on this mailing list, or on the various wikis, etc.** A wiki version of OpenLibrary is a good example of something we could try; even if it failed then it wouldn't be time wasted, as the result could be fed into OpenLibrary. So, I think the answer to my question is "no".
What could be the cause of this recent dearth of new projects?
Could it be the presence of Wikia?
Are we stuck in the mindset of just Wikipedia + supporting projects?
Is the technical side of things too moribund to easily establish new projects?
Are we afraid of trying new things (or worse, unable to try new things)?
Do we lack the leadership to make new projects successful?
Is it a limitation of not being able to make a living from working on Wikimedia projects?
Wikimedia is big enough that it can launch new projects very publicly, and get a lot of support (both volunteer and financial) very quickly. It's widespread enough that you can ask a group of people in any room if they know of Wikipedia, and over half of them will.*** Actually editing Wikipedia might not appeal to them, but working on a different project could, especially if it's in their speciality.
One final question: do we need to start looking for project donations - i.e. absorbing projects started elsewhere?
Mike
PS: my questions here are posed to be provocative. Please don't take them as accurately representing my viewpoints.
* Note that increasing the number of languages that these projects use doesn't in my mind count as a new project. ** A few of my favourite examples: WikiJournal, publishing scholarly works; WikiReview, providing in-depth reviews of subjects; WikiWrite, where fiction can be written collaboratively; etc. *** Country-dependent. Your language may vary.
The question isn't "Is there more we could do?" because there most certainly is. The question is "Is there more we want to do?" We need to decide what really is the scope of the Wikimedia movement. We never really made that decision before starting the existing projects and just started any project that had enough support to be viable (which resulted in some mistakes, such as the 9/11 memorial wiki). We still haven't made that decision but now we work as if the decision has been made to restrict ourselves to the existing projects, which, obviously, means we don't open any new projects. We need to stop and actually make that decision - I think the big strategy plan that we are just starting is a good framework to do that within.
On Tue, Sep 8, 2009 at 5:32 PM, Michael Peel email@mikepeel.net wrote:
On 2 Sep 2009, at 12:35, David Goodman wrote:
There is sufficient missing material in every Wikipedia, sufficient lack of coverage of areas outside the primary language zone and in earlier periods, sufficient unsourced material; sufficient need for updating articles, sufficient potentially free media to add, sufficient needed imagery to get; that we have more than enough work for all the volunteers we are likely to get.
I apologise for taking this slightly out of context, but it touches upon something I've been wondering about recently, which is: do we have a complete set of WMF projects?
David focuses on Wikipedia, which is the main project, and also touches on Wikimedia Commons. We also have (in no particular order) WikiBooks, WikiSource, WikiNews, Wikiversity, Wiktionary, Wikiquote and WikiSpecies, in all their various languages. Each of these has essentially its own set of volunteers (so I disagree with David's assertion at the end of his paragraph - different work brings in different volunteers).
The latest* one of these projects is Wikiversity, which opened on 15 September 2006. That's almost 3 years ago. In terms of internet time, that's practically a generation ago.
Do we now have all of the projects running now that we could have running? Are all of the gaps in our project coverage already done sufficiently well by someone else that we couldn't improve on matters by having our own?
Geographical/atlas/map kind ofproject
granted, there's wikimapia and other external equivalents but we (Wikimedia) are lacking it
My personal feeling is that there's plenty of scope for new Wikimedia projects. There have been plenty mentioned on this mailing list, or on the various wikis, etc.** A wiki version of OpenLibrary is a good example of something we could try; even if it failed then it wouldn't be time wasted, as the result could be fed into OpenLibrary. So, I think the answer to my question is "no".
What could be the cause of this recent dearth of new projects?
Could it be the presence of Wikia?
Are we stuck in the mindset of just Wikipedia + supporting projects?
Is the technical side of things too moribund to easily establish new projects?
Are we afraid of trying new things (or worse, unable to try new things)?
Do we lack the leadership to make new projects successful?
Is it a limitation of not being able to make a living from working on Wikimedia projects?
Wikimedia is big enough that it can launch new projects very publicly, and get a lot of support (both volunteer and financial) very quickly. It's widespread enough that you can ask a group of people in any room if they know of Wikipedia, and over half of them will.*** Actually editing Wikipedia might not appeal to them, but working on a different project could, especially if it's in their speciality.
One final question: do we need to start looking for project donations
- i.e. absorbing projects started elsewhere?
Mike
PS: my questions here are posed to be provocative. Please don't take them as accurately representing my viewpoints.
- Note that increasing the number of languages that these projects
use doesn't in my mind count as a new project. ** A few of my favourite examples: WikiJournal, publishing scholarly works; WikiReview, providing in-depth reviews of subjects; WikiWrite, where fiction can be written collaboratively; etc. *** Country-dependent. Your language may vary.
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
2009/9/8 Pedro Sanchez pdsanchez@gmail.com:
Geographical/atlas/map kind ofproject
granted, there's wikimapia and other external equivalents but we (Wikimedia) are lacking it
Is there any point us doing something that already exists? What would be better about a Wikimedia version?
On 9/8/09 3:56 PM, Thomas Dalton wrote:
2009/9/8 Pedro Sanchezpdsanchez@gmail.com:
Geographical/atlas/map kind ofproject
granted, there's wikimapia and other external equivalents but we (Wikimedia) are lacking it
Is there any point us doing something that already exists? What would be better about a Wikimedia version?
Our current direction is to coordinate with external resources rather than create them from scratch, where we've got compatible goals and ideals.
For instance, rather than creating our own map system from scratch we're working with OpenStreetMap to integrate mapping, using our own rendering servers with a copy of the public data and making it easier to stick maps in wiki pages for starters, with easier ways to get into the upstream system to improve location name translations and mapping data.
-- brion
On Wed, Sep 9, 2009 at 1:00 AM, Brion Vibber brion@wikimedia.org wrote:
On 9/8/09 3:56 PM, Thomas Dalton wrote:
2009/9/8 Pedro Sanchezpdsanchez@gmail.com:
Geographical/atlas/map kind ofproject
granted, there's wikimapia and other external equivalents but we (Wikimedia) are lacking it
Is there any point us doing something that already exists? What would be better about a Wikimedia version?
Our current direction is to coordinate with external resources rather than create them from scratch, where we've got compatible goals and ideals.
For instance, rather than creating our own map system from scratch we're working with OpenStreetMap to integrate mapping, using our own rendering servers with a copy of the public data and making it easier to stick maps in wiki pages for starters, with easier ways to get into the upstream system to improve location name translations and mapping data.
I have been following the map-l and openstreetmap closely. There was a status report posted just recently: http://lists.wikimedia.org/pipermail/maps-l/2009-September/000270.html
http://meta.wikimedia.org/wiki/OpenStreetMap
http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Geographical_coordinates
Also there is a big discussion on the idea that wikipedia data can be imported into openstreetmap, because supposedly the coordinates from Wikipedia are copied from non free sources.: http://www.nabble.com/Wikipedia-POI-import--td23392791.html
In my opinion, What is really missing for example is the ability to find all the articles that occur in a geographic location.
I would like to see all the articles about Beijing for example, but it is not easy. Google provides some of this, but it could be better.
On a different dimension, time not space :
another project that I would like to see it a WikiTimeLine It would be great to be able to extract all the data references out of the wikipedia articles and put the on a time line.
mike
On Fri, Sep 11, 2009 at 7:40 PM, jamesmikedupont@googlemail.com jamesmikedupont@googlemail.com wrote: <snip>
In my opinion, What is really missing for example is the ability to find all the articles that occur in a geographic location.
I would like to see all the articles about Beijing for example, but it is not easy. Google provides some of this, but it could be better.
<snip>
I've seriously thought about implementing this. Enhancing the existing coordinate templates by creating a searchable coordinates table in the database would not be a difficult thing. It requires a bit of thought and effort to make it efficient, but the underlying idea is simple. Locating nearby articles (and geocoded images) could have a lot of uses.
-Robert Rohde
On Fri, Sep 11, 2009 at 10:40 PM, jamesmikedupont@googlemail.com
Also there is a big discussion on the idea that wikipedia data can be imported into openstreetmap, because supposedly the coordinates from Wikipedia are copied from non free sources.: http://www.nabble.com/Wikipedia-POI-import--td23392791.html
The biggest legitimate concern they seem to have is UK (and EU?) database laws regarding "substantial extraction". It's an interesting predicament.
Thomas Dalton wrote:
2009/9/8 Pedro Sanchez pdsanchez@gmail.com:
Geographical/atlas/map kind ofproject
granted, there's wikimapia and other external equivalents but we (Wikimedia) are lacking it
Is there any point us doing something that already exists? What would be better about a Wikimedia version?
It would generate competition to the advantage of both sides. Similarly Wikipedia forks would help us by generating competition. Such a fork may, for example, have a different interpretation of NPOV, as would be its right on its own site. Readers would then be more free to draw their own conclusions from comparing the two sites. Another fork could choose to limit its scope to certain topics, and adopt editing policies that are more tailored to its topics.
Ec
Thanks for bringing this up, Mike. I think WikiReview sounds like a great idea, WikiJournal sounds like it would suffer from a number of very serious flaws, WikiWrite could be interesting, and that there are probably a number of other project ideas that are equally interesting but not necessarily ideal for Wikimedia expansion.
My sense has been that some of the newer projects, including Wikiversity, tend to have limited readership and limited participation. I'd be happy if someone could provide some data to stack against this sense.[1] I think that without making a major splash early on, new Wikimedia projects tend to languish. While projects without widespread popularity are still useful, particularly if they are highly specialized, projects like WikiReview/Journal/Write would depend on public consciousness and participation levels to achieve relevance.
We'll agree, I think, that relevance isn't a nice benefit, its essential in order to attract readers and editors. Any new project must meet a heretofore unmet need significant enough to draw an active and self-perpetuating community. It isn't enough, then, to offer a cc-by-sa alternative to a proprietary but sufficient source - we have to be able to do whatever it is better.[2] Wikimedia has done this with fantastic success with Wikipedia, other projects fill smaller but vibrant niches - but we have some that don't meet this sort of criteria, and any new project ought to.
Lastly, can we reconsider the naming scheme for future projects? The "wiki-" prefix shouldn't be mandatory. Something like "writereviews.org, a project of the Wikimedia Foundation" could be an interesting alternative to "wikiwrite.org" or "wikireviews.org" that doesn't immediately bring to mind the proliferation of personal wikis on the web.
Nathan
[1]: The English Wikiversity, for example, has less than 12k "content pages", while the German Wikiversity has only 1800. En.wikiversity has 175k registered users, but only 25 administrators. The English WikiSource, with roughly the same number of users and administrators as en.wikiversity, has ten times as many content pages. [2]: A limited resource of uneven quality is not a preferable substitute for an easily accessible, free-to-use and reliable resource that is owned by a for-profit corporation.
Michael Peel wrote: [cut]
** A few of my favourite examples: WikiJournal, publishing scholarly works;
These works are welcomed on Wikisource, if they are under a free license, of course.
WikiReview, providing in-depth reviews of subjects;
I think this can be hosted on Wikibooks or Wikiversity for the most part.
WikiWrite, where fiction can be written collaboratively; etc.
I don't think this fits very well in the Wikimedia mission.
In the sum of all human knowledge, there are two projects which would be nice complement to the Wikimedia family:
1. A database of all books. This is actually what OpenLibrary tries to do, with mitigated success, IMO. As you said, if we try and fail, nothing would be lost, as the result could be imported to OpenLibrary. We wouldn't need to start from scratch as the content of OpenLibrary is available and free.
2. A database of all people, i.e. genealogy. There is one project which is IMO a great technical success in this field: Rodovid (http://rodovid.org/).
I like very much how the trees are displayed: http://fr.rodovid.org/wk/Personne:29004 (Philippe the 3rd of France, 1245-1285). It show very well how the French and English monarchies are related to each other. You can see the complete tree, but it takes ages to load because of the size: more than 7000 people (http://fr.rodovid.org/wk/Special:Tree/29004)
See also Elizabeth II: http://en.rodovid.org/wk/Special:Tree/29818 Complete tree: http://en.rodovid.org/wk/Special:Tree/29818
The Rodovid project has asked to be hosted by Wikimedia Foundation, although I don't know if it still does. It is based on an adapted version of Mediawiki, so it would be an easy integration with current projects.
Regards,
Yann
On Wed, Sep 9, 2009 at 9:42 AM, Yann Forgetyann@forget-me.net wrote:
Michael Peel wrote: [cut]
** A few of my favourite examples: WikiJournal, publishing scholarly works;
These works are welcomed on Wikisource, if they are under a free license, of course.
And if they are beyond the scope of Wikisource, they would be suitable on Wikiversity.
WikiReview, providing in-depth reviews of subjects;
I think this can be hosted on Wikibooks or Wikiversity for the most part.
"reviews" are a mine field. If they are educational, they can probably go on Wikibooks, or even Wikipedia if written using existing sources.
Wikinews may also be interested in publishing reviews.
WikiWrite, where fiction can be written collaboratively; etc.
I don't think this fits very well in the Wikimedia mission.
If the objective is to learn, Wikiversity courses could be constructed around fiction writing.
Wikinews may be interested in collaboratively composed cartoons and fiction.
In the sum of all human knowledge, there are two projects which would be nice complement to the Wikimedia family:
- A database of all books. This is actually what OpenLibrary tries to
do, with mitigated success, IMO. As you said, if we try and fail, nothing would be lost, as the result could be imported to OpenLibrary. We wouldn't need to start from scratch as the content of OpenLibrary is available and free.
I agree this is necessary. This task is extremely large and complex, and it does not hurt to have multiple "competing" open projects. OpenLibrary publishes their data store, and our wiki would be available as well, so the result will be cross pollination between OpenLibrary and our project until they are more or less in sync and the task is done.
- A database of all people, i.e. genealogy. There is one project which
is IMO a great technical success in this field: Rodovid (http://rodovid.org/).
Bringing Rodovid under the WMF umbrella would be great, and would encourage more genealogy people to become involved Wikimedia.
-- John Vandenberg
On 9 Sep 2009, at 00:42, Yann Forget wrote:
Michael Peel wrote:
** A few of my favourite examples: WikiJournal, publishing scholarly works;
These works are welcomed on Wikisource, if they are under a free license, of course.
WikiReview, providing in-depth reviews of subjects;
I think this can be hosted on Wikibooks or Wikiversity for the most part.
There's a big difference between starting a new section of something, and starting something completely new and fresh. With the former, you get all of the baggage of that project so far - e.g. if you want to start something slightly different on the English Wikipedia, then you have to modify huge numbers of policies, argue with many thousands of people, etc. Sometimes it's easier to split something off and do it seperately - as WikiSpecies has been doing, for example.
There's also a big difference between testing a project and launching a project. Tests are normally small-scale, aimed at just trying something out, rather than actually doing a project. It's very difficult to establish critical mass with that approach. Launching a project involves announcing it loudly to the world, and getting the attention of lots of people. As long as the basic idea is sound, you then get a large influx of people who want to try it out. Perhaps they don't all stick around - but some of them will.
Of course, you can't do either very often, otherwise people will stop paying any attention. But for some projects, it could work very well. Especially if there's the backing of e.g. a funding body, which could easily be attracted now that Wikimedia is so large and popular.
Mike
On Wed, Sep 9, 2009 at 11:45 AM, Michael Peel email@mikepeel.net wrote:
There's a big difference between starting a new section of something, and starting something completely new and fresh. With the former, you get all of the baggage of that project so far - e.g. if you want to start something slightly different on the English Wikipedia, then you have to modify huge numbers of policies, argue with many thousands of people, etc. Sometimes it's easier to split something off and do it seperately - as WikiSpecies has been doing, for example.
There's also a big difference between testing a project and launching a project. Tests are normally small-scale, aimed at just trying something out, rather than actually doing a project. It's very difficult to establish critical mass with that approach. Launching a project involves announcing it loudly to the world, and getting the attention of lots of people. As long as the basic idea is sound, you then get a large influx of people who want to try it out. Perhaps they don't all stick around - but some of them will.
Of course, you can't do either very often, otherwise people will stop paying any attention. But for some projects, it could work very well. Especially if there's the backing of e.g. a funding body, which could easily be attracted now that Wikimedia is so large and popular.
Mike
I think you can test a project in the incubator, get an idea of how it will work, set up the initial structure and *then* launch it publicly. The publicity part is the simplest. We've got a built-in megaphone; any launch that is incorporated with the fundraising drive, or given a similar level of extended publicity on Wikimedia pages, would reach many millions of people who already appreciate free collaborative projects. That would require a somewhat different philosophy from the current approach to "advertising" (not in the commercial sense) the fundraising drive, which emphasizes minimal intrusion and a once-a-year limit. Perhaps the community would be more amenable to Wikimedia-wide publicity if it promoted projects?
I'd like to see a role like that in launches for future projects; the foundation hasn't been involved in promoting or fostering new projects in a deep way in the past, from my understanding, and real support from the moment of establishment would go a long way towards protecting promising ideas from abandonment in the incubator. Erik's point is well made, that developing many promising projects beyond the idea point requires the commitment of resources that remain scarce. But there are lots of avenues the Foundation can take in this direction that don't require the direct allocation of foundation money; a "lesson plan / course material" wiki, or a "student wiki" designed for collaborative use by students could be developed jointly with innovative school systems or teacher groups, or even partnerships between schools in different countries aimed at allowing international cooperative learning. We may not be able to organically generate the Wikimedia community interest and expertise necessary for building the content these projects would need, but with the Foundation as technological facilitator and enthusiastic booster...
Nathan
As Erik points out, at a certain point we have to actually write new code to support new ideas. Else "projects we could do at Wikimedia" becomes "projects we can do with a wiki engine."
e.g. OpenStreetMap would have been a natural for WMF, but it would have required a whole new software infrastructure. And we have no shortage of content editors, but developers appear rather rarer.
Proposals I recall seeing for new projects either fit into a current project (e.g. Wikibooks - really, Wikipedia is a book, too) or haven't been neutral (e.g. the victims of Soviet repression proposal, which I think is a great idea but also think just would have been way too intrinsically non-neutral for WMF; the reviews wiki). Any proposal that's "hey, let's start a wiki" will, I suspect, fall into one of those two.
We're either not thinking outside the box enough or need to build new boxes. Or both.
What interesting new engines are there out there for gathering content from masses of Internet users that aren't wikis as we know them? What could we use them for besides their original purpose?
[cc'd to wikitech-l for comment as well]
- d.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
David Gerard wrote:
Proposals I recall seeing for new projects either fit into a current project (e.g. Wikibooks - really, Wikipedia is a book, too)
Sorry, Wikibooks is for *textbooks* and Wikipedia is not a textbook. (We also have a cookbook, wikijunior (kids' books) and how-tos.) Nevertheless, a lot of proposed projects do fit at Wikibooks - on strategywiki I found 2 or 3 just today.
Perhaps we're not doing a good enough job of advertising ourselves, or perhaps people are not thinking their ideas through. Whatever the reason, it seems like these proposals that already fit inside a box are not actually being nipped in the bud with "That belongs at X project, go do it there" and instead these people simply wallow in a netherworld between wanting to start a project and the community having no real capacity to evaluate proposals (including letting people know where their project might fit into the wikis we already have).
- -Mike
On 9/9/09 9:41 AM, David Gerard wrote:
As Erik points out, at a certain point we have to actually write new code to support new ideas. Else "projects we could do at Wikimedia" becomes "projects we can do with a wiki engine."
IMO we need to do that for the projects we already have before we take on new obligations!
We still have very poor software support for:
* Commons -- We need a sane upload and post-upload workflow (eg review and deletion), and a clean system for handling structured metadata (descriptions, authorship, licence info).
Some of this is being worked on now with Michael Dale's video & media work, and the Ford Foundation grant will let us put more resources into the workflow & metadata side, so this is the one I worry the least about. :)
* Wiktionary -- Really needs to be rebuilt as a structured system. It's very hard to query Wiktionary or extract its data usefully, and there's a lot of duplicated manual work maintaining it.
There was some third-party work done in this direction (Ultimate Wiktionary/WiktionaryZ/OmegaWiki) which was very interesting but never got the community buy-in to push that work back towards the live Wiktionary.
* Wikibooks -- We still have very poor native support for multiple-page "books" or "modules", which complicates navigation, search, authoring, and downloading.
Tools like the Collection extension are making it easier to download a batch of related pages for offline reading, but someone still needs to build those collections manually and they don't provide other navigation aids.
* Wikinews -- Workflow on Wikinews has been aided by tools like FlaggedRevs but is still a bit awkward. Native support for things like exporting feeds of news articles is still missing, leading to a lot of workarounds and manual effort being expended.
* Wikisource -- better native support for side-by-side translations, annotations, and extracting/citing primary source material from the other sites like Wikipedia would be very helpful.
-- brion
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Brion Vibber wrote:
IMO we need to do that for the projects we already have before we take on new obligations!
We still have very poor software support for:...
Thanks Brion, it is good to know that the tech team is aware of these issues and will be expending energy to improve how the software supports the non-Wikipedia projects. I'm looking forward in particular to seeing how the grant money will be spent for improving Commons' software, and what ideas may come about for giving Wikibooks some in-software structure.
- -Mike
2009/9/10 Brion Vibber brion@wikimedia.org:
IMO we need to do that for the projects we already have before we take on new obligations!
Oh yesss.
We still have very poor software support for:
- Commons -- We need a sane upload and post-upload workflow (eg review
and deletion), and a clean system for handling structured metadata (descriptions, authorship, licence info). Some of this is being worked on now with Michael Dale's video & media work, and the Ford Foundation grant will let us put more resources into the workflow & metadata side, so this is the one I worry the least about. :)
Categories as tags with arbitrary Boolean queries? Huh? Huh? Huh?
- d.
- Wikisource -- better native support for side-by-side translations,
annotations, and extracting/citing primary source material from the other sites like Wikipedia would be very helpful.
Same thing is in need for Wikiquote as well while I do believe that
... extracting/citing primary source material from the other sites like ...
is extremely useful and very universal thing for any cross-project 'linking' (and even for internal citing)
On Thu, Sep 10, 2009 at 7:07 PM, Brion Vibber brion@wikimedia.org wrote:
On 9/9/09 9:41 AM, David Gerard wrote:
As Erik points out, at a certain point we have to actually write new code to support new ideas. Else "projects we could do at Wikimedia" becomes "projects we can do with a wiki engine."
IMO we need to do that for the projects we already have before we take on new obligations!
We still have very poor software support for:
- Commons -- We need a sane upload and post-upload workflow (eg review
and deletion), and a clean system for handling structured metadata (descriptions, authorship, licence info).
Some of this is being worked on now with Michael Dale's video & media work, and the Ford Foundation grant will let us put more resources into the workflow & metadata side, so this is the one I worry the least about. :)
- Wiktionary -- Really needs to be rebuilt as a structured system. It's
very hard to query Wiktionary or extract its data usefully, and there's a lot of duplicated manual work maintaining it.
There was some third-party work done in this direction (Ultimate Wiktionary/WiktionaryZ/OmegaWiki) which was very interesting but never got the community buy-in to push that work back towards the live Wiktionary.
- Wikibooks -- We still have very poor native support for multiple-page
"books" or "modules", which complicates navigation, search, authoring, and downloading.
Tools like the Collection extension are making it easier to download a batch of related pages for offline reading, but someone still needs to build those collections manually and they don't provide other navigation aids.
- Wikinews -- Workflow on Wikinews has been aided by tools like
FlaggedRevs but is still a bit awkward. Native support for things like exporting feeds of news articles is still missing, leading to a lot of workarounds and manual effort being expended.
- Wikisource -- better native support for side-by-side translations,
annotations, and extracting/citing primary source material from the other sites like Wikipedia would be very helpful.
-- brion
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Software programs (including smartphone apps), products (with UPC codes), companies, organizations, restaurants, movies, books, video games, web sites, libraries, elementary schools, points of interest. All the things removed from Wikipedia by those mean old deletionists. Original research. But yeah, not going to happen.
OpenStreetMap seems to be doing a good enough job at maps.
2009/9/8 Michael Peel email@mikepeel.net:
What could be the cause of this recent dearth of new projects?
Certainly the process for getting a new project underway is so complex and exhausting that it's not something that many people will be likely to engage in - especially considering that project ideas are often proposed by people who aren't currently very active Wikimedians. Perhaps we need to set up a formal system for long-time Wikimedians to adopt ideas they're excited about, to help push them to approval? In any event, if you want to add to the Wikimedia family, my guess is that it's currently a commitment of 2-3 months of several hours per week to get to that point, provided it's achievable to begin with.
I do think that project adoption is something that we should explore in the right circumstances; it's not something we've ever done but IMO we should be open to it. I don't think OpenStreetMap or OpenLibrary want or need to be adopted. ;-) But there may be other smaller semi-successful projects that would like to join our project family, and that would make sense as part of it.
I would also make the point that adding capabilities to existing projects can be just as effective at cultivating new communities of participants as creating an entirely new wiki, and sometimes more so. For example, as of a few weeks ago, there's now a fledgling community of people on Wikimedia Commons who add annotations to images, because a volunteer developed a cool image annotation tool. The entire community of people adding categories to Wikipedia articles could only form after the categorization functionality was developed.
Because the Wikipedia community is so vast, adding capabilities that engage more people on Wikipedia specifically, or improving access to the existing capabilities, can have dramatically greater impact than creating a blank-slate wiki.
That is not to say that I think there should be no new blank-slate wikis, or wikis with custom software, for specific purposes. But I would also not see the fact that no new top-level Wikimedia project has been created in recent years as a sign of stagnation - wonderful capabilities have been created in the existing Wikimedia ecosystem in that same time period, some of them with dramatic positive impact.
On Tue, Sep 8, 2009 at 6:28 PM, Erik Moeller erik@wikimedia.org wrote:
2009/9/8 Michael Peel email@mikepeel.net:
What could be the cause of this recent dearth of new projects?
Certainly the process for getting a new project underway is so complex and exhausting that it's not something that many people will be likely to engage in - especially considering that project ideas are often proposed by people who aren't currently very active Wikimedians. Perhaps we need to set up a formal system for long-time Wikimedians to adopt ideas they're excited about, to help push them to approval? In any event, if you want to add to the Wikimedia family, my guess is that it's currently a commitment of 2-3 months of several hours per week to get to that point, provided it's achievable to begin with.
I do think that project adoption is something that we should explore in the right circumstances; it's not something we've ever done but IMO we should be open to it. I don't think OpenStreetMap or OpenLibrary want or need to be adopted. ;-) But there may be other smaller semi-successful projects that would like to join our project family, and that would make sense as part of it.
I would also make the point that adding capabilities to existing projects can be just as effective at cultivating new communities of participants as creating an entirely new wiki, and sometimes more so. For example, as of a few weeks ago, there's now a fledgling community of people on Wikimedia Commons who add annotations to images, because a volunteer developed a cool image annotation tool. The entire community of people adding categories to Wikipedia articles could only form after the categorization functionality was developed.
Because the Wikipedia community is so vast, adding capabilities that engage more people on Wikipedia specifically, or improving access to the existing capabilities, can have dramatically greater impact than creating a blank-slate wiki.
That is not to say that I think there should be no new blank-slate wikis, or wikis with custom software, for specific purposes. But I would also not see the fact that no new top-level Wikimedia project has been created in recent years as a sign of stagnation - wonderful capabilities have been created in the existing Wikimedia ecosystem in that same time period, some of them with dramatic positive impact. -- Erik Möller Deputy Director, Wikimedia Foundation
Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate
I propose expanding the notion of the Wikimedia Incubator to include entirely new projects that are very, very easy to create. They don't need to be approved by the WMF - they just need to demonstrate their value by attracting a community and creating great content. This would be more like the Apache Incubator, but even more open. This gives people an easy way to prototype their ideas for new projects, to advertise them, and over time will give an overview of what kinds of projects and approaches to projects are likely to succeed and likely to fail.
On Wed, Sep 9, 2009 at 10:33 AM, BrianBrian.Mingus@colorado.edu wrote:
I propose expanding the notion of the Wikimedia Incubator to include entirely new projects that are very, very easy to create. They don't need to be approved by the WMF - they just need to demonstrate their value by attracting a community and creating great content. This would be more like the Apache Incubator, but even more open. This gives people an easy way to prototype their ideas for new projects, to advertise them, and over time will give an overview of what kinds of projects and approaches to projects are likely to succeed and likely to fail.
Brilliant idea.
Currently new projects proposed on meta have buckley's chance of ever starting. Wikiversity wasn't a new project - it was split from wikibooks.
We would need a bit of infrastructure around new concepts before they land on the incubator, such as a detailed description of the purpose, and an experienced admin willing to monitor that area of the incubator.
-- John Vandenberg
John Vandenberg wrote:
On Wed, Sep 9, 2009 at 10:33 AM, BrianBrian.Mingus@colorado.edu wrote:
I propose expanding the notion of the Wikimedia Incubator to include entirely new projects that are very, very easy to create. They don't need to be approved by the WMF - they just need to demonstrate their value by attracting a community and creating great content. This would be more like the Apache Incubator, but even more open. This gives people an easy way to prototype their ideas for new projects, to advertise them, and over time will give an overview of what kinds of projects and approaches to projects are likely to succeed and likely to fail.
Brilliant idea.
Currently new projects proposed on meta have buckley's chance of ever starting. Wikiversity wasn't a new project - it was split from wikibooks.
We would need a bit of infrastructure around new concepts before they land on the incubator, such as a detailed description of the purpose, and an experienced admin willing to monitor that area of the incubator.
This sounds like a good idea to me. One difference is immediately obvious from the way the incubator works presently, though. Rather than having these projects move out of the incubator based on the decision of the language committee, that issue would have to be considered by the board directly in consultation with the broader community.
--Michael Snow
Hoi, I am glad that this is seen as obvious. The language committee has never involved itself in assessing new project proposals. It does not have the inclination to do so and I am glad that this is understood. Thanks, GerardM
2009/9/9 Michael Snow wikipedia@verizon.net
John Vandenberg wrote:
On Wed, Sep 9, 2009 at 10:33 AM, BrianBrian.Mingus@colorado.edu wrote:
I propose expanding the notion of the Wikimedia Incubator to include entirely new projects that are very, very easy to create. They don't
need to
be approved by the WMF - they just need to demonstrate their value by attracting a community and creating great content. This would be more
like
the Apache Incubator, but even more open. This gives people an easy way
to
prototype their ideas for new projects, to advertise them, and over time will give an overview of what kinds of projects and approaches to
projects
are likely to succeed and likely to fail.
Brilliant idea.
Currently new projects proposed on meta have buckley's chance of ever starting. Wikiversity wasn't a new project - it was split from wikibooks.
We would need a bit of infrastructure around new concepts before they land on the incubator, such as a detailed description of the purpose, and an experienced admin willing to monitor that area of the incubator.
This sounds like a good idea to me. One difference is immediately obvious from the way the incubator works presently, though. Rather than having these projects move out of the incubator based on the decision of the language committee, that issue would have to be considered by the board directly in consultation with the broader community.
--Michael Snow
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
On Tue, Sep 8, 2009 at 11:31 PM, Michael Snow wikipedia@verizon.net wrote:
John Vandenberg wrote:
On Wed, Sep 9, 2009 at 10:33 AM, BrianBrian.Mingus@colorado.edu wrote:
I propose expanding the notion of the Wikimedia Incubator to include entirely new projects that are very, very easy to create. They don't need to be approved by the WMF - they just need to demonstrate their value by attracting a community and creating great content. This would be more like the Apache Incubator, but even more open. This gives people an easy way to prototype their ideas for new projects, to advertise them, and over time will give an overview of what kinds of projects and approaches to projects are likely to succeed and likely to fail.
Brilliant idea.
Currently new projects proposed on meta have buckley's chance of ever starting. Wikiversity wasn't a new project - it was split from wikibooks.
We would need a bit of infrastructure around new concepts before they land on the incubator, such as a detailed description of the purpose, and an experienced admin willing to monitor that area of the incubator.
This sounds like a good idea to me. One difference is immediately obvious from the way the incubator works presently, though. Rather than having these projects move out of the incubator based on the decision of the language committee, that issue would have to be considered by the board directly in consultation with the broader community.
--Michael Snow
This is a brilliant and much-needed idea, on many many levels.
I suggest that we start to work developing such a new system for the Incubator at the strategy wiki.
Thanks, Richard (User:Pharos)
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
On Wed, Sep 16, 2009 at 11:26 AM, Pharos pharosofalexandria@gmail.com wrote:
On Tue, Sep 8, 2009 at 11:31 PM, Michael Snow wikipedia@verizon.net wrote:
John Vandenberg wrote:
On Wed, Sep 9, 2009 at 10:33 AM, BrianBrian.Mingus@colorado.edu wrote:
I propose expanding the notion of the Wikimedia Incubator to include entirely new projects that are very, very easy to create. They don't need to be approved by the WMF - they just need to demonstrate their value by attracting a community and creating great content. This would be more like the Apache Incubator, but even more open. This gives people an easy way to prototype their ideas for new projects, to advertise them, and over time will give an overview of what kinds of projects and approaches to projects are likely to succeed and likely to fail.
Brilliant idea.
This sounds like a good idea to me.
This is a brilliant and much-needed idea, on many many levels.
I suggest that we start to work developing such a new system for the Incubator at the strategy wiki.
<AOL>
Michael Snow writes:
One difference is immediately obvious from the way the incubator works presently, though. Rather than having these projects move out of the incubator based on the decision >> of the language committee, that issue would have to be considered by the board directly in consultation with the broader community.
We have for years had a 'Board approval' bottleneck for creating new Projects. On the other hand, we've only created four since the Board got started. We managed somehow before, and I don't see why this couldn't one day be delegated to something very like the language committee, dedicated to assessing new project proposals for sufficient interest and alignment with Wikimedia goals (here goals for free knowledge coverage, not goals for real language coverage). Until and unless such a thing is worked out, review by the board and broader community is a fine alternative... but it uncomfortably close to describing the system we have now.
SJ
In the past there were several project proposals on incubator, but we deleted them because they were not active. Since then, tests for new WMF projects are not allowed. If they were still allowed, Incubator would be full of inactive projects. Even now, there are inactive test projects for new languages, because the procedure is difficult and takes a very long time. I assume requests for creating entirely new projects would require even more difficult and longer procedures, resulting in an Incubator full of inactive tests.
2009/9/9 Brian Brian.Mingus@colorado.edu
On Tue, Sep 8, 2009 at 6:28 PM, Erik Moeller erik@wikimedia.org wrote:
2009/9/8 Michael Peel email@mikepeel.net:
What could be the cause of this recent dearth of new projects?
Certainly the process for getting a new project underway is so complex and exhausting that it's not something that many people will be likely to engage in - especially considering that project ideas are often proposed by people who aren't currently very active Wikimedians. Perhaps we need to set up a formal system for long-time Wikimedians to adopt ideas they're excited about, to help push them to approval? In any event, if you want to add to the Wikimedia family, my guess is that it's currently a commitment of 2-3 months of several hours per week to get to that point, provided it's achievable to begin with.
I do think that project adoption is something that we should explore in the right circumstances; it's not something we've ever done but IMO we should be open to it. I don't think OpenStreetMap or OpenLibrary want or need to be adopted. ;-) But there may be other smaller semi-successful projects that would like to join our project family, and that would make sense as part of it.
I would also make the point that adding capabilities to existing projects can be just as effective at cultivating new communities of participants as creating an entirely new wiki, and sometimes more so. For example, as of a few weeks ago, there's now a fledgling community of people on Wikimedia Commons who add annotations to images, because a volunteer developed a cool image annotation tool. The entire community of people adding categories to Wikipedia articles could only form after the categorization functionality was developed.
Because the Wikipedia community is so vast, adding capabilities that engage more people on Wikipedia specifically, or improving access to the existing capabilities, can have dramatically greater impact than creating a blank-slate wiki.
That is not to say that I think there should be no new blank-slate wikis, or wikis with custom software, for specific purposes. But I would also not see the fact that no new top-level Wikimedia project has been created in recent years as a sign of stagnation - wonderful capabilities have been created in the existing Wikimedia ecosystem in that same time period, some of them with dramatic positive impact. -- Erik Möller Deputy Director, Wikimedia Foundation
Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate
I propose expanding the notion of the Wikimedia Incubator to include entirely new projects that are very, very easy to create. They don't need to be approved by the WMF - they just need to demonstrate their value by attracting a community and creating great content. This would be more like the Apache Incubator, but even more open. This gives people an easy way to prototype their ideas for new projects, to advertise them, and over time will give an overview of what kinds of projects and approaches to projects are likely to succeed and likely to fail. _______________________________________________ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Are inactive project in incubator really such a big problem? Could not be strict deadlines given to new projects in incubator the solution of this problem?
Jiri
On Wednesday, 09. September 2009 16:10:26 Robin P. wrote:
In the past there were several project proposals on incubator, but we deleted them because they were not active. Since then, tests for new WMF projects are not allowed. If they were still allowed, Incubator would be full of inactive projects. Even now, there are inactive test projects for new languages, because the procedure is difficult and takes a very long time. I assume requests for creating entirely new projects would require even more difficult and longer procedures, resulting in an Incubator full of inactive tests.
2009/9/9 Brian Brian.Mingus@colorado.edu
On Tue, Sep 8, 2009 at 6:28 PM, Erik Moeller erik@wikimedia.org wrote:
2009/9/8 Michael Peel email@mikepeel.net:
What could be the cause of this recent dearth of new projects?
Certainly the process for getting a new project underway is so complex and exhausting that it's not something that many people will be likely to engage in - especially considering that project ideas are often proposed by people who aren't currently very active Wikimedians. Perhaps we need to set up a formal system for long-time Wikimedians to adopt ideas they're excited about, to help push them to approval? In any event, if you want to add to the Wikimedia family, my guess is that it's currently a commitment of 2-3 months of several hours per week to get to that point, provided it's achievable to begin with.
I do think that project adoption is something that we should explore in the right circumstances; it's not something we've ever done but IMO we should be open to it. I don't think OpenStreetMap or OpenLibrary want or need to be adopted. ;-) But there may be other smaller semi-successful projects that would like to join our project family, and that would make sense as part of it.
I would also make the point that adding capabilities to existing projects can be just as effective at cultivating new communities of participants as creating an entirely new wiki, and sometimes more so. For example, as of a few weeks ago, there's now a fledgling community of people on Wikimedia Commons who add annotations to images, because a volunteer developed a cool image annotation tool. The entire community of people adding categories to Wikipedia articles could only form after the categorization functionality was developed.
Because the Wikipedia community is so vast, adding capabilities that engage more people on Wikipedia specifically, or improving access to the existing capabilities, can have dramatically greater impact than creating a blank-slate wiki.
That is not to say that I think there should be no new blank-slate wikis, or wikis with custom software, for specific purposes. But I would also not see the fact that no new top-level Wikimedia project has been created in recent years as a sign of stagnation - wonderful capabilities have been created in the existing Wikimedia ecosystem in that same time period, some of them with dramatic positive impact. -- Erik Möller Deputy Director, Wikimedia Foundation
Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate
I propose expanding the notion of the Wikimedia Incubator to include entirely new projects that are very, very easy to create. They don't need to be approved by the WMF - they just need to demonstrate their value by attracting a community and creating great content. This would be more like the Apache Incubator, but even more open. This gives people an easy way to prototype their ideas for new projects, to advertise them, and over time will give an overview of what kinds of projects and approaches to projects are likely to succeed and likely to fail. _______________________________________________ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Yes. Btw, if we had a deadline, what should we do when a project reaches the deadline? The most logical is deleting it. The problem with that, however, is that nobody would contribute to a test project knowing that it will be deleted when it reaches the deadline. If there is interest again, it would then have to be undeleted. That would be also too much work for nothing. So not really a solution.
2009/9/9 Jiri Hofman hofmanj@aldebaran.cz
Are inactive project in incubator really such a big problem? Could not be strict deadlines given to new projects in incubator the solution of this problem?
Jiri
On Wednesday, 09. September 2009 16:10:26 Robin P. wrote:
In the past there were several project proposals on incubator, but we deleted them because they were not active. Since then, tests for new WMF projects are not allowed. If they were still allowed, Incubator would be full of inactive projects. Even now, there are inactive test projects for new languages, because the procedure is difficult and takes a very long time. I assume requests for creating entirely new projects would require even more difficult and longer procedures, resulting in an Incubator full
of
inactive tests.
2009/9/9 Brian Brian.Mingus@colorado.edu
On Tue, Sep 8, 2009 at 6:28 PM, Erik Moeller erik@wikimedia.org
wrote:
2009/9/8 Michael Peel email@mikepeel.net:
What could be the cause of this recent dearth of new projects?
Certainly the process for getting a new project underway is so
complex
and exhausting that it's not something that many people will be
likely
to engage in - especially considering that project ideas are often proposed by people who aren't currently very active Wikimedians. Perhaps we need to set up a formal system for long-time Wikimedians
to
adopt ideas they're excited about, to help push them to approval? In any event, if you want to add to the Wikimedia family, my guess is that it's currently a commitment of 2-3 months of several hours per week to get to that point, provided it's achievable to begin with.
I do think that project adoption is something that we should explore in the right circumstances; it's not something we've ever done but
IMO
we should be open to it. I don't think OpenStreetMap or OpenLibrary want or need to be adopted. ;-) But there may be other smaller semi-successful projects that would like to join our project family, and that would make sense as part of it.
I would also make the point that adding capabilities to existing projects can be just as effective at cultivating new communities of participants as creating an entirely new wiki, and sometimes more so. For example, as of a few weeks ago, there's now a fledgling community of people on Wikimedia Commons who add annotations to images, because a volunteer developed a cool image annotation tool. The entire community of people adding categories to Wikipedia articles could
only
form after the categorization functionality was developed.
Because the Wikipedia community is so vast, adding capabilities that engage more people on Wikipedia specifically, or improving access to the existing capabilities, can have dramatically greater impact than creating a blank-slate wiki.
That is not to say that I think there should be no new blank-slate wikis, or wikis with custom software, for specific purposes. But I would also not see the fact that no new top-level Wikimedia project has been created in recent years as a sign of stagnation - wonderful capabilities have been created in the existing Wikimedia ecosystem in that same time period, some of them with dramatic positive impact. -- Erik Möller Deputy Director, Wikimedia Foundation
Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate
I propose expanding the notion of the Wikimedia Incubator to include entirely new projects that are very, very easy to create. They don't
need
to be approved by the WMF - they just need to demonstrate their value by attracting a community and creating great content. This would be more
like
the Apache Incubator, but even more open. This gives people an easy way
to
prototype their ideas for new projects, to advertise them, and over
time
will give an overview of what kinds of projects and approaches to
projects
are likely to succeed and likely to fail. _______________________________________________ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
--
* . . . ... . M45 ..
M1 . # .
- . . Jiri Hofman . . Opiskelijankatu 38 B28 . * Tampere . ** 33720 ¤. . Finland **. * . . . . * * . . . * . . . * . * * . *
gsm: +358504661860 . . +358504384197 * . http://www.aldebaran.cz/~hofmanj http://www.aldebaran.cz/%7Ehofmanj . *
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
On Wed, Sep 9, 2009 at 7:10 AM, Robin P. robinp.1273@gmail.com wrote:
In the past there were several project proposals on incubator, but we deleted them because they were not active. Since then, tests for new WMF projects are not allowed. If they were still allowed, Incubator would be full of inactive projects. Even now, there are inactive test projects for new languages, because the procedure is difficult and takes a very long time. I assume requests for creating entirely new projects would require even more difficult and longer procedures, resulting in an Incubator full of inactive tests.
I don't think that deleting them is a good idea,. Perhaps you can "archive" them after a certain period of inactivity, but the incubator should allow project ideas to be revived and should give projects plenty of time to become active. There must be a carrot of course - the WMF should make some sort of statement about how successful a project should become, and what sorts of vision it might have, for them to commit more resources to it.
On 9/8/09, Brian Brian.Mingus@colorado.edu wrote:
On Tue, Sep 8, 2009 at 6:28 PM, Erik Moeller erik@wikimedia.org wrote:
2009/9/8 Michael Peel email@mikepeel.net:
What could be the cause of this recent dearth of new projects?
Certainly the process for getting a new project underway is so complex and exhausting that it's not something that many people will be likely to engage in - especially considering that project ideas are often proposed by people who aren't currently very active Wikimedians. Perhaps we need to set up a formal system for long-time Wikimedians to adopt ideas they're excited about, to help push them to approval?
That would be a nice idea. Three steps : propose a process and find supporters (somewhat defined on meta, can be refined), find an established Wikimedian to mentor/adopt it (define a new process and people willing to mentor), work towards approval (define a new process involving the incubator).
I do think that project adoption is something that we should explore in the right circumstances; it's not something we've ever done but IMO we should be open to it. I don't think OpenStreetMap or OpenLibrary want or need to be adopted. ;-) But there may be other smaller semi-successful projects that would like to join our project family, and that would make sense as part of it.
Yes. Rodovid and Wikikids come to mind as projects that have asked at one point, though they may no longer have such an interest. Rodovid is certainly the largest multilingual project to make such a request... but afaict there simply wasn't a clear way for that to be considered at the time.
For example, as of a few weeks ago, there's now a fledgling community of people on Wikimedia Commons who add annotations to images, because a volunteer developed a cool image annotation tool. The entire community of people adding categories to Wikipedia articles could only form after the categorization functionality was developed.
Yes and yes. I remember the people who wondered if articles would ever be usefully categorized, or if it was just a cute side project that would never impact wikipedia. And the fascinating debates about the meta-category structure... which might [have] serve[d] as material for an entire thesis in librarianship.
That is not to say that I think there should be no new blank-slate wikis, or wikis with custom software, for specific purposes. But I would also not see the fact that no new top-level Wikimedia project has been created in recent years as a sign of stagnation - wonderful
It is a sign of stagnation. The ecosystem is nowhere near saturated with free knowledge projects; WP is dazzlingly successful; we or others should at least be considering similar projects to cover every type and format of knowledge, and for every audience -- in our case, to explicitly say 'out of scope [yet]' if nothing else.
But as you note, there are other signs of growth which counterbalance it.
Brian writes:
I propose expanding the notion of the Wikimedia Incubator to include entirely new projects that are very, very easy to create. They don't need to be approved by the WMF - they just need to demonstrate their value by attracting a community and creating great content. This would be more like the Apache Incubator, but even more open. This gives people an easy way to prototype their ideas for new projects, to advertise them, and over time will give an overview of what kinds of projects and approaches to projects are likely to succeed and likely to fail.
Great idea. Where's the right place to suggest this on the Incubator? That's a project where I have regrettably not gotten to know any of the local policies yet.
SJ
On Thu, Sep 10, 2009 at 9:23 PM, Samuel Klein meta.sj@gmail.com wrote:
Great idea. Where's the right place to suggest this on the Incubator? That's a project where I have regrettably not gotten to know any of the local policies yet.
Here is the main project discussion:
http://incubator.wikimedia.org/wiki/Incubator:Community_Portal
However I think a meta discussion would be more widely visited.
-- John Vandenberg
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Erik Moeller wrote:
Certainly the process for getting a new project underway is so complex and exhausting that it's not something that many people will be likely to engage in
Another issue is that all our projects use the MediaWiki platform (and really, the MediaWiki platform is the only tool we use for the content). So, if it's not something that is conducive to being built on a MediaWiki wiki, it's not something Wikimedia can accomplish.
- -Mike
On 9/9/09, Michael Peel email@mikepeel.net wrote:
On 2 Sep 2009, at 12:35, David Goodman wrote:
There is sufficient missing material in every Wikipedia, sufficient lack of coverage of areas outside the primary language zone and in earlier periods, sufficient unsourced material; sufficient need for updating articles, sufficient potentially free media to add, sufficient needed imagery to get; that we have more than enough work for all the volunteers we are likely to get.
I apologise for taking this slightly out of context, but it touches upon something I've been wondering about recently, which is: do we have a complete set of WMF projects?
great topic :-D
in my personal vision, it is rather obvious we should consider the work of the wmf as "perpetually unfinished" just as wikipedia or any of its other projects: an ongoing process, never ever {{done}} completely.
to just do a little brainstorm, let me share some ideas as well: * a compendium to wikipedia, collecting each and every complete older encyclopedia (which is no longer copyrighted), thus also giving a peek into the history of knowledge and of encyclopedias (does this really belong in wikisource? maybe) * a wikimusic including a musical dictionary, where one can e.g. look up themes and melodies, find sheet music and recordings, searching by notes etc * i also thought of wikimaps, somebody mentioned this already, imnsho including "all maps" in detailed resolutions also historical maps, thus also giving a peek into the history of geography and of cartography as well as leaving room for original creations under a free license (new maps)
just my 2 cts ;-)
all the best, oscar
If we are just throwing out random ideas...
I've long wanted to see an open source project to create a world family tree, i.e. document the ancestry and connections between everyone ever. There are a couple high profile closed source / fee based projects aiming to do this, but no successful projects that really have open access as part of their foundation. Even if we limited such a project to just deceased individuals (as the big projects usually do) it would still be a massive undertaking and potentially very useful for researchers.
However, while a wiki could work, it would be a suboptimal approach. Much like wikispecies, genealogical information has a heavy component of structured data that could benefit from dedicated tools designed for that data. As has been suggested elsewhere, it seems that most of the things that can be easily done by a wiki are already being done either by us or by Wikia and similar third parties.
-Robert Rohde
On Wed, Sep 9, 2009 at 8:24 AM, oscar oscar@wikimedia.org wrote:
On 9/9/09, Michael Peel email@mikepeel.net wrote:
On 2 Sep 2009, at 12:35, David Goodman wrote:
There is sufficient missing material in every Wikipedia, sufficient lack of coverage of areas outside the primary language zone and in earlier periods, sufficient unsourced material; sufficient need for updating articles, sufficient potentially free media to add, sufficient needed imagery to get; that we have more than enough work for all the volunteers we are likely to get.
I apologise for taking this slightly out of context, but it touches upon something I've been wondering about recently, which is: do we have a complete set of WMF projects?
great topic :-D
in my personal vision, it is rather obvious we should consider the work of the wmf as "perpetually unfinished" just as wikipedia or any of its other projects: an ongoing process, never ever {{done}} completely.
to just do a little brainstorm, let me share some ideas as well:
- a compendium to wikipedia, collecting each and every complete older
encyclopedia (which is no longer copyrighted), thus also giving a peek into the history of knowledge and of encyclopedias (does this really belong in wikisource? maybe)
- a wikimusic including a musical dictionary, where one can e.g. look
up themes and melodies, find sheet music and recordings, searching by notes etc
- i also thought of wikimaps, somebody mentioned this already, imnsho
including "all maps" in detailed resolutions also historical maps, thus also giving a peek into the history of geography and of cartography as well as leaving room for original creations under a free license (new maps)
just my 2 cts ;-)
all the best, oscar
-- *edito ergo sum*
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Perhaps we need a peripheral Wikipedia layer for items meeting V, but where N being based on general assumptions: a level for verifiable articles that don't meet current notability standards.
It could be a separate project, Wikidirectory--just as we moved out dicdefs, and quotations, and so on, except that there are already too many projects to keep track of. Could we do it within Wikipedia, perhaps as a namespace?
David Goodman, Ph.D, M.L.S. http://en.wikipedia.org/wiki/User_talk:DGG
On Fri, Sep 11, 2009 at 9:59 AM, Robert Rohde rarohde@gmail.com wrote:
If we are just throwing out random ideas...
I've long wanted to see an open source project to create a world family tree, i.e. document the ancestry and connections between everyone ever. There are a couple high profile closed source / fee based projects aiming to do this, but no successful projects that really have open access as part of their foundation. Even if we limited such a project to just deceased individuals (as the big projects usually do) it would still be a massive undertaking and potentially very useful for researchers.
However, while a wiki could work, it would be a suboptimal approach. Much like wikispecies, genealogical information has a heavy component of structured data that could benefit from dedicated tools designed for that data. As has been suggested elsewhere, it seems that most of the things that can be easily done by a wiki are already being done either by us or by Wikia and similar third parties.
-Robert Rohde
On Wed, Sep 9, 2009 at 8:24 AM, oscar oscar@wikimedia.org wrote:
On 9/9/09, Michael Peel email@mikepeel.net wrote:
On 2 Sep 2009, at 12:35, David Goodman wrote:
There is sufficient missing material in every Wikipedia, sufficient lack of coverage of areas outside the primary language zone and in earlier periods, sufficient unsourced material; sufficient need for updating articles, sufficient potentially free media to add, sufficient needed imagery to get; that we have more than enough work for all the volunteers we are likely to get.
I apologise for taking this slightly out of context, but it touches upon something I've been wondering about recently, which is: do we have a complete set of WMF projects?
great topic :-D
in my personal vision, it is rather obvious we should consider the work of the wmf as "perpetually unfinished" just as wikipedia or any of its other projects: an ongoing process, never ever {{done}} completely.
to just do a little brainstorm, let me share some ideas as well:
- a compendium to wikipedia, collecting each and every complete older
encyclopedia (which is no longer copyrighted), thus also giving a peek into the history of knowledge and of encyclopedias (does this really belong in wikisource? maybe)
- a wikimusic including a musical dictionary, where one can e.g. look
up themes and melodies, find sheet music and recordings, searching by notes etc
- i also thought of wikimaps, somebody mentioned this already, imnsho
including "all maps" in detailed resolutions also historical maps, thus also giving a peek into the history of geography and of cartography as well as leaving room for original creations under a free license (new maps)
just my 2 cts ;-)
all the best, oscar
-- *edito ergo sum*
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
On Fri, Sep 11, 2009 at 7:20 PM, David Goodman dgoodmanny@gmail.com wrote:
Perhaps we need a peripheral Wikipedia layer for items meeting V, but where N being based on general assumptions: a level for verifiable articles that don't meet current notability standards.
It could be a separate project, Wikidirectory--just as we moved out dicdefs, and quotations, and so on, except that there are already too many projects to keep track of. Could we do it within Wikipedia, perhaps as a namespace?
David Goodman, Ph.D, M.L.S. http://en.wikipedia.org/wiki/User_talk:DGG
Another idea that I encountered somewhere (not currently sure where) was to create a global wiki directory to essentially replace the yellow pages. Something managed under a wiki model to include the names, addresses, phone numbers, websites, and a short description of any and all local businesses. Commercial businesses are a fine example of entities that are usually verifiable but not notable from Wikipedia's point of view, and having a central repository of directory information would generally be useful. A crowd sourced directory would suffer from the general problems of accuracy that all our wikiprojects have to worry about, but probably has the potential to include more comprehensive information than the commercial providers can manage.
If people truly believe in the "sum of all human knowledge" paradigm, then eventually we'll have to confront what to do with a wide range of factual information (like yellow page listings, family trees, sports almanacs, and other things) that are permanent or semi-permanent and yet generally not in the scope of projects like Wikipedia because they aren't very notable. Wikibooks can vaguely address some of this, but shoehorning everything into a "book" model doesn't really make sense either.
-Robert Rohde
On Fri, Sep 11, 2009 at 7:43 PM, Robert Rohde rarohde@gmail.com wrote:
Another idea that I encountered somewhere (not currently sure where) was to create a global wiki directory to essentially replace the yellow pages. Something managed under a wiki model to include the names, addresses, phone numbers, websites, and a short description of any and all local businesses. Commercial businesses are a fine example of entities that are usually verifiable but not notable from Wikipedia's point of view, and having a central repository of directory information would generally be useful. A crowd sourced directory would suffer from the general problems of accuracy that all our wikiprojects have to worry about, but probably has the potential to include more comprehensive information than the commercial providers can manage.
WiserEarth http://wiserearth.org/ originally intended to do this for organizations with social missions. Its emphasis has since expanded to become more social network/community-oriented, but its directory data still exists, is editable, and is an important part of the site.
=Eugene
I find projects to build large catalogs interesting; many of these are simply not done well online. If you have a free catalog, you can visaulize its elements on a map / with search; add reviews and images and comments to what was previously uncommentable. Some potential catalogs:
Organizations - O(10M) V [ like wiserearth, whitepages? ] Books and published works - O(10M) new published pieces a year. [ openlibrary.org ] Websites - O(10M) V, non-spam sites? [ aboutus.com ] People/genealogy - O(1B) verifiable data points. [ rodovid : O(100k) ] Locations - O(1B) buildings and urban places. Fewer identifiable/V places in nature. Popular online files - O(1B) V + N (for some defined N), by url + date Mass-produced objects - O(10B) V, by UPC or other code. [ no public site ]
On Sat, Sep 12, 2009 at 12:13 PM, Eugene Eric Kim eekim@blueoxen.com wrote:
On Fri, Sep 11, 2009 at 7:43 PM, Robert Rohde rarohde@gmail.com wrote:
Another idea that I encountered somewhere (not currently sure where) was to create a global wiki directory to essentially replace the yellow pages. Something managed under a wiki model to include the names, addresses, phone numbers, websites, and a short description of any and all local businesses. Commercial businesses are a fine example of entities that are usually verifiable but not notable from Wikipedia's point of view, and having a central repository of directory information would generally be useful. A crowd sourced directory would suffer from the general problems of accuracy that all our wikiprojects have to worry about, but probably has the potential to include more comprehensive information than the commercial providers can manage.
WiserEarth http://wiserearth.org/ originally intended to do this for organizations with social missions. Its emphasis has since expanded to become more social network/community-oriented, but its directory data still exists, is editable, and is an important part of the site.
=Eugene
--
Eugene Eric Kim ................................ http://xri.net/=eekim Blue Oxen Associates ........................ http://www.blueoxen.com/ ======================================================================
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
On Fri, Sep 11, 2009 at 10:43 PM, Robert Rohde rarohde@gmail.com wrote:
Another idea that I encountered somewhere (not currently sure where) was to create a global wiki directory to essentially replace the yellow pages. Something managed under a wiki model to include the names, addresses, phone numbers, websites, and a short description of any and all local businesses. Commercial businesses are a fine example of entities that are usually verifiable but not notable from Wikipedia's point of view, and having a central repository of directory information would generally be useful. A crowd sourced directory would suffer from the general problems of accuracy that all our wikiprojects have to worry about, but probably has the potential to include more comprehensive information than the commercial providers can manage.
This would be nice to tie in with OpenStreetMap, so that amenity=fast_food ( http://wiki.openstreetmap.org/wiki/Tag:amenity%3Dfast_food) could be additionally tagged with something like guid=309FAF94-9FC6-11DE-A978-099455D89593, and then http://xx.local.wikimedia.org/wiki/309FAF94-9FC6-11DE-A978-099455D89593would have the directory information (in the xx language). The wiki would probably contain lots of information not included in OSM (OSM are IMHO a little bit overly copyright-paranoid), but in theory one could make a smartphone app which lets you find the nearest fast food restaurants (OSM), take you to a description page for the nearest one (wiki), provide driving directions (OSM)...
On Sat, Sep 12, 2009 at 8:06 PM, Anthony wikimail@inbox.org wrote:
On Fri, Sep 11, 2009 at 10:43 PM, Robert Rohde rarohde@gmail.com wrote:
Another idea that I encountered somewhere (not currently sure where) was to create a global wiki directory to essentially replace the yellow pages. Something managed under a wiki model to include the names, addresses, phone numbers, websites, and a short description of any and all local businesses. Commercial businesses are a fine example of entities that are usually verifiable but not notable from Wikipedia's point of view, and having a central repository of directory information would generally be useful. A crowd sourced directory would suffer from the general problems of accuracy that all our wikiprojects have to worry about, but probably has the potential to include more comprehensive information than the commercial providers can manage.
This would be nice to tie in with OpenStreetMap, so that amenity=fast_food ( http://wiki.openstreetmap.org/wiki/Tag:amenity%3Dfast_food) could be additionally tagged with something like guid=309FAF94-9FC6-11DE-A978-099455D89593, and then http://xx.local.wikimedia.org/wiki/309FAF94-9FC6-11DE-A978-099455D89593would have the directory information (in the xx language). The wiki would probably contain lots of information not included in OSM (OSM are IMHO a little bit overly copyright-paranoid), but in theory one could make a smartphone app which lets you find the nearest fast food restaurants (OSM), take you to a description page for the nearest one (wiki), provide driving directions (OSM)...
You have this directory already! Just extract all the deleted articles that are not notible. you will have all the small business that got deleted.
In fact that is the biggest pain that people face, they get deleted. We need some way to keep pages from being deleted, but moved.
Wikilocal, Wikiunnotable Wikitrivia Wikihometown
these are the types of things that I would like to see. Something that lowers the notibility and allows clubs, bands and cafes to become listed, but maintaining the wikipedia level of neutrality.
When I was in Prizren, Kosovo, I took pictures of every street intersection. There are lots of shops and places that would never make it to the wikipedia but would be interesting to someone who wants to visit the place.
here are some of my dumps in the archive.org : http://www.archive.org/details/KosovoPrizren4 http://www.archive.org/details/KosovoPrizren3 http://www.archive.org/details/KosovoPrizren2 http://www.archive.org/details/KosovoPrizren
Also have started to create articles for the most famous religious buildings. What would be nice is to put all the streets on there. Even to put some streets from OSM in the wiki sure why not?
Was just playing this online monopoly game, some of our work on OSM in kosovo is now showing up there : http://img11.yfrog.com/i/bildschirmfoto1h.png/
On Sat, Sep 12, 2009 at 2:17 PM, jamesmikedupont@googlemail.com < jamesmikedupont@googlemail.com> wrote:
You have this directory already! Just extract all the deleted articles that are not notible. you will have all the small business that got deleted.
In fact that is the biggest pain that people face, they get deleted. We need some way to keep pages from being deleted, but moved.
I think first there needs to be a place to move them to. OSM has come a long way since I first proposed Wikiteer, which was meant to be a combination of items of local interest which were deleted as non-notable plus the GIS support structure to map them. Nowadays I'm comfortable letting OSM manage the GIS/mapping aspect and sticking to just the description pages (phone numbers, addresses, websites, operating hours, maybe even some topic specific info like menus or admission prices or show schedules).
Maybe it's finally time to reintroduce the idea. But I'm going to let someone else deal with the bureaucracy.
On Sat, Sep 12, 2009 at 8:37 PM, Anthony wikimail@inbox.org wrote:
On Sat, Sep 12, 2009 at 2:17 PM, jamesmikedupont@googlemail.com < jamesmikedupont@googlemail.com> wrote:
You have this directory already! Just extract all the deleted articles that are not notible. you will have all the small business that got deleted.
In fact that is the biggest pain that people face, they get deleted. We need some way to keep pages from being deleted, but moved.
I think first there needs to be a place to move them to. OSM has come a long way since I first proposed Wikiteer, which was meant to be a combination of items of local interest which were deleted as non-notable plus the GIS support structure to map them. Nowadays I'm comfortable letting OSM manage the GIS/mapping aspect and sticking to just the description pages (phone numbers, addresses, websites, operating hours, maybe even some topic specific info like menus or admission prices or show schedules).
OSM supports address and websites. It could store all the data in fact, it is free form. you could have key=price-food-hambuger val=1$ if you want.
Basically you can store all the data that google local does in osm.
But people want to have a wikipage for themselves. Small business etc.
mike
On Sat, Sep 12, 2009 at 2:51 PM, jamesmikedupont@googlemail.com < jamesmikedupont@googlemail.com> wrote:
On Sat, Sep 12, 2009 at 8:37 PM, Anthony wikimail@inbox.org wrote:
On Sat, Sep 12, 2009 at 2:17 PM, jamesmikedupont@googlemail.com < jamesmikedupont@googlemail.com> wrote:
You have this directory already! Just extract all the deleted articles that are not notible. you will have all the small business that got deleted.
In fact that is the biggest pain that people face, they get deleted. We need some way to keep pages from being deleted, but moved.
I think first there needs to be a place to move them to. OSM has come a long way since I first proposed Wikiteer, which was meant to be a combination of items of local interest which were deleted as non-notable plus the GIS support structure to map them. Nowadays I'm comfortable letting OSM manage the GIS/mapping aspect and sticking to just the description pages (phone numbers, addresses, websites, operating hours, maybe even some topic specific info like menus or admission prices or
show
schedules).
OSM supports address and websites. It could store all the data in fact, it is free form. you could have key=price-food-hambuger val=1$ if you want.
Addresses, fine. Websites, maybe (it's kind of redundant putting website= http://www.publix.com/ on every single store location). Phone numbers, I'd say no - that's too transient and not really location based in the first place. Prices, absolutely not - way too transient for a GIS.
Basically you can store all the data that google local does in osm.
You could, but personally I don't think that'd be very good from a design perspective. OSM could, of course, have a separate database for this type of information, but that seems like the kind of thing that the WMF does better.
I can pretty much guarantee you that Google Local doesn't put phone numbers in its GIS database under the nodes table, but puts that stuff in a separate database (or at least a separate table) and links the two databases/table together.
This is especially true when it comes to relatively long freeform text descriptions, both from a perspective of performance (you want this stuff in separate tables) and editing (I don't want to use Potlatch or JOSM to update the operating hours of my local library).
Does OSM maintain a persistent unique ID for every node/way? That's probably the way to do it. Also solves to some extent the copyright problem. OSM can add nodes using sources its feels comfortable with, and Wikipedia can geolocate articles using sources its comfortable with, and the two can be tied together with the OSM node's unique ID - even if the positions are a few feet different due to different sourcing.
On Sat, Sep 12, 2009 at 3:13 PM, Anthony wikimail@inbox.org wrote:
Does OSM maintain a persistent unique ID for every node/way?
Answering my own question, yes.
<node id="26127031" lat="45.7644547" lon="4.8280137" version="1" changeset="223034" user="FredB" uid="1626" visible="true" timestamp="2007-02-24T17:39:12Z"> <tag k="name" v="Temple du Change" /> <tag k="religion" v="christian" /> <tag k="amenity" v="place_of_worship" /> </node>
And http://fr.wikipedia.org/wiki/Temple_du_Change , which locates the temple at the rounded 45.764444, 4.827778, thus making it difficult to link the two points together.
Just updated http://fr.wikipedia.org/wiki/Temple_du_Change has a link to this : http://www.openstreetmap.org/browse/node/26127031
But realistically, we need to be able to match them together with error tolerances. These two small differences are very small. see the red dot: http://yfrog.com/elbildschirmfotoopenstreep
Of course we would have to check them all. But if you look at the Satellite photo you will see : it takes up the whole block: http://maps.google.de/maps?f=q&source=s_q&hl=en&geocode=&q=T...
Therefore they are all right, within a tolerance of a building block.
mike
On Sat, Sep 12, 2009 at 9:37 PM, Anthony wikimail@inbox.org wrote:
On Sat, Sep 12, 2009 at 3:13 PM, Anthony wikimail@inbox.org wrote:
Does OSM maintain a persistent unique ID for every node/way?
Answering my own question, yes.
<node id="26127031" lat="45.7644547" lon="4.8280137" version="1" changeset="223034" user="FredB" uid="1626" visible="true" timestamp="2007-02-24T17:39:12Z"> <tag k="name" v="Temple du Change" /> <tag k="religion" v="christian" /> <tag k="amenity" v="place_of_worship" />
</node>
And http://fr.wikipedia.org/wiki/Temple_du_Change , which locates the temple at the rounded 45.764444, 4.827778, thus making it difficult to link the two points together. _______________________________________________ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
On Sat, Sep 12, 2009 at 3:48 PM, jamesmikedupont@googlemail.com < jamesmikedupont@googlemail.com> wrote:
Just updated http://fr.wikipedia.org/wiki/Temple_du_Change has a link to this : http://www.openstreetmap.org/browse/node/26127031
But realistically, we need to be able to match them together with error tolerances.
Using the unique ID is probably better, and works with ways like:
http://www.openstreetmap.org/browse/way/6678417 ( http://en.wikipedia.org/wiki/Jacqueline_Kennedy_Onassis_Reservoir)
and
http://www.openstreetmap.org/browse/way/12174465 ( http://en.wikipedia.org/wiki/Broad_Street_(Philadelphia))
which are currently linked in Wikipedia by coordinates. (Hmm, I guess the the latter is a problem when the road is split into multiple ways, this might need some sort of relation.)
Which is getting a little off topic, since the idea is to include less notable places that aren't accepted into Wikipedia (the playground around the corner from my house). But the same principles will apply.
Now I'm going to get yelled at for posting too much, but better links would probably be:
http://www.openstreetmap.org/?way=12174465 http://www.openstreetmap.org/?way=6678417
On Sat, Sep 12, 2009 at 4:23 PM, Anthony wikimail@inbox.org wrote:
http://www.openstreetmap.org/browse/way/6678417 ( http://en.wikipedia.org/wiki/Jacqueline_Kennedy_Onassis_Reservoir)
and
http://www.openstreetmap.org/browse/way/12174465 ( http://en.wikipedia.org/wiki/Broad_Street_(Philadelphia))
When I was in Prizren, Kosovo, I took pictures of every street intersection. There are lots of shops and places that would never make it to the wikipedia but would be interesting to someone who wants to visit the place.
here are some of my dumps in the archive.org : http://www.archive.org/details/KosovoPrizren4 http://www.archive.org/details/KosovoPrizren3 http://www.archive.org/details/KosovoPrizren2 http://www.archive.org/details/KosovoPrizren
Also have started to create articles for the most famous religious buildings. What would be nice is to put all the streets on there. Even to put some streets from OSM in the wiki sure why not?
Well, ideally we will have wikipedia articles about every street in Prizren, and they are going to be illustrated by your pictures (I hope there is fop in Kosovo).
Cheers Yaroslav
On Sat, Sep 12, 2009 at 9:25 PM, Yaroslav M. Blanter putevod@mccme.ru wrote:
Well, ideally we will have wikipedia articles about every street in Prizren, and they are going to be illustrated by your pictures (I hope there is fop in Kosovo).
Yes? I can create them.
How would I name them? Prizren/FaradinHoti?
What would they look like? Like this? http://en.wikipedia.org/wiki/Bill_Clinton_Boulevard
What would the notability be? Some streets are really not interesting.
mike
Well, ideally we will have wikipedia articles about every street in Prizren, and they are going to be illustrated by your pictures (I hope there is fop in Kosovo).
Yes? I can create them.
How would I name them? Prizren/FaradinHoti?
What would they look like? Like this? http://en.wikipedia.org/wiki/Bill_Clinton_Boulevard
What would the notability be? Some streets are really not interesting.
mike
I do not know so well en.wp rules, but do not streets have immanent notability? For instance, FaradinHoti (street in Prizren) or somehow else.
Cheers Yaroslav
I just added the street, and already they want to delete it. What should I put as a reason? http://en.wikipedia.org/wiki/FaradinHoti_(street_in_Prizren,_Kosovo)
On Sat, Sep 12, 2009 at 10:54 PM, Yaroslav M. Blanter putevod@mccme.ru wrote:
Well, ideally we will have wikipedia articles about every street in Prizren, and they are going to be illustrated by your pictures (I hope there is fop in Kosovo).
Yes? I can create them.
How would I name them? Prizren/FaradinHoti?
What would they look like? Like this? http://en.wikipedia.org/wiki/Bill_Clinton_Boulevard
What would the notability be? Some streets are really not interesting.
mike
I do not know so well en.wp rules, but do not streets have immanent notability? For instance, FaradinHoti (street in Prizren) or somehow else.
Cheers Yaroslav
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Hoi, I can't respond to that question about a particular project on this mailing list. But suffice it to say, at least some language Wikipedia's aren't going to include articles about streets without a reference to a published source. "I went there and took this pictures of it" doesn't qualify as verifiability.
Anthony
On Sat, Sep 12, 2009 at 5:04 PM, jamesmikedupont@googlemail.com < jamesmikedupont@googlemail.com> wrote:
I just added the street, and already they want to delete it. What should I put as a reason? http://en.wikipedia.org/wiki/FaradinHoti_(street_in_Prizren,_Kosovo)
On Sat, Sep 12, 2009 at 10:54 PM, Yaroslav M. Blanter putevod@mccme.ru wrote:
Well, ideally we will have wikipedia articles about every street in Prizren, and they are going to be illustrated by your pictures (I hope there is fop in Kosovo).
Yes? I can create them.
How would I name them? Prizren/FaradinHoti?
What would they look like? Like this? http://en.wikipedia.org/wiki/Bill_Clinton_Boulevard
What would the notability be? Some streets are really not interesting.
mike
I do not know so well en.wp rules, but do not streets have immanent notability? For instance, FaradinHoti (street in Prizren) or somehow
else.
Cheers Yaroslav
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
On Fri, Sep 11, 2009 at 11:59 PM, Robert Rohde rarohde@gmail.com wrote:
If we are just throwing out random ideas...
I've long wanted to see an open source project to create a world family tree, i.e. document the ancestry and connections between everyone ever. There are a couple high profile closed source / fee based projects aiming to do this, but no successful projects that really have open access as part of their foundation. Even if we limited such a project to just deceased individuals (as the big projects usually do) it would still be a massive undertaking and potentially very useful for researchers.
However, while a wiki could work, it would be a suboptimal approach. Much like wikispecies, genealogical information has a heavy component of structured data that could benefit from dedicated tools designed for that data. As has been suggested elsewhere, it seems that most of the things that can be easily done by a wiki are already being done either by us or by Wikia and similar third parties.
Robert,
Are you familiar with Rodovid ?
It has been mentioned in this thread by Yann, and is the top project mentioned here:
http://meta.wikimedia.org/wiki/Proposals_for_new_projects
http://meta.wikimedia.org/wiki/Rodovid
-- John Vandenberg
Lars Aronsson wrote:
Yann Forget wrote:
I started a proposal on the Strategy Wiki: http://strategy.wikimedia.org/wiki/Proposal:Building_a_database_of_all_books...
IMO this should be a join project between Openlibrary and Wikimedia.
Again, I don't understand why. What exactly is missing in OpenLibrary? Why does it need to be a new, joint project?
The page says "There is currently no database of all books ever published freely available." But OpenLibrary is a project already working towards exactly that goal. It's not done yet, and its methods are not yet fully developed. But neither would your new "joint" project be, for a very long time.
Wikipedia is also far from complete, far from containing "the sum of all human knowledge". But that doesn't create a need to start entirely new encyclopedia projects. It only means more contributors are needed in the existing Wikipedia.
You just give again the same arguments, to which I have answered. Did you read my answer?
Regards,
Yann
wikimedia-l@lists.wikimedia.org