Hi folks,
0. intro/about Wikimania 1. POTY07 2. 1000FP book 3. Project with Wikimedia Israel 4. Goals statement 5. Tech 6. SVGs 7. FLOSS Manuals 8. Joi, CC 9. Licenses, GFDL, GSFDL drafts (long)
I'm back now from Wikimania. It was a really exciting event, and just so cool to be able to speak to so many different people about lots of different things. I will write below a summary of some of the Commons-related things I talked to people about.
Florence used a slide of the Wikimedia logo mosaic in her opening speech. I felt very proud for us :) and I still think that was a great project. I think maybe it made a good impression on the board in terms of our relationship with them.
I gave a 'Commons HOWTO tutorial' which started with about a dozen people attending and by the end there was another dozen or two. I talked briefly about how Commons works, deriv works, licenses, categories, CommonSense tool, Mayflower, Extra-tabs.js (although I didn't call it that, and one person commented that they had never seen them before - we DO have them installed for EVERYONE, right??). We did a transfer of an image from en.wp to Commons. We got some tough questions like 'I'm uploading these for my wife - what should I do vis-a-vis permission?' (I said ask her to send you a confirmation email. This case of genuine verbal permission is tricky to handle.) It was fun.
BTW I hope next time that a lot more Commoners are there. :) This was more or less your Commons representation: http://commons.wikimedia.org/wiki/Image:Wikimania_2007_Commons_puzzle_piece.... It was super fun, and anyone who is a 'wikiholic' should at the very least apply for a scholarship!!
==POTY07== I will post separately about this. But I think we should start getting into gear or at least thinking about things. I think we should close submissions by the end of Nov, hold two weeks of voting, and use the results to make calendar/s. I also think we should have 12-15 different categories of voting and more or less use those as the calendar pictures.
==1000FP book== I did a 5-min lightning talk mentioning the idea about the 1000FP talk and Evan P... (Wikitravel guy) was there. He told me he knew that Flickr had done some partnership with Blurb.com (print on demand) with a book called something like '24 hours on Flickr'. And apparently they are interested in this kind of 'new technology' internet stuff. So I'm going to try and talk to Blurb and see what they think.
Blurb was something I originally thought might work because print-on-demand saves a lot of hassle like worrying about sales numbers, and Blurb also have a special template for photobooks. Let's face it if you're going to make a photobook you really want decent print quality.
I think we should try to distribute books (of Wikimedian works)+CDs (with all 1000FP). According to CatScan we are up to 913.
I now think that instead of asking/wishing for the board to promote us, we need to force them to have to talk about us by creating the interest ourselves. I really think this project could create some serious interest. And maybe if enough proceeds go to the WMF, then we can demand with a little more conviction that they should be earmarked for Commons-specific development. :)
Reminder: if you're interested in book development, please join http://groups.google.com/group/commonsbook .
==Project with Wikimedia Israel== I spent some time talking to a lovely chap named Dror (User:Drork) who describes himself as the "foreign minister" of the Wikimedia Israel chapter. Their chapter is planning to start a big project with another company in Israel to ask people to look for historically significant images in their private collections and donate them to the public domain. (Israel is having some anniversaries so it is timely for them.) He basically wanted to know that such images would be welcome in Commons. I said of course! Then we discussed about how the images should be collected. We decided it would be better if the images were submitted to a gateway before being added to Commons. The organisation that he is working with is willing to write the open source SW to collect all the structured data that they want to collect (and also deal with hebrew, arabic etc interface stuff which we do poorly at best). So the idea at the moment is that their partner org will write this interface, we will install it on the toolserver, and from there it can be appropriately formatted etc and go into Commons. So we have quite a few toolserver users working for Commons, that won't be a problem I think, just thought that was a pretty cool project. They plan to run a pilot project and if it's successful (they don't know what kind of stuff people will even submit), then a full-on project may run for like a year.
==Goals statement== I am even more convinced now that having a clear goals/aims statement will help guide the growth of the wiki by providing direction. So I plan to try and work on this. If you are interested in it then let me know (or keep an eye on my userspace :)).
==Tech== I spent a little time talking to Tim Starling about Commons' tech priorities and even showed him the Mayflower search engine which he was unaware of (!). I intend to write up a summary for wikitech-l about what I consider are Commons' tech priorities. One of them will be replacing/integrating Mayflower as our search engine. I really consider this a major priority for us.
==SVGs== Apparently the Inkscape founder went to Wikimania. I only found this out at the airport as I was leaving. :( I am thinking of trying to contact him to make them aware of the work we are doing with SVGs, because as far as I can see no one else anywhere is doing the kind of things we do with them. Surely someone somewhere must care! :P
==FLOSS Manuals== I ended up spending quite a bit of time talking to Adam from FLOSS Manuals. http://flossmanuals.net/ He gave me a copy of their first printed manual on Audacity. Part of their thing is really concentrating on high-quality documentation. I suggested a chapter on how to edit human voice recordings would be really useful for Wikimedians. He told me they use TWiki. If you check out their site and the section called 'Remix' you can see how it's possible to pick and choose particular chapters to be included in a print-on-demand book. Now this is amazingly cool. I imagine a really useful thing - FLOSS Manuals having a 'Wikimedia Commons Media Handbook' with chapters on Inkscape & GIMP, and Audacity & Ogg Theora (same book? separate book?).
Now our Images for Cleanup people already know a lot of this stuff. If Commons people think something like this could be useful, I will let them know there is interest in it. If Commons people are interested in writing for it, that's all the better I think. I think he is even trying to get sponsorship to be able to retrospectively pay contributors.
I think there's just great power in having something in your hands to demonstrate what you're doing and help spread the word, convince others. Those of us who find this kind of tech second nature are a very small minority.
==Joi, CC== Joi gave a speech about the 'sharing economy' and I managed to talk to him briefly afterwards, about how I have this feeling that we have this great resource that no one knows anything about. We have the Commons know-how, he has the media contacts, maybe we can make something happen.
I think it's a very cool thing that some of the Creative Commons people take part on this list. To me it shows that they are reaching out and want to work with us and I think we should be reaching back too. Wikimedia is a bit insular sometimes... We have much more in common than we do different.
==Licenses, GFDL, GSFDL drafts== So there was actually a surprising number of talks about licenses and licensing. Although there are some of us who know a ton about licenses there are still many of us who don't. So although some of us will feel like we are repeating ourselves it still seems really important that we keep re-iterating the basic messages about how copyright works, how free licenses work, how to use them, etc. I mean at the same time we still have to deal with the nitty-gritty crap like interpreting licenses in different jurisdictions and deriv works and the like. So it's not easy but I think it's really important we keep walking along both lines.
I went to this talk http://wikimania2007.wikimedia.org/wiki/Proceedings:MH1 in which I found out a new draft of the GFDL is open for comment at the moment. The new GFDL has a clause about 'excerpts' where an excerpt can be distributed without the full license text. see http://gplv3.fsf.org/fdl-draft-2006-09-22.html clause 6. Bizarrely this mentions how an excerpt of a text document should be interpreted, and an audio and video document, but not an image. Er???? Is it that an excerpt of an image cannot exist? Is an excerpt of an image like a crop, or a thumbnail, or like nothing at all? If 'excerpt' has no meaning the field of images then they should explicitly say that, I think.
The other complaint I have about this new draft is that the wording is still oriented to text documents (and even very long ones at that), despite the fact that they acknowledge other works can and are used. How should a 'title page' be interpreted in terms of an image? I intend to make a comment asking them to reword such things or else explicitly state the interpretation of the terms for various media.
It also has this important clause 8a: "If the Work has no Cover Texts and no Invariant Sections then you may relicense the Work under the GNU Simpler Free Documentation License."
So maybe we (Wikimedia) could have this: (now) GFDL1.2 -> GFDL2 -> GSFDL === CC-BY-SA?
OK why is this important? Creative Commons' goals are not necessarily ours, etc etc. I think this is a direction we should try to go forward in because in essence we have the same goals, and keeping distinct licenses for no good reason makes understanding the situation unnecessarily complex, and thus harms our ability to communicate our mission and vision to as wide an audience as possible.
Note 'no good reason'. If good reasons exist, we should keep the distinctions.
Is the 'fulltext-copy' condition of the GFDL a good reason? I don't recall ever seeing anyone put forward a reason why it's actually vitally necessary.
So, GSFDL.
The GSFDL currently has a very stupid clause called '0a. FREE MANUALS ARE ESSENTIAL'. When they just acknowledged that the license can be used for even non-text works, I dislike the inclusion of this unnecessary clause a lot.
The GSFDL also has the 'Excerpts' thing. The main difference between the GFDL and the GSFDL is this in the GSFDL:
"You need not include a copy of this License in the Work if you have registered the work's license with a national agency that maintains a network server through which the general public can find out its license."
Hm, I don't know quite what that implies, but hopefully it leads to a situation of not needing to copy the license fulltext with each use. That was my impression from the talk.
regards Brianna user:pfctdayelise
I just love the sound of this project. I wish I could help with translating interfaces into he: or ar:, but unfortunately I cannot speak those languages. I wish them the best of luck. This is exactly the kind of project that could get people curious about free licenses; the urge to share family history (old documents, letters, photos, and in the case of Israel precious war and pre-war photos) can be quite strong. --Maria User:Arria Belli
On 8/7/07, Brianna Laugher brianna.laugher@gmail.com wrote:
==Project with Wikimedia Israel== I spent some time talking to a lovely chap named Dror (User:Drork) who describes himself as the "foreign minister" of the Wikimedia Israel chapter. Their chapter is planning to start a big project with another company in Israel to ask people to look for historically significant images in their private collections and donate them to the public domain. (Israel is having some anniversaries so it is timely for them.) He basically wanted to know that such images would be welcome in Commons. I said of course! Then we discussed about how the images should be collected. We decided it would be better if the images were submitted to a gateway before being added to Commons. The organisation that he is working with is willing to write the open source SW to collect all the structured data that they want to collect (and also deal with hebrew, arabic etc interface stuff which we do poorly at best). So the idea at the moment is that their partner org will write this interface, we will install it on the toolserver, and from there it can be appropriately formatted etc and go into Commons. So we have quite a few toolserver users working for Commons, that won't be a problem I think, just thought that was a pretty cool project. They plan to run a pilot project and if it's successful (they don't know what kind of stuff people will even submit), then a full-on project may run for like a year.
On 8/7/07, Brianna Laugher brianna.laugher@gmail.com wrote: [snip]
I went to this talk http://wikimania2007.wikimedia.org/wiki/Proceedings:MH1 in which I found out a new draft of the GFDL is open for comment at the moment. The new GFDL has a clause about 'excerpts' where an excerpt can be distributed without the full license text. see
[snip]
My apologies for not mentioning the GFDL/SFDL drafts on commons-l before. I'd brought them up on foundation-l and wiken-l a few times and thought I'd pointed them out on commons-l, but it seems I did not.
We have a page setup for FDL suggestions at: http://meta.wikimedia.org/wiki/GFDL_suggestions
It does not currently have any related to the excerpting rules, though it very much needs some and I would have already written some but I didn't want to monopolize the page. *Please* take some time to put out your ideas there.
So maybe we (Wikimedia) could have this: (now) GFDL1.2 -> GFDL2 -> GSFDL === CC-BY-SA? OK why is this important? Creative Commons' goals are not necessarily ours, etc etc. I think this is a direction we should try to go forward in because in essence we have the same goals, and keeping distinct licenses for no good reason makes understanding the situation unnecessarily complex, and thus harms our ability to communicate our mission and vision to as wide an audience as possible. Note 'no good reason'. If good reasons exist, we should keep the distinctions.
The 'good reason' I see is that, according to a somewhat recent post by Lessig that I can't currently find, the creative commons thinks -SA ought to be a minimal/weak copyleft. I.e. it's okay to use an unmodified SA work as in integral part of a non-free work.
This is more of a copyleft in the way that LGPL is copyleft. The GFDL is a more typical copyleft license (like the GPL) [http://www.fsf.org/blogs/licensing/2007-05-08-fdl-scope].
It's very important to have a clearly copyleft free content license. A full copyleft has the impact of using the pre-existing base of free content to encourage the creation of more free content, while a reduced copyleft doesn't really have that impact... but thats a discussion for another place.
I don't think having two popular distinct free licenses is harmful. But what is utterly important is that it be made abundantly clear that it is okay to create works combining them, and what the rules should be governing this. This isn't the case today.
Getting compatibility right might be somewhat hard but it will be a worthwhile effort. You might find a page I wrote on the compatibility subject some months ago interesting: http://commons.wikimedia.org/wiki/User:Gmaxwell/An_alternative_model_for_lic...
On Tue, 2007-08-07 at 08:42 -0400, Gregory Maxwell wrote:
The 'good reason' I see is that, according to a somewhat recent post by Lessig that I can't currently find,
Probably http://lists.ibiblio.org/pipermail/cc-community/2007-April/001703.html or another in that month's threads.
the creative commons thinks -SA ought to be a minimal/weak copyleft. I.e. it's okay to use an unmodified SA work as in integral part of a non-free work.
No, if the work B qualifies as a derivative work (US)/adaption -- see http://creativecommons.org/licenses/by-sa/3.0/us/legalcode / http://creativecommons.org/licenses/by-sa/3.0/legalcode -- of work A licensed under BY-SA, then work B has to be licensed under BY-SA.
The entire question is whether an article containing a photo is a derivative work/adaption of the photo. This was argued much, inconclusively as far as I can tell (early on it looked like "no" to me, but re-reading I'm not sure at all), in the thread above and also on cc-licenses, see threads in http://lists.ibiblio.org/pipermail/cc-licenses/2007-February/thread.html and http://lists.ibiblio.org/pipermail/cc-licenses/2007-March/thread.html
But BY-SA is not generally intended to be weak copyleft. It goes out of its way to say that merely syncing audio and video creates a derivative, with the SA requirement, for example.
Of course if an article that includes image is considered an adaption of the image, this only serves to highlight the need for some form of FDL/BY-SA compatibility. :)
Mike
On 8/7/07, Mike Linksvayer ml@creativecommons.org wrote:
The entire question is whether an article containing a photo is a derivative work/adaption of the photo.
I don't think thats a matter open for question in the strict sense:
I create item X and offer it on the market. Item X contains item Y which is copyrighted by you. As a result Item X as a whole is encumbered by your copyright interest and can not be legally distributed without a license unless item Y is removed.
The open question is does the license in question permit this. Both the FDL and the CC-SA licenses have terms allowing collections/aggregations with non-free works.
I, and others, previously argued that both of them define collections/aggregations very narrowly:
From CC-by-sa-2.0: ""Collective Work" means a such as a periodical
issue, anthology or encyclopedia, in which the Work in its entirety in unmodified form, along with a number of other contributions, constituting separate and independent works in themselves, are assembled into a collective whole"
I think the language of this text is very clear about the character of the things which are considered a collection... it's something where multiple wholly independent works and stitched together. Not something where where works have been combined to produce an enhanced work.
But this has come up, and it seemed pretty conclusive to me that people on the cc-lists thought it was okay under SA to create an non-free work out of SA ones.
If thats actually the intent I presume the license text will eventually be changed to allow it, even if it doesn't today.
Perhaps I mistaken in claiming it was the intent, I don't know the intent. I do know that Lessig claimed it was the current operation, and went as far as to claim that the FDL had the same behavior when it was argued that the FDL was better in this regard. That claim defied a common sense reading of the license, and was firmly and unambigously rejected by the FSF.
[snip]
But BY-SA is not generally intended to be weak copyleft. It goes out of its way to say that merely syncing audio and video creates a derivative, with the SA requirement, for example.
It was argued in those threads that the video/audio case is special. I didn't really understand that argument. But I don't really understand the application for a copyleft license which, in the case of images, doesn't encoumber across the number typical method of building new works from images. (i.e. stock photography)
Of course if an article that includes image is considered an adaption of the image, this only serves to highlight the need for some form of FDL/BY-SA compatibility. :)
Absoultely.
Thats also a reason why we encourage people to dual license their works. (And a fairly large number of our contributors do so).
On 07/08/07, Gregory Maxwell gmaxwell@gmail.com wrote:
We have a page setup for FDL suggestions at: http://meta.wikimedia.org/wiki/GFDL_suggestions
It does not currently have any related to the excerpting rules, though it very much needs some and I would have already written some but I didn't want to monopolize the page. *Please* take some time to put out your ideas there.
Sure. But what is going to happen to that page? Will someone definitely make it some kind of official submission? Because if we're talking amongst ourselves about the flaws but those discussions never reach the FSF then that's kind of a problem. :)
Incidentally I wonder if it is not worth having a separate licenses-l or copyright-l list for Wikimedia. A lot of this discussion is highly technical and not necessarily of interest to other subscribers who are interested in Commons. The scope of such discussions is also wider than the Commons community.
cheers Brianna
On 8/7/07, Brianna Laugher brianna.laugher@gmail.com wrote:
On 07/08/07, Gregory Maxwell gmaxwell@gmail.com wrote:
We have a page setup for FDL suggestions at: http://meta.wikimedia.org/wiki/GFDL_suggestions
Sure. But what is going to happen to that page? Will someone definitely make it some kind of official submission? Because if we're talking amongst ourselves about the flaws but those discussions never reach the FSF then that's kind of a problem. :)
Of course! The page says "specifically so that we can go to the FSF and talk about changes that Wikimedia would like to see".
The page was actually created after an informal recommendation by some of the FSF folks that Kat and I met with after the 2007 FSF members meeting.
There is a high level of interest it reaching to meet every reasonable need of ours in the license. In the time since then their willingness to work with us could only have increased with the appointment of [[Mako Hill]], who is as much of a free content person as a free software person, to the board of the FSF.
So input from our community is needed... but input from our community should actually come from our community, and not from someone like me just blathering on. :) ...Which is why it's important that more people go and make comments on the GFDL suggestions page, even if it's just in the form of "I think this sounds good" or "I don't understand this" to the material already there.
Incidentally I wonder if it is not worth having a separate licenses-l or copyright-l list for Wikimedia. A lot of this discussion is highly technical and not necessarily of interest to other subscribers who are interested in Commons. The scope of such discussions is also wider than the Commons community.
Good idea. Although, I think we must try to encourage a broader set of users to at least come listen: Other groups have seen their licensing discussions taken in weird directions by having the discussions only among the most interested parties. These are issues that impact *everyone* in our projects even those who do not care about the technical aspects of licensing discussion. Without a broad set of eyeballs there our recommendations can not be considered a valid representation of the needs of the community.
"Brianna Laugher" brianna.laugher@gmail.com wrote on Wednesday, August 08, 2007 2:17 AM:
Incidentally I wonder if it is not worth having a separate licenses-l or copyright-l list for Wikimedia. A lot of this discussion is highly technical and not necessarily of interest to other subscribers who are interested in Commons. The scope of such discussions is also wider than the Commons community.
Sounds great. Maybe even together with other projects, so people can see that Fair use is no solution :)
On the other hand people interested only in Commons won't have to read all that license stuff but can be notified by a discussed digest that just mentions the every day aspects.
Best regards,
Flo
On 8/7/07, Brianna Laugher brianna.laugher@gmail.com wrote:
Hi folks,
Hi,
I gave a 'Commons HOWTO tutorial' which started with about a dozen people attending and by the end there was another dozen or two. I talked briefly about how Commons works, deriv works, licenses, categories, CommonSense tool, Mayflower, Extra-tabs.js (although I didn't call it that, and one person commented that they had never seen them before - we DO have them installed for EVERYONE, right??). We did a transfer of an image from en.wp to Commons. We got some tough questions like 'I'm uploading these for my wife - what should I do vis-a-vis permission?' (I said ask her to send you a confirmation email. This case of genuine verbal permission is tricky to handle.) It was fun.
[offtopic] I am going to do a lightning talk about Commons on Wikimedia Conference Netherlands in October. I will have a quick talk about most of the things pfctdayelise mentioned, but I wonder if you folks have some ideas about some specific things I should mention? (We do have a bad reputation at the Dutch wikis, you know...)
==POTY07== I will post separately about this. But I think we should start getting into gear or at least thinking about things. I think we should close submissions by the end of Nov, hold two weeks of voting, and use the results to make calendar/s. I also think we should have 12-15 different categories of voting and more or less use those as the calendar pictures.
Absolutely. We should start with the preparations somewhere around October. Last year's edition was a great succes, but it also was a great mess. I hope we can make this year's better :)
I now think that instead of asking/wishing for the board to promote us, we need to force them to have to talk about us by creating the interest ourselves. I really think this project could create some serious interest. And maybe if enough proceeds go to the WMF, then we can demand with a little more conviction that they should be earmarked for Commons-specific development. :)
Hmm. Do we have a Commons tshirt? I want one.
Cheers, Bryan
On 08/08/07, Bryan Tong Minh bryan.tongminh@gmail.com wrote:
[offtopic] I am going to do a lightning talk about Commons on Wikimedia Conference Netherlands in October. I will have a quick talk about most of the things pfctdayelise mentioned, but I wonder if you folks have some ideas about some specific things I should mention? (We do have a bad reputation at the Dutch wikis, you know...)
Well, when you take questions, it's likely this will come up. Please just reiterate that we Commoners are human and can make mistakes, but we will also try to fix them when we can; we want to improve relationships between Commons & other wikis because it's really important to us, so please keep talking to us; we can't change past mistakes, only try and learn from them; the actions of one Common users are not necessarily representative of Commons (just in the same way the actions of one nl.wp user are not necessarily representative of nl.wp).
Just listening to people's complaints is very important. Don't brush them off. If people get upset, it means we didn't handle it well, even if we were right. ***Being right is not enough.*** I really believe this. I think it's quite powerful if we can just listen to people's complaints, and say, "I accept that you're upset and we didn't handle this well enough, and I'm sorry. What do you think we can do to improve similar situations in the future?" Try not to get into the specifics about who was right and who was wrong and the specifics.
We can (and IMO should) write help pages and interface messages specifically for people from particular wikis. So we could have Special:Upload?uselang=usersfromnlwp - if they think that could be a helpful thing.
I would mention the tools like Checkusage, Commonsticker, CommonSense. Make sure the Dutch interface for Commons is all good.
I hope this talk goes well. I was so impressed by the Dutch at Wikimania. They were some of my favourite people. :) Please report back when you've done it.
==POTY07== I will post separately about this. But I think we should start getting into gear or at least thinking about things. I think we should close submissions by the end of Nov, hold two weeks of voting, and use the results to make calendar/s. I also think we should have 12-15 different categories of voting and more or less use those as the calendar pictures.
Absolutely. We should start with the preparations somewhere around October. Last year's edition was a great succes, but it also was a great mess. I hope we can make this year's better :)
Please participate at http://commons.wikimedia.org/wiki/Commons:Picture_of_the_Year/2007/Preparati...
I now think that instead of asking/wishing for the board to promote us, we need to force them to have to talk about us by creating the interest ourselves. I really think this project could create some serious interest. And maybe if enough proceeds go to the WMF, then we can demand with a little more conviction that they should be earmarked for Commons-specific development. :)
Hmm. Do we have a Commons tshirt? I want one.
No. And no stickers. Please contact a chapter and hassle them to produce these. And then send me some. :)
cheers Brianna
Hi
I saw a few hours ago, your "commons wish list" on wikitech (http://lists.wikimedia.org/pipermail/wikitech-l/2007-August/032652.html)
Brianna Laugher wrote:
Hi folks,
- intro/about Wikimania
- POTY07
- 1000FP book
- Project with Wikimedia Israel
- Goals statement
- Tech
- SVGs
- FLOSS Manuals
- Joi, CC
- Licenses, GFDL, GSFDL drafts (long)
I'm back now from Wikimania. It was a really exciting event, and just so cool to be able to speak to so many different people about lots of different things. I will write below a summary of some of the Commons-related things I talked to people about.
Florence used a slide of the Wikimedia logo mosaic in her opening speech. I felt very proud for us :) and I still think that was a great project. I think maybe it made a good impression on the board in terms of our relationship with them.
Dunno if it made a good impression on the board. I just thought the idea very very cool :-)
I gave a 'Commons HOWTO tutorial' which started with about a dozen people attending and by the end there was another dozen or two. I talked briefly about how Commons works, deriv works, licenses, categories, CommonSense tool, Mayflower, Extra-tabs.js (although I didn't call it that, and one person commented that they had never seen them before - we DO have them installed for EVERYONE, right??). We did a transfer of an image from en.wp to Commons. We got some tough questions like 'I'm uploading these for my wife - what should I do vis-a-vis permission?' (I said ask her to send you a confirmation email. This case of genuine verbal permission is tricky to handle.) It was fun.
Looks like there is no proceeding about your talk :-( And I could not go to it. Can you orient me to the right place, or quickly summarize what CommonSense and Extra-tabs is ?
I saw you considered changing the regular default search box to put the mayflower tool, a priority. I kinda agree :-) I have two questions though. Why is that (does anyone know ?) that the regular search box and the mayflower tool are NOT giving the same results ? If I search "agriculture", I get widely differing set of results. It is a bit scary. Second, in the regular tool, the result display the legend of the picture, in several languages when available. Not so apparently on the mayflower tool. Am I correct ? It would be logical that this appear on the little rollover thumbnail, perhaps english by default if there is no description in the language chosen on the bottom right. Is it planned ?
BTW I hope next time that a lot more Commoners are there. :) This was more or less your Commons representation: http://commons.wikimedia.org/wiki/Image:Wikimania_2007_Commons_puzzle_piece.... It was super fun, and anyone who is a 'wikiholic' should at the very least apply for a scholarship!!
==POTY07== I will post separately about this. But I think we should start getting into gear or at least thinking about things. I think we should close submissions by the end of Nov, hold two weeks of voting, and use the results to make calendar/s. I also think we should have 12-15 different categories of voting and more or less use those as the calendar pictures.
==1000FP book== I did a 5-min lightning talk mentioning the idea about the 1000FP talk and Evan P... (Wikitravel guy) was there. He told me he knew that Flickr had done some partnership with Blurb.com (print on demand) with a book called something like '24 hours on Flickr'. And apparently they are interested in this kind of 'new technology' internet stuff. So I'm going to try and talk to Blurb and see what they think.
Blurb was something I originally thought might work because print-on-demand saves a lot of hassle like worrying about sales numbers, and Blurb also have a special template for photobooks. Let's face it if you're going to make a photobook you really want decent print quality.
I think we should try to distribute books (of Wikimedian works)+CDs (with all 1000FP). According to CatScan we are up to 913.
I now think that instead of asking/wishing for the board to promote us, we need to force them to have to talk about us by creating the interest ourselves. I really think this project could create some serious interest. And maybe if enough proceeds go to the WMF, then we can demand with a little more conviction that they should be earmarked for Commons-specific development. :)
Reminder: if you're interested in book development, please join http://groups.google.com/group/commonsbook .
eh :-) Note that I used the neat logo picture several times in talks, but fact is, it is difficult to talk of a project one has never participated a lot in, even if one think it is a great project. But there is something else to fight, the desire for journalists for "big stories" (understand, shocking stories). For example, during the Wikimania press conference, we had 4 messages to give. One was
MESSAGE 3: Our projects are becoming more sophisticated and accessible thanks the millions of volunteers we have worldwide
* WIKIMEDIA COMMONS LAUNCH RSS FEED
* Wikimedia Commons created a RSS Feed for Popular "Picture of the Day". * Picture of the Day are certified as Quality Images or Featured Pictures, and represent the best of the best in freely licensed media content available today. * A great resources for creative individuals.
Okay, it was not so new, but... Anyway, it does not tickle journalists.
So, the good question is "who do we want to tickle ?" Do you want to tickle "those who talk about things" (reaching out for journalists working in professional press in video/audio/image), or "producers of content" (reaching out to artists, perhaps in art conferences ?), or "users of content" (including a CD of images in a distribution of professional press for teachers ?)
I'd be more than happy to travel to conferences with a bunch of CD containing works, but before doing anything, the best would be to define which audience you want to reach out.
A project I would find rather cool is distribution of a CD in one of these professional monthly dedicated to photo. I am not sure they would agree though. Another I would find cool is distribution of a CD in one of these professional monthly distributed to teachers. I am more convinced of possible success. Or making a deal with a publisher so that he included an image gallery. Or a calendar to distribute at the end of the year, in all conferences/workshops/panels or generally visits, we go to. I am sure it would be nicer than a fireman calendar. But this would be significantly more expensive than distribution of a CD I guess.
I'd say, come with a neat idea, and partners, and the job largely be done already :-) and I guess we can probably spare bucks.
In case of the suggestion you have above, "book+CD", what would be the goal ? Promotion ? Distribution ? Audience ?
One suggestion I would also have (but which is more time consuming probably) is to produce selections according to audience. For example, if you have a CD with pictures of famous buildings in one country, distribute to the touristic network. But it is really much more a hassle.
==Project with Wikimedia Israel== I spent some time talking to a lovely chap named Dror (User:Drork) who describes himself as the "foreign minister" of the Wikimedia Israel chapter. Their chapter is planning to start a big project with another company in Israel to ask people to look for historically significant images in their private collections and donate them to the public domain. (Israel is having some anniversaries so it is timely for them.) He basically wanted to know that such images would be welcome in Commons. I said of course! Then we discussed about how the images should be collected. We decided it would be better if the images were submitted to a gateway before being added to Commons. The organisation that he is working with is willing to write the open source SW to collect all the structured data that they want to collect (and also deal with hebrew, arabic etc interface stuff which we do poorly at best). So the idea at the moment is that their partner org will write this interface, we will install it on the toolserver, and from there it can be appropriately formatted etc and go into Commons. So we have quite a few toolserver users working for Commons, that won't be a problem I think, just thought that was a pretty cool project. They plan to run a pilot project and if it's successful (they don't know what kind of stuff people will even submit), then a full-on project may run for like a year.
==Goals statement== I am even more convinced now that having a clear goals/aims statement will help guide the growth of the wiki by providing direction. So I plan to try and work on this. If you are interested in it then let me know (or keep an eye on my userspace :)).
200% behind you !
==Tech== I spent a little time talking to Tim Starling about Commons' tech priorities and even showed him the Mayflower search engine which he was unaware of (!). I intend to write up a summary for wikitech-l about what I consider are Commons' tech priorities. One of them will be replacing/integrating Mayflower as our search engine. I really consider this a major priority for us.
I read your list. My worry is the lack of feedback. I did not see one developer immediately jumping on one suggestion and saying "this one is cool, let me give it a try". My worry is really the lack of technical support, not enough developers. A lead. Hmmm
==SVGs== Apparently the Inkscape founder went to Wikimania. I only found this out at the airport as I was leaving. :( I am thinking of trying to contact him to make them aware of the work we are doing with SVGs, because as far as I can see no one else anywhere is doing the kind of things we do with them. Surely someone somewhere must care! :P
==FLOSS Manuals== I ended up spending quite a bit of time talking to Adam from FLOSS Manuals. http://flossmanuals.net/ He gave me a copy of their first printed manual on Audacity. Part of their thing is really concentrating on high-quality documentation. I suggested a chapter on how to edit human voice recordings would be really useful for Wikimedians. He told me they use TWiki. If you check out their site and the section called 'Remix' you can see how it's possible to pick and choose particular chapters to be included in a print-on-demand book. Now this is amazingly cool. I imagine a really useful thing - FLOSS Manuals having a 'Wikimedia Commons Media Handbook' with chapters on Inkscape & GIMP, and Audacity & Ogg Theora (same book? separate book?).
Now our Images for Cleanup people already know a lot of this stuff. If Commons people think something like this could be useful, I will let them know there is interest in it. If Commons people are interested in writing for it, that's all the better I think. I think he is even trying to get sponsorship to be able to retrospectively pay contributors.
I think there's just great power in having something in your hands to demonstrate what you're doing and help spread the word, convince others. Those of us who find this kind of tech second nature are a very small minority.
==Joi, CC== Joi gave a speech about the 'sharing economy' and I managed to talk to him briefly afterwards, about how I have this feeling that we have this great resource that no one knows anything about. We have the Commons know-how, he has the media contacts, maybe we can make something happen.
I think it's a very cool thing that some of the Creative Commons people take part on this list. To me it shows that they are reaching out and want to work with us and I think we should be reaching back too. Wikimedia is a bit insular sometimes... We have much more in common than we do different.
==Licenses, GFDL, GSFDL drafts== So there was actually a surprising number of talks about licenses and licensing. Although there are some of us who know a ton about licenses there are still many of us who don't. So although some of us will feel like we are repeating ourselves it still seems really important that we keep re-iterating the basic messages about how copyright works, how free licenses work, how to use them, etc. I mean at the same time we still have to deal with the nitty-gritty crap like interpreting licenses in different jurisdictions and deriv works and the like. So it's not easy but I think it's really important we keep walking along both lines.
I went to this talk http://wikimania2007.wikimedia.org/wiki/Proceedings:MH1 in which I found out a new draft of the GFDL is open for comment at the moment. The new GFDL has a clause about 'excerpts' where an excerpt can be distributed without the full license text. see http://gplv3.fsf.org/fdl-draft-2006-09-22.html clause 6. Bizarrely this mentions how an excerpt of a text document should be interpreted, and an audio and video document, but not an image. Er???? Is it that an excerpt of an image cannot exist? Is an excerpt of an image like a crop, or a thumbnail, or like nothing at all? If 'excerpt' has no meaning the field of images then they should explicitly say that, I think.
The other complaint I have about this new draft is that the wording is still oriented to text documents (and even very long ones at that), despite the fact that they acknowledge other works can and are used. How should a 'title page' be interpreted in terms of an image? I intend to make a comment asking them to reword such things or else explicitly state the interpretation of the terms for various media.
It also has this important clause 8a: "If the Work has no Cover Texts and no Invariant Sections then you may relicense the Work under the GNU Simpler Free Documentation License."
So maybe we (Wikimedia) could have this: (now) GFDL1.2 -> GFDL2 -> GSFDL === CC-BY-SA?
OK why is this important? Creative Commons' goals are not necessarily ours, etc etc. I think this is a direction we should try to go forward in because in essence we have the same goals, and keeping distinct licenses for no good reason makes understanding the situation unnecessarily complex, and thus harms our ability to communicate our mission and vision to as wide an audience as possible.
Note 'no good reason'. If good reasons exist, we should keep the distinctions.
Is the 'fulltext-copy' condition of the GFDL a good reason? I don't recall ever seeing anyone put forward a reason why it's actually vitally necessary.
So, GSFDL.
The GSFDL currently has a very stupid clause called '0a. FREE MANUALS ARE ESSENTIAL'. When they just acknowledged that the license can be used for even non-text works, I dislike the inclusion of this unnecessary clause a lot.
The GSFDL also has the 'Excerpts' thing. The main difference between the GFDL and the GSFDL is this in the GSFDL:
"You need not include a copy of this License in the Work if you have registered the work's license with a national agency that maintains a network server through which the general public can find out its license."
Hm, I don't know quite what that implies, but hopefully it leads to a situation of not needing to copy the license fulltext with each use. That was my impression from the talk.
regards Brianna user:pfctdayelise
Can you better explain what a "coffee table book" is ?
ant
Hi Florence, welcome to commons-l :)
I gave a 'Commons HOWTO tutorial' which started with about a dozen people attending and by the end there was another dozen or two. I talked briefly about how Commons works, deriv works, licenses, categories, CommonSense tool, Mayflower, Extra-tabs.js (although I didn't call it that, and one person commented that they had never seen them before - we DO have them installed for EVERYONE, right??). We did a transfer of an image from en.wp to Commons. We got some tough questions like 'I'm uploading these for my wife - what should I do vis-a-vis permission?' (I said ask her to send you a confirmation email. This case of genuine verbal permission is tricky to handle.) It was fun.
Looks like there is no proceeding about your talk :-( And I could not go to it. Can you orient me to the right place, or quickly summarize what CommonSense and Extra-tabs is ?
It was a very hands-on thing, not prepared, so there are no proceedings.
CommonSense is a tool on the toolserver created by user:Duesentrieb. It is here: http://tools.wikimedia.de/~daniel/WikiSense/CommonSense.php You can put in keywords on this page and it will suggest existing categories that may be relevant. It often suggests categories that are not very relevant, though, so it has to be used with caution. :) It is a useful tool because we encourage all files to be tagged by categories or placed in galleries. Unless you are familiar with the category system, it can be hard to find the right category, so this tool can give you some good hints.
Extra-tabs is a Javascript file that puts extra tabs at the top of certain pages, for logged in users. [if you are logged in and don't see these tabs, please let me know.]
On image pages after the 'watch' tab you get a 'check usage' tab, which will tell you where the image is being used in Wikimedia projects. This is fun in itself, but also used by admins when deleting files. There are also 'find categories' (get CommonSense category suggestions based on which pages this image is used in), 'log' (see logs for this file) and 'en' (look at this image page on enwikipedia. this was needed when there was several-month lag for enwikipedia on the toolserver, but it is now fixed).
On category pages you get a tab that says 'catscan'. This links to another tool by Duesentrieb that lets you do very useful searches within the category. If you work with categories it's a really, really useful tool.
On user/talk pages you get 'gallery', 'orphans' and 'untagged'. These also are all tools by Duesentrieb. 'gallery' is like a visual Special:Contributions that shows your uploaded files and some info about them. 'Orphans' is the same but only lists the images of yours that aren't used anywhere in Wikimedia. It's good to put your files to use somewhere, so this tool helps that. 'Untagged' I think gives the user's uploaded files that haven't been given a category/put in a gallery or a license template. So use this tool to find out which of a user's images need fixing up.
As you can see we have a remarkable amount of infrastructure built on Javascript and the toolserver. The toolserver was only meant for nifty toys, I think, not core functionality. For Commons, these things are core functionality.
I will let Tangotango or Gmaxwell talk about why the Mayflower/normal search results are different...
Note that I used the neat logo picture several times in talks, but fact is, it is difficult to talk of a project one has never participated a lot in, even if one think it is a great project.
Does that mean the Board will only talk about projects they are directly involved in? That should be about 10 at most. That would be disappointing. :P
But there is something else to fight, the desire for journalists for "big stories" (understand, shocking stories). For example, during the Wikimania press conference, we had 4 messages to give. One was
MESSAGE 3: Our projects are becoming more sophisticated and accessible thanks the millions of volunteers we have worldwide
WIKIMEDIA COMMONS LAUNCH RSS FEED
Wikimedia Commons created a RSS Feed for Popular "Picture of the Day".
Picture of the Day are certified as Quality Images or Featured
Pictures, and represent the best of the best in freely licensed media content available today.
- A great resources for creative individuals.
Okay, it was not so new, but... Anyway, it does not tickle journalists.
Well, journalists in general, I am not surprised they are only interested in Wikipedia. But maybe some graphics magazine could be interested in Commons. Definitely some education people should be interested in Wikibooks and Wikiversity.
So, the good question is "who do we want to tickle ?" Do you want to tickle "those who talk about things" (reaching out for journalists working in professional press in video/audio/image), or "producers of content" (reaching out to artists, perhaps in art conferences ?), or "users of content" (including a CD of images in a distribution of professional press for teachers ?)
Hmm, all of the above? :)
The CD of images for teachers is a really good idea. We have some fantastic SVGs that would be awesome to include.
I'd be more than happy to travel to conferences with a bunch of CD containing works, but before doing anything, the best would be to define which audience you want to reach out.
Well I think we can produce different stuff for different audiences. But yes, we do need to define them before we do anything. :)
A project I would find rather cool is distribution of a CD in one of these professional monthly dedicated to photo. I am not sure they would agree though.
Yes, another cool idea. But this may be difficult as traditional photographers are often suspicious of free license stuff. Some feel they are being undercut and free licenses are just destroying their livelihood. (An issue for another time)
Or making a deal with a publisher so that he included an image gallery.
What kind of publisher?
Or a calendar to distribute at the end of the year, in all conferences/workshops/panels or generally visits, we go to. I am sure it would be nicer than a fireman calendar. But this would be significantly more expensive than distribution of a CD I guess.
Well I have some vague plans for that at the moment. My plan is to hold the "Picture of the Year" competition in early November, so that the winners can be put in a calendar. (This has not been widely discussed yet so I can't say the community supports it. But it is my plan. :)) Lulu.com and Blurb.com do print-on-demand calendars. So I think the hassle is in organising it not getting money to do it.
I'd say, come with a neat idea, and partners, and the job largely be done already :-) and I guess we can probably spare bucks.
IMO Commons needs $$ for development, not promotion. We can do a lot of grassroots promotion before we need money.
In case of the suggestion you have above, "book+CD", what would be the goal ? Promotion ? Distribution ? Audience ?
My goal would be raising awareness, so promotion I guess. And also a celebration in our own community of our own great contributors, and reaching a milestone. Imagine if you can order a book that has your photo in it. It's a cool thing.
One suggestion I would also have (but which is more time consuming probably) is to produce selections according to audience. For example, if you have a CD with pictures of famous buildings in one country, distribute to the touristic network. But it is really much more a hassle.
Yes. Maybe we can do some more work with the chapters for this.
==Tech== I spent a little time talking to Tim Starling about Commons' tech priorities and even showed him the Mayflower search engine which he was unaware of (!). I intend to write up a summary for wikitech-l about what I consider are Commons' tech priorities. One of them will be replacing/integrating Mayflower as our search engine. I really consider this a major priority for us.
I read your list. My worry is the lack of feedback. I did not see one developer immediately jumping on one suggestion and saying "this one is cool, let me give it a try". My worry is really the lack of technical support, not enough developers. A lead. Hmmm
I also find it strange that I post about 20 items, one gets picked up and has 20 replies, and the developers reply about that one issue rather than acknowledge my post overall. If they didn't reply I would not even know if they had read it. I don't know if they consider it important. I don't know if they will give increased priority to any of the things I mentioned.
FWIW I would dearly like to see a WMF employee as a dedicated "technical/development liaison" that could respond to community requests like mine. The devs do amazing wonderful work and are generally over-worked. This is well known. But public relations are not their strong point and for people who want to make a request for some functionality for their community, I think it can be very frustrating. There is often not a sense that your ideas are listened to, considered or even wanted. MediaWiki still needs a lot of development to be suited to a dictionary, a library, a book collection. It is still essentially an encyclopedia-writing tool.
This is hard. I don't want to attack the devs. But the current situation is not ideal.
Can you better explain what a "coffee table book" is ?
There is an article on enwikipedia about it :) It's just like a big, luxurious hardcover photography book. Typically little text, so you don't really need to sit down and read it, just flick through it when you like. You put it on the coffee table or on a table in a reception area.
Thanks for participating in this discussion :)
cheers, Brianna
Brianna Laugher wrote:
Note that I used the neat logo picture several times in talks, but fact is, it is difficult to talk of a project one has never participated a lot in, even if one think it is a great project.
Does that mean the Board will only talk about projects they are directly involved in? That should be about 10 at most. That would be disappointing. :P
tssss
Dunno for others, but I try to nearly anytime put a shot of all projects logos and quickly explain what each is about.
Once I did a presentation entirely about wikinews, but from my perspective, it was not a good talk :-( In the first year, I was rather regularly following the french wikinews, but not so well the english one. I frequently talk about wikibooks and wikicommons. Several presentations I did this spring were almost exclusively about wikibooks actually, with a little bit of wikiversity. Mostly presentations focused on education. Wikibooks and wikiversity raise a lot of attention.
I must admit I never really talked about wiktionary or wikisource, but for a few words of presentation, mostly when talking to librarians.
I have next to no idea what is going on wikispecies. Sorry :-)
And I... well, though I use it pretty frequently, I usually do not talk of wikiquote. I am aware that project is a legal spider nest in many countries. If you remember, we got into troubles with a french database on quote issues (the french wikiquote was basically a gigantic copy of this database and they did not like that :-)). The french wikiquote was since then restarted, with a good set of rules and much care from a few editors. So I guess it is okay. But I still hardly dare saying much about it publicly ;-) Maybe later.
But seriously Brianna, what is especially tricky is that in any presentation one does, there is always at least one very involved wikipedian. So, not only is that hard to "surprise" him, but he can spot mistakes ;-) What would be real cool would be to try to keep a written state of each project, what is hot, what is working, what is not working, technical wish list, biggest issues, big figures etc.... so that all participants could "follow" what is going on. I know all this is actually available, but only in a very dispersed manner, so not so easy to find out.
ant
About my comment which projects Board members comment on, I should correct myself, because the wider world rarely cares what the Board members say. To 95% of the world there is only Jimmy. (I guess 4% acknowledge the actual chair of the Board and another 1% acknowledge the rest. maybe less.) So really what is accurate to say is that I find it somewhat disappointing that Jimmy only seems to mention Wikipedia. But probably this is like you say, journalists only want to know about Wikipedia, so they ignore the rest. So then the thing to do is change the journalists. :)
On 12/08/07, Florence Devouard Anthere9@yahoo.com wrote:
What would be real cool would be to try to keep a written state of each project, what is hot, what is working, what is not working, technical wish list, biggest issues, big figures etc.... so that all participants could "follow" what is going on. I know all this is actually available, but only in a very dispersed manner, so not so easy to find out.
Such lists tend to be kept up to date for only a month or so. On meta you can find many documents that are relevant to Feb 2006 or whenever.
Partly I think it just the culture of the projects and partly the infrastructure. By 'culture of the projects' I mean you can spend all week inside a single project and still not follow everything, let alone wonder what the rest of the world is doing. Wikis don't have a good way to keep an 'overview' of a project, e.g. "this week's most edited pages" (especially highlighting pages that have a sudden jump in attention). And most Wikimedians are just in love with their project. It takes a long time to grow through the wiki stages to where you want to spread the word about how great your project is, I think. And many contributors may never progress to it or may stop contributing first. And both those things are ok. Too many "meta" people is also a bad state. The people doing the grunt work on the ground are the valuable ones. :)
So that is one thing. The other is the infrastructure.
Lately there have been more and more blogs created that offer a kind of overview of different projects. Why are blogs cool? Blogs are cool because of RSS. I would love to see some RSS technology integrated with MediaWiki -- or else for the Foundation to set up mass blog infrastruture that, say, any administrator or appointed "trusted user" could write to. Unofficial volunteer-written project blogs. It would be seriously cool. Just having the infrastructure in a central place would encourage a lot more people to participate I think.
OK here is an idea for Florence. Write a post to foundation-l addressing all projects (e.g. enwikipedia, frwikisource). Ask them to put together a 'state of the wiki' report, with the things you mentioned: * progress reports on pages, users, admins, policies * success of any special projects like printed material, wikiprojects * any special policy or practice that they have developed, that is not seen on other projects * technical wishlist * "perennial debates" - controversies that often come up in the community
Tell them it's optional to submit a report, and they have a month to write it.
If nothing else it would make for seriously interesting reading. :) And the Board can just, you know, publish it on the foundation wiki. They don't have to do anything else with it. But just having this kind of 'official' request may make people think about these kind of things.
cheers Brianna
Brianna Laugher wrote:
About my comment which projects Board members comment on, I should correct myself, because the wider world rarely cares what the Board members say. To 95% of the world there is only Jimmy. (I guess 4% acknowledge the actual chair of the Board and another 1% acknowledge the rest. maybe less.)
I think you are super optimistic and super pessimist as well on this point. In the USA, I would say that Jimmy is 99%. Beyond Jimbo, there is nothing, a dark pit. This is very boring, but it is a fact. However, in Europe, most conferences and most press interviews are not covered by Jimbo, but by I, or people from the various chapters, or just... participants. From time to time, a request requires Jimbo and no one else, but by and large, the diversity of people disseminating the message is pretty important. Most of those are from Wikipedia, but not all.
For example, the latest press release of the french association, Wikimedia France, deals with the french Wikiversity http://wikimedia.fr/index.php/Communiqu%C3%A9s_de_presse/2000_cours_sur_Wiki... And this press release was rather well relayed and generated some further press interest. After it was published, we received several press requests for more specific interviews... and our big challenge at that time was to find a "strong" participant to the french wikiversity to answer the press.
We changed the journalists !!!
I am hardly joking. We have come to a time when we know several of the journalists regularly writing about us. Or rather, about internet related projects. When you look at a serious monthly, even one with no specialization, you *know* the people in charge of the column related to web 2.0. These guys have done on average one small article about wikipedia every year. And one big study. They *know* and by and large understand the project. They are also ready to talk about something else. This is the type of journalists to focus on.
ant
So really what is accurate to say is that I
find it somewhat disappointing that Jimmy only seems to mention Wikipedia. But probably this is like you say, journalists only want to know about Wikipedia, so they ignore the rest. So then the thing to do is change the journalists. :)
On 12/08/07, Florence Devouard Anthere9@yahoo.com wrote:
What would be real cool would be to try to keep a written state of each project, what is hot, what is working, what is not working, technical wish list, biggest issues, big figures etc.... so that all participants could "follow" what is going on. I know all this is actually available, but only in a very dispersed manner, so not so easy to find out.
Such lists tend to be kept up to date for only a month or so. On meta you can find many documents that are relevant to Feb 2006 or whenever.
Partly I think it just the culture of the projects and partly the infrastructure. By 'culture of the projects' I mean you can spend all week inside a single project and still not follow everything, let alone wonder what the rest of the world is doing. Wikis don't have a good way to keep an 'overview' of a project, e.g. "this week's most edited pages" (especially highlighting pages that have a sudden jump in attention). And most Wikimedians are just in love with their project. It takes a long time to grow through the wiki stages to where you want to spread the word about how great your project is, I think. And many contributors may never progress to it or may stop contributing first. And both those things are ok. Too many "meta" people is also a bad state. The people doing the grunt work on the ground are the valuable ones. :)
So that is one thing. The other is the infrastructure.
Lately there have been more and more blogs created that offer a kind of overview of different projects. Why are blogs cool? Blogs are cool because of RSS. I would love to see some RSS technology integrated with MediaWiki -- or else for the Foundation to set up mass blog infrastruture that, say, any administrator or appointed "trusted user" could write to. Unofficial volunteer-written project blogs. It would be seriously cool. Just having the infrastructure in a central place would encourage a lot more people to participate I think.
OK here is an idea for Florence. Write a post to foundation-l addressing all projects (e.g. enwikipedia, frwikisource). Ask them to put together a 'state of the wiki' report, with the things you mentioned:
- progress reports on pages, users, admins, policies
- success of any special projects like printed material, wikiprojects
- any special policy or practice that they have developed, that is not
seen on other projects
- technical wishlist
- "perennial debates" - controversies that often come up in the community
Tell them it's optional to submit a report, and they have a month to write it.
If nothing else it would make for seriously interesting reading. :) And the Board can just, you know, publish it on the foundation wiki. They don't have to do anything else with it. But just having this kind of 'official' request may make people think about these kind of things.
cheers Brianna
On 8/12/07, Brianna Laugher brianna.laugher@gmail.com wrote:
Hi Florence, welcome to commons-l :)
[snip]
I will let Tangotango or Gmaxwell talk about why the Mayflower/normal search results are different...
They are different because the software works in differing ways. Mayflower's results should be better.
However, Florence was not actually looking at the old search results.
If you type Agriculture into the side box you get the gallery page: http://commons.wikimedia.org/wiki/Agriculture
This is hand is a result hand generated by humans. It's a wiki page. It's fairly useful.
This is the search results: http://commons.wikimedia.org/wiki/Special:Search?search=Agriculture&full...
Not very useful.
http://tools.wikimedia.de/~tangotango/mayflower/search.php?q=Agriculture&... Mayflower results, somewhat more useful.
As our search improves it should become get more qualities of the gallery page.
On 12/08/07, Florence Devouard Anthere9@yahoo.com wrote:
So, the good question is "who do we want to tickle ?" Do you want to tickle "those who talk about things" (reaching out for journalists working in professional press in video/audio/image), or "producers of content" (reaching out to artists, perhaps in art conferences ?), or "users of content" (including a CD of images in a distribution of professional press for teachers ?)
Wikimedia Commons will become our next famous project. Journalists understand immediately, and just want the search not to suck. Hey, a sorted archive of pre-cleared pictures, for free!
- d.
On 8/12/07, David Gerard dgerard@gmail.com wrote:
On 12/08/07, Florence Devouard Anthere9@yahoo.com wrote:
So, the good question is "who do we want to tickle ?" Do you want to tickle "those who talk about things" (reaching out for journalists working in professional press in video/audio/image), or "producers of content" (reaching out to artists, perhaps in art conferences ?), or "users of content" (including a CD of images in a distribution of professional press for teachers ?)
Wikimedia Commons will become our next famous project. Journalists understand immediately, and just want the search not to suck. Hey, a sorted archive of pre-cleared pictures, for free!
- d.
Journalists do not make our projects famous. As far as your average Internet user is concerned we are competing with google images. Somewhat tricky.
On 12/08/07, geni geniice@gmail.com wrote:
On 8/12/07, David Gerard dgerard@gmail.com wrote:
Wikimedia Commons will become our next famous project. Journalists understand immediately, and just want the search not to suck. Hey, a sorted archive of pre-cleared pictures, for free!
Journalists do not make our projects famous. As far as your average Internet user is concerned we are competing with google images. Somewhat tricky.
That was really two separate ideas and should have been two separate paragraphs. They can certainly greatly assist.
- d.
On 12/08/07, geni geniice@gmail.com wrote:
As far as your average Internet user is concerned we are competing with google images. Somewhat tricky.
Not really - Google Images' search sucks as well - try using it a lot and see what I mean. Their text search of the web is fantastic, but their image search is crappy.
With tagging, etc., Commons can do a hell of a lot better.
- d.
The site I suggest we target is Getty images. http://www.gettyimages.com/
Try their search with an image in mind, like "black child reading a book".
There is no reason we can't do even better than they do. The software isn't even hard, we just need to make a commitment to doing tagging right and doing it diligently.
On 8/12/07, David Gerard dgerard@gmail.com wrote:
On 12/08/07, geni geniice@gmail.com wrote:
As far as your average Internet user is concerned we are competing with google images. Somewhat tricky.
Not really - Google Images' search sucks as well - try using it a lot and see what I mean. Their text search of the web is fantastic, but their image search is crappy.
With tagging, etc., Commons can do a hell of a lot better.
- d.
Commons-l mailing list Commons-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/commons-l
On 12/08/07, Gregory Maxwell gmaxwell@gmail.com wrote:
The site I suggest we target is Getty images. http://www.gettyimages.com/ Try their search with an image in mind, like "black child reading a book". There is no reason we can't do even better than they do. The software isn't even hard, we just need to make a commitment to doing tagging right and doing it diligently.
Wasn't the problem that queries on category intersections suck mud in MySQL? (But work very fast in Postgres. Thus demonstrating the superiority of MySQL. Hold on, I'll come in again ...)
But yes, a tagging system is necessary.
- d.
On 8/12/07, David Gerard dgerard@gmail.com wrote:
Wasn't the problem that queries on category intersections suck mud in MySQL? (But work very fast in Postgres. Thus demonstrating the superiority of MySQL. Hold on, I'll come in again ...)
But yes, a tagging system is necessary.
We're not stuck using MySQL when something else works better. :)
You have seen my intersection tool, yes?
For commons: http://tools.wikimedia.de/~gmaxwell/cgi-bin/cattersect.py
For enwiki: http://tools.wikimedia.de/~gmaxwell/cgi-bin/enwiki_cattersect.py
Tangotango and I are already working on moving mayflower over to my backend. ;)
Cheers.
On 12/08/07, Gregory Maxwell gmaxwell@gmail.com wrote:
The site I suggest we target is Getty images. http://www.gettyimages.com/
Try their search with an image in mind, like "black child reading a book".
There is no reason we can't do even better than they do. The software isn't even hard, we just need to make a commitment to doing tagging right and doing it diligently.
Well if we had a better tool it would remove a lot of the problems we currently have.
Tagging is flawed - some people put 'wiki', some put 'wikis', some put 'wikipedia', etc etc. And yet somehow it doesn't seem to matter. this is puzzling. I haven't really seen a site do intentionally-collaborative tagging, where the users actively try to have the same understanding for the same tag. no wonder we have so many problems with categories. ;)
There are several advantages we have over all the competitors people have mentioned -- Google images, Flickr, Getty images: * only free content licenses, and we actively remove copyvios. (insert disclaimer about reliability here) * seriously multilingual - today's POTD had captions in 19 languages, and I think usually it is more. * attention to detailed annotation - we kill the others when it comes to this. especially with nature images. (I just did a search on getty images for 'kangaroo'. most of the images look hokey and staged.) * "encyclopedic" coverage and style, compared to flickr's "self-absorbed" coverage and Getty image's often "staged" style * wiki = instantly updating mistakes and improving. We can only get better. our descriptions can only get more detailed. our translations can only grow. Now this is cause for cheer! How will Getty get new annotations? By paying translators. Us? We just encourage and wait.
But it doesn't matter how much great content you have, if no can find it, you may as well not have it. Hence search is my #1 request. Further down the list there was also a request for a rating system, which would help to bring higher quality results higher up in search queries.
cheers Brianna
On 12/08/07, Brianna Laugher brianna.laugher@gmail.com wrote:
Tagging is flawed - some people put 'wiki', some put 'wikis', some put 'wikipedia', etc etc. And yet somehow it doesn't seem to matter. this is puzzling. I haven't really seen a site do intentionally-collaborative tagging, where the users actively try to have the same understanding for the same tag. no wonder we have so many problems with categories. ;)
The problem is that you need a controlled vocabulary of some form - a way of saying "mark all images with cats as 'cats', not 'cat'", twelve thousand times - and we so do not have this; we have a classic folksonomy, people tagging with whatever they feel like.
[Okay, now I've shown I can remember the buzzwords they taught me at library school...]
It's possible to turn a folksonomy into a controlled vocabulary, in two ways:
a) Manual patrolling - change all uses to conform with a controlled vocabulary b) Tag equivalence - ensure everything *corresponds* to an entry in a controlled vocabulary
a) is essentially what gets done with categories. People keep looking at categories, merging them and renaming them and organising them; new data gets subsumed into the existing structure. (Enwiki's category intersections - "French mathematicians" - are great examples of this; there's two or three ways to phrase each one, and dozens of people doing nothing more than make sure they're all standardised). Here, you'd hunt out all incidences of "cats" and change them to "cat". The problem is the meta-standardisation of this... I'm not sure quite how long-term workable it is without constant maintenance.
b) is perhaps more interesting. In the bowels of whatever system you use for tagging, set it up so that one tag, one identifier, can be represented by many different tags. In effect, allow "cat" and "cats", but ensure a search for one displays the other as well. LibraryThing does this, and does it fairly well; their tag lists contain mispellings and foreign terms as well as variant names, which is quite useful. Configuring this and ensuring it doesn't get accidentally snarled up - inadvertently merging two large groups can be confusing - is tricky, but once it's up and running it should require less ongoing maintenance.
A few representative collections, representing one tag each:
philosophy of science, Ciencia-FilosofĂa, Philosophy (Science), philosophy_of_science, Science - Philosophy, science philosophy, Science-Philosophy
theology, teologia, theolgy, theologie, Theololgy, Theoloogy
wwii, 2nd world war, second world war, SecondWorldWar, second_world_war, segunda guerra, Segunda Guerra Mundial, w.w.ii, war (WWII), war world ii, word war 2, World War (1939-1945), world war 1939-1945, world war 2, world war ii, World War II 1939-1945, world war ll, world war two, world war. 1939-1945, World War2, world-war-2, worldwarII, world_war_ii, ww 2, ww ii, ww11, ww2, WW_II, Zweiter Weltkrieg
The main problem here is ambiguous terms, the classic that LT deals with being 'sf' - science fiction, or books about San Francisco? There are also long debates to be had about meaningful correspondences - are 'paranormal' and 'supernatural' the same thing? 'humor' and 'humour'? The last won't really apply for photos, but is an interesting question with regard to the written word - compare http://www.librarything.com/tag/humor and http://www.librarything.com/tag/humour - and demonstrates the subtleties that can be found in folksonomies...
On 13/08/07, Andrew Gray shimgray@gmail.com wrote:
b) is perhaps more interesting. In the bowels of whatever system you use for tagging, set it up so that one tag, one identifier, can be represented by many different tags. In effect, allow "cat" and "cats", but ensure a search for one displays the other as well. LibraryThing does this, and does it fairly well; their tag lists contain mispellings and foreign terms as well as variant names, which is quite useful. Configuring this and ensuring it doesn't get accidentally snarled up - inadvertently merging two large groups can be confusing - is tricky, but once it's up and running it should require less ongoing maintenance.
ok... this is interesting. I didn't know this. Don't suppose their SW happens to be open source? :)
The other problem which cannot be ignored in relation to tagging/categories is that people need to be able to use the equivalent tag in their own language, and have the tags show up in their own language, while still seeing all the files that have been tagged with the equivalent tag in other languages.
cheers Brianna
On 12/08/07, Brianna Laugher brianna.laugher@gmail.com wrote:
On 13/08/07, Andrew Gray shimgray@gmail.com wrote:
b) is perhaps more interesting. In the bowels of whatever system you use for tagging, set it up so that one tag, one identifier, can be represented by many different tags. In effect, allow "cat" and "cats", but ensure a search for one displays the other as well. LibraryThing does this, and does it fairly well; their tag lists contain mispellings and foreign terms as well as variant names, which is quite useful. Configuring this and ensuring it doesn't get accidentally snarled up - inadvertently merging two large groups can be confusing - is tricky, but once it's up and running it should require less ongoing maintenance.
ok... this is interesting. I didn't know this. Don't suppose their SW happens to be open source? :)
I have no idea; I suspect not, since it's mostly homebrewed and they are selling it as a subscription service... but then, it might not be desperately much use to us anyway, given it's homebrewed and targeted specifically for running their site.
http://www.librarything.com/blog/2005/12/combining-tags-heresy.php is the original blog post discussing it; he notes Clay Shirky's objections to it.
[on 'film', 'cinema', 'movies'] "Those terms actually encode different things, and the assertion that restricting vocabularies improves signal assumes that that there's no signal in the difference itself, and no value in protecting the user from too many matches."
http://www.shirky.com/writings/ontology_overrated.html#mind_reading
This is a real issue, but I think - *think* - it's not one we need to be desperately paranoid about. The valuable semantic ambiguities lie in intangibles, conceptual things, which are generally not what we want to be tagging a photograph with! We're basically looking at things not ideas, and there's a much more solid mapping of terminology there.
(I really don't get the workings of the MediaWiki category system very well. Can we have 'redirect categories'? That might help)
http://www.librarything.com/tagcombine_log.php gives you an idea of the kind of tag merging going on.
The other problem which cannot be ignored in relation to tagging/categories is that people need to be able to use the equivalent tag in their own language, and have the tags show up in their own language, while still seeing all the files that have been tagged with the equivalent tag in other languages.
Yeah. Though... hmm. The LT model is that each tag is basically a redirect (to use our jargon) for the "main tag", which is determined by raw popularity; it presumably wouldn't be too difficult to have the main tag vary by language if you had some way of categorising the "minor tags" by source language.
The real killer here would *probably* be the multilingual homonym problem - orthographically identical words that mean different things in different languages; they'd get combined happily because as far as a monolingual speaker knew, they were unambiguous, and have... interesting... knock-on effects.
On 8/12/07, Brianna Laugher brianna.laugher@gmail.com wrote:
Well if we had a better tool it would remove a lot of the problems we currently have.
Be careful not to blame the software too much. Yes we have some serious software gaps, but none of them would take much time to close if they were getting active development. After that the gaps all become data quality, process, and manpower related.
We have 1.75 million images, 2 million including deleted images. All the software changes you would ask for will take less man hours than a single pass of data quality improvements over our whole collection.
Tagging is flawed - some people put 'wiki', some put 'wikis', some put 'wikipedia', etc etc. And yet somehow it doesn't seem to matter. this is puzzling. I haven't really seen a site do intentionally-collaborative tagging, where the users actively try to have the same understanding for the same tag. no wonder we have so many problems with categories. ;)
What is flawed is the the model of "user contributed content" which lacks strong facilities for collaboration. We understand collaboration, we have strong facilities for it. This is why "Joe uses 'dog', john uses 'dogs' isn't and shouldn't be a hard problem for us. With collaboration we can just go fix all of it.
There are several advantages we have over all the competitors people have mentioned -- Google images, Flickr, Getty images:
Woah. Now. I only mentioned Getty as an example of what we should aspire to search wise. In that one regard they simply blow us out of the water. Their search produces useful results for things in ways that we can't even hope to produce today no matter how good we make the software, because we simply do not have the data on each item in our collection required right now.
Of course, in every other regard we already blow them away. Can I not point out an area where someone else does clearly better without getting the Commons Sales Pitch? :)
[snip]
- attention to detailed annotation - we kill the others when it comes
to this. especially with nature images. (I just did a search on getty images for 'kangaroo'. most of the images look hokey and staged.)
Different audience, they cater to commercial stock photography.. advertisements and such, while our primary customer is an encyclopedia, but that has nothing to do with annotation.
Their annotation is fantastic. For example, searching for "kangaroo fighting" ... gives you only pictures of kangaroos fighting. Searching for "kangaroo costume" asks you to clarify if you want "traditional clothing" or "Costume (Dressing Up)" you either get a painting of people in what appears to be tribal dress cooking a kangaroo, or you get pictures of people in cheezy kangaroo costumes depending on your choice.
This blows away anything that we currently offer. It's fantastically useful.
What stinks, in my view, is we're not that far from being able to have that kind of search ourselves. Their keyword data looks a lot like our categories. The keywords themselves are classified into groups (to help people find the right keywords, but not to classify the images), and there is keyword disambiguation data.
The biggest difference, as far as I can tell, is that we're utterly paranoid about "over categorization". While they apply all that are appropriate, people on commons are constantly trying to reduce images to a few.. or even one category. It's nuts and it clearly doesn't work.
A typical image in getty's web collection will have something between 20 and 40 'keywords' assigned to them. We have an average of 2.9 (including all the license cats).
On 8/12/07, Gregory Maxwell gmaxwell@gmail.com wrote:
What is flawed is the the model of "user contributed content" which lacks strong facilities for collaboration. We understand collaboration, we have strong facilities for it. This is why "Joe uses 'dog', john uses 'dogs' isn't and shouldn't be a hard problem for us. With collaboration we can just go fix all of it.
Or just pull the redirects from en and use them as data for a basic disambiguation system.
Different audience, they cater to commercial stock photography.. advertisements and such, while our primary customer is an encyclopedia, but that has nothing to do with annotation.
Their annotation is fantastic. For example, searching for "kangaroo fighting" ... gives you only pictures of kangaroos fighting. Searching for "kangaroo costume" asks you to clarify if you want "traditional clothing" or "Costume (Dressing Up)" you either get a painting of people in what appears to be tribal dress cooking a kangaroo, or you get pictures of people in cheezy kangaroo costumes depending on your choice.
This blows away anything that we currently offer. It's fantastically useful.
Depends. You can get something close to that if you use wikipedia as your commons search engine (of course in the case of en you will also hit a load of images that are not on commons but give it time).
What stinks, in my view, is we're not that far from being able to have that kind of search ourselves. Their keyword data looks a lot like our categories. The keywords themselves are classified into groups (to help people find the right keywords, but not to classify the images), and there is keyword disambiguation data.
The biggest difference, as far as I can tell, is that we're utterly paranoid about "over categorization". While they apply all that are appropriate, people on commons are constantly trying to reduce images to a few.. or even one category. It's nuts and it clearly doesn't work.
A typical image in getty's web collection will have something between 20 and 40 'keywords' assigned to them. We have an average of 2.9 (including all the license cats).
To an extent you could get around that by looking at the wikipedia articles images appear in.
On 12/08/07, geni geniice@gmail.com wrote:
A typical image in getty's web collection will have something between 20 and 40 'keywords' assigned to them. We have an average of 2.9 (including all the license cats).
To an extent you could get around that by looking at the wikipedia articles images appear in.
Useful for the images which *do* appear in articles, but we have a lot of surplus.
Say I go off one day, with my camera, and I take a set of photos of something. I upload a dozen from different angles, close-ups and wide shots, a nice variety of photos. All are tagged in the same way with much the same description. And then I put one in the Wikipedia article - it would be silly to include all these others, they'd just be repetitive clutter.
The search should really return them all, not just the one I happened to pick as most suitable for an encyclopedia.
On 8/12/07, Andrew Gray shimgray@gmail.com wrote:
Useful for the images which *do* appear in articles, but we have a lot of surplus.
Say I go off one day, with my camera, and I take a set of photos of something. I upload a dozen from different angles, close-ups and wide shots, a nice variety of photos. All are tagged in the same way with much the same description. And then I put one in the Wikipedia article
- it would be silly to include all these others, they'd just be
repetitive clutter.
The search should really return them all, not just the one I happened to pick as most suitable for an encyclopedia.
Click the image on wikipedia and then look at the cats on it if you want more.
On 12/08/07, geni geniice@gmail.com wrote:
The search should really return them all, not just the one I happened to pick as most suitable for an encyclopedia.
Click the image on wikipedia and then look at the cats on it if you want more.
Yes, but that isn't a very efficient patch for a Commons search engine.
Andrew Gray wrote:
On 12/08/07, geni wrote:
A typical image in getty's web collection will have something between 20 and 40 'keywords' assigned to them. We have an average of 2.9 (including all the license cats).
To an extent you could get around that by looking at the wikipedia articles images appear in.
Useful for the images which *do* appear in articles, but we have a lot of surplus.
Say I go off one day, with my camera, and I take a set of photos of something. I upload a dozen from different angles, close-ups and wide shots, a nice variety of photos. All are tagged in the same way with much the same description. And then I put one in the Wikipedia article
- it would be silly to include all these others, they'd just be
repetitive clutter.
The search should really return them all, not just the one I happened to pick as most suitable for an encyclopedia.
Add to the wikipedia article: "Commons has a gallery/category about X"
On 8/12/07, Brianna Laugher brianna.laugher@gmail.com wrote:
- seriously multilingual - today's POTD had captions in 19 languages,
and I think usually it is more.
Just wondering... are there any larger projects on the internet that are multilingual in the same degree as Commons?
Bryan