Hello,
I am happy to relay that Wikimedia Commons now has over two million files. This is around 11 months after we reached one million. Since March 2007 we routinely have over 100,000 files uploaded every single month. It is becoming more and more common to have over 5,000 files in a single day.
http://commons.wikimedia.org/wiki/Commons:Press_releases/2M
This is not much of a traditional press release but will be interesting for Wikimedians who are not closely involved with Commons (and maybe those who are, too). It lists some recent innovations and tools that we have introduced to improve the quality and ease of access of the Commons collection.
I expect most people here are aware of the Picture of the Year competition, and Quality Images, and the Mayflower search engine ( http://tools.wikimedia.de/~tangotango/mayflower/ ), and the improved Ogg video/audio playback thanks to Tim Starling.
A couple of features people may not be aware of: - geocoding - every picture that is geotagged gets a little "Earth" icon and clicking on it launches a map that shows the image located on the map, as well as other geotagged images nearby. It's really cool. The geocoding folk are very innovative and welcoming so if you are interested in helping them out or just finding out other cool ways this can be used, please talk to them. http://commons.wikimedia.org/wiki/Commons:Geocoding
- category RSS feeds - thanks to Magnus Manske each category now has a RSS feed for new files that get added to that category. If you are interested in some topic X and it has a category, you can keep tabs on what new files get added to it without having to go to Commons to check all the time. Just add the RSS feed link (it's on the category page under the toolbox) to your favourite feed reader. A full list of types of feeds available is here: http://commons.wikimedia.org/wiki/Commons:Feeds
The 2M mark draws some comparison with English Wikipedia which recently passed the same milestone. However the comparison is not really valid. Where Wikipedia may expect basically one topic == one article (give or take section expension such as 'History of X' having its own article instead of just 'X'), Commons would hope to have one topic == dozens of files.
Commons is still a very young project and it is both exciting and scary to imagine what it will be like with 10M files.
By #files we are the largest project. ;) By #users we are the 8th largest project. By #edits we are the 9th largest project.
It is a very interesting time to be involved with Commons. We face many social and technical problems that are yet to be resolved. Are our current processes for dealing with things (eg: uploading, tagging as copyvio, deletion discussion, deletion, adminship) scaling well - will they continue to cope or will they get overloaded? How can we face the joint challenges of welcoming new users on one hand and yet tackling the never-ending stream of copyvios on the other? How can we provide equal opportunity for our users to participate regardless of the languages they do or don't speak?
The social problems are ours to ponder, yet I can't help feeling that our work is made more difficult by using inappropriate tools. MediaWiki is a great tool for writing an encyclopedia in a single language, but it has some significant shortcomings when used in a multilingual media-based environment that only become more problematic as Commons gets larger. Therefore I sincerely hope that the Foundation will consider hiring or contracting developers specifically for problems that affect Commons, within the next couple of years.
regards, Brianna user:pfctdayelise
On 10/15/07, Brianna Laugher brianna.laugher@gmail.com wrote:
Hello,
A couple of features people may not be aware of:
- geocoding - every picture that is geotagged gets a little "Earth"
icon and clicking on it launches a map that shows the image located on the map, as well as other geotagged images nearby. It's really cool. The geocoding folk are very innovative and welcoming so if you are interested in helping them out or just finding out other cool ways this can be used, please talk to them. http://commons.wikimedia.org/wiki/Commons:Geocoding
I know I'm going offtopic, but anyway. I've asked around, and I haven't yet found how properly do scaling on geocoding.
For example, [[Image:Uxmal.jpg]] no matter what I do trying to set the scale as described on many documentation pages ( | scale=N }} ), I always get a whole continent view when clicking the little globe to open the miniatlas, instead of a zoomed view (maybe a few kilometers width) which would be more appropiated for building photo than a continent one (thousands of km width).
I've tried many things as it can be seen on the historial, and now, since people who know how wil lread this, could anyone fix the scaling on image for me so I can see how it's done and proceed to do it on my other images?
Sorry again for going offtopic
What is the 2 000 000 uploaded file? ;D
Ok it must be really hard to know, but we need to pick one, at least for the press release.
On 16/10/2007, Pedro Sanchez pdsanchez@gmail.com wrote:
On 10/15/07, Brianna Laugher brianna.laugher@gmail.com wrote:
Hello,
A couple of features people may not be aware of:
- geocoding - every picture that is geotagged gets a little "Earth"
icon and clicking on it launches a map that shows the image located on the map, as well as other geotagged images nearby. It's really cool. The geocoding folk are very innovative and welcoming so if you are interested in helping them out or just finding out other cool ways this can be used, please talk to them. http://commons.wikimedia.org/wiki/Commons:Geocoding
I know I'm going offtopic, but anyway. I've asked around, and I haven't yet found how properly do scaling on geocoding.
For example, [[Image:Uxmal.jpg]] no matter what I do trying to set the scale as described on many documentation pages ( | scale=N }} ), I always get a whole continent view when clicking the little globe to open the miniatlas, instead of a zoomed view (maybe a few kilometers width) which would be more appropiated for building photo than a continent one (thousands of km width).
I've tried many things as it can be seen on the historial, and now, since people who know how wil lread this, could anyone fix the scaling on image for me so I can see how it's done and proceed to do it on my other images?
Sorry again for going offtopic
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: http://lists.wikimedia.org/mailman/listinfo/foundation-l
I am happy to relay that Wikimedia Commons now has over two million files. This is around 11 months after we reached one million. Since
This is really exciting. Any stats on how quickly audio and video filetypes are growing?
A couple of features people may not be aware of:
- geocoding - every picture that is geotagged gets a little "Earth"
icon and clicking on it launches a map that shows the image located on the map, as well as other geotagged images nearby. It's really cool. The geocoding folk are very innovative and welcoming so if you are interested in helping them out or just finding out other cool ways this can be used, please talk to them. http://commons.wikimedia.org/wiki/Commons:Geocoding
These folks are amazing.
By #users we are the 8th largest project. By #edits we are the 9th largest project.
and by # of languages on one wiki, the largest by far...
It is a very interesting time to be involved with Commons. We face many social and technical problems that are yet to be resolved. Are our current processes for dealing with things (eg: uploading, tagging as copyvio, deletion discussion, deletion, adminship) scaling well - will they continue to cope or will they get overloaded? How can we face the joint challenges of welcoming new users on one hand and yet tackling the never-ending stream of copyvios on the other? How can we provide equal opportunity for our users to participate regardless of the languages they do or don't speak?
The social problems are ours to ponder, yet I can't help feeling that our work is made more difficult by using inappropriate tools. MediaWiki is a great tool for writing an encyclopedia in a single language, but it has some significant shortcomings when used in a multilingual media-based environment that only become more problematic as Commons gets larger. Therefore I sincerely hope that the Foundation will consider hiring or contracting developers specifically for problems that affect Commons, within the next couple of years.
Is there a list of outstanding feature requests for active commons users? A place for gathering problems that don't yet have good solutions?
SJ
On 10/16/07, Samuel Klein sj@laptop.org wrote:
By #users we are the 8th largest project. By #edits we are the 9th largest project.
and by # of languages on one wiki, the largest by far...
I think that we are one of the largest multilingual projects on the Internet, which is great.
Bryan
On 10/16/07, Bryan Tong Minh bryan.tongminh@gmail.com wrote:
On 10/16/07, Samuel Klein sj@laptop.org wrote:
By #users we are the 8th largest project. By #edits we are the 9th largest project.
and by # of languages on one wiki, the largest by far...
I think that we are one of the largest multilingual projects on the Internet, which is great.
Bryan
I did some calculations, based on data in Wikipedia, and our administrators speak the mother language of about 3.2 billion people. This means that at least half of the world's population can make themselves understandable in their mother language to one of our administrators. This calculation only includes native languages. If we would also count the people that speak some foreign languages fluently, I estimate that this number of people is 4-5 billion.
This only concerns administrators. The number of languages in which the main page is available is far greater.
On 16/10/2007, Samuel Klein sj@laptop.org wrote:
I am happy to relay that Wikimedia Commons now has over two million files. This is around 11 months after we reached one million. Since
This is really exciting. Any stats on how quickly audio and video filetypes are growing?
None to hand. Having a regular breakdown of filetype would be an interesting thing to watch. I would guess or hope that Ogg uploads will increase now that they can be easily used. (Although having said that we have always had a decent number of pronunciation files and Spoken Wikipedia files.) Last time I checked SVGs were about 7% of total uploads which is pretty cool IMO. I was going to say Oggs are probably less than that but thinking about pronunciation files, I'm not sure.
Is there a list of outstanding feature requests for active commons users? A place for gathering problems that don't yet have good solutions?
I think my list from August is still current for me, at least: http://thread.gmane.org/gmane.science.linguistics.wikipedia.technical/32980/ Please read this link for a fuller explanation of the items below.
Big ticket items:
0. SUL. In the works. 1. Integrated Mayflower search. Mayflower is complete, I think this would just take some dedicated time (but of course things are never that easy). 2. Backups. I am told Wikimedia has backups now so hopefully this can be crossed off. 3. Multilingual tagging system (ie categories). This would be a huge messy change that would potentially affect a lot of parts in MediaWiki, so I think this will only happen with a dedicated developer. 4. Structured data. Maybe partially solved by things such as Semantic MW or Wikidata but would take some dedicated time to make acceptable to core MW. 5. Move/rename images functionality. Maybe an easier task as something that is hopefully more isolated. 6. Rating system.Travis contacted me about this so I hope we can get something going here. 7. Improved category handling (eg all subcats on front page, print total# of items in cat, sort by params other than alphabetical, 'auto-flatten' subcats). These are all individual things which could be picked away at by those interested. 8. Multilingual support. [Hopefully for categories solved with item 3] Making languages "stick" for anon users. Fixing RTL interface. Translating templates "automagically" like MW system messages. I'm guessing these are all capital D Difficult problems.
Medium ticket items: 9. Playback support. Greatly improved thanks to OggHandler, would be nice to have the same for OO files, Midi and... maybe that's it? any other file formats we can't display... 10. Category RSS - solved thanks to Magnus+toolserver. InstantCommons - don't know what its status is. 11. Global native MW Checkusage, CommonsTicker functionality. I guess the reason we don't have global native checkusage is because of performance reasons. These are maybe relatively easy to write as isolated things. 12. Enable ImportFreeImages extension. (auto-Flickr import) http://bugzilla.wikimedia.org/show_bug.cgi?id=8854 This extension has been enabled at Wikia but awaits review before being enabled here. We have since worked around it with toolserver+bot but it is less than ideal.
Small ticket items: 13. Ability to move images from another Wikimedia wiki to Commons (worked around with toolserver, bots to some extent). Maybe Special:Import just needs some tweaking? Maybe not /too/ hard? 14. User upload gallery function. Difficulty: relatively easy. (work around: toolserver) 15. Bulk upload function. Difficulty: ? (work around: user-created programs and scripts that have varying degrees of success and portability) 16. Auto-resize galleries according to window size. Difficulty: easy? (workaround: site JS) 17. Category and gallery Flickr-like 'previews'. Difficulty: at the difficult end of easy (workaround: site JS) 18. "Fotonotes" Difficulty: maybe a fun problem, difficult in itself, but relatively isolated (although how these things get recorded in the DB could be interesting) 19. Upload form flexibility. (such as ability to create purpose-specific forms) Difficulty: moderate.
We also have an incomplete list of bugs which could provide some nice bite-sized problems as well as some monsters like the above: http://commons.wikimedia.org/wiki/Commons:Bugs
An item I did not write in this list is that of "copyright checking" or "workflow" or something. I somewhat doubt that 5k images every day each get eyeballed by even one other person than the uploader. That is worrying. Patrol is not quite the right tool. Stable will not quite be the right tool. I don't know what the right tool will quite look like but I guess this is another "dedicated" difficult problem.
cheers, Brianna
On 10/16/07, Brianna Laugher brianna.laugher@gmail.com wrote:
On 16/10/2007, Samuel Klein sj@laptop.org wrote:
I am happy to relay that Wikimedia Commons now has over two million files. This is around 11 months after we reached one million. Since
This is really exciting. Any stats on how quickly audio and video filetypes are growing?
The 2 millionth file on Commons was an audio file. From the Commons database dumps, I extracted that http://commons.wikimedia.org/wiki/Image:Sr-Gornya_Crnucya.ogg uploaded on 03:34, 9 October 2007 by Nikola Smolenski is the 2 millionth file.
Bryan
On 10/16/07, Brianna Laugher brianna.laugher@gmail.com wrote:
- Auto-resize galleries according to window size. Difficulty: easy?
(workaround: site JS)
Would be easy (for browsers that supply that information), but two problems: * breaks caching => performance nightmare * still needs JS when user resizes browser window The JS call for rearranging could be called earlier (once the HTML is there, and don't wait for the thumbnails to load), which would make the rearranging (almost) invisible.
- Category and gallery Flickr-like 'previews'. Difficulty: at the
difficult end of easy (workaround: site JS)
Also: http://tools.wikimedia.de/~magnus/cgi-bin/flommons.pl
An item I did not write in this list is that of "copyright checking" or "workflow" or something. I somewhat doubt that 5k images every day each get eyeballed by even one other person than the uploader. That is worrying. Patrol is not quite the right tool. Stable will not quite be the right tool. I don't know what the right tool will quite look like but I guess this is another "dedicated" difficult problem.
I did start a toolserver-based review thingy, but got distracted...
Cheers, Magnus
Samuel:
This is really exciting. Any stats on how quickly audio and video filetypes are growing?
http://stats.wikimedia.org/wikispecial/EN/TablesWikipediaCOMMONS.htm
For monthly figures scroll to 'Database records per namespace / Categorised articles / Binaries'
Latest figures are for June 2007 (later dump was broken, no surprises here)
June 2007:
Gif Jpg Mid Ogg Pdf Png Svg 34k 1.2M 879 31k 1.1k 262k 95k
Erik Zachte
On Wed, 2007-10-17 at 00:30 +0200, Erik Zachte wrote:
Samuel:
This is really exciting. Any stats on how quickly audio and video filetypes are growing?
http://stats.wikimedia.org/wikispecial/EN/TablesWikipediaCOMMONS.htm
For monthly figures scroll to 'Database records per namespace / Categorised articles / Binaries'
Latest figures are for June 2007 (later dump was broken, no surprises here)
June 2007:
Gif Jpg Mid Ogg Pdf Png Svg 34k 1.2M 879 31k 1.1k 262k 95k
I'd be super interested in the licensing breakdown overall and by media type. :)
(Yes, I know this is complicated by many items being multi-licensed.)
On 16/10/2007, Brianna Laugher brianna.laugher@gmail.com wrote:
Hello,
I am happy to relay that Wikimedia Commons now has over two million files. This is around 11 months after we reached one million. Since March 2007 we routinely have over 100,000 files uploaded every single month. It is becoming more and more common to have over 5,000 files in a single day.
http://commons.wikimedia.org/wiki/Commons:Press_releases/2M
This is not much of a traditional press release but will be interesting for Wikimedians who are not closely involved with Commons (and maybe those who are, too). It lists some recent innovations and tools that we have introduced to improve the quality and ease of access of the Commons collection.
I expect most people here are aware of the Picture of the Year competition, and Quality Images, and the Mayflower search engine ( http://tools.wikimedia.de/~tangotango/mayflower/ ), and the improved Ogg video/audio playback thanks to Tim Starling.
A couple of features people may not be aware of:
- geocoding - every picture that is geotagged gets a little "Earth"
icon and clicking on it launches a map that shows the image located on the map, as well as other geotagged images nearby. It's really cool. The geocoding folk are very innovative and welcoming so if you are interested in helping them out or just finding out other cool ways this can be used, please talk to them. http://commons.wikimedia.org/wiki/Commons:Geocoding
- category RSS feeds - thanks to Magnus Manske each category now has a
RSS feed for new files that get added to that category. If you are interested in some topic X and it has a category, you can keep tabs on what new files get added to it without having to go to Commons to check all the time. Just add the RSS feed link (it's on the category page under the toolbox) to your favourite feed reader. A full list of types of feeds available is here: http://commons.wikimedia.org/wiki/Commons:Feeds
The 2M mark draws some comparison with English Wikipedia which recently passed the same milestone. However the comparison is not really valid. Where Wikipedia may expect basically one topic == one article (give or take section expension such as 'History of X' having its own article instead of just 'X'), Commons would hope to have one topic == dozens of files.
Commons is still a very young project and it is both exciting and scary to imagine what it will be like with 10M files.
By #files we are the largest project. ;) By #users we are the 8th largest project. By #edits we are the 9th largest project.
It is a very interesting time to be involved with Commons. We face many social and technical problems that are yet to be resolved. Are our current processes for dealing with things (eg: uploading, tagging as copyvio, deletion discussion, deletion, adminship) scaling well - will they continue to cope or will they get overloaded? How can we face the joint challenges of welcoming new users on one hand and yet tackling the never-ending stream of copyvios on the other? How can we provide equal opportunity for our users to participate regardless of the languages they do or don't speak?
The social problems are ours to ponder, yet I can't help feeling that our work is made more difficult by using inappropriate tools. MediaWiki is a great tool for writing an encyclopedia in a single language, but it has some significant shortcomings when used in a multilingual media-based environment that only become more problematic as Commons gets larger. Therefore I sincerely hope that the Foundation will consider hiring or contracting developers specifically for problems that affect Commons, within the next couple of years.
regards, Brianna user:pfctdayelise
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: http://lists.wikimedia.org/mailman/listinfo/foundation-l
Let's hope most of those are actually within the project scope...
Brianna Laugher wrote:
Hello,
I am happy to relay that Wikimedia Commons now has over two million files. This is around 11 months after we reached one million. Since March 2007 we routinely have over 100,000 files uploaded every single month. It is becoming more and more common to have over 5,000 files in a single day.
http://commons.wikimedia.org/wiki/Commons:Press_releases/2M
This is not much of a traditional press release but will be interesting for Wikimedians who are not closely involved with Commons (and maybe those who are, too). It lists some recent innovations and tools that we have introduced to improve the quality and ease of access of the Commons collection.
Indeed not a traditional press release, but hugely interesting :-) Hugely... thank you so much for that page
I expect most people here are aware of the Picture of the Year competition, and Quality Images, and the Mayflower search engine ( http://tools.wikimedia.de/~tangotango/mayflower/ ), and the improved Ogg video/audio playback thanks to Tim Starling.
A couple of features people may not be aware of:
- geocoding - every picture that is geotagged gets a little "Earth"
icon and clicking on it launches a map that shows the image located on the map, as well as other geotagged images nearby. It's really cool. The geocoding folk are very innovative and welcoming so if you are interested in helping them out or just finding out other cool ways this can be used, please talk to them. http://commons.wikimedia.org/wiki/Commons:Geocoding
- category RSS feeds - thanks to Magnus Manske each category now has a
RSS feed for new files that get added to that category. If you are interested in some topic X and it has a category, you can keep tabs on what new files get added to it without having to go to Commons to check all the time. Just add the RSS feed link (it's on the category page under the toolbox) to your favourite feed reader. A full list of types of feeds available is here: http://commons.wikimedia.org/wiki/Commons:Feeds
The 2M mark draws some comparison with English Wikipedia which recently passed the same milestone. However the comparison is not really valid. Where Wikipedia may expect basically one topic == one article (give or take section expension such as 'History of X' having its own article instead of just 'X'), Commons would hope to have one topic == dozens of files.
Commons is still a very young project and it is both exciting and scary to imagine what it will be like with 10M files.
By #files we are the largest project. ;) By #users we are the 8th largest project. By #edits we are the 9th largest project.
It is a very interesting time to be involved with Commons. We face many social and technical problems that are yet to be resolved. Are our current processes for dealing with things (eg: uploading, tagging as copyvio, deletion discussion, deletion, adminship) scaling well - will they continue to cope or will they get overloaded? How can we face the joint challenges of welcoming new users on one hand and yet tackling the never-ending stream of copyvios on the other? How can we provide equal opportunity for our users to participate regardless of the languages they do or don't speak?
The social problems are ours to ponder, yet I can't help feeling that our work is made more difficult by using inappropriate tools. MediaWiki is a great tool for writing an encyclopedia in a single language, but it has some significant shortcomings when used in a multilingual media-based environment that only become more problematic as Commons gets larger. Therefore I sincerely hope that the Foundation will consider hiring or contracting developers specifically for problems that affect Commons, within the next couple of years.
You know... I would be interested in knowing more about how daily work is handled with regards to multilingual situation, and if there are specific issues that non english raise and are still unsolved.
With regards to the developers request, hopefully, if the fundraising goes well, your wish should be answered. http://wikimediafoundation.org/wiki/Job_openings Several new positions are planned with regards to developers.
Best
Ant
regards, Brianna user:pfctdayelise
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: http://lists.wikimedia.org/mailman/listinfo/foundation-l
On 18/10/2007, Florence Devouard Anthere9@yahoo.com wrote:
You know... I would be interested in knowing more about how daily work is handled with regards to multilingual situation, and if there are specific issues that non english raise and are still unsolved.
A really big problem is the category one. We have a policy to only use English categories, because category redirects don't "work" properly (putting an image in one cat that redirects to another doesn't make the image available in the other cat). I think nobody is really happy with the solution, but no better one has been found to date.
A secondary problem is that of templates, although we have {{pd-self}} and {{pd-self/ru}} if you have Russian interface the Russian one doesn't "magically" show up, unfortunately. Also we use a template {{information}} to store license,author etc info about each file, and we don't really have a good way to let users use an equivalent in another language.
For image descriptions, we have little templates that mark the language, like {{en|description in english}} {{de|...}} {{fr|...}} {{zh|...}}
We have just started a bit of a drive to use these more (esp. English ones are often not used) and from that we can derive things like "find all images that are missing a description in language X". So in a few more years I can imagine that will be quite impressive. At the moment Picture of the Day often has around 15 or 20 different language descriptions. It will be good if that is extended to, say, Quality Images.
Although we try to be "multilingual" I am sad to say I don't think everyone can participate on an equal footing regardless of which language they speak. I think most policy and administration discussion occurs in English.
For participating in processes like deletion, I think you could go pretty well without using English -- if someone had come before you and done the necessary translation. The actual process if pretty automated, and I think we have done pretty well at inculclating the value that "anyone can participate here with any language". So a request in Russian might not get processed too quickly but the user at least won't be commanded to write in English.
In requests for adminship I think we look very happily on multilingual candidates and if you happen to speak a language that we don't yet have an admin for, it's really not too hard to have success. :)
Then I suppose there is the general "cultural" thing where interacting with someone who speaks English as a second language... and I sometimes wonder "this is really dramatic, I wonder if that is intentional?" So that's another element where you have to make "assume good faith" your mantra.
Anyway, it would be more interesting to hear about this from the perspective of a non-native-English-speaker's perspective. :)
With regards to the developers request, hopefully, if the fundraising goes well, your wish should be answered. http://wikimediafoundation.org/wiki/Job_openings Several new positions are planned with regards to developers.
Yes, I sincerely hope so :)
cheers, Brianna
On 10/18/07, Florence Devouard Anthere9@yahoo.com wrote:
Indeed not a traditional press release, but hugely interesting :-) Hugely... thank you so much for that page
Seconded. Doing a "What's been happening in project X" update on milestones, similar to the release notes of software packages, is a brilliant idea.
wikimedia-l@lists.wikimedia.org