Towards a Commons API

List overview All Threads
Download

newer

older

State of the projects and other...

Does Commons need a separate...

Brianna Laugher

17 Aug 2007 17 Aug '07

1:47 p.m.

Hello,

First, an anecdote. I have had on my Google Reader for a while now, a feed from Technorati which picks up any blog posts with the words 'wikimedia commons'. A lot of crud comes through, but also a fair number of real bloggers who use our photos to illustrate their blog posts. And let me say, they almost always get it wrong. They fail to link. They fail to mention the license. They fail to mention the author. Probably we are lucky if they manage to mention the site name. While I look at all these random blogs, I notice the proliferation of Flickr plugins...part of Flickr's success, I feel certain, is due to their API which allows anyone to easily "plug" Flickr into another application - a blog, a website, facebook, etc. And in multiple ways: by license, by keyword, by author(or, close enough: uploader). They also have that

Commons could do this, but first we need to standardise. Anyone who actually has tried to write a tool to pick up this stuff will know it is hit and miss.

So, I am not really planning to work on this in any big hurry, but I'm just saying it for reference and in case anyone else has a particular interest in it.

The two main problems are keywords and licenses. Uploaders at least MediaWiki takes care of. :)

first, the easy one: licenses. There is a painful problem at the moment that we have no way of knowing which templates are license templates and which ones are not. New ones are created all the time and old ones may be converted to deletion templates. (ook.)

so, my proposal. 1. ask for new License: namespace to be installed at Commons. 2. move all license templates into the License: namespace. 3. separate any template which conflates license and source, e.g. "PD-NASA", "GFDL-GeoDB" (or whatever). Anything which is in the public domain, regardless of how it got there, should have {{License:Public domain}}. Indicating source by text + template is fine. Now I am not sure if there is actually a good reason we have license categories. Is it safe to assume that no one ever searches via license? If so, is there any extra functionality we gain from having the category? If not, we can quit using categories as well as templates to indicate licenses. If it is useful, we should change all license categories to be prefixed with License:. So instead of [[Category:GFDL]] we would have [[category:License:GFDL]].

I know technically "Public domain" is not a license. but close enough.

So, the second problem, keywords. Let's reduce this to an easier (although less complete :)) problem: categories. Being able to have per-category feeds would be extremely cool. Imagine such a feed on QI or FP. Totally awesome.

So in general we can assume that categories on a file act as describing keywords. There are two exceptions. One is license categories (see above). The other is maintenance categories, such as deletion, cleanup, user. So I propose we rename all these kinds of categories, to "Maintenance:X" or "Meta:X" for deletion and cleanup, and I guess "User:X" for user. I dunno. maybe these don't interfere too much. Sometimes they are useful. This change is not as important as the license one.

In general, I think it is good for us to look at Flickr and say: how do they facilitate sharing their content? how can we do that too?

cheers, Brianna user:pfctdayelise

-- They've just been waiting in a mountain for the right moment: http://modernthings.org/

Show replies by date

Magnus Manske

17 Aug 17 Aug

3:03 p.m.

On 8/17/07, Brianna Laugher brianna.laugher@gmail.com wrote:

...

While I look at all these random blogs, I notice the proliferation of Flickr plugins...part of Flickr's success, I feel certain, is due to their API which allows anyone to easily "plug" Flickr into another application - a blog, a website, facebook, etc. And in multiple ways: by license, by keyword, by author(or, close enough: uploader). They also have that

Commons could do this, but first we need to standardise. Anyone who actually has tried to write a tool to pick up this stuff will know it is hit and miss.

Well, there is one catch prior to that: IIRC, Flickr plugins transclude the image from the flickr servers. Would we want this for commons - using our bandwidth to display images on blog pages? (I'm not saying yes or no here, just asking the question)

...

first, the easy one: licenses. There is a painful problem at the moment that we have no way of knowing which templates are license templates and which ones are not. New ones are created all the time and old ones may be converted to deletion templates. (ook.)

Since I have written my share of tools dealing with licenses, here's what I recommend: Don't go for templates, but for categories. See http://commons.wikimedia.org/wiki/MediaWiki:GalleryDetails.js

for a brief list of bood/bad categories that, in practice, cover most images quite well (note that these are /beginnings/ of category names, so "CC-" fits any category starting with "CC-").

...

so, my proposal.

ask for new License: namespace to be installed at Commons.

move all license templates into the License: namespace.

separate any template which conflates license and source, e.g.

"PD-NASA", "GFDL-GeoDB" (or whatever). Anything which is in the public domain, regardless of how it got there, should have {{License:Public domain}}. Indicating source by text + template is fine. Now I am not sure if there is actually a good reason we have license categories. Is it safe to assume that no one ever searches via license? If so, is there any extra functionality we gain from having the category? If not, we can quit using categories as well as templates to indicate licenses. If it is useful, we should change all license categories to be prefixed with License:. So instead of [[Category:GFDL]] we would have [[category:License:GFDL]].

Well, few people will search starting at [[Category:GFDL]]. However, if the problem is to display the license of a certain image, it is very useful. It would be easier to prefix all the license categories, but only marginally so. For a complete list of all license categories, get the subcategories of [[Category:Copyright statuses]] (of of a few of these). This could be done centrally (toolserver) once a day.

...

So, the second problem, keywords. Let's reduce this to an easier (although less complete :)) problem: categories. Being able to have per-category feeds would be extremely cool. Imagine such a feed on QI or FP. Totally awesome.

Just FYI: Ages ago, I added a Recent Changes filter for changes in a category and its subcategories. It should still be in the code, but (guess what) deactivated.

...

So in general we can assume that categories on a file act as describing keywords. There are two exceptions. One is license categories (see above). The other is maintenance categories, such as deletion, cleanup, user. So I propose we rename all these kinds of categories, to "Maintenance:X" or "Meta:X" for deletion and cleanup, and I guess "User:X" for user. I dunno. maybe these don't interfere too much. Sometimes they are useful. This change is not as important as the license one.

In general, I think it is good for us to look at Flickr and say: how do they facilitate sharing their content? how can we do that too?

Full agreement :-)

Magnus

Andrew Gray

3:16 p.m.

On 17/08/07, Magnus Manske magnusmanske@googlemail.com wrote:

...

...
Commons could do this, but first we need to standardise. Anyone who actually has tried to write a tool to pick up this stuff will know it is hit and miss.

Well, there is one catch prior to that: IIRC, Flickr plugins transclude the image from the flickr servers. Would we want this for commons - using our bandwidth to display images on blog pages? (I'm not saying yes or no here, just asking the question)

Right now we don't take any steps to prevent this and I see it happening reasonably often - is it likely to be a significant blip compared to our usual bandwidth overheads?

In many ways, the answer to "do we want to?" is pretty intertwined with "what's the point of Commons?" - are we primarily a service for Wikimedia projects, or for the internet at large?

-- - Andrew Gray andrew.gray@dunelm.org.uk

Brianna Laugher

3:32 p.m.

On 18/08/07, Andrew Gray shimgray@gmail.com wrote:

...

In many ways, the answer to "do we want to?" is pretty intertwined with "what's the point of Commons?" - are we primarily a service for Wikimedia projects, or for the internet at large?

We are not a service "for the internet". We are a service (if we are even a service) for the people of the world.

IIRC almost all the projects started because they were decided to be "not Wikipedia". That doesn't mean they are content to define themselves with such a negative, small designation.

I find the answer to your questions easily here: http://wikimediafoundation.org/wiki/Mission "...to collect and develop educational content under a free license or in the public domain, and to disseminate it effectively and globally." I don't see how acting as if we only exist for Wikimedia is effective for disseminating our content at all. For one thing there are many similar efforts by different groups around the world. If we act as if we only exist for Wikimedia, we are going to create a lot of wasted unnecessary duplicate effort.

Encouraging people to realise that free content - the concept - exists, as well as the material thing, and then giving them easy ways to incorporate it into their own work - these are all small but necesary steps in spreading understanding and knowledge of free content and free culture.

Lastly: frankly, *if we don't do it, someone else will.* There is nothing to stop them since all our stuff is freely licensed. So for God's sake let us be the ones to do it and benefit from it. Lest we see some whiz-bang Yahoo app that feeds directly from Commons with our name in tiny tiny print somewhere in a disclaimer.

regards Brianna

-- They've just been waiting in a mountain for the right moment: http://modernthings.org/

Andrew Gray

4 p.m.

On 17/08/07, Brianna Laugher brianna.laugher@gmail.com wrote:

...

IIRC almost all the projects started because they were decided to be "not Wikipedia". That doesn't mean they are content to define themselves with such a negative, small designation.

Hey, hey, I just threw it out there. Nothing like navelgazing for a Friday afternoon... :-)

It's honestly something that you hear mumbled a lot, and bits of Commons itself - much less anyone else - seem quite confused over. Who are we doing this *for*? Comments like the bandwith one made me think of it again.

...

I find the answer to your questions easily here: http://wikimediafoundation.org/wiki/Mission "...to collect and develop educational content under a free license or in the public domain, and to disseminate it effectively and globally." I don't see how acting as if we only exist for Wikimedia is effective for disseminating our content at all. For one thing there are many similar efforts by different groups around the world. If we act as if we only exist for Wikimedia, we are going to create a lot of wasted unnecessary duplicate effort.

I don't disagree with any of this, incidentally.

...

Lastly: frankly, *if we don't do it, someone else will.* There is nothing to stop them since all our stuff is freely licensed. So for God's sake let us be the ones to do it and benefit from it. Lest we see some whiz-bang Yahoo app that feeds directly from Commons with our name in tiny tiny print somewhere in a disclaimer.

Perhaps the most pressing reason!

We could probably implement something very similar to flickr's basic "use this image" approach with a link in the toolbox; a "select the size you want" page with preformatted HTML to use it and link back to us.

(eg/ http://www.flickr.com/photo_zoom.gne?id=1116183939&size=m )

This would probably be the simplest thing to get up and running - he said, waving his hands - and probably more generally useful than most of the other API stuff. However, we'd still need to find a way of getting author and license information standardised, so that the tool could pull them in.

I wonder how many of our images use the nicely-standardised {{Information}} template for metadata?

-- - Andrew Gray andrew.gray@dunelm.org.uk

Bryan Tong Minh

6:32 p.m.

On 8/17/07, Andrew Gray shimgray@gmail.com wrote:

...

I wonder how many of our images use the nicely-standardised {{Information}} template for metadata?

{{Information}} is transcluded 939691 times. So say about 50% of the images.

Bryan

Gregory Maxwell

6:41 p.m.

On 8/17/07, Bryan Tong Minh bryan.tongminh@gmail.com wrote:

...

On 8/17/07, Andrew Gray shimgray@gmail.com wrote:

...
I wonder how many of our images use the nicely-standardised {{Information}} template for metadata?

{{Information}} is transcluded 939691 times. So say about 50% of the images.

Although there there are ~520 other templates which tranclude {{Information}} in order to wrap around it... I don't know that any of these wrappers are widely used, but they'll break any effort to machine read the data in information template arguments.

Magnus Manske

8:09 p.m.

On 8/17/07, Andrew Gray shimgray@gmail.com wrote:

...

We could probably implement something very similar to flickr's basic "use this image" approach with a link in the toolbox; a "select the size you want" page with preformatted HTML to use it and link back to us.

(eg/ http://www.flickr.com/photo_zoom.gne?id=1116183939&size=m )

Like - this? ;-) http://commons.wikimedia.org/wiki/Image:Oorgatbrug_schematisch.jpg?withJS=Me...

I couldn't resist *that* temptation!!!

...

This would probably be the simplest thing to get up and running - he said, waving his hands - and probably more generally useful than most of the other API stuff. However, we'd still need to find a way of getting author and license information standardised, so that the tool could pull them in.

I'm working on that ;-)

I could probably "screenscrape" the categories and find those that look like a license.

Determining the author might prove harder. There's the upload log (latest uploader if multiple?) and possible screenscraping of the Information template. Tricky.

Help is welcome! :-)

Magnus

Gregory Maxwell

8:33 p.m.

On 8/17/07, Magnus Manske magnusmanske@googlemail.com wrote:

...

Like - this? ;-) http://commons.wikimedia.org/wiki/Image:Oorgatbrug_schematisch.jpg?withJS=Me...

Change that to appear above the image at the top like flickr does, with a few less options, and a prominant orignal. ... and take it live site wide please. Thankx.

I get a couple emails a month now asking for higher resolution versions of my image.. when the higher resolution image was always on commons but the full resolution link was just not obvious enough.

...

I'm working on that ;-)

I could probably "screenscrape" the categories and find those that look like a license.

Don't bother yet. We should fix the license stuff to be a bit more machine readable (A prefix which will only be used for approved and acceptable licenses).

Anything else would be maddness. If we're not making some effort to keep a uniform interface on the data source side you'll have to hopelessly change a moving target forever. I.e. some random user will decide he wants to use [[Category:By-SA]] rather than [[Category:CC-By-SA]] and you'll miss it. We just need to say something like:

"An image isn't licensed unless there is a transclusion from the license namespace of the form {{License:foo}} directly included in the wikitext, and all pages in the license namespace must be commons compatible community approved licenses which also apply [[Category:Licensed under foo]] style categories."

This would be no real burden on users, and it would make the data much more machine readable. It would also solve other issues, like preventing people from randomly creating acceptable looking but invalid license templates (such as the, long ago fixed, "it came from the Library of Congress, thus it's PD" template).

If we only use categories for license integration we can't slow people from just inventing new ones at will..

If we use a template applied directly to the page, then we could skip the category entirely.. but keeping it is harmless, and can make integration with simple category based search tools simpler.

The direct applied template approach is a real boon for machine reading/editing. It's what Enwp did with the Non-free templates on enwp, and I can point you to a half dozen bot authors who were thankful for the change.

...

Determining the author might prove harder. There's the upload log (latest uploader if multiple?) and possible screenscraping of the Information template. Tricky.

Help is welcome! :-)

Scrape the text out of the author field. If someone has done something anything which is at all hard to read, like including a template in it's value, or used a wrapper rather than Information directly: just fail to extract the data.

We can go clean it up. Anything else is futle.

Next thing we need is "commons lint". Basically a script that checks an image page for errors (like an inability to extract the author or license data, malformed geocoding, and other machine detectable problems).

We should invoke that script using javascript when someone views the image page on commons. So if you view an image with problem's you'll get some kind of red flag.

We could even go so far as hooking the edit page, so that page save triggers the check script first and interupts saving if the check fails... or at least yells at you a lot. ... I wouldn't want to do that until it was well tested in advisory mode. However.

Magnus Manske

9:53 p.m.

OK, new gimmics: * On top of the image (Flickr-like) instead of toolbox * Full resolution emphasized * Automatic author detection from Infobox (no author info if this does not work) * Automatic license scraping (refers to commons page for license otherwise)

I noticed something when I wanted to link to the license on commons: Where to link? The template page? Not ideal, I guess. [[Commons:GFDL]] doesn't exist. [[GFDL]] is not the real fun either. I'm linking to [[Category:GFDL]] for now.

Magnus

P.S.: Someone please fix my layout. I'm too tired to throw margin/padding CSS around today ;-)

Andrew Gray

18 Aug 18 Aug

12:07 a.m.

On 17/08/07, Magnus Manske magnusmanske@googlemail.com wrote:

...

I noticed something when I wanted to link to the license on commons: Where to link? The template page? Not ideal, I guess. [[Commons:GFDL]] doesn't exist. [[GFDL]] is not the real fun either. I'm linking to [[Category:GFDL]] for now.

The Wikipedia articles on the GFDL, etc? That hits language issues, but the tool's already producing English output, so...

-- - Andrew Gray andrew.gray@dunelm.org.uk

Brianna Laugher

17 Aug 17 Aug

11:55 p.m.

On 18/08/07, Gregory Maxwell gmaxwell@gmail.com wrote:

...

On 8/17/07, Magnus Manske magnusmanske@googlemail.com wrote:

...
Like - this? ;-) http://commons.wikimedia.org/wiki/Image:Oorgatbrug_schematisch.jpg?withJS=Me...

Change that to appear above the image at the top like flickr does, with a few less options, and a prominant orignal. ... and take it live site wide please. Thankx.

Wow. Nice work again Magnus. that is.... fricking whoa. :D

some minor comments. Instead of offering 6 choices, just offer one. 200px or 250px. Change those six links to just be a single option "Use this image on your webpage outside Wikimedia". Make sure the link has a nice mouseover so it's not scary (if you can do this with JS... btw this function strikes me as a perfect little MW extension). Now for the HTML you give them, ('Title Goes Here'? *grins*) change the linktext of 'source' to 'from Wikimedia Commons', and remove the direct link to the fullsize file. Also, I am pretty sure the devs will crack it if you point to thumb.php for every single use. We have perfectly good cached thumbs, do we not? good enough for Wikimedia, good enough for everyone else. :)

For our toolserver tools, what do they point to? Is there some 200px thumb cache they use? I had an idea that there was...

I suppose if actually rolled this out sitewide we would need to talk to the devs and see what they think...

cheers Brianna

-- They've just been waiting in a mountain for the right moment: http://modernthings.org/

Gregory Maxwell

18 Aug 18 Aug

12:54 a.m.

On 8/17/07, Brianna Laugher brianna.laugher@gmail.com wrote:

...

some minor comments. Instead of offering 6 choices, just offer one. 200px or 250px.

The full option is useful to users who don't want to do HTML, but just want the full image. I hope we keep that. It's much more user friendly than the normal link (and should be made so by just making it say 'Full Size') and will reduce a lot of confusion that we see today.

I agree that we don't need six options, we should just make sure the HTML is easily changed by anyone with half a clue.

...

Change those six links to just be a single option "Use this image on your webpage outside Wikimedia

I think we need to draw the line at being bold when we get to the point of spending the Foundation's money, or more clearly our donors' money.

A quick 'within an order of magnitude' back of the napkin calculation indicates that each 250px thumbnail image view would cost the foundation about 1/2054 cents US in bandwidth. If we're successful it would add up to real amounts of money fairly quickly.

It might well be the cheapest and best advertising Wikimedia and Commons could ever get. ...But because endorsing this activity has a clear risk of costing real money, we should make sure that the right people are involved. Once we've endorsed this behavior it will be hard to turn it off without making people angry and taking bad PR.

I also think we should make an effort to maximize the value of our expenditure if we do this. For example, we should require sites using our HTML to preserve the author's name and the link back to Commons. .. We could enforce this too, by watching referer logs, and blocking sites that deep link the image without a proper link back.

We should also advise users to seek legal advice before embedding copylefted images into non-free works, as it may violate the terms of the licenses depending on their use; we could even customize this message by license tag.

(I for one am tired of seeing my images show up on other sites without the smallest effort at following the license ... Commons should do what it can to improve this situation).

My other worry:

A huge part of the reason people embed Flickr images is because Flickr is cheap/free image hosting for blogs, forums, and other websites. In many cases they are not using Flickr's library of existing images as much as uploading their own images. (What Web 2.0 profit dreams Flickr has for this is anyone's guess ;) ). I'm not sure what percentage of external Flickr use is the existing repository vs images uploaded for pure hosting purposes, but I do know that we shouldn't be encouraging that sort of use unless we are to abandon the "useful" part of our mission. (And abandoning that would put us outside of the scope of the Wikimedia Foundation)

I suppose we can deal with this one as it comes... but we need to be careful to not confuse popularity with success.

...

Make sure the link has a nice mouseover so it's not scary (if you can do this with JS... btw this function strikes me as a perfect little MW extension).

He should use an onclick. Set href="#". Javascript URLs are Evil(tm).

Brianna Laugher

1:27 p.m.

On 18/08/07, Gregory Maxwell gmaxwell@gmail.com wrote:

...

On 8/17/07, Brianna Laugher brianna.laugher@gmail.com wrote:

...
some minor comments. Instead of offering 6 choices, just offer one. 200px or 250px.

The full option is useful to users who don't want to do HTML, but just want the full image. I hope we keep that. It's much more user friendly than the normal link (and should be made so by just making it say 'Full Size') and will reduce a lot of confusion that we see today.

Ah yes. Although it is redundant. It would be nice if eventually what we're doing was default and then the link no one uses was removed.

...

...
Change those six links to just be a single option "Use this image on your webpage outside Wikimedia

I think we need to draw the line at being bold when we get to the point of spending the Foundation's money, or more clearly our donors' money.

I agree entirely with this part of your post. my enthusiasm for the notion may have clouded my expressing this. :)

...

A huge part of the reason people embed Flickr images is because Flickr is cheap/free image hosting for blogs, forums, and other websites. In many cases they are not using Flickr's library of existing images as much as uploading their own images. (What Web 2.0 profit dreams Flickr has for this is anyone's guess ;) ). I'm not sure what percentage of external Flickr use is the existing repository vs images uploaded for pure hosting purposes, but I do know that we shouldn't be encouraging that sort of use unless we are to abandon the "useful" part of our mission. (And abandoning that would put us outside of the scope of the Wikimedia Foundation)

Hm. I agree. But I don't think allowing our images to be easily reused implies we want people to add their images to our collection. I think we can knock that kind of behaviour (people uploading personal collections) on the head easily enough.

cheers, Brianna

-- They've just been waiting in a mountain for the right moment: http://modernthings.org/

Magnus Manske

4:20 p.m.

OK, new version: * One size only (larger dimension = 250px), plus "full resolution" * Now also works for SVG (no "full resolution", though) * Licenses cleanup, link to en.wikipedia now (though we might have to create some redirects there...)

Example links (to save you scrolling;-) http://commons.wikimedia.org/wiki/Image:SantAgnese_00247.JPG?withJS=MediaWik... http://commons.wikimedia.org/wiki/Image:Flag_of_Costa_Rica_%281842-1848%29.s...

The visual aspect of the bar is still a little flimsy IMHO, somehow...

Magnus

Florian Straub

20 Aug 20 Aug

8:22 a.m.

"Brianna Laugher" brianna.laugher@gmail.com wrote:

...

On 18/08/07, Gregory Maxwell gmaxwell@gmail.com wrote:

...
On 8/17/07, Magnus Manske magnusmanske@googlemail.com wrote:

...
Like - this? ;-)

http://commons.wikimedia.org/wiki/Image:Oorgatbrug_schematisch.jpg?withJS=Me...

Change that to appear above the image at the top like flickr does, with a few less options, and a prominant orignal. ... and take it live site wide please. Thankx.

Wow. Nice work again Magnus. that is.... fricking whoa. :D

I second that. Although it seems to have problems with one of these previous scripts:

includePage( 'MediaWiki:ResizeGalleries.js' ); includePage( 'MediaWiki:HotCat.js' ); includePage( 'MediaWiki:Cat-a-lot.js' ); includePage( 'MediaWiki:Check-usage.js' ); includePage( 'MediaWiki:Flickrfixr.js' );

After I commented them out, it worked ...

Regards,

Flo

-- GMX FreeMail: 1 GB Postfach, 5 E-Mail-Adressen, 10 Free SMS. Alle Infos und kostenlose Anmeldung: http://www.gmx.net/de/go/freemail

Magnus Manske

9:08 a.m.

On 8/20/07, Florian Straub Flominator@gmx.net wrote:

...

I second that. Although it seems to have problems with one of these previous scripts:

includePage( 'MediaWiki:ResizeGalleries.js' ); includePage( 'MediaWiki:HotCat.js' ); includePage( 'MediaWiki:Cat-a-lot.js' ); includePage( 'MediaWiki:Check-usage.js' ); includePage( 'MediaWiki:Flickrfixr.js' );

After I commented them out, it worked ...

Just put ChooseResolution.js first :-)

Magnus

P.S.: Yes, I'm trying to fix it...

Magnus

Andrew Gray

17 Aug 17 Aug

8:38 p.m.

On 17/08/07, Magnus Manske magnusmanske@googlemail.com wrote:

...

I couldn't resist *that* temptation!!!

Good lord, you respond fast. Commons *really* needs someone to make me some soup, too... ;-)

...

...
This would probably be the simplest thing to get up and running - he said, waving his hands - and probably more generally useful than most of the other API stuff. However, we'd still need to find a way of getting author and license information standardised, so that the tool could pull them in.

I'm working on that ;-)

I could probably "screenscrape" the categories and find those that look like a license.

Determining the author might prove harder. There's the upload log (latest uploader if multiple?) and possible screenscraping of the Information template. Tricky.

Certainly my gut instinct would be take whatever's in the "Author =" field of the information template for preference - uploader usernames falls down heavily on the basis that a) a lot of people don't want to be credited that way ("by MrFancyName? Huh?"), and b) much of the Commons material is transferred to us from other projects by third parties. The "author" field, however, is pretty much verbatim how they want to be credited.

As long as we have a link to the image page, we're about as compliant with the license as Wikipedia is, and asking for more might get grumbles ;-)

-- - Andrew Gray andrew.gray@dunelm.org.uk

Brianna Laugher

19 Aug 19 Aug

2:37 a.m.

OK, replying to myself is weird, but...

On 18/08/07, Brianna Laugher brianna.laugher@gmail.com wrote:

...

Lastly: frankly, *if we don't do it, someone else will.* There is nothing to stop them since all our stuff is freely licensed. So for God's sake let us be the ones to do it and benefit from it.

Witness: http://www.facebook.com/apps/application.php?id=2395958097&ref=nf 'My Wikipedia' facebook app. developed by KallOut Inc. https://www.kallout.com/login/index.php

...

...
My Wikipedia allows you to display sections of Wikipedia on your

profile page. By default, My Wikipedia displays the daily "Featured Article" from Wikipedia's homepage but can be customized to display any article of your choosing.<<

cheers Brianna

-- They've just been waiting in a mountain for the right moment: http://modernthings.org/

Stephen Bain

7:15 a.m.

...

On 18/08/07, Brianna Laugher brianna.laugher@gmail.com wrote:

...
Lastly: frankly, *if we don't do it, someone else will.* There is nothing to stop them since all our stuff is freely licensed. So for God's sake let us be the ones to do it and benefit from it.

Witness: http://www.facebook.com/apps/application.php?id=2395958097&ref=nf 'My Wikipedia' facebook app. developed by KallOut Inc. https://www.kallout.com/login/index.php

I've been working on a similar app for Commons for a while now. It will do things like display the latest featured pictures, or the latest images in a category, or a user's most recent uploads (like Flickr's photostream thing), or any of a range of other things (the limits are what can be easily retrieved via the API).

Currently I'm hampered by the slowness of the toolserver account requesting process, so if you want to see this app in action then go bug DaB :)

-- Stephen Bain stephen.bain@gmail.com

Magnus Manske

9:46 a.m.

On 8/19/07, Stephen Bain stephen.bain@gmail.com wrote:

...

...
On 18/08/07, Brianna Laugher brianna.laugher@gmail.com wrote:

...
Lastly: frankly, *if we don't do it, someone else will.* There is nothing to stop them since all our stuff is freely licensed. So for God's sake let us be the ones to do it and benefit from it.

Witness: http://www.facebook.com/apps/application.php?id=2395958097&ref=nf 'My Wikipedia' facebook app. developed by KallOut Inc. https://www.kallout.com/login/index.php

I've been working on a similar app for Commons for a while now. It will do things like display the latest featured pictures, or the latest images in a category, or a user's most recent uploads (like Flickr's photostream thing), or any of a range of other things (the limits are what can be easily retrieved via the API).

Currently I'm hampered by the slowness of the toolserver account requesting process, so if you want to see this app in action then go bug DaB :)

And there's always my flommons :-) http://tools.wikimedia.de/~magnus/cgi-bin/flommons.pl

Picking some examples at random: http://tools.wikimedia.de/~magnus/cgi-bin/flommons.pl?user=&mode=single_... http://tools.wikimedia.de/~magnus/cgi-bin/flommons.pl?user=&mode=categor... http://tools.wikimedia.de/~magnus/cgi-bin/flommons.pl?user=&foruser=Chri...

And yes, the toolserver is slow...

Magnus

David Gerard

17 Aug 17 Aug

9:42 p.m.

On 17/08/07, Andrew Gray shimgray@gmail.com wrote:

...

In many ways, the answer to "do we want to?" is pretty intertwined with "what's the point of Commons?" - are we primarily a service for Wikimedia projects, or for the internet at large?

The first and then the second. (Removing the first would mean a duplicate Commons just for the projects, and that would be silly.) However, that doesn't address whether to lock down on out-of-Wikimedia referrers.

- d.

Magnus Manske

3:07 p.m.

On 8/17/07, Brianna Laugher brianna.laugher@gmail.com wrote:

...

So, the second problem, keywords. Let's reduce this to an easier (although less complete :)) problem: categories. Being able to have per-category feeds would be extremely cool. Imagine such a feed on QI or FP. Totally awesome.

Would that be "New files in that category"? Or "Last files that have been added to that category"? Or "Last changes for pictures in that category"? Or a combination?

Magnus

Brianna Laugher

3:13 p.m.

On 18/08/07, Magnus Manske magnusmanske@googlemail.com wrote:

...

On 8/17/07, Brianna Laugher brianna.laugher@gmail.com wrote:

...
So, the second problem, keywords. Let's reduce this to an easier (although less complete :)) problem: categories. Being able to have per-category feeds would be extremely cool. Imagine such a feed on QI or FP. Totally awesome.

Would that be "New files in that category"? Or "Last files that have been added to that category"?

Uhm... how are they different? That's what I mean, at any rate. For external services edits to image pages are not interesting.

cheers Brianna

-- They've just been waiting in a mountain for the right moment: http://modernthings.org/

Magnus Manske

3:47 p.m.

On 8/17/07, Brianna Laugher brianna.laugher@gmail.com wrote:

...

On 18/08/07, Magnus Manske magnusmanske@googlemail.com wrote:

...
On 8/17/07, Brianna Laugher brianna.laugher@gmail.com wrote:

...
So, the second problem, keywords. Let's reduce this to an easier (although less complete :)) problem: categories. Being able to have per-category feeds would be extremely cool. Imagine such a feed on QI or FP. Totally awesome.

Would that be "New files in that category"? Or "Last files that have been added to that category"?

Uhm... how are they different? That's what I mean, at any rate. For external services edits to image pages are not interesting.

They are identical for newly uploaded images. They are different if I add a category to an image that has been uploaded a year ago.

I was talking about "date added to category" vs. "date of image upload". Maybe I'm nitpicking here ;-)

Magnus

Bryan Tong Minh

6:47 p.m.

On 8/17/07, Brianna Laugher brianna.laugher@gmail.com wrote:

...

ask for new License: namespace to be installed at Commons.

That sounds like good idea. This will greatly improve machine readability.

Bryan

Gregory Maxwell

7:03 p.m.

On 8/17/07, Bryan Tong Minh bryan.tongminh@gmail.com wrote:

...

On 8/17/07, Brianna Laugher brianna.laugher@gmail.com wrote:

...

ask for new License: namespace to be installed at Commons.

That sounds like good idea. This will greatly improve machine readability.

Enwp would almost certainly follow suit. Enwp made a similar change for the non-free tags there (all are prefixed with Non-free), and Brianna's suggestion of a license name space for free license templates was well received.

Stephen Bain

18 Aug 18 Aug

1:42 a.m.

On 8/17/07, Brianna Laugher brianna.laugher@gmail.com wrote:

...

Being able to have per-category feeds would be extremely cool. Imagine such a feed on QI or FP. Totally awesome.

We do keep timestamp data in the categorylinks table, so it's possible to work out roughly when a page was added to a category. However, AFAIK this is not actually used anywhere.

On a related note, for my Facebook work I put together a patch to allow timestamp data to be exposed through API queries. This would effectively allow many of the Flickr-like functions that are so popular (latest images in a category, for example):

http://bugzilla.wikimedia.org/show_bug.cgi?id=10890

Note that it would be good if some technical types can check this out to make sure it actually works.

-- Stephen Bain stephen.bain@gmail.com

Gregory Maxwell

2:02 a.m.

On 8/17/07, Stephen Bain stephen.bain@gmail.com wrote:

...

We do keep timestamp data in the categorylinks table, so it's possible to work out roughly when a page was added to a category. However, AFAIK this is not actually used anywhere.

Enwp used the category timestamps for some automated prod tool on toolserver a while back. I think they found the data lacking, since any blanking/revert cycle renewed it.

For us it will probably be okay.

...

Note that it would be good if some technical types can check this out to make sure it actually works.

We carry an index on cl_to,cl_timestamp and it appears your query will correctly use it.. I don't see anything obviously broken from that perspective. But thats about all I can offer because my mediawiki internal kungfu is weak.

Bryan Tong Minh

1:38 p.m.

On 8/18/07, Stephen Bain stephen.bain@gmail.com wrote:

...

On 8/17/07, Brianna Laugher brianna.laugher@gmail.com wrote:

...
Being able to have per-category feeds would be extremely cool. Imagine such a feed on QI or FP. Totally awesome.

On wiki, and not excactly real time: [[User:BryanBot/CategoryWatch]]. But I guess that shows that is should be easy to make rss feeds of categories.

Bryan

6351

Age (days ago)

6354

Last active (days ago)

commons-l@lists.wikimedia.org

29 comments

8 participants

tags (0)

participants (8)

Andrew Gray
Brianna Laugher
Bryan Tong Minh
David Gerard
Florian Straub
Gregory Maxwell
Magnus Manske
Stephen Bain