On Sun, May 9, 2010 at 6:37 AM, David Gerard dgerard@gmail.com wrote:
On 9 May 2010 06:09, K. Peachey p858snake@yahoo.com.au wrote:
Bugzilla 982[1] MediaWiki should support ICRA's PICS content labeling. From my understanding without reading much about it, It [ICRA] is ment to be a "international" or at least a standard for these things which most people seem to abide by (i see it splashed around on a lot of education sites that they are compliant with that standard).
This came up in discussion a while ago on WHATWG - PICS is actually dead. Even its creators have given up on it. No-one implements it. As a standard, it's got no backing. So we'd be the first significant organisation to actually take it seriously, and would be reviving it.
- d.
Forwarding this to wikitech-l solely for the technical discussion.
I was unaware that it had died. Do you have any links from ICRA on the issue? I know Derk-Jan has been looking at resurrecting that bug, and I'd be interested to know what the actual state of the standard is. If nobody uses it and they've declared it dead, then we don't need to bother implementing it. Do we know if there is some standard that is used widely, or does every web filtering package reinvent the wheel?
-Chad
Some people have pointed out the fact we currently already have a categorization system which is in use. This system doesn't exactly work well for end user/third party filtering. (Note I haven't played with the API in regards to this) Currently our system shows the contents of what is in it which is fine for when your looking for something in it, but to filter something you need to be easily get the file(name/s) of the contents in it en-mass for the content so they can be added to the relevant systems.
For example, Showing real content, There is a difference in adding any of the three to a filter list: * http://en.wikipedia.org/wiki/Category:Cats (The Category) * http://en.wikipedia.org/wiki/File:ParliamentaryCats.jpg (The Description Page) * http://upload.wikimedia.org/wikipedia/en/e/ec/ParliamentaryCats.jpg (The Actual File)
The last one (and the subsequent thumb files) is what we need to be easily identifiable [The Filenames and Full Paths] so they can be dealt with as required by the filters, For example a list that is automatically produced per category so they can be imported (Is there any standards for importable lists into filters??).
-Peachey
On Sun, May 9, 2010 at 12:36, K. Peachey p858snake@yahoo.com.au wrote:
For example, Showing real content, There is a difference in adding any of the three to a filter list:
- http://en.wikipedia.org/wiki/Category:Cats (The Category)
- http://en.wikipedia.org/wiki/File:ParliamentaryCats.jpg (The Description Page)
- http://upload.wikimedia.org/wikipedia/en/e/ec/ParliamentaryCats.jpg
(The Actual File)
The last one (and the subsequent thumb files) is what we need to be easily identifiable [The Filenames and Full Paths] so they can be dealt with as required by the filters, For example a list that is automatically produced per category so they can be imported (Is there any standards for importable lists into filters??).
It's pretty easy to do arbitrary content tagging (and filtering now). You just add a template or external link to the page. E.g. {{PG-13}}.
Then all some third party has to do is to download templatelinks.sql.gz (or externallinks.sql.gz) in addition to the image dump.
You just have to start getting people to tag things consistently. The good thing is that you can start now without any additional software support.
On Sun, May 9, 2010 at 10:50 PM, Ævar Arnfjörð Bjarmason avarab@gmail.com wrote:
It's pretty easy to do arbitrary content tagging (and filtering now). You just add a template or external link to the page. E.g. {{PG-13}}.
Then all some third party has to do is to download templatelinks.sql.gz (or externallinks.sql.gz) in addition to the image dump.
You just have to start getting people to tag things consistently. The good thing is that you can start now without any additional software support.
The dumps would have to be done fairly regularly if not daily (would that even be possible, especially on the bigger sites) for them to be really use for outside [filtering] companies to use them, And in a easy format, If it would take them to long to work it out, they would probably just block everything instead of wasting time on it.
On Sun, May 9, 2010 at 11:08 PM, K. Peachey p858snake@yahoo.com.au wrote:
On Sun, May 9, 2010 at 10:50 PM, Ævar Arnfjörð Bjarmason avarab@gmail.com wrote:
It's pretty easy to do arbitrary content tagging (and filtering now). You just add a template or external link to the page. E.g. {{PG-13}}.
Then all some third party has to do is to download templatelinks.sql.gz (or externallinks.sql.gz) in addition to the image dump.
You just have to start getting people to tag things consistently. The good thing is that you can start now without any additional software support.
Also, Do we even have image dumps anymore?
-Peachey
The dumps would have to be done fairly regularly if not daily (would that even be possible, especially on the bigger sites) for them to be
If it is only about obtaining a list of tagged images, that can even be done on the toolserver with a simple SQL query. Should not take more than a few minutes for a large wikipedia. So, yes, daily updates would be no problem.
On Sun, May 9, 2010 at 3:09 PM, K. Peachey p858snake@yahoo.com.au wrote:
On Sun, May 9, 2010 at 11:08 PM, K. Peachey p858snake@yahoo.com.au wrote:
On Sun, May 9, 2010 at 10:50 PM, Ævar Arnfjörð Bjarmason avarab@gmail.com wrote:
It's pretty easy to do arbitrary content tagging (and filtering now). You just add a template or external link to the page. E.g. {{PG-13}}.
Then all some third party has to do is to download templatelinks.sql.gz (or externallinks.sql.gz) in addition to the image dump.
You just have to start getting people to tag things consistently. The good thing is that you can start now without any additional software support.
Also, Do we even have image dumps anymore?
AFAIR backups exist, but I don't know if there are any public dumps. I'd imagine they're simply too big to maintain and archive...
Marco
On 09/05/10 20:55, Chad wrote:
This came up in discussion a while ago on WHATWG - PICS is actually dead. Even its creators have given up on it. No-one implements it. As a standard, it's got no backing. So we'd be the first significant organisation to actually take it seriously, and would be reviving it.
Forwarding this to wikitech-l solely for the technical discussion.
I was unaware that it had died. Do you have any links from ICRA on the issue? I know Derk-Jan has been looking at resurrecting that bug, and I'd be interested to know what the actual state of the standard is. If nobody uses it and they've declared it dead, then we don't need to bother implementing it. Do we know if there is some standard that is used widely, or does every web filtering package reinvent the wheel?
PICS was a W3C proposed standard for tagging content. It is obsolete. It has been replaced by RDF Content Labels, a.k.a. POWDER:
http://www.w3.org/2004/12/q/doc/content-labels-schema.htm
Both PICS and RDF Content Labels are technical schemes with no moral values attached.
ICRA provides a set of labels (the "ICRA vocabulary") relevant to prevailing Christian morality. It can be used with either PICS or RDF.
Companies that sell filtering software tend to be coy about how they classify pages, since content analysis heuristics certainly play a big role. ICRA gives links to two content filters that support their tags. One is a simple browser plugin, the other is a large and complex content classification system suitable for filtering internet access for schools, businesses or ISPs.
http://www.profiltechnology.com/en/index.aspx
Profil looks big enough that if it does indeed support ICRA/RDF, then I think that's a good enough reason to write an extension.
Note that there are lots of other applications for RDF Content Labels. In particular, accessibility and copyright/license tagging have been promoted. I think we could have some generic support for RDF in the core, with the ICRA vocabulary and UI in an extension.
-- Tim Starling
wikitech-l@lists.wikimedia.org