Hi, I have launched speedydeletion.wika.com , it is updated every 30 minutes with the proposed deletions and speedy deletion articles (not notable and hoaxes, not others). it is running on the en.wikipedia.org. the sources for the script are all on git hub and are a merger of pywikipediabot and the wikiteam codebases. hope you enjoy it, thanks, mike
sorry, not speedydeletion.wika.com but http://speedydeletion.wikia.com http://speedydeletion.wikia.com/thanks, mike
On Sun, Jun 10, 2012 at 7:37 AM, Mike Dupont <jamesmikedupont@googlemail.com
wrote:
Hi, I have launched speedydeletion.wika.com , it is updated every 30 minutes with the proposed deletions and speedy deletion articles (not notable and hoaxes, not others). it is running on the en.wikipedia.org. the sources for the script are all on git hub and are a merger of pywikipediabot and the wikiteam codebases. hope you enjoy it, thanks, mike -- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3
Thats great. We've been in need for something like this for a while, good luck with the project!
Two questions: Are you planning on letting the authors know there article has been transwikid? Are you checking if the article actually gets deleted from Wikipedia or gets recreated?
On Sun, Jun 10, 2012 at 9:39 AM, Mike Dupont jamesmikedupont@googlemail.com wrote:
sorry, not speedydeletion.wika.com but http://speedydeletion.wikia.com http://speedydeletion.wikia.com/thanks, mike
On Sun, Jun 10, 2012 at 7:37 AM, Mike Dupont <jamesmikedupont@googlemail.com
wrote:
Hi, I have launched speedydeletion.wika.com , it is updated every 30 minutes with the proposed deletions and speedy deletion articles (not notable and hoaxes, not others). it is running on the en.wikipedia.org. the sources for the script are all on git hub and are a merger of pywikipediabot and the wikiteam codebases. hope you enjoy it, thanks, mike -- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3
-- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Hi,
I need one bit of help, is there any dump file of templates from the en wikipedia? i would like to import those into the wikia.
I will add services to postprocess the data as we go, for example to look for articles and mark them as really deleted, updating them if they change on the main wiki and not yet on the copy. also i am considering to add to the talk page and to the page of the users, we can do all that yes. I need to look at how to implement that.
my next step will be to reduce the archive.org backups to only changes, right now all the articles are taken as a snapshot every 30 minutes, meaning that there are many copies of articles in the dumps.
I would like to install these scripts on the toolserver if there are no objections, right now it is running on my laptop.
I also wanted to look for twitter accounts for example embedded in the articles and tweet people, I did that for a handful of articles. But that can be done in the future as well, it is not needed right now.
mike
On Sun, Jun 10, 2012 at 9:45 AM, Martijn Hoekstra <martijnhoekstra@gmail.com
wrote:
Thats great. We've been in need for something like this for a while, good luck with the project!
Two questions: Are you planning on letting the authors know there article has been transwikid? Are you checking if the article actually gets deleted from Wikipedia or gets recreated?
On Sun, Jun 10, 2012 at 9:39 AM, Mike Dupont jamesmikedupont@googlemail.com wrote:
sorry, not speedydeletion.wika.com but http://speedydeletion.wikia.com http://speedydeletion.wikia.com/thanks, mike
On Sun, Jun 10, 2012 at 7:37 AM, Mike Dupont <
jamesmikedupont@googlemail.com
wrote:
Hi, I have launched speedydeletion.wika.com , it is updated every 30
minutes
with the proposed deletions and speedy deletion articles (not notable
and
hoaxes, not others). it is running on the en.wikipedia.org. the sources for the script are
all
on git hub and are a merger of pywikipediabot and the wikiteam
codebases.
hope you enjoy it, thanks, mike -- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3
-- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
2012/6/10 Mike Dupont jamesmikedupont@googlemail.com:
I have launched speedydeletion.wika.com , it is updated every 30 minutes with the proposed deletions and speedy deletion articles (not notable and hoaxes, not others).
Speedydeleted articles are either no articles at all, thus not worth discussing, or they are copyright infrigements. I am sorry to say that I do not happen to understand why you collect those pages in a separate wiki.
Regards, Jürgen.
Not quite - Mike says that " it is updated...with the proposed deletions and speedy deletion articles (not notable and hoaxes, *not others*)."
Non-notable articles are arguably important to keep - they serve as a basis for recreating the articles in future, and for easy review by non-admins. Hoaxes are also important to keep publicy viewable - so that we can stop similar ones happening again, surely?
Richard Symonds Wikimedia UK 0207 065 0992 Disclaimer viewable at http://uk.wikimedia.org/wiki/Wikimedia:Email_disclaimer Visit http://www.wikimedia.org.uk/ and @wikimediauk
On 10 June 2012 18:29, Juergen Fenn schneeschmelze@googlemail.com wrote:
2012/6/10 Mike Dupont jamesmikedupont@googlemail.com:
I have launched speedydeletion.wika.com , it is updated every 30 minutes with the proposed deletions and speedy deletion articles (not notable and hoaxes, not others).
Speedydeleted articles are either no articles at all, thus not worth discussing, or they are copyright infrigements. I am sorry to say that I do not happen to understand why you collect those pages in a separate wiki.
Regards, Jürgen.
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
On Sun, Jun 10, 2012 at 2:24 PM, Richard Symonds < richard.symonds@wikimedia.org.uk> wrote:
Non-notable articles are arguably important to keep - they serve as a basis for recreating the articles in future, and for easy review by non-admins.
That's true to a point, but only for articles which are marginally non-notable, or which have the potential to demonstrate notability if rewritten. An article about, for example, someone's pet goldfish is unlikely to be of much value to anyone, regardless of where it's located.
If Mike wants to keep such articles around, of course, that's his business; as far as I can tell, there's no real harm in it.
Hoaxes are also important to keep publicy viewable - so that we can stop similar ones happening again, surely?
This is where I think things become slightly problematic. There are, in broad terms, two categories of hoaxes: those which are fundamentally harmless (these typically being of the "did you know that 'gullible' isn't in the dictionary" variety), and those which have the potential to be harmful (including, but not limited to, false statements about drugs, crimes, politically explosive issues, etc.). The speedy deletion system makes no real attempt to distinguish between these (with the exception of blatant attacks on living people, which have their own deletion tag), but retaining the second category in a publicly-viewable (and publicly-searcheable?) form is probably not desirable.
Kirill
On Sun, Jun 10, 2012 at 6:38 PM, Kirill Lokshin kirill.lokshin@gmail.comwrote:
This is where I think things become slightly problematic. There are, in broad terms, two categories of hoaxes: those which are fundamentally harmless (these typically being of the "did you know that 'gullible' isn't in the dictionary" variety), and those which have the potential to be harmful (including, but not limited to, false statements about drugs, crimes, politically explosive issues, etc.). The speedy deletion system makes no real attempt to distinguish between these (with the exception of blatant attacks on living people, which have their own deletion tag), but retaining the second category in a publicly-viewable (and publicly-searcheable?) form is probably not desirable.
I am willing to delete articles from the wikia, they are also archived on archive.org. it would be nice to work out a detailed tagging system to work out what may be archived and what not. We can remove categories and add them.
But lets look at some of these, http://speedydeletion.wikia.com/wiki/Ipod_5.5_generation this looks like someone who is providing infomation in the wrong way, but could be valid. http://speedydeletion.wikia.com/wiki/Trantlers this is funny and would like to keep it and add a picture of it, maybe to a humor section. this one too http://speedydeletion.wikia.com/wiki/Trantlas
http://speedydeletion.wikia.com/wiki/Willyfor this is strange , but might be some meme or urban legend. I guess unencylopedia. http://speedydeletion.wikia.com/wiki/Like_Dis_Like_Dat same here, this is also unencylopedia material.
now this looks like a hoax , and could be deleted, but also unencylopedia stuff. http://speedydeletion.wikia.com/wiki/Aman_hadid
personally I am interested in non notable articles, my experience with dealing with encouraging editors and collection creative commons licensed information on places that are not very notable like villages in kosovo could be helped with these articles. if we can find authors who are willing to write about non notable topics, maybe we can collect them and categorize them. this could be a recruiting mechanism and also provide good basic data and maybe new contributors for wikitravel and openstreetmap. with a staging system like this speedydeletion wikia you will have a way to demote articles to the speedy deletion purgatory where they might redeem themselves.
What happens to your system if an article is deleted from Wikipedia, a new article is posted again under the same name, and then that one is also deleted?
On Sun, Jun 10, 2012 at 2:55 PM, Mike Dupont <jamesmikedupont@googlemail.com
wrote:
On Sun, Jun 10, 2012 at 6:38 PM, Kirill Lokshin <kirill.lokshin@gmail.com
wrote:
This is where I think things become slightly problematic. There are, in broad terms, two categories of hoaxes: those which are fundamentally harmless (these typically being of the "did you know that 'gullible'
isn't
in the dictionary" variety), and those which have the potential to be harmful (including, but not limited to, false statements about drugs, crimes, politically explosive issues, etc.). The speedy deletion system makes no real attempt to distinguish between these (with the exception of blatant attacks on living people, which have their own deletion tag), but retaining the second category in a publicly-viewable (and publicly-searcheable?) form is probably not desirable.
I am willing to delete articles from the wikia, they are also archived on archive.org. it would be nice to work out a detailed tagging system to work out what may be archived and what not. We can remove categories and add them.
But lets look at some of these, http://speedydeletion.wikia.com/wiki/Ipod_5.5_generation this looks like someone who is providing infomation in the wrong way, but could be valid. http://speedydeletion.wikia.com/wiki/Trantlers this is funny and would like to keep it and add a picture of it, maybe to a humor section. this one too http://speedydeletion.wikia.com/wiki/Trantlas
http://speedydeletion.wikia.com/wiki/Willyfor this is strange , but might be some meme or urban legend. I guess unencylopedia. http://speedydeletion.wikia.com/wiki/Like_Dis_Like_Dat same here, this is also unencylopedia material.
now this looks like a hoax , and could be deleted, but also unencylopedia stuff. http://speedydeletion.wikia.com/wiki/Aman_hadid
personally I am interested in non notable articles, my experience with dealing with encouraging editors and collection creative commons licensed information on places that are not very notable like villages in kosovo could be helped with these articles. if we can find authors who are willing to write about non notable topics, maybe we can collect them and categorize them. this could be a recruiting mechanism and also provide good basic data and maybe new contributors for wikitravel and openstreetmap. with a staging system like this speedydeletion wikia you will have a way to demote articles to the speedy deletion purgatory where they might redeem themselves. _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
good question, The raw data is not lost : the archive.org will get a snapshot every 30 minutes of the items that are categorized.
the wikia right now will not be updated: There is not any code for the updating of the article, once it is in the wikia.
I would be interesting to get the full history of each article, if it is replaced after deletion, i suppose that a new page or subpage would be needed if it is deleted again, maybe a fine history would be good enough.
In any case if articles last more than 30 minutes in the speedy deletion category they will be archived.
mike
On Sun, Jun 10, 2012 at 9:45 PM, Nathan nawrich@gmail.com wrote:
What happens to your system if an article is deleted from Wikipedia, a new article is posted again under the same name, and then that one is also deleted?
On Sun, Jun 10, 2012 at 2:55 PM, Mike Dupont < jamesmikedupont@googlemail.com
wrote:
On Sun, Jun 10, 2012 at 6:38 PM, Kirill Lokshin <
kirill.lokshin@gmail.com
wrote:
This is where I think things become slightly problematic. There are,
in
broad terms, two categories of hoaxes: those which are fundamentally harmless (these typically being of the "did you know that 'gullible'
isn't
in the dictionary" variety), and those which have the potential to be harmful (including, but not limited to, false statements about drugs, crimes, politically explosive issues, etc.). The speedy deletion
system
makes no real attempt to distinguish between these (with the exception
of
blatant attacks on living people, which have their own deletion tag),
but
retaining the second category in a publicly-viewable (and publicly-searcheable?) form is probably not desirable.
I am willing to delete articles from the wikia, they are also archived on archive.org. it would be nice to work out a detailed tagging system to work out what
may
be archived and what not. We can remove categories and add them.
But lets look at some of these, http://speedydeletion.wikia.com/wiki/Ipod_5.5_generation this looks like someone who is providing infomation in the wrong way, but could be valid. http://speedydeletion.wikia.com/wiki/Trantlers this is funny and would like to keep it and add a picture of it, maybe to a humor section. this one
too
http://speedydeletion.wikia.com/wiki/Trantlas
http://speedydeletion.wikia.com/wiki/Willyfor this is strange , but
might
be some meme or urban legend. I guess unencylopedia. http://speedydeletion.wikia.com/wiki/Like_Dis_Like_Dat same here, this
is
also unencylopedia material.
now this looks like a hoax , and could be deleted, but also unencylopedia stuff. http://speedydeletion.wikia.com/wiki/Aman_hadid
personally I am interested in non notable articles, my experience with dealing with encouraging editors and collection creative commons licensed information on places that are not very notable like villages in kosovo could be
helped
with these articles. if we can find authors who are willing to write about non notable topics, maybe we can collect them and categorize them. this could be a recruiting mechanism and also provide good basic data and maybe new contributors for wikitravel and openstreetmap. with a staging system like this speedydeletion wikia you will have a way
to
demote articles to the speedy deletion purgatory where they might redeem themselves. _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
How are you attributing the work and ensuring that the license conditions are met?
Risker
On 11 June 2012 01:24, Mike Dupont jamesmikedupont@googlemail.com wrote:
good question, The raw data is not lost : the archive.org will get a snapshot every 30 minutes of the items that are categorized.
the wikia right now will not be updated: There is not any code for the updating of the article, once it is in the wikia.
I would be interesting to get the full history of each article, if it is replaced after deletion, i suppose that a new page or subpage would be needed if it is deleted again, maybe a fine history would be good enough.
In any case if articles last more than 30 minutes in the speedy deletion category they will be archived.
mike
On Sun, Jun 10, 2012 at 9:45 PM, Nathan nawrich@gmail.com wrote:
What happens to your system if an article is deleted from Wikipedia, a
new
article is posted again under the same name, and then that one is also deleted?
On Sun, Jun 10, 2012 at 2:55 PM, Mike Dupont < jamesmikedupont@googlemail.com
wrote:
On Sun, Jun 10, 2012 at 6:38 PM, Kirill Lokshin <
kirill.lokshin@gmail.com
wrote:
This is where I think things become slightly problematic. There are,
in
broad terms, two categories of hoaxes: those which are fundamentally harmless (these typically being of the "did you know that 'gullible'
isn't
in the dictionary" variety), and those which have the potential to be harmful (including, but not limited to, false statements about drugs, crimes, politically explosive issues, etc.). The speedy deletion
system
makes no real attempt to distinguish between these (with the
exception
of
blatant attacks on living people, which have their own deletion tag),
but
retaining the second category in a publicly-viewable (and publicly-searcheable?) form is probably not desirable.
I am willing to delete articles from the wikia, they are also archived
on
archive.org. it would be nice to work out a detailed tagging system to work out what
may
be archived and what not. We can remove categories and add them.
But lets look at some of these, http://speedydeletion.wikia.com/wiki/Ipod_5.5_generation this looks
like
someone who is providing infomation in the wrong way, but could be
valid.
http://speedydeletion.wikia.com/wiki/Trantlers this is funny and would like to keep it and add a picture of it, maybe to a humor section. this one
too
http://speedydeletion.wikia.com/wiki/Trantlas
http://speedydeletion.wikia.com/wiki/Willyfor this is strange , but
might
be some meme or urban legend. I guess unencylopedia. http://speedydeletion.wikia.com/wiki/Like_Dis_Like_Dat same here, this
is
also unencylopedia material.
now this looks like a hoax , and could be deleted, but also
unencylopedia
stuff. http://speedydeletion.wikia.com/wiki/Aman_hadid
personally I am interested in non notable articles, my experience with dealing with encouraging editors and collection creative commons
licensed
information on places that are not very notable like villages in kosovo could be
helped
with these articles. if we can find authors who are willing to write about non notable
topics,
maybe we can collect them and categorize them. this could be a recruiting mechanism and also provide good basic data
and
maybe new contributors for wikitravel and openstreetmap. with a staging system like this speedydeletion wikia you will have a
way
to
demote articles to the speedy deletion purgatory where they might
redeem
themselves. _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
-- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Hi,
About copyright , the articles do not contain "G12. Unambiguous copyright infringement.", I can only trust the category system from wikipedia is working. I am willing to help fine tune the speedy deletion system to be specific about what articles are original copyrighted material.
there is more work to do on attribution, the dumps on archive.org does contain an attribution of wikipedia in general and the creative commones license for the entire bucket, http://archive.org/details/wikipedia-delete-2012-05 http://archive.org/details/wikipedia-delete-2012-06
The individual users names are only visible in the data files, I assume we could collect them..
For the wikia, at this point I am using the standard wikia template http://speedydeletion.wikia.com/wiki/Template:Wikipedia-deleted, so the articles should have a text like this and the name of the last author : [image: Smallwikipedialogo.png] This page uses content that was added to * Wikipedia* http://en.wikipedia.org/. The article has been deleted from Wikipedia. The original article was written by these Wikipedia users: Ammodramus. As with this wiki (Speedy deletion Wiki), the text of Wikipedia is available under the Creative Commons Attribution-ShareAlike 3.0 Unported License http://creativecommons.org/licenses/by-sa/3.0/.
Again the full history is available on archive.org and i think that no one is going to think that this data is from me, it is clearly marked as being from wikipedia.
mike
On Mon, Jun 11, 2012 at 5:27 AM, Risker risker.wp@gmail.com wrote:
How are you attributing the work and ensuring that the license conditions are met?
Risker
On 11 June 2012 01:24, Mike Dupont jamesmikedupont@googlemail.com wrote:
good question, The raw data is not lost : the archive.org will get a snapshot every 30 minutes of the items that are categorized.
the wikia right now will not be updated: There is not any code for the updating of the article, once it is in the wikia.
I would be interesting to get the full history of each article, if it is replaced after deletion, i suppose that a new page or subpage would be needed if it is deleted again, maybe a fine history would be good enough.
In any case if articles last more than 30 minutes in the speedy deletion category they will be archived.
mike
On Sun, Jun 10, 2012 at 9:45 PM, Nathan nawrich@gmail.com wrote:
What happens to your system if an article is deleted from Wikipedia, a
new
article is posted again under the same name, and then that one is also deleted?
On Sun, Jun 10, 2012 at 2:55 PM, Mike Dupont < jamesmikedupont@googlemail.com
wrote:
On Sun, Jun 10, 2012 at 6:38 PM, Kirill Lokshin <
kirill.lokshin@gmail.com
wrote:
This is where I think things become slightly problematic. There
are,
in
broad terms, two categories of hoaxes: those which are
fundamentally
harmless (these typically being of the "did you know that
'gullible'
isn't
in the dictionary" variety), and those which have the potential to
be
harmful (including, but not limited to, false statements about
drugs,
crimes, politically explosive issues, etc.). The speedy deletion
system
makes no real attempt to distinguish between these (with the
exception
of
blatant attacks on living people, which have their own deletion
tag),
but
retaining the second category in a publicly-viewable (and publicly-searcheable?) form is probably not desirable.
I am willing to delete articles from the wikia, they are also
archived
on
archive.org. it would be nice to work out a detailed tagging system to work out
what
may
be archived and what not. We can remove categories and add them.
But lets look at some of these, http://speedydeletion.wikia.com/wiki/Ipod_5.5_generation this looks
like
someone who is providing infomation in the wrong way, but could be
valid.
http://speedydeletion.wikia.com/wiki/Trantlers this is funny and
would
like to keep it and add a picture of it, maybe to a humor section. this
one
too
http://speedydeletion.wikia.com/wiki/Trantlas
http://speedydeletion.wikia.com/wiki/Willyfor this is strange , but
might
be some meme or urban legend. I guess unencylopedia. http://speedydeletion.wikia.com/wiki/Like_Dis_Like_Dat same here,
this
is
also unencylopedia material.
now this looks like a hoax , and could be deleted, but also
unencylopedia
stuff. http://speedydeletion.wikia.com/wiki/Aman_hadid
personally I am interested in non notable articles, my experience
with
dealing with encouraging editors and collection creative commons
licensed
information on places that are not very notable like villages in kosovo could be
helped
with these articles. if we can find authors who are willing to write about non notable
topics,
maybe we can collect them and categorize them. this could be a recruiting mechanism and also provide good basic data
and
maybe new contributors for wikitravel and openstreetmap. with a staging system like this speedydeletion wikia you will have a
way
to
demote articles to the speedy deletion purgatory where they might
redeem
themselves. _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
-- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
On Mon, Jun 11, 2012 at 05:44:40AM +0000, Mike Dupont wrote:
Hi,
Again the full history is available on archive.org and i think that no one is going to think that this data is from me, it is clearly marked as being from wikipedia.
You are very concientious, and normally this would indeed be adequate (not perfect, but definitely adequate :)
It is adequate because wikipedia itself retains full attribution information in page history. One can follow the chain from the copy back to the wikipedia original back to the page history and voila, more than you ever wanted to know.
The problem with deleted/hidden articles on en.wp is that the history information is also deleted/hidden; and therefore the attribution chain is broken.
Attribution information is important for legal and open content reasons of course.
Is it possible to keep a copy of page history somewhere also?
I know the mediawiki export/import functions support this, and work via GET request. see http://en.wikipedia.org/wiki/Special:Export
eg: http://en.wikipedia.org/wiki/Special:Export/Train
sincerely, Kim Bruning
This has been debated numerous times; to what extent does the attribution have to relate to the exact contribution of each author.
A list of authors has been considered acceptable in the past (including on-wiki).
Tom
On 12 June 2012 23:48, Kim Bruning kim@bruning.xs4all.nl wrote:
On Mon, Jun 11, 2012 at 05:44:40AM +0000, Mike Dupont wrote:
Hi,
Again the full history is available on archive.org and i think that no
one
is going to think that this data is from me, it is clearly marked as
being
from wikipedia.
You are very concientious, and normally this would indeed be adequate (not perfect, but definitely adequate :)
It is adequate because wikipedia itself retains full attribution information in page history. One can follow the chain from the copy back to the wikipedia original back to the page history and voila, more than you ever wanted to know.
The problem with deleted/hidden articles on en.wp is that the history information is also deleted/hidden; and therefore the attribution chain is broken.
Attribution information is important for legal and open content reasons of course.
Is it possible to keep a copy of page history somewhere also?
I know the mediawiki export/import functions support this, and work via GET request. see http://en.wikipedia.org/wiki/Special:Export
eg: http://en.wikipedia.org/wiki/Special:Export/Train
sincerely, Kim Bruning
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
In theory, yes. However, if you're fetching page dumps conforming to the schema at http://www.mediawiki.org/xml/export-0.6/ http://www.mediawiki.org/xml/export-0.6.xsd
to some extent you've already got all the information anyway. It *almost* takes more work to filter out just the authors. :-P
If you're getting page data in some different format, sure. :-)
sincerely, Kim Bruning
On Wed, Jun 13, 2012 at 12:47:31AM +0100, Thomas Morton wrote:
This has been debated numerous times; to what extent does the attribution have to relate to the exact contribution of each author.
A list of authors has been considered acceptable in the past (including on-wiki).
Tom
On 12 June 2012 23:48, Kim Bruning kim@bruning.xs4all.nl wrote:
On Mon, Jun 11, 2012 at 05:44:40AM +0000, Mike Dupont wrote:
Hi,
Again the full history is available on archive.org and i think that no
one
is going to think that this data is from me, it is clearly marked as
being
from wikipedia.
You are very concientious, and normally this would indeed be adequate (not perfect, but definitely adequate :)
It is adequate because wikipedia itself retains full attribution information in page history. One can follow the chain from the copy back to the wikipedia original back to the page history and voila, more than you ever wanted to know.
The problem with deleted/hidden articles on en.wp is that the history information is also deleted/hidden; and therefore the attribution chain is broken.
Attribution information is important for legal and open content reasons of course.
Is it possible to keep a copy of page history somewhere also?
I know the mediawiki export/import functions support this, and work via GET request. see http://en.wikipedia.org/wiki/Special:Export
eg: http://en.wikipedia.org/wiki/Special:Export/Train
sincerely, Kim Bruning
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
On Mon, Jun 11, 2012 at 05:44:40AM +0000, Mike Dupont wrote:
Hi,
Again the full history is available on archive.org and i think that no one is going to think that this data is from me, it is clearly marked as being from wikipedia.
ps. It occurs to me that if the full history is available on archive.org, that linking to the relevant archive.org data might be adequate.
sincerely, Kim Bruning
yes, good point, and while checking this I found a problem.
it turns out that up to now I dont have the full history, I have modified the extractor to pull the full version. For the other articles, I dont have the full history.
mike
On Tue, Jun 12, 2012 at 11:03 PM, Kim Bruning kim@bruning.xs4all.nl wrote:
On Mon, Jun 11, 2012 at 05:44:40AM +0000, Mike Dupont wrote:
Hi,
Again the full history is available on archive.org and i think that no
one
is going to think that this data is from me, it is clearly marked as
being
from wikipedia.
ps. It occurs to me that if the full history is available on archive.org, that linking to the relevant archive.org data might be adequate.
sincerely, Kim Bruning
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
ok kim, thanks again for your constructive feedback, and thanks to everyone who has given input. It is great to have a supportive group of people like you to talk to.
now i have upgraded the code, including templates and full history. It gets all new articles not seen before every 10 minutes. Also the uploading to the wikia is working but without the history. I think this is a good deal. for the articles that we dont have the history for, people can ask an admin for restoring the history if they really want it, I dont have it for the old articles. Here is an example new dump : http://archive.org/download/wikipedia-delete-2012-06/wtarchive130612111041.z...
code is here github.com/h4ck3rm1k3/pywikipediabot and github.com/h4ck3rm1k3/wikiteam
mike
On Wed, Jun 13, 2012 at 5:45 AM, Mike Dupont <jamesmikedupont@googlemail.com
wrote:
yes, good point, and while checking this I found a problem.
it turns out that up to now I dont have the full history, I have modified the extractor to pull the full version. For the other articles, I dont have the full history.
mike
On Tue, Jun 12, 2012 at 11:03 PM, Kim Bruning kim@bruning.xs4all.nlwrote:
On Mon, Jun 11, 2012 at 05:44:40AM +0000, Mike Dupont wrote:
Hi,
Again the full history is available on archive.org and i think that no
one
is going to think that this data is from me, it is clearly marked as
being
from wikipedia.
ps. It occurs to me that if the full history is available on archive.org, that linking to the relevant archive.org data might be adequate.
sincerely, Kim Bruning
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
-- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3
I'm actually really shocked that the speedy deletion wiki doesn't have a copy of http://en.wikipedia.org/wiki/Kingdom_of_Juclandia - there's an excellent version of the article on http://microwiki.org.uk/index.php?title=Juclandia , though, so all is not lost. [it seems to be a monstrously elaborate page about some kid's bedroom kingdom, by the way.]
But seriously, my point is that you don't need to pick up the crumbs that are swept off Wikipedia's table to allow people the space to indulge their creative spirits. There are other wikis that will open their doors.
David
Well, lets see the speedy deletion wiki was just launched, 15 December 2010 it was deleted. I will have to make a plan for getting older articles, they would be possible to get, but it is outside of our time window right now. mike
On Tue, Jun 19, 2012 at 9:48 AM, David Richfield davidrichfield@gmail.com wrote:
I'm actually really shocked that the speedy deletion wiki doesn't have a copy of http://en.wikipedia.org/wiki/Kingdom_of_Juclandia - there's an excellent version of the article on http://microwiki.org.uk/index.php?title=Juclandia , though, so all is not lost. [it seems to be a monstrously elaborate page about some kid's bedroom kingdom, by the way.]
But seriously, my point is that you don't need to pick up the crumbs that are swept off Wikipedia's table to allow people the space to indulge their creative spirits. There are other wikis that will open their doors.
David
-- David Richfield [[:en:User:Slashme]] +27718539985
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
-- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3
well originally i wanted just not notable, but i was interested in hoaxes personally. let me see,
http://speedydeletion.wikia.com/wiki/index.php?search=hoax&fulltext=Sear...
http://speedydeletion.wikia.com/wiki/index.php?search=hoax&fulltext=Searchthey are clearly marked and might contain something interesting some day
thanks, mike
On Sun, Jun 10, 2012 at 6:24 PM, Richard Symonds < richard.symonds@wikimedia.org.uk> wrote:
Not quite - Mike says that " it is updated...with the proposed deletions and speedy deletion articles (not notable and hoaxes, *not others*)."
Non-notable articles are arguably important to keep - they serve as a basis for recreating the articles in future, and for easy review by non-admins. Hoaxes are also important to keep publicy viewable - so that we can stop similar ones happening again, surely?
Richard Symonds Wikimedia UK 0207 065 0992 Disclaimer viewable at http://uk.wikimedia.org/wiki/Wikimedia:Email_disclaimer Visit http://www.wikimedia.org.uk/ and @wikimediauk
On 10 June 2012 18:29, Juergen Fenn schneeschmelze@googlemail.com wrote:
2012/6/10 Mike Dupont jamesmikedupont@googlemail.com:
I have launched speedydeletion.wika.com , it is updated every 30
minutes
with the proposed deletions and speedy deletion articles (not notable
and
hoaxes, not others).
Speedydeleted articles are either no articles at all, thus not worth discussing, or they are copyright infrigements. I am sorry to say that I do not happen to understand why you collect those pages in a separate wiki.
Regards, Jürgen.
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
This is great. Thank you, Mike! It would be nice to see this done for historically speedied articles, too. Sam.
On Sun, Jun 10, 2012 at 3:37 AM, Mike Dupont jamesmikedupont@googlemail.com wrote:
Hi, I have launched speedydeletion.wika.com , it is updated every 30 minutes with the proposed deletions and speedy deletion articles (not notable and hoaxes, not others). it is running on the en.wikipedia.org. the sources for the script are all on git hub and are a merger of pywikipediabot and the wikiteam codebases. hope you enjoy it, thanks, mike -- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
In my short experience studying this, there are some articles that might be deleted very quickly, not even waiting 30 minutes.
If I look here at monthly dumps, http://dumps.wikimedia.org/enwiki/ then we would be missing many articles that were created and deleted between the two dumps. We could look in total what articles were deleted
Historical data, I guess they could be extracted from older dumps, to extract the list of articles tagged with categories based on a series of dumps. right now I am pulling them every thirty minutes, it would be possible to scan historical dumps and find any articles that are no longer in the newer dumps.
The deletion logs here http://en.wikipedia.org/w/index.php?title=Special:Log/delete we could scan those, but as i said, how to get the text?
the CPU usage for something like that would go way over my current processing. we could as I said install in on the toolserver, I would have to work on the code for a bit first. so, this comes down to the question, do we have a full log of the deleted articles ?
thanks, mike
On Mon, Jun 11, 2012 at 5:49 AM, Samuel Klein meta.sj@gmail.com wrote:
This is great. Thank you, Mike! It would be nice to see this done for historically speedied articles, too. Sam.
On Sun, Jun 10, 2012 at 3:37 AM, Mike Dupont jamesmikedupont@googlemail.com wrote:
Hi, I have launched speedydeletion.wika.com , it is updated every 30 minutes with the proposed deletions and speedy deletion articles (not notable and hoaxes, not others). it is running on the en.wikipedia.org. the sources for the script are
all
on git hub and are a merger of pywikipediabot and the wikiteam codebases. hope you enjoy it, thanks, mike -- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
-- Samuel Klein identi.ca:sj w:user:sj +1 617 529 4266
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Although I can understand the appeal of this concept, I am concerned that a deleted-articles wiki or site will perpetuate the publicity given to pages that are properly deleted from Wikipedia because they contain offensive personal attacks, harassment, cyberbullying, defamation, and BLP violations. These are not always flagged in the deletion grounds, especially in speedy situations (e.g. if a harassing or defamatory article does not assert the subject's notability, it will often be deleted on that ground without its being tagged as an attack page, etc.). This issue strikes me as extremely serious. How do you plan to address it?
Newyorkbrad
On 6/10/12, Mike Dupont jamesmikedupont@googlemail.com wrote:
Hi, I have launched speedydeletion.wika.com , it is updated every 30 minutes with the proposed deletions and speedy deletion articles (not notable and hoaxes, not others). it is running on the en.wikipedia.org. the sources for the script are all on git hub and are a merger of pywikipediabot and the wikiteam codebases. hope you enjoy it, thanks, mike -- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
My view is that we just need a proper tagging system when deleting them to know which are harassing and violating the peoples rights. I want to also help here and protect peoples rights, but I also want to scrape out valid copyrighted material that is not worthy yet of wikipedia. Are there no tag for such things? thanks, mike
On Mon, Jun 11, 2012 at 7:15 AM, Newyorkbrad newyorkbrad@gmail.com wrote:
Although I can understand the appeal of this concept, I am concerned that a deleted-articles wiki or site will perpetuate the publicity given to pages that are properly deleted from Wikipedia because they contain offensive personal attacks, harassment, cyberbullying, defamation, and BLP violations. These are not always flagged in the deletion grounds, especially in speedy situations (e.g. if a harassing or defamatory article does not assert the subject's notability, it will often be deleted on that ground without its being tagged as an attack page, etc.). This issue strikes me as extremely serious. How do you plan to address it?
Newyorkbrad
On 6/10/12, Mike Dupont jamesmikedupont@googlemail.com wrote:
Hi, I have launched speedydeletion.wika.com , it is updated every 30 minutes with the proposed deletions and speedy deletion articles (not notable and hoaxes, not others). it is running on the en.wikipedia.org. the sources for the script are
all
on git hub and are a merger of pywikipediabot and the wikiteam codebases. hope you enjoy it, thanks, mike -- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
After rereading your question, it boils down to if the tags are wrong. If the tags are wrong we will have to deal with them on a case by case issue, and I see that we will have to do more finer tagging of articles to be deleted, I hope that this will be the outcome of my effort to see that articles that are properly tagged and deleted according to a fair and transparent set of rules. lets work on this together to make the wikipedia better.
What is the current policy now? you have articles in dumps that contain these offending materials i presume if they are not that new. What is the difference between a dump on wikpedia and a dump on archive.org, and a dump on wikia?
thanks,
mike
On Mon, Jun 11, 2012 at 7:36 AM, Mike Dupont <jamesmikedupont@googlemail.com
wrote:
My view is that we just need a proper tagging system when deleting them to know which are harassing and violating the peoples rights. I want to also help here and protect peoples rights, but I also want to scrape out valid copyrighted material that is not worthy yet of wikipedia. Are there no tag for such things? thanks, mike
On Mon, Jun 11, 2012 at 7:15 AM, Newyorkbrad newyorkbrad@gmail.comwrote:
Although I can understand the appeal of this concept, I am concerned that a deleted-articles wiki or site will perpetuate the publicity given to pages that are properly deleted from Wikipedia because they contain offensive personal attacks, harassment, cyberbullying, defamation, and BLP violations. These are not always flagged in the deletion grounds, especially in speedy situations (e.g. if a harassing or defamatory article does not assert the subject's notability, it will often be deleted on that ground without its being tagged as an attack page, etc.). This issue strikes me as extremely serious. How do you plan to address it?
Newyorkbrad
On 6/10/12, Mike Dupont jamesmikedupont@googlemail.com wrote:
Hi, I have launched speedydeletion.wika.com , it is updated every 30
minutes
with the proposed deletions and speedy deletion articles (not notable
and
hoaxes, not others). it is running on the en.wikipedia.org. the sources for the script are
all
on git hub and are a merger of pywikipediabot and the wikiteam
codebases.
hope you enjoy it, thanks, mike -- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
-- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3
On 11 June 2012 08:40, Mike Dupont jamesmikedupont@googlemail.com wrote:
After rereading your question, it boils down to if the tags are wrong. If the tags are wrong we will have to deal with them on a case by case issue, and I see that we will have to do more finer tagging of articles to be deleted, I hope that this will be the outcome of my effort to see that articles that are properly tagged and deleted according to a fair and transparent set of rules. lets work on this together to make the wikipedia better.
I'm not sure if this is beneficial. Speedy tagging is the first line for removing junk, inappropriate content, spam and obvious hoaxes. THe material is of such low quality the only consideration is that it is deleted - not how well we tag it for deletion.
Expending more effort on this material is non-optimal.
Tom
On 11 June 2012 08:40, Mike Dupont jamesmikedupont@googlemail.com wrote:
After rereading your question, it boils down to if the tags are wrong. If the tags are wrong we will have to deal with them on a case by case issue, and I see that we will have to do more finer tagging of articles to be deleted, I hope that this will be the outcome of my effort to see that articles that are properly tagged and deleted according to a fair and transparent set of rules. lets work on this together to make the wikipedia better.
There's a couple of problems here.
One is that speedy deletion tags don't often get stacked - if people see something that fills two criteria, they'll often tag them with just one, and usually prefer the "clearer" criteria, the one with less subjectivity about interpretation. Usually, this is seen with "copyvio + not notable" = only tag for copyvio; but it also manifests itself in the cases Brad mentions - not notable + negatively slanted = usually tagged for notability rather than negativity. This is a problem, because understandably most of the cases you want to *keep* are the "not-notable" ones...
Secondly, some admins will delete abusive material on sight without waiting to tag it (which is skirting around best practice, but widely accepted for unambiguously "bad" content), and some people prefer to use "hand-written" CSD reasons. Both of these may end up with harder-to-interpret deletion logs (they don't contain the codes) - I'm not sure how your system works with these.
What is the current policy now? you have articles in dumps that contain these offending materials i presume if they are not that new. What is the difference between a dump on wikpedia and a dump on archive.org, and a dump on wikia?
The difference is that one is a hard-to-accidentally-stumble-across dump, and one is a publicly readable website ;-)
I do not believe there are active moves to remove material from older dumps. *However*, the damage from problematic material surviving in dumps is relatively low, because most speedily deleted material tends never to make it into a dump.
Let's assume dumps are twice-weekly. If every article tagged for speedy deletion "on creation" lasts for six hours before being removed, then there's approximately a 1/60 chance of it being around to when the dump triggers. By contrast, around 50% of articles marked with PROD just after creation will survive into the dump, even if deleted, and substantially over 50% of articles put through AFD on creation will be dumped even if deleted, because the frequency of relisting means many of them are tagged for two weeks.
As a result, CSD material - which is where the worst content usually is - is much less likely to get dumped than AFD material.
Some dumped abusive material does make it out onto reuse sites, though, and it's an incredible pain to deal with - people find it a year or two later, mirrored on an abandoned ad-farm website, and write us distressed letters about trying to get it removed. We can't do anything about it - these sites are usually completely uncontactable and don't care anyway - and it's quite an unpleasant experience for all concerned.
If you're going to expose these on wikia as well as making archive files, I'd strongly recommend you give a clearly visible contact address to have material removed if needed - it'll save everyone involved lot of stress in future!
Yes, you are raising very good points, I will have to take on the responsibility to put more work into this, I hope that other people will help curate the leftovers. my contact address is on the disclaimer on the main page and anyone is free to edit and blank out articles that are a problem. there are some gems in that muck however, and I intend to help them out of the dirt. thanks, mike
On Mon, Jun 11, 2012 at 4:47 PM, Andrew Gray andrew.gray@dunelm.org.ukwrote:
On 11 June 2012 08:40, Mike Dupont jamesmikedupont@googlemail.com wrote:
After rereading your question, it boils down to if the tags are wrong. If the tags are wrong we will have to deal with them on a case by case
issue,
and I see that we will have to do more finer tagging of articles to be deleted, I hope that this will be the outcome of my effort to see that articles that are properly tagged and deleted according to a fair and transparent set of rules. lets work on this together to make the
wikipedia
better.
There's a couple of problems here.
One is that speedy deletion tags don't often get stacked - if people see something that fills two criteria, they'll often tag them with just one, and usually prefer the "clearer" criteria, the one with less subjectivity about interpretation. Usually, this is seen with "copyvio
- not notable" = only tag for copyvio; but it also manifests itself in
the cases Brad mentions - not notable + negatively slanted = usually tagged for notability rather than negativity. This is a problem, because understandably most of the cases you want to *keep* are the "not-notable" ones...
Secondly, some admins will delete abusive material on sight without waiting to tag it (which is skirting around best practice, but widely accepted for unambiguously "bad" content), and some people prefer to use "hand-written" CSD reasons. Both of these may end up with harder-to-interpret deletion logs (they don't contain the codes) - I'm not sure how your system works with these.
What is the current policy now? you have articles in dumps that contain these offending materials i presume if they are not that new. What is the difference between a dump on wikpedia and a dump on archive.org, and a
dump
on wikia?
The difference is that one is a hard-to-accidentally-stumble-across dump, and one is a publicly readable website ;-)
I do not believe there are active moves to remove material from older dumps. *However*, the damage from problematic material surviving in dumps is relatively low, because most speedily deleted material tends never to make it into a dump.
Let's assume dumps are twice-weekly. If every article tagged for speedy deletion "on creation" lasts for six hours before being removed, then there's approximately a 1/60 chance of it being around to when the dump triggers. By contrast, around 50% of articles marked with PROD just after creation will survive into the dump, even if deleted, and substantially over 50% of articles put through AFD on creation will be dumped even if deleted, because the frequency of relisting means many of them are tagged for two weeks.
As a result, CSD material - which is where the worst content usually is - is much less likely to get dumped than AFD material.
Some dumped abusive material does make it out onto reuse sites, though, and it's an incredible pain to deal with - people find it a year or two later, mirrored on an abandoned ad-farm website, and write us distressed letters about trying to get it removed. We can't do anything about it - these sites are usually completely uncontactable and don't care anyway - and it's quite an unpleasant experience for all concerned.
If you're going to expose these on wikia as well as making archive files, I'd strongly recommend you give a clearly visible contact address to have material removed if needed - it'll save everyone involved lot of stress in future!
--
- Andrew Gray andrew.gray@dunelm.org.uk
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
I think we need to ensure that BLP deletions are tagged appropriately.
and this wikia needs to err on the side of caution in order to avoid causing subjects further grief in their pursuit to remove problematic content from Wikipedia. i.e. if someone jumps through all the hoops to *help* us remove problematic content from Wikipedia, they are not going to be happy to learn that the same content has appeared on Wikia - its confusing, and they will blame Wikipedia, and IMO they are right to do so as this Wikia is run by people in the Wikimedia community, and due to the overlap in the WMF board and Wikia board, now and historically.
e.g. this AFD mentioned "WP:BLP1E" and was categorised into "AfD debates (Biographical)"
https://en.wikipedia.org/wiki/Wikipedia:Articles_for_deletion/Alexander_Kiny... http://speedydeletion.wikia.com/wiki/Alexander_Kinyua
Until we are confident that BLP problems are not being imported into the wikia, the content shouldnt be indexed. I assume __NOINDEX__ works on Wikia?
On Mon, Jun 11, 2012 at 2:15 PM, Newyorkbrad newyorkbrad@gmail.com wrote:
Although I can understand the appeal of this concept, I am concerned that a deleted-articles wiki or site will perpetuate the publicity given to pages that are properly deleted from Wikipedia because they contain offensive personal attacks, harassment, cyberbullying, defamation, and BLP violations. These are not always flagged in the deletion grounds, especially in speedy situations (e.g. if a harassing or defamatory article does not assert the subject's notability, it will often be deleted on that ground without its being tagged as an attack page, etc.). This issue strikes me as extremely serious. How do you plan to address it?
Newyorkbrad
On 6/10/12, Mike Dupont jamesmikedupont@googlemail.com wrote:
Hi, I have launched speedydeletion.wika.com , it is updated every 30 minutes with the proposed deletions and speedy deletion articles (not notable and hoaxes, not others). it is running on the en.wikipedia.org. the sources for the script are all on git hub and are a merger of pywikipediabot and the wikiteam codebases. hope you enjoy it, thanks, mike -- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Well, showing up on wikia is one thing, that should be easy, to fix, and even after the fact we can update the wikia and remove articles, there are not so many of them.
we need a list of categories that can be archived or not. John you are saying WP:BLP1E should be excluded? is there a specific tag or text to look for? What about a simple tag or category like "bin it!".
thanks, mike
On Mon, Jun 11, 2012 at 8:00 AM, John Vandenberg jayvdb@gmail.com wrote:
I think we need to ensure that BLP deletions are tagged appropriately.
and this wikia needs to err on the side of caution in order to avoid causing subjects further grief in their pursuit to remove problematic content from Wikipedia. i.e. if someone jumps through all the hoops to *help* us remove problematic content from Wikipedia, they are not going to be happy to learn that the same content has appeared on Wikia
- its confusing, and they will blame Wikipedia, and IMO they are right
to do so as this Wikia is run by people in the Wikimedia community, and due to the overlap in the WMF board and Wikia board, now and historically.
e.g. this AFD mentioned "WP:BLP1E" and was categorised into "AfD debates (Biographical)"
https://en.wikipedia.org/wiki/Wikipedia:Articles_for_deletion/Alexander_Kiny... http://speedydeletion.wikia.com/wiki/Alexander_Kinyua
Until we are confident that BLP problems are not being imported into the wikia, the content shouldnt be indexed. I assume __NOINDEX__ works on Wikia?
On Mon, Jun 11, 2012 at 2:15 PM, Newyorkbrad newyorkbrad@gmail.com wrote:
Although I can understand the appeal of this concept, I am concerned that a deleted-articles wiki or site will perpetuate the publicity given to pages that are properly deleted from Wikipedia because they contain offensive personal attacks, harassment, cyberbullying, defamation, and BLP violations. These are not always flagged in the deletion grounds, especially in speedy situations (e.g. if a harassing or defamatory article does not assert the subject's notability, it will often be deleted on that ground without its being tagged as an attack page, etc.). This issue strikes me as extremely serious. How do you plan to address it?
Newyorkbrad
On 6/10/12, Mike Dupont jamesmikedupont@googlemail.com wrote:
Hi, I have launched speedydeletion.wika.com , it is updated every 30
minutes
with the proposed deletions and speedy deletion articles (not notable
and
hoaxes, not others). it is running on the en.wikipedia.org. the sources for the script are
all
on git hub and are a merger of pywikipediabot and the wikiteam
codebases.
hope you enjoy it, thanks, mike -- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
-- John Vandenberg
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
John, and others.
I have finally figured out a big problem with my plan. The articles for deletion are not tagged peoperly at all. There are authors who know for a fact that articles are mistagged and have no proper copyvio tagging, and now they are accusing me of hosting copyvio articles. I see this a problem in the wikipedia deletion system, if an editor knows for a fact that an articles is in violation of copyright then they should tag it as Such. I have written scripts to strip out artilces that are properly tagged. Lets sit down and work out a plan for a proper system of sorting out what is not notable, and waht is copyrightvio. I want to host the non notable artilces. My argument is that giving non-notable bands and actors etc an outlet to be hosted will reduce repeated reposting of articles. I have been sorting through all these articles, contacting people and many of them are thankful, I would be suprized if any of them would repost the deleted article, like the Jack Psyco from .au, someone reposted his article many many times. Please support me in cleaning up the deletion and tagging process, I am willing to put some work into this. I can write code as well. Some people have asked me not to use the mailing list, but I wanted to bring up this up in response to your mail.
Please see http://en.wikipedia.org/wiki/User:Mdupont/SpeedyDeletionWikia and http://en.wikipedia.org/wiki/User_talk:Mdupont#Speedy_backup_-_copyvios_and_...
thanks mike On Mon, Jun 11, 2012 at 8:00 AM, John Vandenberg jayvdb@gmail.com wrote:
I think we need to ensure that BLP deletions are tagged appropriately.
and this wikia needs to err on the side of caution in order to avoid causing subjects further grief in their pursuit to remove problematic content from Wikipedia. i.e. if someone jumps through all the hoops to *help* us remove problematic content from Wikipedia, they are not going to be happy to learn that the same content has appeared on Wikia
- its confusing, and they will blame Wikipedia, and IMO they are right
to do so as this Wikia is run by people in the Wikimedia community, and due to the overlap in the WMF board and Wikia board, now and historically.
e.g. this AFD mentioned "WP:BLP1E" and was categorised into "AfD debates (Biographical)"
https://en.wikipedia.org/wiki/Wikipedia:Articles_for_deletion/Alexander_Kiny... http://speedydeletion.wikia.com/wiki/Alexander_Kinyua
Until we are confident that BLP problems are not being imported into the wikia, the content shouldnt be indexed. I assume __NOINDEX__ works on Wikia?
On Mon, Jun 11, 2012 at 2:15 PM, Newyorkbrad newyorkbrad@gmail.com wrote:
Although I can understand the appeal of this concept, I am concerned that a deleted-articles wiki or site will perpetuate the publicity given to pages that are properly deleted from Wikipedia because they contain offensive personal attacks, harassment, cyberbullying, defamation, and BLP violations. These are not always flagged in the deletion grounds, especially in speedy situations (e.g. if a harassing or defamatory article does not assert the subject's notability, it will often be deleted on that ground without its being tagged as an attack page, etc.). This issue strikes me as extremely serious. How do you plan to address it?
Newyorkbrad
On 6/10/12, Mike Dupont jamesmikedupont@googlemail.com wrote:
Hi, I have launched speedydeletion.wika.com , it is updated every 30
minutes
with the proposed deletions and speedy deletion articles (not notable
and
hoaxes, not others). it is running on the en.wikipedia.org. the sources for the script are
all
on git hub and are a merger of pywikipediabot and the wikiteam
codebases.
hope you enjoy it, thanks, mike -- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
-- John Vandenberg
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
On 21 July 2012 22:33, Mike Dupont jamesmikedupont@googlemail.com wrote:
John, and others.
I have finally figured out a big problem with my plan. The articles for deletion are not tagged peoperly at all. There are authors who know for a fact that articles are mistagged and have no proper copyvio tagging, and now they are accusing me of hosting copyvio articles. I see this a problem in the wikipedia deletion system, if an editor knows for a fact that an articles is in violation of copyright then they should tag it as Such. I have written scripts to strip out artilces that are properly tagged. Lets sit down and work out a plan for a proper system of sorting out what is not notable, and waht is copyrightvio.
Not going to happen. The reality is that people deleting articles are going to opt for the option that takes the least effort on their part. And A7 beats out G12 in that case. The other issue is that OTRS, who's opinions we care about a lot more, complain when people go the other way (deleting things as copyvios rather than going through AFD).
On Sat, Jul 21, 2012 at 10:27 PM, geni geniice@gmail.com wrote:
On 21 July 2012 22:33, Mike Dupont jamesmikedupont@googlemail.com wrote:
John, and others.
I have finally figured out a big problem with my plan. The articles for deletion are not tagged peoperly at all. There are authors who know for a fact that articles are mistagged and have no proper copyvio tagging, and now they are accusing me of hosting copyvio articles. I see this a
problem
in the wikipedia deletion system, if an editor knows for a fact that an articles is in violation of copyright then they should tag it as Such. I have written scripts to strip out artilces that are properly tagged. Lets sit down and work out a plan for a proper system of sorting out what is
not
notable, and waht is copyrightvio.
Not going to happen. The reality is that people deleting articles are going to opt for the option that takes the least effort on their part. And A7 beats out G12 in that case.
Ok, well i understand that people dont want extra work, but I think we can work on some process and software. It might happend.
The other issue is that OTRS, who's opinions we care about a lot more, complain when people go the other way (deleting things as copyvios rather than going through AFD).
Is there any way to tag them like that? can we mark them?
-- geni
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
While I think you have a good idea (similar to one that I myself did several years ago, and eventually abandoned for the very problems you're dealing with now), I don't think putting these pages directly into Wikia is going to work. If you decide to store the articles somewhere more private, at least until they are manually reviewed, let me know. By "more private" I mean, at a minimum, not listed in search engines.
Have you looked into hosting the articles at Internet Archive? This seems like a better place than Wikia, at least before the articles are manually reviewed.
On Sat, Jul 21, 2012 at 5:33 PM, Mike Dupont jamesmikedupont@googlemail.com wrote:
John, and others.
I have finally figured out a big problem with my plan. The articles for deletion are not tagged peoperly at all. There are authors who know for a fact that articles are mistagged and have no proper copyvio tagging, and now they are accusing me of hosting copyvio articles. I see this a problem in the wikipedia deletion system, if an editor knows for a fact that an articles is in violation of copyright then they should tag it as Such. I have written scripts to strip out artilces that are properly tagged. Lets sit down and work out a plan for a proper system of sorting out what is not notable, and waht is copyrightvio. I want to host the non notable artilces. My argument is that giving non-notable bands and actors etc an outlet to be hosted will reduce repeated reposting of articles. I have been sorting through all these articles, contacting people and many of them are thankful, I would be suprized if any of them would repost the deleted article, like the Jack Psyco from .au, someone reposted his article many many times. Please support me in cleaning up the deletion and tagging process, I am willing to put some work into this. I can write code as well. Some people have asked me not to use the mailing list, but I wanted to bring up this up in response to your mail.
Please see http://en.wikipedia.org/wiki/User:Mdupont/SpeedyDeletionWikia and http://en.wikipedia.org/wiki/User_talk:Mdupont#Speedy_backup_-_copyvios_and_...
thanks mike
The articles are hosted on archive.org, and then transfered to wikia, I could make some review tools, but where to host them? mike
On Sun, Jul 22, 2012 at 12:51 AM, Anthony wikimail@inbox.org wrote:
While I think you have a good idea (similar to one that I myself did several years ago, and eventually abandoned for the very problems you're dealing with now), I don't think putting these pages directly into Wikia is going to work. If you decide to store the articles somewhere more private, at least until they are manually reviewed, let me know. By "more private" I mean, at a minimum, not listed in search engines.
Have you looked into hosting the articles at Internet Archive? This seems like a better place than Wikia, at least before the articles are manually reviewed.
On Sat, Jul 21, 2012 at 5:33 PM, Mike Dupont jamesmikedupont@googlemail.com wrote:
John, and others.
I have finally figured out a big problem with my plan. The articles for deletion are not tagged peoperly at all. There are authors who know for a fact that articles are mistagged and have no proper copyvio tagging, and now they are accusing me of hosting copyvio articles. I see this a
problem
in the wikipedia deletion system, if an editor knows for a fact that an articles is in violation of copyright then they should tag it as Such. I have written scripts to strip out artilces that are properly tagged. Lets sit down and work out a plan for a proper system of sorting out what is
not
notable, and waht is copyrightvio. I want to host the non notable
artilces.
My argument is that giving non-notable bands and actors etc an outlet to
be
hosted will reduce repeated reposting of articles. I have been sorting through all these articles, contacting people and many of them are thankful, I would be suprized if any of them would repost the deleted article, like the Jack Psyco from .au, someone reposted his article many many times. Please support me in cleaning up the deletion and tagging process, I am willing to put some work into this. I can write code as well. Some people have asked me not to use the mailing list, but I wanted to bring up this up in response to your mail.
Please see http://en.wikipedia.org/wiki/User:Mdupont/SpeedyDeletionWikia and
http://en.wikipedia.org/wiki/User_talk:Mdupont#Speedy_backup_-_copyvios_and_...
thanks mike
On Sun, Jul 22, 2012 at 1:57 AM, Mike Dupont jamesmikedupont@googlemail.com wrote:
The articles are hosted on archive.org, and then transfered to wikia, I could make some review tools, but where to host them? mike
Well, why do you want to host them? Or, more specifically, why do you want to host them somewhere other than archive.org?
I definitely understand wanting to save the content of these articles just in case someone wants to work on fixing them. Wikipedia's temporary undeletion process for people who wish to do this, is simply not convenient. But archive.org seems to take care of that purpose just fine.
If the *topic* is deemed "notable", just the article itself is of very poor quality, then once these have undergone manual review and intervention, the right place to host them would be back in Wikipedia. It would also be nice to have a wiki for topics deemed by Wikipedia to be "not notable". But this is a much different problem, and I don't believe it is served well by the autopopulation of mostly very low quality articles.
Also, sorry if this was already answered, but where are they on archive.org?
Hi, I created the wika because the archive.org is so hard to use and slow. I want to be able to share articles with people, and I have been contacting all the twitter users to tell them about the articles (after checking and updating them).
I want to host them, at least the good ones to give bands and things a place to find themselves, or to give the "non-notables" a home.
In fact, a few people have already found themselves via google or some means and have updated thier articles. I bet that this should be good enough to prevent them from recreating the articles on wp.
The technical details are here, https://code.google.com/p/wikiteam/wiki/SpeedyDeletion https://code.google.com/p/wikiteam/wiki/SpeedyDeletionI created monthly buckets, the most recent one is here: http://archive.org/details/wikipedia-delete-v3-2012-07
If it turns out that too many articles that are copyvio are slipping through, i could turn off the feed to wikia, yet keep the archive.org feed, they are two separate processes.
Ideally we would agree on some tag to mark articles that would be worthy of hosting, but not ready for wikipedia. We are pulling only some categories directly.
mike
On Sun, Jul 22, 2012 at 12:15 PM, Anthony wikimail@inbox.org wrote:
On Sun, Jul 22, 2012 at 1:57 AM, Mike Dupont jamesmikedupont@googlemail.com wrote:
The articles are hosted on archive.org, and then transfered to wikia, I could make some review tools, but where to host them? mike
Well, why do you want to host them? Or, more specifically, why do you want to host them somewhere other than archive.org?
I definitely understand wanting to save the content of these articles just in case someone wants to work on fixing them. Wikipedia's temporary undeletion process for people who wish to do this, is simply not convenient. But archive.org seems to take care of that purpose just fine.
If the *topic* is deemed "notable", just the article itself is of very poor quality, then once these have undergone manual review and intervention, the right place to host them would be back in Wikipedia. It would also be nice to have a wiki for topics deemed by Wikipedia to be "not notable". But this is a much different problem, and I don't believe it is served well by the autopopulation of mostly very low quality articles.
Also, sorry if this was already answered, but where are they on archive.org?
Hi Mike,
You are going to find it difficult to find people who want to give their time to your project, as the cost benefit ratio is very low. If people had more time, they would be spending it rescuing the articles before deletion (on a non-profit wesbsite), rather than preparing them for rescuing after deletion (on a for-profit website).
By reposting the content somewhere else, you are taking responsibility for it. And by hosting it, Wikia is also taking responsibility for it.
And that responsibility requires you to work with the existing system, warts and all. Even good changes to the system will take a long time to become standard practise.
Before trying to change New Page Patrol, you should try doing New Page Patrol for a few days.
On Sun, Jul 22, 2012 at 4:33 AM, Mike Dupont jamesmikedupont@googlemail.com wrote:
John, and others.
I have finally figured out a big problem with my plan. The articles for deletion are not tagged peoperly at all. There are authors who know for a fact that articles are mistagged and have no proper copyvio tagging, and now they are accusing me of hosting copyvio articles. I see this a problem in the wikipedia deletion system, if an editor knows for a fact that an articles is in violation of copyright then they should tag it as Such. I have written scripts to strip out artilces that are properly tagged. Lets sit down and work out a plan for a proper system of sorting out what is not notable, and waht is copyrightvio. I want to host the non notable artilces. My argument is that giving non-notable bands and actors etc an outlet to be hosted will reduce repeated reposting of articles. I have been sorting through all these articles, contacting people and many of them are thankful, I would be suprized if any of them would repost the deleted article, like the Jack Psyco from .au, someone reposted his article many many times. Please support me in cleaning up the deletion and tagging process, I am willing to put some work into this. I can write code as well. Some people have asked me not to use the mailing list, but I wanted to bring up this up in response to your mail.
Please see http://en.wikipedia.org/wiki/User:Mdupont/SpeedyDeletionWikia and http://en.wikipedia.org/wiki/User_talk:Mdupont#Speedy_backup_-_copyvios_and_...
thanks mike On Mon, Jun 11, 2012 at 8:00 AM, John Vandenberg jayvdb@gmail.com wrote:
I think we need to ensure that BLP deletions are tagged appropriately.
and this wikia needs to err on the side of caution in order to avoid causing subjects further grief in their pursuit to remove problematic content from Wikipedia. i.e. if someone jumps through all the hoops to *help* us remove problematic content from Wikipedia, they are not going to be happy to learn that the same content has appeared on Wikia
- its confusing, and they will blame Wikipedia, and IMO they are right
to do so as this Wikia is run by people in the Wikimedia community, and due to the overlap in the WMF board and Wikia board, now and historically.
e.g. this AFD mentioned "WP:BLP1E" and was categorised into "AfD debates (Biographical)"
https://en.wikipedia.org/wiki/Wikipedia:Articles_for_deletion/Alexander_Kiny... http://speedydeletion.wikia.com/wiki/Alexander_Kinyua
Until we are confident that BLP problems are not being imported into the wikia, the content shouldnt be indexed. I assume __NOINDEX__ works on Wikia?
On Mon, Jun 11, 2012 at 2:15 PM, Newyorkbrad newyorkbrad@gmail.com wrote:
Although I can understand the appeal of this concept, I am concerned that a deleted-articles wiki or site will perpetuate the publicity given to pages that are properly deleted from Wikipedia because they contain offensive personal attacks, harassment, cyberbullying, defamation, and BLP violations. These are not always flagged in the deletion grounds, especially in speedy situations (e.g. if a harassing or defamatory article does not assert the subject's notability, it will often be deleted on that ground without its being tagged as an attack page, etc.). This issue strikes me as extremely serious. How do you plan to address it?
Newyorkbrad
On 6/10/12, Mike Dupont jamesmikedupont@googlemail.com wrote:
Hi, I have launched speedydeletion.wika.com , it is updated every 30
minutes
with the proposed deletions and speedy deletion articles (not notable
and
hoaxes, not others). it is running on the en.wikipedia.org. the sources for the script are
all
on git hub and are a merger of pywikipediabot and the wikiteam
codebases.
hope you enjoy it, thanks, mike -- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
-- John Vandenberg
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
-- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Thanks John,
Thanks for your feedback, that is a good suggestion. working on New Page Patrol.
I have been doing alot of work on this myself and have saved a bunch of bands etc, been contacting people on twitter and it seems that there has been a good feedback.
Non Profit website sounds great, but I have no funds for this, Ideally we would find a way to solve this problem.
The issue here that got me started was trying to promote wikipedia in kosovo and albania. Did you know that people started a new group to support wikipedia in albania, similar to flossk? http://openlabs.cc/
The problem with new articles from these places, and like india is that they have a very high deletion rate, and we need a way to save them. I am willing to put work into the articles on bands, artists, movies, atheletes etc , but also websites to save them.
We just need to make a clear distinction between not-notable and what is illegal.
mike
On Sun, Jul 22, 2012 at 3:16 AM, John Vandenberg jayvdb@gmail.com wrote:
Hi Mike,
You are going to find it difficult to find people who want to give their time to your project, as the cost benefit ratio is very low. If people had more time, they would be spending it rescuing the articles before deletion (on a non-profit wesbsite), rather than preparing them for rescuing after deletion (on a for-profit website).
By reposting the content somewhere else, you are taking responsibility for it. And by hosting it, Wikia is also taking responsibility for it.
And that responsibility requires you to work with the existing system, warts and all. Even good changes to the system will take a long time to become standard practise.
Before trying to change New Page Patrol, you should try doing New Page Patrol for a few days.
On Sun, Jul 22, 2012 at 4:33 AM, Mike Dupont jamesmikedupont@googlemail.com wrote:
John, and others.
I have finally figured out a big problem with my plan. The articles for deletion are not tagged peoperly at all. There are authors who know for a fact that articles are mistagged and have no proper copyvio tagging, and now they are accusing me of hosting copyvio articles. I see this a
problem
in the wikipedia deletion system, if an editor knows for a fact that an articles is in violation of copyright then they should tag it as Such. I have written scripts to strip out artilces that are properly tagged. Lets sit down and work out a plan for a proper system of sorting out what is
not
notable, and waht is copyrightvio. I want to host the non notable
artilces.
My argument is that giving non-notable bands and actors etc an outlet to
be
hosted will reduce repeated reposting of articles. I have been sorting through all these articles, contacting people and many of them are thankful, I would be suprized if any of them would repost the deleted article, like the Jack Psyco from .au, someone reposted his article many many times. Please support me in cleaning up the deletion and tagging process, I am willing to put some work into this. I can write code as well. Some people have asked me not to use the mailing list, but I wanted to bring up this up in response to your mail.
Please see http://en.wikipedia.org/wiki/User:Mdupont/SpeedyDeletionWikia and
http://en.wikipedia.org/wiki/User_talk:Mdupont#Speedy_backup_-_copyvios_and_...
thanks mike On Mon, Jun 11, 2012 at 8:00 AM, John Vandenberg jayvdb@gmail.com
wrote:
I think we need to ensure that BLP deletions are tagged appropriately.
and this wikia needs to err on the side of caution in order to avoid causing subjects further grief in their pursuit to remove problematic content from Wikipedia. i.e. if someone jumps through all the hoops to *help* us remove problematic content from Wikipedia, they are not going to be happy to learn that the same content has appeared on Wikia
- its confusing, and they will blame Wikipedia, and IMO they are right
to do so as this Wikia is run by people in the Wikimedia community, and due to the overlap in the WMF board and Wikia board, now and historically.
e.g. this AFD mentioned "WP:BLP1E" and was categorised into "AfD debates (Biographical)"
https://en.wikipedia.org/wiki/Wikipedia:Articles_for_deletion/Alexander_Kiny...
http://speedydeletion.wikia.com/wiki/Alexander_Kinyua
Until we are confident that BLP problems are not being imported into the wikia, the content shouldnt be indexed. I assume __NOINDEX__ works on Wikia?
On Mon, Jun 11, 2012 at 2:15 PM, Newyorkbrad newyorkbrad@gmail.com wrote:
Although I can understand the appeal of this concept, I am concerned that a deleted-articles wiki or site will perpetuate the publicity given to pages that are properly deleted from Wikipedia because they contain offensive personal attacks, harassment, cyberbullying, defamation, and BLP violations. These are not always flagged in the deletion grounds, especially in speedy situations (e.g. if a harassing or defamatory article does not assert the subject's notability, it will often be deleted on that ground without its being tagged as an attack page, etc.). This issue strikes me as extremely serious. How do you plan to address it?
Newyorkbrad
On 6/10/12, Mike Dupont jamesmikedupont@googlemail.com wrote:
Hi, I have launched speedydeletion.wika.com , it is updated every 30
minutes
with the proposed deletions and speedy deletion articles (not notable
and
hoaxes, not others). it is running on the en.wikipedia.org. the sources for the script
are
all
on git hub and are a merger of pywikipediabot and the wikiteam
codebases.
hope you enjoy it, thanks, mike -- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
-- John Vandenberg
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
-- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
-- John Vandenberg
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
On 22 July 2012 07:09, Mike Dupont jamesmikedupont@googlemail.com wrote:
The problem with new articles from these places, and like india is that they have a very high deletion rate, and we need a way to save them. I am willing to put work into the articles on bands, artists, movies, atheletes etc , but also websites to save them.
(this is getting a bit en:wp-specific, but anyway)
Robert Horning just raised this on the Village Pump - he wants to get article incubation going again:
https://en.wikipedia.org/wiki/Wikipedia:Village_pump_%28policy%29#Article_In...
I answered that the main barrier was just having people willing to do it. You may want to mention your interest there.
- d.
On 22 July 2012 04:16, John Vandenberg jayvdb@gmail.com wrote:
Before trying to change New Page Patrol, you should try doing New Page Patrol for a few days.
+1
I'm broadly an inclusionist, but reading a few hours' Special:Newpages is enough to have me breaking out the flamethrower. Anyone complaining about new pages patrol needs to personally sample the firehose of *mostly shit* that comes into Wikipedia.
- d.
This has been tried before, i.e. wikialpha.org. Pages are speedily deleted for a reason, many of them quite properly so. Moving potentially libelous BLP attack pages and other sundry junk to a publicly viewable wiki is not a very well-thought-out idea.
From: jamesmikedupont@googlemail.com Date: Sun, 10 Jun 2012 07:37:24 +0000 To: wikimedia-l@lists.wikimedia.org Subject: [Wikimedia-l] speedydeletion.wika.com lauched
Hi, I have launched speedydeletion.wika.com , it is updated every 30 minutes with the proposed deletions and speedy deletion articles (not notable and hoaxes, not others). it is running on the en.wikipedia.org. the sources for the script are all on git hub and are a merger of pywikipediabot and the wikiteam codebases. hope you enjoy it, thanks, mike -- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
http://deletionpedia.dbatley.com/w/index.php?title=Main_Page also worked up until a few months ago.
Richard Symonds Wikimedia UK 0207 065 0992 Disclaimer viewable at http://uk.wikimedia.org/wiki/Wikimedia:Email_disclaimer Visit http://www.wikimedia.org.uk/ and @wikimediauk
On 11 June 2012 14:02, Tarc Meridian tarc@hotmail.com wrote:
This has been tried before, i.e. wikialpha.org. Pages are speedily deleted for a reason, many of them quite properly so. Moving potentially libelous BLP attack pages and other sundry junk to a publicly viewable wiki is not a very well-thought-out idea.
From: jamesmikedupont@googlemail.com Date: Sun, 10 Jun 2012 07:37:24 +0000 To: wikimedia-l@lists.wikimedia.org Subject: [Wikimedia-l] speedydeletion.wika.com lauched
Hi, I have launched speedydeletion.wika.com , it is updated every 30 minutes with the proposed deletions and speedy deletion articles (not notable and hoaxes, not others). it is running on the en.wikipedia.org. the sources for the script are
all
on git hub and are a merger of pywikipediabot and the wikiteam codebases. hope you enjoy it, thanks, mike -- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
I think a lot of those efforts petered out because, in the end, the material was of so little utility no one was interested in it.
Or to put it another way; it was being archived for the sake of it, more than anything else.
Tom
On 11 June 2012 14:02, Tarc Meridian tarc@hotmail.com wrote:
This has been tried before, i.e. wikialpha.org. Pages are speedily deleted for a reason, many of them quite properly so. Moving potentially libelous BLP attack pages and other sundry junk to a publicly viewable wiki is not a very well-thought-out idea.
From: jamesmikedupont@googlemail.com Date: Sun, 10 Jun 2012 07:37:24 +0000 To: wikimedia-l@lists.wikimedia.org Subject: [Wikimedia-l] speedydeletion.wika.com lauched
Hi, I have launched speedydeletion.wika.com , it is updated every 30 minutes with the proposed deletions and speedy deletion articles (not notable and hoaxes, not others). it is running on the en.wikipedia.org. the sources for the script are
all
on git hub and are a merger of pywikipediabot and the wikiteam codebases. hope you enjoy it, thanks, mike -- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
There was a site (I think it was deletionpedia or something) which collected non-speedied articles which were deleted. I found that interesting; a collection of fringe notable articles and topics. I think that stopped collecting years ago; is that something you would be interested in adding to your wiki?
To answer your question, any article nominated for deletion is also archived.
Well my personal motivation was that some of my articles were deleted because of notability and i see many articles in kosovo deleted, so I want to preserve them.
Anyway, It does not cost much and I am learning. I hope that it will have some impact, lets see. mike
On Mon, Jun 11, 2012 at 1:38 PM, Chris Lee theornamentalist@gmail.comwrote:
There was a site (I think it was deletionpedia or something) which collected non-speedied articles which were deleted. I found that interesting; a collection of fringe notable articles and topics. I think that stopped collecting years ago; is that something you would be interested in adding to your wiki? _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Well, there are already articles showing up on my pet topic of kosovo and albania, they would be lost otherwise, so I would say it is working for me. mike
On Mon, Jun 11, 2012 at 1:02 PM, Tarc Meridian tarc@hotmail.com wrote:
This has been tried before, i.e. wikialpha.org. Pages are speedily deleted for a reason, many of them quite properly so. Moving potentially libelous BLP attack pages and other sundry junk to a publicly viewable wiki is not a very well-thought-out idea.
From: jamesmikedupont@googlemail.com Date: Sun, 10 Jun 2012 07:37:24 +0000 To: wikimedia-l@lists.wikimedia.org Subject: [Wikimedia-l] speedydeletion.wika.com lauched
Hi, I have launched speedydeletion.wika.com , it is updated every 30 minutes with the proposed deletions and speedy deletion articles (not notable and hoaxes, not others). it is running on the en.wikipedia.org. the sources for the script are
all
on git hub and are a merger of pywikipediabot and the wikiteam codebases. hope you enjoy it, thanks, mike -- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
On Mon, Jun 11, 2012 at 9:02 AM, Tarc Meridian tarc@hotmail.com wrote:
This has been tried before, i.e. wikialpha.org. Pages are speedily deleted for a reason, many of them quite properly so. Moving potentially libelous BLP attack pages and other sundry junk to a publicly viewable wiki is not a very well-thought-out idea.
Every new article is potentially libelous.
Mike is the terror of deletionists.
Good work.
2012/6/10 Mike Dupont jamesmikedupont@googlemail.com
Hi, I have launched speedydeletion.wika.com , it is updated every 30 minutes with the proposed deletions and speedy deletion articles (not notable and hoaxes, not others). it is running on the en.wikipedia.org. the sources for the script are all on git hub and are a merger of pywikipediabot and the wikiteam codebases. hope you enjoy it, thanks, mike -- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
here is one that is worth keeping! http://speedydeletion.wikia.com/wiki/Rootstrikers
On Mon, Jun 11, 2012 at 2:57 PM, emijrp emijrp@gmail.com wrote:
Mike is the terror of deletionists.
Good work.
2012/6/10 Mike Dupont jamesmikedupont@googlemail.com
Hi, I have launched speedydeletion.wika.com , it is updated every 30 minutes with the proposed deletions and speedy deletion articles (not notable and hoaxes, not others). it is running on the en.wikipedia.org. the sources for the script are
all
on git hub and are a merger of pywikipediabot and the wikiteam codebases. hope you enjoy it, thanks, mike -- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
-- Emilio J. Rodríguez-Posada. E-mail: emijrp AT gmail DOT com Pre-doctoral student at the University of Cádiz (Spain) Projects: AVBOT http://code.google.com/p/avbot/ | StatMediaWikihttp://statmediawiki.forja.rediris.es | WikiEvidens http://code.google.com/p/wikievidens/ | WikiPapershttp://wikipapers.referata.com | WikiTeam http://code.google.com/p/wikiteam/ Personal website: https://sites.google.com/site/emijrp/ _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Indeed; and so it was...
{{facepalm}}
Tom
On 11 June 2012 17:32, Mike Dupont jamesmikedupont@googlemail.com wrote:
here is one that is worth keeping! http://speedydeletion.wikia.com/wiki/Rootstrikers
On Mon, Jun 11, 2012 at 2:57 PM, emijrp emijrp@gmail.com wrote:
Mike is the terror of deletionists.
Good work.
2012/6/10 Mike Dupont jamesmikedupont@googlemail.com
Hi, I have launched speedydeletion.wika.com , it is updated every 30
minutes
with the proposed deletions and speedy deletion articles (not notable
and
hoaxes, not others). it is running on the en.wikipedia.org. the sources for the script are
all
on git hub and are a merger of pywikipediabot and the wikiteam
codebases.
hope you enjoy it, thanks, mike -- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
-- Emilio J. Rodríguez-Posada. E-mail: emijrp AT gmail DOT com Pre-doctoral student at the University of Cádiz (Spain) Projects: AVBOT http://code.google.com/p/avbot/ | StatMediaWikihttp://statmediawiki.forja.rediris.es | WikiEvidens http://code.google.com/p/wikievidens/ | WikiPapershttp://wikipapers.referata.com | WikiTeam http://code.google.com/p/wikiteam/ Personal website: https://sites.google.com/site/emijrp/ _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
-- James Michael DuPont Member of Free Libre Open Source Software Kosova http://flossk.org Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 _______________________________________________ Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
2012/6/10 Mike Dupont jamesmikedupont@googlemail.com:
Hi, I have launched speedydeletion.wika.com , it is updated every 30 minutes with the proposed deletions and speedy deletion articles (not notable and hoaxes, not others).
Hey Mike,
Great idea, do you plan to extend it to other wikis?
Also, I see that you only make the content available as CCBYSA, which might be a problem if someone decided to use the deleted content back into Wikipedia (well, not so much a problem as a complication for reusers). Perhaps you could consider releasing the content also unde GFDL where it applies?
Strainu
Hi there, I am still waiting for some ok to install it on toolserver, the source is open, you can run it as well. I dont plan on going for other wikis until i get one down perfect. In terms of license, i am just using a template for that, I have not thought about gfdl much but I dont want to change the license, just make the data available, feel free to update the templates as needed, thanks, mike
On Tue, Jun 12, 2012 at 8:05 AM, Strainu strainu10@gmail.com wrote:
2012/6/10 Mike Dupont jamesmikedupont@googlemail.com:
Hi, I have launched speedydeletion.wika.com , it is updated every 30 minutes with the proposed deletions and speedy deletion articles (not notable and hoaxes, not others).
Hey Mike,
Great idea, do you plan to extend it to other wikis?
Also, I see that you only make the content available as CCBYSA, which might be a problem if someone decided to use the deleted content back into Wikipedia (well, not so much a problem as a complication for reusers). Perhaps you could consider releasing the content also unde GFDL where it applies?
Strainu
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
wikimedia-l@lists.wikimedia.org