It's completely broken: https://code.google.com/p/wikiteam/issues/detail?id=56 It will download only a fraction of the wiki, 500 pages at most per namespace.
Let me reiterate that https://code.google.com/p/wikiteam/issues/detail?id=44 is a very urgent bug and we've seen no work on it in many months. We need an actual programmer with some knowledge of python to fix it and make the script work properly; I know there are several on this list (and elsewhere), please please help. The last time I, as a non-coder, tried to fix a bug, I made things worse (https://code.google.com/p/wikiteam/issues/detail?id=26).
Only after API is implemented/fixed, I'll be able to re-archive the 4-5 thousands wikis we've recently archived on archive.org (https://archive.org/details/wikiteam) and possibly many more. Many of those dumps contain errors and/or are just partial because of the script's unreliability, and wikis die on a daily basis. (So, quoting emijrp, there IS a deadline.)
Nemo
P.s.: Cc'ing some lists out of desperation; sorry for cross-posting.
Hi all,
I am beginning work on a port to PHP due to some issues regarding unit testing for another project of mine (if you follow me on GitHub, you will know). I hope to help out with fixing the script, but it is a good idea to get someone who knows python (pywikipedia-l people) and the MediaWiki API (mediawiki-api people) to help.
On Fri, Nov 9, 2012 at 6:27 PM, Federico Leva (Nemo) nemowiki@gmail.comwrote:
It's completely broken: https://code.google.com/p/** wikiteam/issues/detail?id=56https://code.google.com/p/wikiteam/issues/detail?id=56 It will download only a fraction of the wiki, 500 pages at most per namespace.
Let me reiterate that https://code.google.com/p/** wikiteam/issues/detail?id=44https://code.google.com/p/wikiteam/issues/detail?id=44is a very urgent bug and we've seen no work on it in many months. We need an actual programmer with some knowledge of python to fix it and make the script work properly; I know there are several on this list (and elsewhere), please please help. The last time I, as a non-coder, tried to fix a bug, I made things worse (https://code.google.com/p/** wikiteam/issues/detail?id=26https://code.google.com/p/wikiteam/issues/detail?id=26 ).
Only after API is implemented/fixed, I'll be able to re-archive the 4-5 thousands wikis we've recently archived on archive.org ( https://archive.org/details/**wikiteamhttps://archive.org/details/wikiteam) and possibly many more. Many of those dumps contain errors and/or are just partial because of the script's unreliability, and wikis die on a daily basis. (So, quoting emijrp, there IS a deadline.)
Nemo
P.s.: Cc'ing some lists out of desperation; sorry for cross-posting.
If someone wants to contact me off list I will gladly re-write this clusterfuck I just need some help with the details, and a step-by step of exactly what/how you want the process of dumping a wiki done, Ive done it before quite a few times myself and getting you a functional bot shouldnt be that hard (I am both a python and API programmer :P )
On Fri, Nov 9, 2012 at 5:27 AM, Federico Leva (Nemo) nemowiki@gmail.com wrote:
It's completely broken: https://code.google.com/p/wikiteam/issues/detail?id=56 It will download only a fraction of the wiki, 500 pages at most per namespace.
Let me reiterate that https://code.google.com/p/wikiteam/issues/detail?id=44 is a very urgent bug and we've seen no work on it in many months. We need an actual programmer with some knowledge of python to fix it and make the script work properly; I know there are several on this list (and elsewhere), please please help. The last time I, as a non-coder, tried to fix a bug, I made things worse (https://code.google.com/p/wikiteam/issues/detail?id=26).
Only after API is implemented/fixed, I'll be able to re-archive the 4-5 thousands wikis we've recently archived on archive.org (https://archive.org/details/wikiteam) and possibly many more. Many of those dumps contain errors and/or are just partial because of the script's unreliability, and wikis die on a daily basis. (So, quoting emijrp, there IS a deadline.)
Nemo
P.s.: Cc'ing some lists out of desperation; sorry for cross-posting.
Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
Hi John (and all),
Can you provide me with your GitHub username? I did a git-svn conversion to a repository there, and I can add you as a member to contribute the code.
The repo is at https://github.com/WikiTeam/WikiGrabber.
Its just a few revisions out of date, got to figure out a way to pull changes from the main repository to this unofficial clone. (Note that I can push directly to the main repository on Google Code)
On Fri, Nov 9, 2012 at 11:56 PM, John phoenixoverride@gmail.com wrote:
If someone wants to contact me off list I will gladly re-write this clusterfuck I just need some help with the details, and a step-by step of exactly what/how you want the process of dumping a wiki done, Ive done it before quite a few times myself and getting you a functional bot shouldnt be that hard (I am both a python and API programmer :P )
On Fri, Nov 9, 2012 at 5:27 AM, Federico Leva (Nemo) nemowiki@gmail.com wrote:
It's completely broken: https://code.google.com/p/wikiteam/issues/detail?id=56 It will download only a fraction of the wiki, 500 pages at most per namespace.
Let me reiterate that
https://code.google.com/p/wikiteam/issues/detail?id=44
is a very urgent bug and we've seen no work on it in many months. We
need an
actual programmer with some knowledge of python to fix it and make the script work properly; I know there are several on this list (and
elsewhere),
please please help. The last time I, as a non-coder, tried to fix a bug,
I
made things worse (
https://code.google.com/p/wikiteam/issues/detail?id=26).
Only after API is implemented/fixed, I'll be able to re-archive the 4-5 thousands wikis we've recently archived on archive.org (https://archive.org/details/wikiteam) and possibly many more. Many of
those
dumps contain errors and/or are just partial because of the script's unreliability, and wikis die on a daily basis. (So, quoting emijrp,
there IS
a deadline.)
Nemo
P.s.: Cc'ing some lists out of desperation; sorry for cross-posting.
Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
On Fri, Nov 9, 2012 at 11:02 AM, Hydriz Wikipedia admin@alphacorp.tkwrote:
Can you provide me with your GitHub username? I did a git-svn conversion to a repository there, and I can add you as a member to contribute the code.
One of the main advantages of git/Github is that you don't need to do precoordination like this. He can just fork the repository and start working on it, then send you a pull request when it's done.
The repo is at https://github.com/WikiTeam/WikiGrabber.
Its just a few revisions out of date, got to figure out a way to pull changes from the main repository to this unofficial clone. (Note that I can push directly to the main repository on Google Code)
You should still have SVN as a remote that you can pull from to get yourself synced.
Tom
On Fri, Nov 9, 2012 at 11:56 PM, John phoenixoverride@gmail.com wrote:
If someone wants to contact me off list I will gladly re-write this clusterfuck I just need some help with the details, and a step-by step of exactly what/how you want the process of dumping a wiki done, Ive done it before quite a few times myself and getting you a functional bot shouldnt be that hard (I am both a python and API programmer :P )
On Fri, Nov 9, 2012 at 5:27 AM, Federico Leva (Nemo) nemowiki@gmail.com wrote:
It's completely broken: https://code.google.com/p/wikiteam/issues/detail?id=56 It will download only a fraction of the wiki, 500 pages at most per namespace.
Let me reiterate that
https://code.google.com/p/wikiteam/issues/detail?id=44
is a very urgent bug and we've seen no work on it in many months. We
need an
actual programmer with some knowledge of python to fix it and make the script work properly; I know there are several on this list (and
elsewhere),
please please help. The last time I, as a non-coder, tried to fix a
bug, I
made things worse (
https://code.google.com/p/wikiteam/issues/detail?id=26).
Only after API is implemented/fixed, I'll be able to re-archive the 4-5 thousands wikis we've recently archived on archive.org (https://archive.org/details/wikiteam) and possibly many more. Many of
those
dumps contain errors and/or are just partial because of the script's unreliability, and wikis die on a daily basis. (So, quoting emijrp,
there IS
a deadline.)
Nemo
P.s.: Cc'ing some lists out of desperation; sorry for cross-posting.
Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
-- Regards, Hydriz
We've created the greatest collection of shared knowledge in history. Help protect Wikipedia. Donate now: http://donate.wikimedia.org
Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
On Sat, Nov 10, 2012 at 12:17 AM, Tom Morris tfmorris@gmail.com wrote:
On Fri, Nov 9, 2012 at 11:02 AM, Hydriz Wikipedia admin@alphacorp.tkwrote:
Can you provide me with your GitHub username? I did a git-svn conversion to a repository there, and I can add you as a member to contribute the code.
One of the main advantages of git/Github is that you don't need to do precoordination like this. He can just fork the repository and start working on it, then send you a pull request when it's done.
True on that, at least I can do a code review first before things gets merged in. I am more than welcome to get anyone else willing to help on this to fork and submit a pull request later.
The repo is at https://github.com/WikiTeam/WikiGrabber.
Its just a few revisions out of date, got to figure out a way to pull changes from the main repository to this unofficial clone. (Note that I can push directly to the main repository on Google Code)
You should still have SVN as a remote that you can pull from to get yourself synced.
I probably have to figure this out, though it isn't of the highest priority due to the small little changes that is happening in the main repo which may probably not have a big impact.
Tom
On Fri, Nov 9, 2012 at 11:56 PM, John phoenixoverride@gmail.com wrote:
If someone wants to contact me off list I will gladly re-write this clusterfuck I just need some help with the details, and a step-by step of exactly what/how you want the process of dumping a wiki done, Ive done it before quite a few times myself and getting you a functional bot shouldnt be that hard (I am both a python and API programmer :P )
On Fri, Nov 9, 2012 at 5:27 AM, Federico Leva (Nemo) nemowiki@gmail.com wrote:
It's completely broken: https://code.google.com/p/wikiteam/issues/detail?id=56 It will download only a fraction of the wiki, 500 pages at most per namespace.
Let me reiterate that
https://code.google.com/p/wikiteam/issues/detail?id=44
is a very urgent bug and we've seen no work on it in many months. We
need an
actual programmer with some knowledge of python to fix it and make the script work properly; I know there are several on this list (and
elsewhere),
please please help. The last time I, as a non-coder, tried to fix a
bug, I
made things worse (
https://code.google.com/p/wikiteam/issues/detail?id=26).
Only after API is implemented/fixed, I'll be able to re-archive the 4-5 thousands wikis we've recently archived on archive.org (https://archive.org/details/wikiteam) and possibly many more. Many
of those
dumps contain errors and/or are just partial because of the script's unreliability, and wikis die on a daily basis. (So, quoting emijrp,
there IS
a deadline.)
Nemo
P.s.: Cc'ing some lists out of desperation; sorry for cross-posting.
Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
-- Regards, Hydriz
We've created the greatest collection of shared knowledge in history. Help protect Wikipedia. Donate now: http://donate.wikimedia.org
Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l