Hi everyone,
I recently set up a MediaWiki (http://server.bluewatersys.com/w90n740/)
and I need to extra the content from it and convert it into LaTeX
syntax for printed documentation. I have googled for a suitable OSS
solution but nothing was apparent.
I would prefer a script written in Python, but any recommendations
would be very welcome.
Do you know of anything suitable?
Kind Regards,
Hugo Vincent,
Bluewater Systems.
I've been tinkering with an extension to provide for a captcha to reduce
automated linkspamming while still staying out of the way for common use.
My preliminary code is running now on test.leuksman.com; the actual
"captcha" part is a really primitive plain text hack which would take
all of a few minutes for a dedicated attacker to crack, but don't worry
about that -- I'm not testing the protection yet, just the framework it
plugs into.
By default the captcha prompt will only kick in if an edit adds new URLs
to the text. Most regular editing shouldn't trip this -- wiki links,
plain text, or just preserving existing links. But if you add new HTTP
links that weren't there before, it'll then make you pass the captcha
before it saves.
The captcha step can also be bypassed based on user group (eg registered
bots, sysop accounts, optionally all registered users), and can also be
set to skip for any user who has gone through confirmation of their
account e-mail address.
I haven't coded it yet, but it should also be possible to add a URL
whitelist, for instance for the site's own local URLs.
As for a 'real' captcha generator to put into this system; I'm not too
sure what code is already out there that's not awful. There's a Drupal
plugin which would be easy to rip GPL'd PHP code from, but it doesn't
seem very robust.
There's a set of samples of various captcha output and their weaknesses
here: http://sam.zoy.org/pwntcha/
Obviously it would be good to either find something on the 'hard
captchas' list rather than 'defeated captchas', or roll our own that
doesn't suck too bad.
There's also the question of whether we can feasibly provide an audio
alternative or whathaveyou.
-- brion vibber (brion @ pobox.com)
Hi,
i'm the administrator of a Java italian website, javastaff.com. We are
a group of developers and we would like to create a J2ME client for
Wikipedia, that would be an interesting software for a cellular phone.
Of course we will do it only to donate a good software to the
users.
I have already talked with Jimmy Wales and he is entusiast of this project.
We have already studied the structure of Wikipedia and we have done
experiment parsing the information with a search on your website, but i
would
like to know if there is some page we can access to perform a better search
(only text for example or a webservice or something else).
I hope you like this idea :)
Best regards
---------------------------------------------------
Federico Paparoni
JavaStaff.com Admin
<doc>In girum imus nocte et consumimur ign</doc>
---------------------------------------------------
(was "Problem importing using importDump.php")
Having given up on importDump.php, I'm now trying to
import all Wikipedia articles using mwdumper.jar
The command I typed into the terminal was:
/System/Library/Frameworks/JavaVM.framework/Versions/1.5/Commands/java
-jar /Users/xed/Desktop/mwdumper.jar --format=sql:1.5
/Users/xed/Desktop/20051127_pages_articles.xml.bz2 |
/usr/local/mysql-standard-5.0.16-osx10.4-powerpc/bin/mysql
-u wikiuser -p wikidb
After entering the database password it came up with
this error:
ERROR 1146 (42S02) at line 31: Table 'wikidb.text'
doesn't exist
...and then immediately started doing this:
1,000 pages (49.717/sec), 1,000 revs (49.717/sec)
2,000 pages (75.106/sec), 2,000 revs (75.106/sec)
3,000 pages (92.308/sec), 3,000 revs (92.308/sec)
4,000 pages (97.611/sec), 4,000 revs (97.611/sec)
..etc..
now it's up to
408,000 pages (306.929/sec), 408,000 revs
(306.929/sec)
..which is nice.
But what is it doing? Is it actually going into the
Wiki I set up? Looking at the "All pages" Special page
in my Wiki I just see the couple of pages that I had
already made. Are there any other steps I have to take
once mwdumper has done it's job?
Thanks
X
___________________________________________________________
To help you stay safe and secure online, we've developed the all new Yahoo! Security Centre. http://uk.security.yahoo.com
Would it be a good idea to tell our robots.txt not to index talk pages?
- d.
jkelly(a)fas.harvard.edu wrote:
> I wasn't aware that Google results were influenced by material on Talk pages.
> If this is true, it explains instances in which I have seen anons post some
> ideological screed in the article, have it removed, and then re-post it
> repeatedly into the article's Talk page. Is this actually that effective a
> tactic for using Wikipedia as a soapbox?
>
> Jason
>
>
>
> Quoting slimvirgin(a)gmail.com:
>
>
>>On 11/30/05, Fred Bauder <fredbaud(a)ctelco.net> wrote:
>>
>>>This case Slim Virgin mentions is in arbitration now and a blatant
>>>example of gaming Google by associating the name of the person with a
>>>lot of accusations he has only a marginal connection with ...
>>>At a minimum we need to not allow Google to index our talk pages. We
>>>talk about a lot of things. They may be about information but they
>>>are not encyclopedic.
>>
>>Fred, the case I was referring to isn't the one that's in arbitration,
>>though I know the one you mean, and it's quite similar. I'm starting
>>to wonder whether this is happening a lot: that troublemakers see our
>>talk pages as a sort of Trojan horse. They pretend to be having an
>>innocent conversation designed to sort out the good from the bad
>>material, whereas in fact the discussion is only a vehicle being used
>>to spread the bad stuff, which they know won't survive in our
>>articles.
>>
>>Sarah
>>_______________________________________________
>>WikiEN-l mailing list
>>WikiEN-l(a)Wikipedia.org
>>To unsubscribe from this mailing list, visit:
>>http://mail.wikipedia.org/mailman/listinfo/wikien-l
>>
>
>
>
>
>
> _______________________________________________
> WikiEN-l mailing list
> WikiEN-l(a)Wikipedia.org
> To unsubscribe from this mailing list, visit:
> http://mail.wikipedia.org/mailman/listinfo/wikien-l
>
Doing a CheckUser, I saw someone as coming from 207.142.131.239.
That's bayle.wikimedia.org ... how does that show up as the IP an
edit's coming from?
- d.
Hello,
I had been using an extension with MediaWiki 1.4.6 but after installing
it in MediaWiki 1.5, I get the following error when using the extension:
Fatal error: Call to undefined function: fetchobject() in c:\program
files\easyphp1-7\www\latest\extensions\DocumentExport.php on line 460
It appears to be having trouble with the function fetchobject() in
Database.php. But I looked at that function in 1.4.6 and 1.5 and they
both look identical. I was wondering if anyone knew what has changed in
the way that function ( fetchobject() ) is used that might cause this
error? Any help help would be much appreciated.
Thanks,
Andrew.
Let's say a wikicities project has a description on Rome in a certain
language - is there a possibility to insert an interwikilink to that
language version withing wikipedia?
Thanks for any hint!
Ciao, Sabine
___________________________________
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB
http://mail.yahoo.it
Hi all. Newbie here.
I've setup up MediaWiki on OS X using the arcane
instructions here:
http://meta.wikimedia.org/wiki/Help:Running_MediaWiki_on_Mac_OS_X
The Wiki works fine. Tested it by creating pages and
uploading images etc.
Then I wanted to import articles from Wikipedia. So I
downloaded the latest "articles only" dump
(20051127_pages_articles.xml.bz2), and then performed
a bzip2 -d on it. Then I renamed it as "articles.xml"
Next I tried to import it.
I went to my wikis direcotory and renamed
AdminSettings.sample as AdminSettings.php and changed
$wgDBadminuser and $wgDBadminpassword to what I had
specified when setting up the wiki (DB username and DB
password on the Site Config screen)
Then, after going to the maintenance directory, I
wrote this in the terminal:
php importDump.php /Users/xed/Desktop/articles.xml
It spits out this error:
<h1><img
src='/~xed/testwiki/skins/common/images/wiki.png'
style='float:left;margin-right:1em' alt=''>Testpedia
has a problem</h1><p><strong>Sorry! This site is
experiencing technical
difficulties.</strong></p><p>Try waiting a few minutes
and reloading.</p><p><small>(Can't contact the
database server: Client does not support
authentication protocol requested by server; consider
upgrading MySQL client (localhost))</small></p>
I thought it might have something to do with the size
of the "articles.xml" file (nearly 4GB), but when I
tested it on a single exported page from Wikipedia, it
does the same thing.
I'm running:
MediaWiki: 1.5.2
PHP: 5.0.4 (apache)
MySQL: 5.0.16-standard
Thanks
___________________________________________________________
Yahoo! Model Search 2005 - Find the next catwalk superstars - http://uk.news.yahoo.com/hot/model-search/