Hi everyone,
I'm new to the list and somewhat-new to Mediawiki.
I am trying set up some wrapper code that automates certain MediaWiki
functions, such as page creation or renaming. For example, an
external application would attempt to create a new article in MW. It
might check to see whether the article exists, and create it if it
does not. It might append some content to the article text. It might
move the article, or set up a redirect. At the moment, I'm
accomplishing this by issuing HTTP sub-requests to MediaWiki and
interpreting the response.
MediaWiki does not seem to have a public-facing API, is that right?
Are there any F/OSS projects that provide such functionality without
requiring screen-scraping? Alternatively, is there any way to
interface with the internals of the MW, *without* invoking an entire
request, i.e. the stuff that requires the MEDIAWIKI constant? Failing
*that*, does documentation exist that explains MW's DB schema so I
can begin to write my own?
I'm using the current-stable version, 1.4.9, if that matters.
-mike.
----------------------------------------------------------------
michal migurski- mike(a)stamen.com
415.558.1610
Hi everybody!
I'm newbie on this list, so apologize for any from
me :-))
I think my problem is important for people who have like me
local xAMP+M environment (for me x => W :-)) ) and want time
to time load wikis dumps. All was all right since June 23rd,
when was published last .sql dumps of "my" polish wiki. I
load cur table into db for 10 minutes (300 pages per
second). Now I find only .xml file. First, I completely not
understand this change. For production needs, for example to
restore db, .xml files vs. .slq file are out of range. For
people like me too. Nowhere was any help how to use that new
solution. When I at last find here help (from Brion post),
no successful happens.
gzip -dc pages_current.xml.gz | php importDump.php
stops after loading appr. 15000 pages (from about 190000)
when executing 47 row of importDump script (1.5rc2).
Then I found 2979 bug about 47 row, the bug is still open
(for 1.5rc4 too), and 3182 bug, open too.
Next I try Kates importDump.phps, things were long time fine
and in progress, unfortunate for pages 142000 php was
suddenly terminated without any message but from Windows.
While php was work I watched const. increase consuming of
memory reported by bug 3182, however not so drastic or
failing the php. And next very important - flow of data
through gzip & php is drastic low - appr. 5 p/s. vs. 300 p/s
when I import .sql dump. I think export to .xml goes same
slow and not acceptable for production needs. Are really
wikis db's dumping to .xml? and if needed are restoring from
xml?
So, is any chance that people can take from
download.wikimedia.org .sql dumps? Xml dumps are completely
useless for them.
Janusz 'Ency' Dorozynski
Bonjour à tous,
Quelle est la meilleure manière de configuerer la recherche sous WikiMedia?
Google ou la recherche intégrée?
Autrement?
Merci d'avance,
Philippe Roth
A combination of a couple configuration errors during restoration of one
of our database servers may have corrupted data on the French, Dutch,
and Japanese wikipedias, sending updates to the wrong server during at
least some portion of the last 24 hours.
I've temporarily locked these three wikis while we work it out.
-- brion vibber (brion @ pobox.com)
Sir I want few cds on physiotherapy if u can provide
me then reply soon From Raman
--- wikitech-l-bounces(a)wikimedia.org
<wikitech-l-request(a)wikimedia.org> wrote:
> Send Wikitech-l mailing list submissions to
> wikitech-l(a)wikimedia.org
>
> To subscribe or unsubscribe via the World Wide Web,
visit
>
http://mail.wikipedia.org/mailman/listinfo/wikitech-l
> or, via email, send a message with subject or body
'help' to
> wikitech-l-request(a)wikimedia.org
>
> You can reach the person managing the list at
> wikitech-l-owner(a)wikimedia.org
>
> When replying, please edit your Subject line so it
is more specific
> than "Re: Contents of Wikitech-l digest..."
>
>
> Today's Topics:
>
> 1. Re: Re: Allow one-word editing (Rowan Collins)
> 2. Re: Re: Wikipedia API? (Michal Migurski)
> 3. Re: Re: Wikipedia API? (Brion Vibber)
> 4. RE: Re: Wikipedia API? (Bass, Joshua L)
> 5. Re: create new articles from the search form
(Klaus-Eduard Runnel)
> 6. Re: Wikipedia API? (Edward Z. Yang)
> 7. Re: Wikipedia page protection report (Mark
Williamson)
> 8. Re: Re: Wikipedia API? (Angela)
> 9. Re: Re: Wikipedia API? (Michal Migurski)
> 10. Re: Differential storage (Tim Starling)
>
>
>
----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 14 Sep 2005 22:00:46 +0100
> From: Rowan Collins <rowan.collins(a)gmail.com>
> Subject: Re: [Wikitech-l] Re: Allow one-word editing
> To: Wikimedia developers <wikitech-l(a)wikimedia.org>
> Message-ID:
<9f02ca4c0509141400766d4103(a)mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> On 14/09/05, Timwi <timwi(a)gmx.net> wrote:
> > Edward Z. Yang wrote:
> > > The idea is this: with a combination of
> > > Javascript and server side tools, it should be
feasible to double-click
> > > any word on a page and then edit that single
word. This is especially
> > > useful for correcting spelling errors that you
would be too lazy to wade
> > > through an entire section to get to.
> >
> > I'd prefer if it was a sentence rather than a
single word. Sometimes a
> > grammar fix requires changes to multiple words
within a sentence.
>
> Well, here's a really neat idea: how about we let
people edit section
> by section - that way people can do things like
merging sentences too,
> and we don't need to worry about defining a
sentence. Oh, wait... ;)
>
> On a more serious note, I can see that this might be
useful for some
> power users, but it would definitely remain off by
default to avoid
> confusion, and only be turned on by a handful of
people, so may not be
> worth the development and maintenance time if it
required changes to
> the core code.
>
> --
> Rowan Collins BSc
> [IMSoP]
>
>
> ------------------------------
>
> Message: 2
> Date: Wed, 14 Sep 2005 14:30:52 -0700
> From: Michal Migurski <mike(a)stamen.com>
> Subject: Re: [Wikitech-l] Re: Wikipedia API?
> To: Wikimedia developers <wikitech-l(a)wikimedia.org>
> Message-ID:
<CFCF9D55-4FDB-4DE3-9E53-24453C402CE8(a)stamen.com>
> Content-Type: text/plain; charset=US-ASCII;
delsp=yes; format=flowed
>
> >>>>> Are there any F/OSS projects that provide such
functionality
> >>>>> without
> >>>>> requiring screen-scraping?
> >>>>
> >>>> No, although the best-maintained
screen-scraping product is
=== Message Truncated ===
__________________________________
Yahoo! Mail - PC Magazine Editors' Choice 2005
http://mail.yahoo.com
Brion Vibber wrote:
> Dorożyński Janusz wrote:
> > So, is any chance that people can take from
> > download.wikimedia.org .sql dumps? Xml dumps are completely
> > useless for them.
>
> No you can't get SQL dumps. :)
> * The schema and compression formats keep changing, which breaks
> things for people trying to get at the data.
> * There is no longer any equivalent to the "cur table" for
> current-revision-only SQL dumps.
> [...snip...]
> If you like you can use the mwdumper tool to convert the XML dumps to
> local-import-friendly SQL instead of using importDump.php (which as you
> note needs a bug fix).
Can I please make a suggestion? Can the XML format be run through the
mwdumper (or equivalent), and the result SQL _of that process_ be
compressed and uploaded to the database dump site? That way everything
can change from MediaWiki perspective, and it won't make any
difference to whether or not the SQL dumps can be created (as long as
the XML dumps can be created, the SQL ones can too). Please spare a
though for those of who don't care for XML religion, and who simply
want to get the data into a database.
Also, can we please have back the "is_redirect" field in the XML (and
XML->SQL) output, that used to be in the cur SQL dump? ( Yes, I know I
can generate it myself, but it is useful data, and may well be useful
to many people - making each and every one of those users
independently generate this info seems counterproductive).
Diff of a page's XML might look like this:
==================================================
<page>
<title>AccessibleComputing</title>
<id>10</id>
<revision>
<id>15898945</id>
<timestamp>2003-04-25T22:18:38Z</timestamp>
<contributor>
<username>Ams80</username>
<id>7543</id>
</contributor>
<minor />
+ <redirect />
<comment>Fixing redirect</comment>
<text xml:space="preserve">#REDIRECT [[Accessible_computing]]</text>
</revision>
</page>
==================================================
Jakob Voss wrote:
> When I tried to parse the current German XML dump I discovered the
> following malformed sequence (in [[de:India]]):
>
> [[got:��...
I got similar errors on EN running "xmllint
20050909_pages_current.xml" on Debian Linux. Xmllint seems to be quick
way to test the validity of the XML dump.
All the best,
Nick. (aka EN user:Nickj).
Hi,
I just wondered, since I failed to download the wikipedia image dump
using HTTP connection, do you provide an FTP server and port for us to
download using FTP
connection. Or maybe, is there any other methods of downloading the
file besides using HTTP connection? Thank you in advance for your
assistance.
Regards,
Darwin Sadeli
Hi
It was decided a lonnnng time ago that we would
1) write a privacy policy
2) translate it
3) link it at the bottom of every page
I think it is best that the privacy policy page is
hosted only on the WMF to avoid alterations... which
would be problematic. So, could it be possible to add
now this link to the foundation website, where we will
link as many translated as possible ?
http://wikimediafoundation.org/wiki/Privacy_policy
Thanks
Ant
If there is a privacy policy, but no one knows about
it, it makes no sense...
__________________________________
Yahoo! Mail - PC Magazine Editors' Choice 2005
http://mail.yahoo.com
Hi,
When I tried to parse the current German XML dump I discovered the
following malformed sequence (in [[de:India]]):
[[got:��...
It looks like someone tried to encode a unicode surogate pair with
XML character references. Maybe MediaWiki does not recognize #xD800;
as an invalid unicode character and transformed it into this form.
I have not tried to send invalid unicode characters in an edit form
to reproduce the error.
Anyway the dump is broken. It's not well-formed XML (so it's no XML
at all but "looks-like-XML") and every correct XML-Parser will fail
to parse it.
According the the XML specification (1.0) Chapter 2.2 legal characters
in XML are any Unicode characters, excluding the surrogate blocks,
FFFE, and FFFF.
Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] |
[#x10000-#x10FFFF]
Using any the following unicode character will make SpecialExport and
the XML dump fail:
#x0-#x8, #xB-#xC, #x0E-#x1F, #xD800-#xDFFF, #xFFFE-#xFFFF, 𑀀-...
Additionally you can use hexadecimal and decimal character
references - I don't know how the wrong characters were
encoded in the SQL database.
Greetings,
Jakob
BTW: I doubt that anyone has ever tried to validate the huge XML dump as
a whole - as far as I know validating XML streams (given an XML schema)
is still a research topic. It's not the only part where MediaWiki
touches the research border of current computer science :-)
I have uploaded a new version of my mediawiki calendar
extension. I implemented some things people requested:
http://meta.wikimedia.org/wiki/User:Cdamian/calendar
I fixed some bugs with the "format" option. There are new options to
set the date of the calendar and new views to show one or more
days. Some examples:
== work next week ==
<calendar>
date="next monday"
view=days
days=5
</calendar>
== christmas this year ==
<calendar>
day=23
month=12
view=days
days=4
</calendar>
Some options are redundant now. I might remove them at one point.
Some stuff that is missing:
- doesn't use the users timezone
- some strings and date formats are hardcoded at the moment
i hope you find it useful anyway,
christof
--
Christof Damian
christof(a)damian.net