Wikitech-l September 2005

wikitech-l@lists.wikimedia.org

126 participants
160 discussions

by Michal Migurski

Hi everyone, I'm new to the list and somewhat-new to Mediawiki. I am trying set up some wrapper code that automates certain MediaWiki functions, such as page creation or renaming. For example, an external application would attempt to create a new article in MW. It might check to see whether the article exists, and create it if it does not. It might append some content to the article text. It might move the article, or set up a redirect. At the moment, I'm accomplishing this by issuing HTTP sub-requests to MediaWiki and interpreting the response. MediaWiki does not seem to have a public-facing API, is that right? Are there any F/OSS projects that provide such functionality without requiring screen-scraping? Alternatively, is there any way to interface with the internals of the MW, *without* invoking an entire request, i.e. the stuff that requires the MEDIAWIKI constant? Failing *that*, does documentation exist that explains MW's DB schema so I can begin to write my own? I'm using the current-stable version, 1.4.9, if that matters. -mike. ---------------------------------------------------------------- michal migurski- mike(a)stamen.com 415.558.1610

18 years, 8 months

Wiki dump organization

by Dorożyński Janusz

Hi everybody! I'm newbie on this list, so apologize for any from me :-)) I think my problem is important for people who have like me local xAMP+M environment (for me x => W :-)) ) and want time to time load wikis dumps. All was all right since June 23rd, when was published last .sql dumps of "my" polish wiki. I load cur table into db for 10 minutes (300 pages per second). Now I find only .xml file. First, I completely not understand this change. For production needs, for example to restore db, .xml files vs. .slq file are out of range. For people like me too. Nowhere was any help how to use that new solution. When I at last find here help (from Brion post), no successful happens. gzip -dc pages_current.xml.gz | php importDump.php stops after loading appr. 15000 pages (from about 190000) when executing 47 row of importDump script (1.5rc2). Then I found 2979 bug about 47 row, the bug is still open (for 1.5rc4 too), and 3182 bug, open too. Next I try Kates importDump.phps, things were long time fine and in progress, unfortunate for pages 142000 php was suddenly terminated without any message but from Windows. While php was work I watched const. increase consuming of memory reported by bug 3182, however not so drastic or failing the php. And next very important - flow of data through gzip & php is drastic low - appr. 5 p/s. vs. 300 p/s when I import .sql dump. I think export to .xml goes same slow and not acceptable for production needs. Are really wikis db's dumping to .xml? and if needed are restoring from xml? So, is any chance that people can take from download.wikimedia.org .sql dumps? Xml dumps are completely useless for them. Janusz 'Ency' Dorozynski

18 years, 8 months

Recherche sous Wikimedia

by Roth Philippe

Bonjour à tous, Quelle est la meilleure manière de configuerer la recherche sous WikiMedia? Google ou la recherche intégrée? Autrement? Merci d'avance, Philippe Roth

18 years, 8 months

fr, nl, ja wikipedias locked

by Brion Vibber

A combination of a couple configuration errors during restoration of one of our database servers may have corrupted data on the French, Dutch, and Japanese wikipedias, sending updates to the wrong server during at least some portion of the last 24 hours. I've temporarily locked these three wikis while we work it out. -- brion vibber (brion @ pobox.com)

18 years, 8 months

Sir I want

by raman singhj

Sir I want few cds on physiotherapy if u can provide me then reply soon From Raman --- wikitech-l-bounces(a)wikimedia.org <wikitech-l-request(a)wikimedia.org> wrote: > Send Wikitech-l mailing list submissions to > wikitech-l(a)wikimedia.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://mail.wikipedia.org/mailman/listinfo/wikitech-l > or, via email, send a message with subject or body 'help' to > wikitech-l-request(a)wikimedia.org > > You can reach the person managing the list at > wikitech-l-owner(a)wikimedia.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Wikitech-l digest..." > > > Today's Topics: > > 1. Re: Re: Allow one-word editing (Rowan Collins) > 2. Re: Re: Wikipedia API? (Michal Migurski) > 3. Re: Re: Wikipedia API? (Brion Vibber) > 4. RE: Re: Wikipedia API? (Bass, Joshua L) > 5. Re: create new articles from the search form (Klaus-Eduard Runnel) > 6. Re: Wikipedia API? (Edward Z. Yang) > 7. Re: Wikipedia page protection report (Mark Williamson) > 8. Re: Re: Wikipedia API? (Angela) > 9. Re: Re: Wikipedia API? (Michal Migurski) > 10. Re: Differential storage (Tim Starling) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Wed, 14 Sep 2005 22:00:46 +0100 > From: Rowan Collins <rowan.collins(a)gmail.com> > Subject: Re: [Wikitech-l] Re: Allow one-word editing > To: Wikimedia developers <wikitech-l(a)wikimedia.org> > Message-ID: <9f02ca4c0509141400766d4103(a)mail.gmail.com> > Content-Type: text/plain; charset=ISO-8859-1 > > On 14/09/05, Timwi <timwi(a)gmx.net> wrote: > > Edward Z. Yang wrote: > > > The idea is this: with a combination of > > > Javascript and server side tools, it should be feasible to double-click > > > any word on a page and then edit that single word. This is especially > > > useful for correcting spelling errors that you would be too lazy to wade > > > through an entire section to get to. > > > > I'd prefer if it was a sentence rather than a single word. Sometimes a > > grammar fix requires changes to multiple words within a sentence. > > Well, here's a really neat idea: how about we let people edit section > by section - that way people can do things like merging sentences too, > and we don't need to worry about defining a sentence. Oh, wait... ;) > > On a more serious note, I can see that this might be useful for some > power users, but it would definitely remain off by default to avoid > confusion, and only be turned on by a handful of people, so may not be > worth the development and maintenance time if it required changes to > the core code. > > -- > Rowan Collins BSc > [IMSoP] > > > ------------------------------ > > Message: 2 > Date: Wed, 14 Sep 2005 14:30:52 -0700 > From: Michal Migurski <mike(a)stamen.com> > Subject: Re: [Wikitech-l] Re: Wikipedia API? > To: Wikimedia developers <wikitech-l(a)wikimedia.org> > Message-ID: <CFCF9D55-4FDB-4DE3-9E53-24453C402CE8(a)stamen.com> > Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed > > >>>>> Are there any F/OSS projects that provide such functionality > >>>>> without > >>>>> requiring screen-scraping? > >>>> > >>>> No, although the best-maintained screen-scraping product is === Message Truncated === __________________________________ Yahoo! Mail - PC Magazine Editors' Choice 2005 http://mail.yahoo.com

18 years, 8 months

Wiki dump organization

by Nick Jenkins

Brion Vibber wrote: > Dorożyński Janusz wrote: > > So, is any chance that people can take from > > download.wikimedia.org .sql dumps? Xml dumps are completely > > useless for them. > > No you can't get SQL dumps. :) > * The schema and compression formats keep changing, which breaks > things for people trying to get at the data. > * There is no longer any equivalent to the "cur table" for > current-revision-only SQL dumps. > [...snip...] > If you like you can use the mwdumper tool to convert the XML dumps to > local-import-friendly SQL instead of using importDump.php (which as you > note needs a bug fix). Can I please make a suggestion? Can the XML format be run through the mwdumper (or equivalent), and the result SQL _of that process_ be compressed and uploaded to the database dump site? That way everything can change from MediaWiki perspective, and it won't make any difference to whether or not the SQL dumps can be created (as long as the XML dumps can be created, the SQL ones can too). Please spare a though for those of who don't care for XML religion, and who simply want to get the data into a database. Also, can we please have back the "is_redirect" field in the XML (and XML->SQL) output, that used to be in the cur SQL dump? ( Yes, I know I can generate it myself, but it is useful data, and may well be useful to many people - making each and every one of those users independently generate this info seems counterproductive). Diff of a page's XML might look like this: ================================================== <page> <title>AccessibleComputing</title> <id>10</id> <revision> <id>15898945</id> <timestamp>2003-04-25T22:18:38Z</timestamp> <contributor> <username>Ams80</username> <id>7543</id> </contributor> <minor /> + <redirect /> <comment>Fixing redirect</comment> <text xml:space="preserve">#REDIRECT [[Accessible_computing]]</text> </revision> </page> ================================================== Jakob Voss wrote: > When I tried to parse the current German XML dump I discovered the > following malformed sequence (in [[de:India]]): > > [[got:&#xD800;&#xDF39;... I got similar errors on EN running "xmllint 20050909_pages_current.xml" on Debian Linux. Xmllint seems to be quick way to test the validity of the XML dump. All the best, Nick. (aka EN user:Nickj).

18 years, 8 months

Is it possible to download wikipedia images using FTP connection?

by Darwin Sadeli

Hi, I just wondered, since I failed to download the wikipedia image dump using HTTP connection, do you provide an FTP server and port for us to download using FTP connection. Or maybe, is there any other methods of downloading the file besides using HTTP connection? Thank you in advance for your assistance. Regards, Darwin Sadeli

18 years, 8 months

by Anthere

Hi It was decided a lonnnng time ago that we would 1) write a privacy policy 2) translate it 3) link it at the bottom of every page I think it is best that the privacy policy page is hosted only on the WMF to avoid alterations... which would be problematic. So, could it be possible to add now this link to the foundation website, where we will link as many translated as possible ? http://wikimediafoundation.org/wiki/Privacy_policy Thanks Ant If there is a privacy policy, but no one knows about it, it makes no sense... __________________________________ Yahoo! Mail - PC Magazine Editors' Choice 2005 http://mail.yahoo.com

18 years, 8 months

XML dump not well-formed because of unicode

by Jakob Voss

Hi, When I tried to parse the current German XML dump I discovered the following malformed sequence (in [[de:India]]): [[got:&#xD800;&#xDF39;... It looks like someone tried to encode a unicode surogate pair with XML character references. Maybe MediaWiki does not recognize #xD800; as an invalid unicode character and transformed it into this form. I have not tried to send invalid unicode characters in an edit form to reproduce the error. Anyway the dump is broken. It's not well-formed XML (so it's no XML at all but "looks-like-XML") and every correct XML-Parser will fail to parse it. According the the XML specification (1.0) Chapter 2.2 legal characters in XML are any Unicode characters, excluding the surrogate blocks, FFFE, and FFFF. Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] Using any the following unicode character will make SpecialExport and the XML dump fail: #x0-#x8, #xB-#xC, #x0E-#x1F, #xD800-#xDFFF, #xFFFE-#xFFFF, &#x11000-... Additionally you can use hexadecimal and decimal character references - I don't know how the wrong characters were encoded in the SQL database. Greetings, Jakob BTW: I doubt that anyone has ever tried to validate the huge XML dump as a whole - as far as I know validating XML streams (given an XML schema) is still a research topic. It's not the only part where MediaWiki touches the research border of current computer science :-)

18 years, 8 months

new version of my mediawiki calendar extension

by Christof Damian

I have uploaded a new version of my mediawiki calendar extension. I implemented some things people requested: http://meta.wikimedia.org/wiki/User:Cdamian/calendar I fixed some bugs with the "format" option. There are new options to set the date of the calendar and new views to show one or more days. Some examples: == work next week == <calendar> date="next monday" view=days days=5 </calendar> == christmas this year == <calendar> day=23 month=12 view=days days=4 </calendar> Some options are redundant now. I might remove them at one point. Some stuff that is missing: - doesn't use the users timezone - some strings and date formats are hardcoded at the moment i hope you find it useful anyway, christof -- Christof Damian christof(a)damian.net

18 years, 8 months

← Newer
1
...
5
6
7
8
9
10
11
...
16
Older →

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l September 2005