Wikitech-l February 2007

wikitech-l@lists.wikimedia.org

114 participants
150 discussions

by Nicolas Ruest

I am currently testing out the possibility of using mediawiki for a historical image collection. The image links and meta data are in mySql database. What is the best way to batch import a collection into mediawiki? Would the MediaWiki Bulk Page Creator be the best way? Thanks, Nick Ruest

17 years, 2 months

Small-scale distributed hosting?

by Steve Bennett

Hi all, After the previous thread, I ask out of curiosity, what if there were just a small number of servers spread around the world, not owned by the WMF. Something like: - Users can access en.wikipedia.org, en.wikipedia-brazil.org, en.wikipedia-japan.org etc interchangeably - Each site is a complete copy of the database, and capable of serving independently - Every change on one database is immediately replicated to the others - To make this worthwhile, all the servers except the original are owned and operated by third parties on a sponsorship arrangement, presumably meaning they get to discreetly stick a logo somewhere Is this sort of thing technically feasible? The clear advantage is faster response time for people near one of the overseas hosts, especially when just browsing. I can see obvious problems with the replication (ie, two competing requests arriving simultaneously), and what happens when a link fails for an extended period of time. Perhaps a simpler model: - Read requests are fulfilled by these distributed servers - All write requests are sent to the one central server which immediately pushes out a "dirty page" notification (but not page content) to the other servers - The distributed servers fetch updated pages when a user requests a "dirty page", or perhaps after some time period, to avoid the whole database becoming too out of date This model would rely on the fact that the vast majority of requests are reads, not writes, and attempts to reduce the impact of a page which is heavily modified by one server, while infrequently requested on another. Thoughts? I'm obviously not a network or database engineer so I'm just wondering if this would be a workable or useful solution. It at least avoids the problems of untrusted servers, unreliable servers, and gives an additional benefit in responsiveness. Steve

17 years, 2 months

MediaWiki automated test run failure 2007-02-19

by brion＠pobox.com

An automated run of parserTests.php showed the following failures: This is MediaWiki version 1.10alpha (r19986). Reading tests from "maintenance/parserTests.txt"... Reading tests from "extensions/Cite/citeParserTests.txt"... Reading tests from "extensions/Poem/poemParserTests.txt"... 18 still FAILING test(s) :( * URL-encoding in URL functions (single parameter) [Has never passed] * URL-encoding in URL functions (multiple parameters) [Has never passed] * TODO: Table security: embedded pipes (http://mail.wikipedia.org/pipermail/wikitech-l/2006-April/034637.html) [Has never passed] * TODO: Link containing double-single-quotes '' (bug 4598) [Has never passed] * TODO: message transform: <noinclude> in transcluded template (bug 4926) [Has never passed] * TODO: message transform: <onlyinclude> in transcluded template (bug 4926) [Has never passed] * BUG 1887, part 2: A <math> with a thumbnail- math enabled [Has never passed] * TODO: HTML bullet list, unclosed tags (bug 5497) [Has never passed] * TODO: HTML ordered list, unclosed tags (bug 5497) [Has never passed] * TODO: HTML nested bullet list, open tags (bug 5497) [Has never passed] * TODO: HTML nested ordered list, open tags (bug 5497) [Has never passed] * TODO: Inline HTML vs wiki block nesting [Has never passed] * TODO: Mixing markup for italics and bold [Has never passed] * TODO: 5 quotes, code coverage +1 line [Has never passed] * TODO: dt/dd/dl test [Has never passed] * TODO: Images with the "|" character in the comment [Has never passed] * TODO: Parents of subpages, two levels up, without trailing slash or name. [Has never passed] * TODO: Parents of subpages, two levels up, with lots of extra trailing slashes. [Has never passed] Passed 493 of 511 tests (96.48%)... 18 tests failed!

17 years, 2 months

Re: [Wikitech-l] Distributed content hosting

by Anthony

On 2/18/07, Guillaume Pierre <gpierre(a)cs.vu.nl> wrote: > As Gerard said, the Vrije Universiteit Amsterdam is working on > distributed decentralized hosting of a wikipedia-like site. Our first > results are summarized in an article available here: > http://www.globule.org/publi/DWECWH_webist2007.html > The meat of the idea seems to be to use distributed hash tables to allow the main database to be moved onto multiple mostly-independent computers (i.e. break away from the inefficient MySQL replication/cluster model). This is absolutely something which should be done. Wikipedia's data model screams for the adoption of this solution. I question the benefit of then allowing untrusted third parties to run the servers, though, because at the end of the paper you acknowledge that all the data is going to have to pass back through trusted parties anyway. I'm not convinced that there would be a significant cost savings to the introduction of untrusted third parties in this case. Once you've achieved an approximately linear scaling of the database servers, which the appropriate use of DHTs will do, it seems to me that the costs of downloading the data from untrusted third parties (doubling the bandwidth) and checking the signatures (eating up CPU) is going to be nearly as great as the cost of simply adding another database server. Of course, I see why you're proposing it - allowing untrusted third parties to interact directly with the end-user would require end-users to install some sort of client software if they want to authenticate the content. But I really think that's the way you've gotta go if you're going to achieve a real cost savings (or cost distribution). Let the end-user software check the signatures. Anthony

17 years, 2 months

An RSS extension I'd like to see

by Erik Moeller

Wikipedia has hundreds of wonderful portals on every imaginable topic. These are perhaps one of the most underexposed treasures on the site. It would be lovely if people could subscribe to a portal feed, or individual "boxes" (typical portal arrangement is into rectangles with different content). Example: http://en.wikipedia.org/wiki/Portal:Free_software How could this be achieved? One way would be to support an RSS extension that would operate as follows: 1) You put something like <makefeed> title=Selected article on free software addto=freesoftware.xml addto=freesoftware-sel.xml </makefeed> <feedicon> feed=freesoftware-sel.xml </feedicon> inside a template, or indeed any page. 2) When a user edits a page, the extension checks for the presence of <makefeed> in the wikitext. If it is present, it adds a ( ) Add as new item to RSS feed ( ) Update most recent RSS feed item for this page (x) no change selection to the page, below where the minor edit checkbox is. This selection should only be available to users with a definable permission level (e.g. autoconfirmed). 3) The feeds could be directly updated/written on the disk, in the images/ directory. In any case, the <feedicon> tag would generate a pretty link to a feed with a given name. The feed content would be the action=render output for the page where the <makefeed> instruction is found (ideally sans noinclude). It could also include the edit summary. Given that a feed could be accessed from multiple pages, you could build aggregated feeds (in the above example, freesoftware.xml would be a feed for the whole free software portal) and individual ones (freesoftware-sel.xml would only be the selected article box). You'd have to do some clever scanning of the file on disk to make safe updates, but it shouldn't be too hard. Any conceptual flaws? Any takers? I think this could really make a big difference for content re-use, not just in the context of Wikipedia. But the portals seem like a particularly attractive target application to me. -- Peace & Love, Erik DISCLAIMER: This message does not represent an official position of the Wikimedia Foundation or its Board of Trustees. "An old, rigid civilization is reluctantly dying. Something new, open, free and exciting is waking up." -- Ming the Mechanic

17 years, 2 months

Advice for PagesOnDemand, an extension to create pages when following red links

by Jim Hu

Jim Wilson and I have been working on a new extension, which I call PagesOnDemand that does the following: Hooks at ArticleFromTitle Uses a regex to look for a pattern in the requested title If the title matches the pattern and the page does not already exist, runs a function to create the desired stub content and saves it Note that if this runs when a user clicks on a red link, it acts like a blue link - the user is directed to a view of the brand new page. Currently, if the user searches for a page that matches the regex, the extension populates the page if the user clicks the red link in the search page, or clicks "create this page", but they get the edit page. PagesOnDemand is set up so that other extension writers can add additional regex/content creation combinations. Before I put up a page in mediawiki.org, I'd like to check with the experts here about how I've done a couple of things to let future extensions use PagesOnDemand: 1) The future extensions would need to register to use a "hook" inside the PagesOnDemand extension. So I'd like to reserve that hook name - or whatever the devs want me to name it -within reason ; ) - within mediawiki. I assume that this would have to be approved by Brion et al? (I use "hook" in quotes because I'm not using RunHooks to run the function). 2) To keep the regex and the function together, I've set it up so that the content creator extensions register by pushing a 2-element array onto $wgHooks, where the array is (regex, functionName). For example, to look for page titles that correspond to PubMed IDs (which is why I wrote it), the creator extension uses: $wgHooks['PagesOnDemand'][] = array('/^PMID:\\d+ $/','wfPubMedOnDemand') ; Is this kosher ? Jim W. doesn't like it, but I don't like his alternative, which is to pass the regex matching to the content creator. I don't want to distribute the code on mediawiki.org if it does something evil. Thanks in advance for advice. Jim ===================================== Jim Hu Associate Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054

17 years, 2 months

MediaWiki automated test run failure 2007-02-18

by brion＠pobox.com

An automated run of parserTests.php showed the following failures: This is MediaWiki version 1.10alpha (r19982). Reading tests from "maintenance/parserTests.txt"... Reading tests from "extensions/Cite/citeParserTests.txt"... Reading tests from "extensions/Poem/poemParserTests.txt"... 18 still FAILING test(s) :( * URL-encoding in URL functions (single parameter) [Has never passed] * URL-encoding in URL functions (multiple parameters) [Has never passed] * TODO: Table security: embedded pipes (http://mail.wikipedia.org/pipermail/wikitech-l/2006-April/034637.html) [Has never passed] * TODO: Link containing double-single-quotes '' (bug 4598) [Has never passed] * TODO: message transform: <noinclude> in transcluded template (bug 4926) [Has never passed] * TODO: message transform: <onlyinclude> in transcluded template (bug 4926) [Has never passed] * BUG 1887, part 2: A <math> with a thumbnail- math enabled [Has never passed] * TODO: HTML bullet list, unclosed tags (bug 5497) [Has never passed] * TODO: HTML ordered list, unclosed tags (bug 5497) [Has never passed] * TODO: HTML nested bullet list, open tags (bug 5497) [Has never passed] * TODO: HTML nested ordered list, open tags (bug 5497) [Has never passed] * TODO: Inline HTML vs wiki block nesting [Has never passed] * TODO: Mixing markup for italics and bold [Has never passed] * TODO: 5 quotes, code coverage +1 line [Has never passed] * TODO: dt/dd/dl test [Has never passed] * TODO: Images with the "|" character in the comment [Has never passed] * TODO: Parents of subpages, two levels up, without trailing slash or name. [Has never passed] * TODO: Parents of subpages, two levels up, with lots of extra trailing slashes. [Has never passed] Passed 493 of 511 tests (96.48%)... 18 tests failed!

17 years, 3 months

Feb 21+: Nonprofit Software Development Summit

by Erik Moeller

http://www.aspirationtech.org/events/devsummit This should be of interest to anyone involved in both Wikimedia Foundation issues and open source technology. Sorry for the late notice, I just found out about it. Aspiration has a couple of other cool projects, such as their index of nonprofit tools (some of them open source, some of them not): http://www.socialsourcecommons.org/ As well as Penguin Days: http://www.penguinday.org/ -- Peace & Love, Erik DISCLAIMER: This message does not represent an official position of the Wikimedia Foundation or its Board of Trustees. "An old, rigid civilization is reluctantly dying. Something new, open, free and exciting is waking up." -- Ming the Mechanic

17 years, 3 months

Applying for a toolserver account

by Neil Harris

I'd like to apply for a toolserver account, in order to run an article rating experiment with similar aims to the IWLC. Who do I apply to? -- Neil

17 years, 3 months

MediaWiki automated test run failure 2007-02-17

by brion＠pobox.com

An automated run of parserTests.php showed the following failures: This is MediaWiki version 1.10alpha (r19975). Reading tests from "maintenance/parserTests.txt"... Reading tests from "extensions/Cite/citeParserTests.txt"... Reading tests from "extensions/Poem/poemParserTests.txt"... 18 still FAILING test(s) :( * URL-encoding in URL functions (single parameter) [Has never passed] * URL-encoding in URL functions (multiple parameters) [Has never passed] * TODO: Table security: embedded pipes (http://mail.wikipedia.org/pipermail/wikitech-l/2006-April/034637.html) [Has never passed] * TODO: Link containing double-single-quotes '' (bug 4598) [Has never passed] * TODO: message transform: <noinclude> in transcluded template (bug 4926) [Has never passed] * TODO: message transform: <onlyinclude> in transcluded template (bug 4926) [Has never passed] * BUG 1887, part 2: A <math> with a thumbnail- math enabled [Has never passed] * TODO: HTML bullet list, unclosed tags (bug 5497) [Has never passed] * TODO: HTML ordered list, unclosed tags (bug 5497) [Has never passed] * TODO: HTML nested bullet list, open tags (bug 5497) [Has never passed] * TODO: HTML nested ordered list, open tags (bug 5497) [Has never passed] * TODO: Inline HTML vs wiki block nesting [Has never passed] * TODO: Mixing markup for italics and bold [Has never passed] * TODO: 5 quotes, code coverage +1 line [Has never passed] * TODO: dt/dd/dl test [Has never passed] * TODO: Images with the "|" character in the comment [Has never passed] * TODO: Parents of subpages, two levels up, without trailing slash or name. [Has never passed] * TODO: Parents of subpages, two levels up, with lots of extra trailing slashes. [Has never passed] Passed 493 of 511 tests (96.48%)... 18 tests failed!

17 years, 3 months

← Newer
1
...
5
6
7
8
9
10
11
...
15
Older →

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l February 2007