I apologize if this is not the right forum to discuss this topic.
I'm planning to hire someone to make the following modifications to the MediaWiki source code (1.5 beta version). I'd be interested in getting feedback from this list about the the major change that I want implemented: "structured namespaces." I want to keep the implementation as simple as possible in order to reduce development costs, and also want to keep things in the spirit of where MediaWiki is going long-term, in case you wanted to eventually fold the code for structured namespaces into the main codebase.
And in case someone from this list is interested in working on the project for me, I've included the other changes I need someone to make as well.
Thanks in advance for looking this over,
-dallan
General requirement: In order make it easier for me to update to later MediaWiki versions in the future, as few changes to the existing scripts should be made as possible. Instead, the modifications should be implemented generally by adding hooks into the existing functions that call out to new classes and functions.
Major Changes: structured namespaces
You'll notice that several types of Wikipedia pages (e.g., countries and US states) have infoboxes to display particular data elements. I want to carry this one step further by assigning a specific structure to certain namespaces. All pages in a "structured namespace" would contain data elements that conform to the specific schema/DTD that has been assigned to that namespace. The data would be included in the page content before the regular text of the page. This affects how the page is displayed and how it is edited. Special functions would be associated with the namespace for page display and editing, as described below. The "diff" view to compare two versions of a page would not need a new function. By saving data elements at the top of a page in a pretty-printed XML format, diffs running over the entire page (including both data element and regular text portions) should make reasonable sense to the user.
Specifically, I would like four enhancements made in this area:
(1) When a page in a structured namespace is displayed, a call-out to a namespace-specific display function is made to display the data elements on the page. The regular text of the page (after the data element portion) is then displayed using the existing functionality.
(2) When a page in a structured namespace is edited, a call-out to a namespace-specific edit function is made to generate HTML form fields for editing the data elements on the page. The regular text of the page is then placed in the normal edit textbox using the existing functionality.
(3) Before previewing or saving an edited page in a structured namespace, a call-out to a namespace-specific generation function is made to generate the XML to insert into the page from the user-entered data. In addition, the generation function may generate errors, in which case if the user pressed "save" the "preview" page should be shown instead (i.e., save should be disallowed if there are errors), and the errors should be listed on the preview page.
(4) When a page in a structured namespace is saved, a call-out to a namespace-specific propagation function is made to propagate data elements to related pages if necessary. This callout should be handed both the current as well as the previous version of the data elements for the page. Suppose for example that a structured namespace contained pages for places, that one of the data elements on a place page was its enclosing "parent" place, and that another of the data elements on a place page was a list of subordinate "child" places which was non-user-editable - that the way to edit this list of child places was through changing the parent place of a child. Then if someone were to change the parent place of a child, it would need to be removed from the list of children in the previous parent place and added to the list of children in the new parent place, so the propagation function would read the page for the previous parent place, update its list of child places, then read the page for the new parent place and update its list of child places. The propagation function must be done in the same transaction as the page save, so that either both succeed or both fail. It must also be called as part of page deletion.
Other changes:
(1) Display, edit, generation, and propagation functions for a "given name" structured namespace. This namespace contains the following data element. * Related names: one or more names.
When displayed, each related name displays as a link to another "given name" page (the page is possibly non-existent). In editing, the names should be entered as a list, one per line. To validate, ensure that names contain only alphabetic characters, spaces, and single quotes, no wiki/html markup. No propagation is needed.
(2) Display, edit, generation, and propagation functions for a "surname" structured namespace. This namespace contains the same elements and functions as the "given name" namespace.
(3) Display, edit, generation, and propagation functions for a "place" structured namespace. This namespace contains the following data elements. * Preferred name * Parent: link to title of the parent (enclosing) place; e.g., for US States, this is the title of the USA country page. This page must exist. * Date range: start - end year * Type of place: country, state, province, county, city, etc. * Previous parents: list of links to titles of previous parent places, along with a year range (from - to) for which this was the parent. Each of these pages must exist * All names: a list of all variant names/common misspellings and their sources (i.e., the atlas/gazetteer giving this name as the name of the place) that this place has been known by over time. * Latitude and longitude * Population * See-also places: a list of links to related places and the reason that each is related to this place. Each of these pages must exist. * Child/subordinate places: A list of links to all places that list this place as their Parent, shown grouped by Type.
This is the most complex namespace. To be a little more precise, an XML grammer for the data elements is:
element place { element preferredname { text }, element parent { text }, element daterange { element from { text }, element to { text } }, element type { text }, element previousparents { element previousparent { element parent { text }, element from { text }?, element to { text }? }* }, element allnames { element allname { element name { text }, element source { text }? }+ }, element latitude { text }, element longitude { text }, element population { text }, element see_also_places { element see_also_place { element place { text }, element reason { text }? }* }, element childplaces { element childplace { element place { text }, element type { text } }* } }
Display should show these data elements in an infobox on the right-hand side of the screen. Editing previous parents, all names, and see-also places should use textboxes, listing one entry per row, with |'s separating the fields on a row. (I think this makes editing these complex fields relatively easy.) Validation should parse the textbox rows and ensure that numeric fields are numeric. Also, no wiki/html markup should be allowed in any of the data elements. Child places cannot be user-edited. Instead, they are derived from upon each place's Parent place, and propagated as described in the introduction.
(4) Display, edit, generation, and propagation functions for a "resource" structured namespace. This namespace contains the following data elements. * Category(s): one or more of: census, birth, marriage, death, obituary, etc. * Access: one of: web site, web form, microfilm, book, etc. * Place(s): one or more place pages. The place pages must exist. * Surname(s): one or more surname pages. The surname pages do not have to exist. * Year range: from - to: both must be 4-digit years, with from <= to * Coverage: one of: good, fair, poor, or unknown * Location: - if Access is web site or web form, this must be a URL; - otherwise this can be anything.
Display should show these elements in an infobox on the right-hand side of the screen. Edit function creates a multi-select list for category, a drop-down list for access and coverage, textboxes for places and surnames (with one entry per line), text fields for from and to years, and a textbox for location (no wiki/html formatting in any data element). No propagation is necessary.
(5) New skin - need a new CSS skin (describe later).
(6) Need a "special page" that displays all pages in the surname namespace. This should use the AllPages special page script, restricted to pages in the surname namespace. The only difference is that I want the URL for this "all surname pages" to have just a single parameter, which is the starting name (from=).
(7) Only registered users should be able to create/edit pages in the 4 namespaces mentioned above.
(8) Only sysops should be able to create/edit pages in other namespaces (outside of the 4 namespaces mentioned above).
(9) User registration should require email confirmation. That is, as part of the registration process, an email should be sent to the user containing a special URL that they need to click on in order to complete their registration.
(10) There is an extension to MediaWiki that allows daily digest emails to be sent instead of an email for every page change. You need to install this option and make sure that if the user elects to be notified of changes to pages in their watchlist, that they are sent digests rather than separate emails for each page changed.
(11) Full-text search: Add a "special page" that provides full-text searching of pages by calling a REST-based search interface that I will provide. Essentially, you'll make an HTTP call with a URL that includes the user-entered search string, namespace, start#, and the #results to display. You will passed back an XML result set containing a total count along with a list of PageURLs and context strings to display.
wikitech-l@lists.wikimedia.org