I apologize if this is not the right forum to discuss this topic.
I'm planning to hire someone to make the following modifications to
the MediaWiki source code (1.5 beta version). I'd be interested in
getting feedback from this list about the the major change that I want
implemented: "structured namespaces." I want to keep the
implementation as simple as possible in order to reduce development
costs, and also want to keep things in the spirit of where MediaWiki is
going long-term, in case you wanted to eventually fold the code for
structured namespaces into the main codebase.
And in case someone from this list is interested in working on the
project for me, I've included the other changes I need someone
to make as well.
Thanks in advance for looking this over,
-dallan
General requirement: In order make it easier for me to update to later
MediaWiki versions in the future, as few changes to the existing
scripts should be made as possible. Instead, the modifications should
be implemented generally by adding hooks into the existing functions
that call out to new classes and functions.
Major Changes: structured namespaces
You'll notice that several types of Wikipedia pages (e.g., countries
and US states) have infoboxes to display particular data elements. I
want to carry this one step further by assigning a specific structure
to certain namespaces. All pages in a "structured namespace" would
contain data elements that conform to the specific schema/DTD that has
been assigned to that namespace. The data would be included in the
page content before the regular text of the page. This affects how
the page is displayed and how it is edited. Special functions would
be associated with the namespace for page display and editing, as
described below. The "diff" view to compare two versions of a page
would not need a new function. By saving data elements at the top of
a page in a pretty-printed XML format, diffs running over the entire
page (including both data element and regular text portions) should
make reasonable sense to the user.
Specifically, I would like four enhancements made in this area:
(1) When a page in a structured namespace is displayed, a call-out to
a namespace-specific display function is made to display the data
elements on the page. The regular text of the page (after the data
element portion) is then displayed using the existing functionality.
(2) When a page in a structured namespace is edited, a call-out to a
namespace-specific edit function is made to generate HTML form fields
for editing the data elements on the page. The regular text of the
page is then placed in the normal edit textbox using the existing
functionality.
(3) Before previewing or saving an edited page in a structured
namespace, a call-out to a namespace-specific generation function is
made to generate the XML to insert into the page from the user-entered
data. In addition, the generation function may generate errors, in
which case if the user pressed "save" the "preview" page should be
shown instead (i.e., save should be disallowed if there are errors),
and the errors should be listed on the preview page.
(4) When a page in a structured namespace is saved, a call-out to a
namespace-specific propagation function is made to propagate data
elements to related pages if necessary. This callout should be handed
both the current as well as the previous version of the data elements
for the page. Suppose for example that a structured namespace
contained pages for places, that one of the data elements on a place
page was its enclosing "parent" place, and that another of the data
elements on a place page was a list of subordinate "child" places
which was non-user-editable - that the way to edit this list of child
places was through changing the parent place of a child. Then if
someone were to change the parent place of a child, it would need to
be removed from the list of children in the previous parent place and
added to the list of children in the new parent place, so the
propagation function would read the page for the previous parent
place, update its list of child places, then read the page for the new
parent place and update its list of child places. The propagation
function must be done in the same transaction as the page save, so
that either both succeed or both fail. It must also be called as
part of page deletion.
Other changes:
(1) Display, edit, generation, and propagation functions for a "given
name" structured namespace. This namespace contains the following data
element.
* Related names: one or more names.
When displayed, each related name displays as a link to another "given
name" page (the page is possibly non-existent). In editing, the names
should be entered as a list, one per line. To validate, ensure that
names contain only alphabetic characters, spaces, and single quotes,
no wiki/html markup. No propagation is needed.
(2) Display, edit, generation, and propagation functions for a
"surname" structured namespace. This namespace contains the same
elements and functions as the "given name" namespace.
(3) Display, edit, generation, and propagation functions for a "place"
structured namespace. This namespace contains the following data
elements.
* Preferred name
* Parent: link to title of the parent (enclosing) place; e.g., for US
States, this is the title of the USA country page. This page must exist.
* Date range: start - end year
* Type of place: country, state, province, county, city, etc.
* Previous parents: list of links to titles of previous parent places, along
with a year range (from - to) for which this was the parent. Each of these
pages must exist
* All names: a list of all variant names/common misspellings and their
sources (i.e., the atlas/gazetteer giving this name as the name of the
place) that this place has been known by over time.
* Latitude and longitude
* Population
* See-also places: a list of links to related places and the reason that
each is related to this place. Each of these pages must exist.
* Child/subordinate places: A list of links to all places that list this
place as their Parent, shown grouped by Type.
This is the most complex namespace. To be a little more precise, an XML
grammer for the data elements is:
element place {
element preferredname { text },
element parent { text },
element daterange {
element from { text },
element to { text }
},
element type { text },
element previousparents {
element previousparent {
element parent { text },
element from { text }?,
element to { text }?
}*
},
element allnames {
element allname {
element name { text },
element source { text }?
}+
},
element latitude { text },
element longitude { text },
element population { text },
element see_also_places {
element see_also_place {
element place { text },
element reason { text }?
}*
},
element childplaces {
element childplace {
element place { text },
element type { text }
}*
}
}
Display should show these data elements in an infobox on the
right-hand side of the screen. Editing previous parents, all names,
and see-also places should use textboxes, listing one entry per row,
with |'s separating the fields on a row. (I think this makes editing
these complex fields relatively easy.) Validation should parse the
textbox rows and ensure that numeric fields are numeric. Also, no
wiki/html markup should be allowed in any of the data elements. Child
places cannot be user-edited. Instead, they are derived from upon
each place's Parent place, and propagated as described in the
introduction.
(4) Display, edit, generation, and propagation functions for a "resource"
structured namespace. This namespace contains the following data elements.
* Category(s): one or more of: census, birth, marriage, death, obituary,
etc.
* Access: one of: web site, web form, microfilm, book, etc.
* Place(s): one or more place pages. The place pages must exist.
* Surname(s): one or more surname pages. The surname pages do not have to
exist.
* Year range: from - to: both must be 4-digit years, with from <= to
* Coverage: one of: good, fair, poor, or unknown
* Location:
- if Access is web site or web form, this must be a URL;
- otherwise this can be anything.
Display should show these elements in an infobox on the right-hand
side of the screen. Edit function creates a multi-select list for
category, a drop-down list for access and coverage, textboxes for
places and surnames (with one entry per line), text fields for from
and to years, and a textbox for location (no wiki/html formatting
in any data element). No propagation is necessary.
(5) New skin - need a new CSS skin (describe later).
(6) Need a "special page" that displays all pages in the surname
namespace. This should use the AllPages special page script,
restricted to pages in the surname namespace. The only difference is
that I want the URL for this "all surname pages" to have just a single
parameter, which is the starting name (from=).
(7) Only registered users should be able to create/edit pages in the 4
namespaces mentioned above.
(8) Only sysops should be able to create/edit pages in other namespaces
(outside of the 4 namespaces mentioned above).
(9) User registration should require email confirmation. That is, as
part of the registration process, an email should be sent to the user
containing a special URL that they need to click on in order to complete
their registration.
(10) There is an extension to MediaWiki that allows daily digest emails
to be sent instead of an email for every page change. You need to
install this option and make sure that if the user elects to be notified
of changes to pages in their watchlist, that they are sent digests rather
than separate emails for each page changed.
(11) Full-text search: Add a "special page" that provides full-text
searching of pages by calling a REST-based search interface that I
will provide. Essentially, you'll make an HTTP call with a URL that
includes the user-entered search string, namespace, start#, and
the #results to display. You will passed back an XML result set
containing a total count along with a list of PageURLs and context
strings to display.