Hi everyone,
I recently set up a MediaWiki (http://server.bluewatersys.com/w90n740/)
and I need to extra the content from it and convert it into LaTeX
syntax for printed documentation. I have googled for a suitable OSS
solution but nothing was apparent.
I would prefer a script written in Python, but any recommendations
would be very welcome.
Do you know of anything suitable?
Kind Regards,
Hugo Vincent,
Bluewater Systems.
Right now in the skins system (if you consider vector part of the skins
system) we have two parallel methods of adding tabs to the page:
- Into content_actions via SkinTemplateTabs,
SkinTemplateBuildContentActionUrlsAfterSpecialPage, and
SkinTemplateContentActions,
- Into vector's navigation_urls via SkinTemplateNavigation (the missing
two hooks should be added)
The only important difference between these (besides some vector
specific stuff that can stay in vector) is that content_actions is a
flat array, and navigation_urls is an array of arrays organized into
categories. Besides that, they are basically mirrors of each other with
the same functional purpose, but you need to add tabs to both of them to
avoid something not showing up in vector. There's also the misfortune
that other skins can't take advantage of the organized navigation_urls
because the actual implementation (which is basically a reimplementation
of buildContentActionUrls with code duplication) without being a vector
subskin because the code in question is inside of vector.
Right now we have extensions using both methods of adding tabs to the
page, code duplication on their part. And a few extensions that are
broken in vector because they haven't added the hooks.
Doing a quick grep, it appears the following extensions are missing
vector support: Oversight, CommentPages, Todo, WikiTrust, Tasks,
CategoryTree, DeleteQueue, Wikidata, Imagetabs, purgetab, Tab0,
AuthorProtect, TidyTab, Purge, SpecialTalk
Shouldn't be to hard to fix, especially if we fix the bug of missing
hooks for navigation_urls.
Now onto my focal point. As I've been improving the skin system trying
to pull out the thorns that make building skins troublesome and mesh in
new features and helpers which are missing, I'd like to remove the
content_actions hooks and deprecate content_actions in 1.18 and start
using navigation_urls style data everywhere.
Since content_actions and navigation_urls are the same, content_actions
can be built by having SkinTemplate take the navigation_urls data and
flatten it into a single array. Similarly to how I already have
$wgFooterIcons work and fold it for some skins like Monobook which don't
organize it the way vector does.
The effects will be like this:
- The three content_actions related hooks will no longer work in 1.18,
thus extensions that haven't started supporting vector tabs will also
stop showing tabs in other skins
- In their place extensions will use 3 navigation_urls related hooks
(most extensions are already using the one hook available)
- Extension code for those already using both forms of hooks will stay
the same, the only difference being that 1.18 will use the
navigation_urls related hooks and the content_actions related ones will
become redundant code which the extensions can keep for back compat but
drop once they stop supporting pre-1.18 installations
- All standard skins will be using navigation_url based data and
content_actions will be available but deprecated
- 3rd party skins will still function using content_actions but it would
be preferred for them to be updated to use the new BaseTemplate and use
the helpers in there (a navigation_urls related one would be added) once
they don't need to support pre-1.18
- SkinTemplatePreventOtherActiveTabs will probably still work, though I
may want to find a cleaner method to transition to (ie: one that says
"this is the active tab" rather than "don't make other tabs active").
Any comments, rejections?
--
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]
Earlier today, /a filled with binlogs in db27, which was s3 & s7 master.
nagios had warned too early / nobody noticed. Slaves lagged, lots of
locks, the wikis got to a halt.
Revisions between 6:50 and 8:20 pm UTC were lost (although they can be
manually reimported from db27).
The new s3 and s7 master is db17, with only one slave: db25.
After the master switch, we started having problems due to cached
revision text in memcached, due to the duplication of old_id values,
so we made them read-only until UTC midnight.
We decided not to disable $wgRevisionCacheExpiry but to remove the
faulty entries, thus I quickly prepared the script
maintenance/purgeStaleMemcachedText.php to clean them.
There were problems in hewiki, since data there didn't clean. On one
instance doing $wgMemc->get persisted even after a $wgMemc->delete on
that same key (???).
Other than the hewiki issues, it seemed to run fine. There will be lots
of wrong entries in diff and parser cache needing a manual action=purge
but a purge will clean them.
Flagged revs caches were not touched. Wikis using it may show the wrong
content (with the additional fun of some users viewing the right one).
There are also PPFrame_DOM->expand errors that started around the same
time, even on wikis on a different cluster. They usually only happen
once, and it succeeds just reloading.
https://bugzilla.wikimedia.org/show_bug.cgi?id=26429
For folks who have not been following the saga on
http://wikitech.wikimedia.org/view/Dataset1
we were able to get the raid array back in service last night on the XML
data dumps server, and we are now busily copying data off of it to
another host. There's about 11T of dumps to copy over; once that's done
we will start serving these dumps read-only to the public again.
Because the state of the server hardware is still uncertain, we don't
want to do anything that might put the data at risk until that copy has
been made.
The replacement server is on order and we are watching that closely.
We have also been working on deploying a server to run one round of
dumps in the interrim.
Thanks for your patience (which is a way of saying, I know you are all
out of patience, as am I, but hang on just a little longer).
Ariel
[crossposted to foundation-l and wikitech-l]
"There has to be a vision though, of something better. Maybe something
that is an actual wiki, quick and easy, rather than the template
coding hell Wikipedia's turned into." - something Fred Bauder just
said on wikien-l.
Our current markup is one of our biggest barriers to participation.
AIUI, edit rates are about half what they were in 2005, even as our
fame has gone from "popular" through "famous" to "part of the
structure of the world." I submit that this is not a good or healthy
thing in any way and needs fixing.
People who can handle wikitext really just do not understand how
offputting the computer guacamole is to people who can cope with text
they can see.
We know this is a problem; WYSIWYG that works is something that's been
wanted here forever. There are various hideous technical nightmares in
its way, that make this a big and hairy problem, of the sort where the
hair has hair.
However, I submit that it's important enough we need to attack it with
actual resources anyway.
This is just one data point, where a Canadian government office got
*EIGHT TIMES* the participation in their intranet wiki by putting in a
(heavily locally patched) copy of FCKeditor:
http://lists.wikimedia.org/pipermail/mediawiki-l/2010-May/034062.html
"I have to disagree with you given my experience. In one government
department where MediaWiki was installed we saw the active user base
spike from about 1000 users to about 8000 users within a month of having
enabled FCKeditor. FCKeditor definitely has it's warts, but it very
closely matches the experience non-technical people have gotten used to
while using Word or WordPerfect. Leveraging skills people already have
cuts down on training costs and allows them to be productive almost
immediately."
http://lists.wikimedia.org/pipermail/mediawiki-l/2010-May/034071.html
"Since a plethora of intelligent people with no desire to learn WikiCode
can now add content, the quality of posts has been in line with the
adoption of wiki use by these people. Thus one would say it has gone up.
"In the beginning there were some hard core users that learned WikiCode,
for the most part they have indicated that when the WYSIWYG fails, they
are able to switch to WikiCode mode to address the problem. This usually
occurs with complex table nesting which is something that few of the
users do anyways. Most document layouts are kept simple. Additionally,
we have a multilingual english/french wiki. As a result the browser
spell-check is insufficient for the most part (not to mention it has
issues with WikiCode). To address this a second spellcheck button was
added to the interface so that both english and french spellcheck could
be available within the same interface (via aspell backend)."
So, the payoffs could be ridiculously huge: eight times the number of
smart and knowledgeable people even being able to *fix typos* on
material they care about.
Here are some problems. (Off the top of my head; please do add more,
all you can think of.)
- The problem:
* Fidelity with the existing body of wikitext. No conversion flag day.
The current body exploits every possible edge case in the regular
expression guacamole we call a "parser". Tim said a few years ago that
any solution has to account for the existing body of text.
* Two-way fidelity. Those who know wikitext will demand to keep it and
will bitterly resist any attempt to take it away from them.
* FCKeditor (now CKeditor) in MediaWiki is all but unmaintained.
* There is no specification for wikitext. Well, there almost is -
compiled as C, it runs a bit slower than the existing PHP compiler.
But it's a start!
http://lists.wikimedia.org/pipermail/wikitext-l/2010-August/000318.html
- Attempting to solve it:
* The best brains around Wikipedia, MediaWiki and WMF have dashed
their foreheads against this problem for at least the past five years
and have got *nowhere*. Tim has a whole section in the SVN repository
for "new parser attempts". Sheer brilliance isn't going to solve this
one.
* Tim doesn't scale. Most of our other technical people don't scale.
*We have no resources and still run on almost nothing*.
($14m might sound like enough money to run a popular website, but for
comparison, I work as a sysadmin at a tiny, tiny publishing company
with more money and staff just in our department than that to do
*almost nothing* compared to what WMF achieves. WMF is an INCREDIBLY
efficient organisation.)
- Other attempts:
* Starting from a clear field makes it ridiculously easy. The
government example quoted above is one. Wikia wrote a good WYSIWYG
that works really nicely on new wikis (I'm speaking here as an
experienced wikitext user who happily fixes random typos on Wikia). Of
course, I noted that we can't start from a clear field - we have an
existing body of wikitext.
So, specification of the problem:
* We need good WYSIWYG. The government example suggests that a simple
word-processor-like interface would be enough to give tremendous
results.
* It needs two-way fidelity with almost all existing wikitext.
* We can't throw away existing wikitext, much as we'd love to.
* It's going to cost money in programming the WYSIWYG.
* It's going to cost money in rationalising existing wikitext so that
the most unfeasible formations can be shunted off to legacy for
chewing on.
* It's going to cost money in usability testing and so on.
* It's going to cost money for all sorts of things I haven't even
thought of yet.
This is a problem that would pay off hugely to solve, and that will
take actual money thrown at it.
How would you attack this problem, given actual resources for grunt work?
- d.
Okay, I emailed to Anthony how he can upload it.
after he is done the content will be at: http://dump.huiblaurens.nl
When the date is online I will make sure I have a copy of it somewhere else
on a server so it won't get lost.
Best,
Huib
Hello,
I have been a WP editor since 2006. I hope you can help me. For some reason
I no longer have Section Heading titles showing in the Articles. This is
true of all Headings including the one that carries the Article subject's
name. When there is a Table of Contents, it appears fine and, when I click
on a particular Section, it goes to that Section, but all that is there is a
straight line separating the Sections. There is also no button to edit a
Section. If I edit the page and remove the "== ==" markers from the Section
Titles, the Title then shows up, but not as a Section Heading. Also, I don't
have any Date separators on my Want List. This started 2 days ago. Any
thoughts?
Thanks,
Marc Riddell
[[User:Michael David]]
Just added 3 new committers. Let's give them a warm welcome.
Core & extensions:
Zak Greant (zak) - WMF contractor, working on documentation
Extensions only:
Jason Giglio (gigs) - Working on GoogleNewsSiteMap
Robert Scheiber (rscheiber) - General extension cleanup/fixes/etc
-Chad
Hi,
i will work on improving DB2 mediawiki port in 1.17. Currently there
is a problem with missing template class so i cant even run installer.
Can you give me some schedule when will be 1.17 in good enough state to
let me work on database backend?
PHP Fatal error: Class 'BaseTemplate' not found in
/home/hsn/public_html/phase3/skins/Vector.php on line 344
If this is not developer mailing list please let me know.