Does anybody know if capitalisation on the Esperanto wiktionary has been
taken of and if it is on UTF-8? As much as I know Gerard already
requested this in the chat (that strangely doesn't work on my computer).
Just to avoid doulble work.
Sorry for the message to all lists, but I'd like to start working on
that and therefore this is essential - I now tried and it seems as if
capitalisation is still on.
Thank you!!!
Ciao, Sabine
LocalSettings.php file not being created when I fill in the form and hit
Install button.
I got an empty page displayed due to bug #736 in MediaWiki v1.3.x where a
1.3 install sucess page is expected
Installing Wiki on a Stick using Uniform Server v3.2
Should I try a version of WikiMedia other than 1.3.7?
Thanks in advance...really want to get my own Wiki system up and running!
Is there a way to prevent templates from being surrounded by <p></p> tags. I
begin and end a page with a template, and don't want them to appear before
or after the templates.
Many thanks.
Hi,
I saw in some of the old posts of a rating mechanism being developed. I need
this functionality for a wiki site. Probably not exactly the same kind of
ratings but still I want to take a look and hopefully contribute. Which version
will include it? How can I access the rating code?
Thanks.
The representation of the tonos and oxia on Greek vowels has become an
issue at Wikisource. For the letter alpha modern monotonic Greek uses
U+03AC (decimal 940) for alpha with an acute accent or "tonos".
Traditional, or polytonic Greek uses U+1F71 (decimal 8049) for alpha
with an acute accent or "oxia". I understand that the simplification of
Greek writing has been a matter of considerable controversy in the 20th
century.
Our software now converts the 1F71 to the 03AC form. It would
nevertheless be preferable not to conflate these forms. Wikisource will
include both classical and modern Greek texts, and it would seem
preferable to have the entire range of accents available for the
classical texts when required.
Ec
Seeing as 1.5 is UTF-8 for all languages (yay), can we parse -- and ---
to N and M dashes? I know there were some problems with wiki table
syntax last time this was attempted, but that's easy to eliminate if you
require spaces around the hyphen-sequence. In that case, as a rule, the
resulting display would have spaces as well (except numbers... more
later). The spaces around dashes are a mini-dispute of their own over at
[[Wikipedia talk:Manual of Style (dashes)]], but I think that people
would be happy just to have an easy, standard way to enter dashes.
Last time this came up someone mentioned other languages that might
forbid a space around dashes. I know that in French it's common to use
dashes at the beginning lines in dialog. This problem has an easy
solution: French keyboards also have a a dash key, unless my memory is
failing me completely. American Mac keyboards have option-hyphen. On
Windows, Word inserts dashes all over the place. People who aren't happy
with the " --- " and " -- " solution would be free to enter unicode
dashes however they wish, as they can today on UTF-8 wikipedias.
(Stubborn space-conscious anglophones might also resort to this
method... so be it.)
I've written a patch that I think is fairly well placed since it's
adjacent to the existing code that inserts non-breaking spaces between
guillemets. This method would make a lot of people happy, and it
promotes compliance to the Manual of Style as much as is possible.
Here's how it works:
1. Replace any ' -- ' with the UTF-8 sequence equivalent to ' – '
2. Replace any '--' between numbers with '–' alone.
3. Replace any ' --- ' with the UTF-8 sequence equivalent to
' — '
See below for the code.
Nathan
Index: Parser.php
===================================================================
RCS file: /cvsroot/wikipedia/phase3/includes/Parser.php,v
retrieving revision 1.383
diff --unified=3 -w -B -r1.383 Parser.php
--- Parser.php 6 Feb 2005 16:13:05 -0000 1.383
+++ Parser.php 9 Feb 2005 21:15:50 -0000
@@ -185,6 +185,9 @@
'/<br *>/i' => '<br />',
'/<center *>/i' => '<div class="center">',
'/<\\/center *>/i' => '</div>',
+ '/ -- /i' => "\xC2\xA0\xE2\x80\x93 ", # –<normal space>
+ '/([0-9])--([0-9])/i' => "\\1\xE2\x80\x93\\2", # –
+ '/ --- /i' => "\xC2\xA0\xE2\x80\x94 " # —<normal space>
);
$text = preg_replace( array_keys($fixtags), array_values($fixtags),
$text );
$text = Sanitizer::normalizeCharReferences( $text );
@@ -195,7 +198,10 @@
# french spaces, Guillemet-right
'/(\\302\\253) /i' => '\\1 ',
'/<center *>/i' => '<div class="center">',
- '/<\\/center *>/i' => '</div>'
+ '/<\\/center *>/i' => '</div>',
+ '/ -- /i' => "\xC2\xA0\xE2\x80\x93 ", # –<normal space>
+ '/([0-9])--([0-9])/i' => "\\1\xE2\x80\x93\\2", # –
+ '/ --- /i' => "\xC2\xA0\xE2\x80\x94 " # —<normal space>
);
$text = preg_replace( array_keys($fixtags), array_values($fixtags),
$text );
}
Hello folks,
I want to make PDF files from a number of wikipedia articles, but
encountered several technical problems and wondered if there's a better
way how to do that than I came about. (If you want to know what this is
for, look at the end of this posting.)
The number of articles should be about 1000. That has some consequences:
1. I have to avoid wasting bandwidth and wikipedia server load as
much as possible.
2. I have to automate the task because I cannot click 1000 pages
interactively.
To save server load and bandwidth, I considered using the database dump,
but that lacks the images and the layout, right? I even downloaded the
wikipedia CDROM image, but discovered that's a Windows software with
data stuffed into some database where it's probably difficult to
retrieve and make PDFs from.
My current idea is to use the normal web access since I have no other
working solution. I would spread the accesses over a week and only use
times where normal server load is low.
But: html2ps is regarded as a harvester by wikipedia.org and doesn't get
the real pages, even with User-Agent tweaked by a squid. When I use wget
-p and html2ps afterwards on the local files, it fetches the images
twice for some weird reason. And worse, html2ps is obviously not capable
of dealing with utf-8 encoded pages.
So, I have two questions to the community here:
1. What's the best way to get those 1000 articles from the servers
without putting too much load on them?
2. What's a good way to automate the conversion to PDF of those pages?
Any suggestions appreciated.
(What is all this for? I work for a software company that will show a
demonstration of a document management system on the CeBIT fair in March
2005 in Hannover, Germany. I will hold some lectures there where I will
demonstrate a power plant documentation system. Of couse I cannot do
that with the actual real documents from our customers because those are
classified, so I intend to substitute them with wikipedia articles about
power plants, electricity and related topics. Thus, I only show free
content, and I can also use the opportunity to demonstrate wikipedia
quality to some business people who might not know it yet. I hope you
agree this is a perfectly valid intention of using wikipedia content.)
Regards
--
Maik Musall <maik(a)musall.de>
GPG public key 0x856861EB (keyserver: wwwkeys.de.pgp.net)
I see there are new dumps (btw: de/en missing on download page, but dumped
recently anyway)
In order to adapt and test stats scripts, has one of the smaller databases
been converted to new compression cheme? I tried a few but no luck.
Can't download de/en old dump for tests, they are too large.
Thx, Erik Zachte.