[Labs-l] MediaWiki to LaTeX Compiler

Liangent liangent at gmail.com
Sun May 25 17:50:02 UTC 2014


It looks much better now.

Regarding newcommand's, it's more natural to say:

第4章 instead of 章4
第2页 instead of 页2
图8 instead of 图形8

and in Chinese you don't need to (and shouldn't) add spaces between words.
I feel there're some extra ones added especially near links.

-Liangent
On May 26, 2014 1:06 AM, "Dirk Hünniger" <dirk.hunniger at googlemail.com>
wrote:

>  Hi,
> I made a language file for chinese now and installed it on the server. So
> please have a try:
> http://mediawiki2latex.wmflabs.org/
> Yours Dirk
>
> PS:The language file:
> \HyphSubstLet{ngerman}{ngerman-x-latest}
> \usepackage{xeCJK}
> \setCJKmainfont{WenQuanYi Zen Hei}
> \newcommand{\mychapterbabel}{章}
> \newcommand{\mypagebabel}{页}
> \newcommand{\myfigurebabel}{图形}
> \newcommand{\mylangbabel}{chinese}
>
>
> On 2014-05-25 17:58, Liangent wrote:
>
>  I failed to compile your document in it's original form:
>
> (../headers/babel.tex
> (/var/lib/texmf/tex/generic/babel/babel.sty
> (/usr/share/texlive/texmf-dist/tex/generic/babel-english/english.ldf
> (/usr/share/texlive/texmf-dist/tex/generic/babel/babel.def
> ! Undefined control sequence.
> \initiate at active@char #1->\bbl at ifshorthand
>                                            {#1}{\bbl at s@initiate at active@char
> ...
> l.585 \initiate at active@char{~}
>
> and I worked around it by commenting out \usepackage[english]{babel} in
> ../headers/babel.tex
>
>  In my experiment I added:
>
> \usepackage{xeCJK}
> \setCJKmainfont{WenQuanYi Zen Hei}
>
> to main.tex after \usepackage{fontspec}, and it improves CJK typesetting a
> lot.
>
> WenQuanYi Zen Hei is contained in
> https://packages.debian.org/sid/fonts-wqy-zenhei
>
>  -Liangent
>
>
> On Sun, May 25, 2014 at 8:01 PM, Dirk Hünniger <
> dirk.hunniger at googlemail.com> wrote:
>
>>  Hi,
>> I am sending you the latex source of the main page of the chinese
>> wikipedia as attachment.
>> You can look at it. But if you want to compile it you need to have ubuntu
>> 14.04 and do
>> sudo apt-get install mediawiki2latex
>> xelatex main.tex
>> Yours Dirk
>>
>>
>> On 2014-05-25 13:32, Liangent wrote:
>>
>>  I don't really have an idea about how to "go for the command line
>> version and use the -c command line option"; I know nothing about Haskell
>> anyway...
>>
>> I hope that it's available on the web, is it possible to add a checkbox
>> or something?
>>
>>  -Liangent
>>
>>
>> On Sun, May 25, 2014 at 7:18 PM, Dirk Hünniger <
>> dirk.hunniger at googlemail.com> wrote:
>>
>>>  Hi,
>>> I didn't take any special care about CJK. Its a bit hard for me since I
>>> cannot read any of these languages myself. Maybe you can have a look at the
>>> LaTeX source and tell me what I need to change. Currently no CJK package is
>>> loaded. The only thing I am doing is to switch to ttf fonts that contain
>>> CJK characters when I need to print them. Also I am using babel packages.
>>> For some languages I get proper hyphenation this way, but apparently
>>> something does not work here for Chinese.
>>> Yours Dirk
>>>
>>> On 2014-05-25 13:02, Liangent wrote:
>>>
>>>  I had a try using an article on Chinese Wikipedia. Although I'm not
>>> sure whether the cause is in generated LaTeX source or the way you invoke
>>> LaTeX, the most notable problem is that in output PDF, word wrap doesn't
>>> take place correctly so almost every line overflows. See
>>> https://en.wikipedia.org/wiki/Word_wrap#Word_wrapping_in_text_containing_Chinese.2C_Japanese.2C_and_Koreanfor more information.
>>>
>>>  -Liangent
>>>
>>>
>>> On Sun, May 25, 2014 at 5:49 PM, Dirk Hünniger <
>>> dirk.hunniger at googlemail.com> wrote:
>>>
>>>> Hi,
>>>> if you want the tex source go for the command line version and use the
>>>> -c command line option. If you want to convert from tex to mediawiki use
>>>> pandoc. In the imprint of each pdf there is a link to the sourceforge page.
>>>> Its slow, but I cannot make it any faster. Its mostly the runtime of LaTeX
>>>> itself. I already invested two weeks in optimizing speed. In particular its
>>>> using multiple cores, while in my code. But well there is not much you can
>>>> do with multiple cores when running LaTeX itself. You could actully get
>>>> some speed by using native cores, but the administration is not that easy.
>>>> It also says on the main page that it will take up to ten minutes.
>>>> Yours Dirk
>>>>
>>>>
>>>> On 2014-05-25 11:40, Gryllida wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> On Sun, 25 May 2014, at 18:02, Dirk Hünniger wrote:
>>>>>
>>>>>> It not a private server anymore. Its now running on Wmflabs already.
>>>>>>
>>>>>> http://mediawiki2latex.wmflabs.org
>>>>>>
>>>>> I would probably link to the source code and a bug tracker on its main
>>>>> page.
>>>>> - I see it generated a PDF. Nicely formatted. :) But the TeX source
>>>>> would be also useful.
>>>>> - It would be nice to be able to convert back from tex to wiki markup
>>>>> also.
>>>>> - It also appears to be dog slow (about 5 minutes).
>>>>>
>>>>> Gryllida.
>>>>>
>>>>> _______________________________________________
>>>>> Labs-l mailing list
>>>>> Labs-l at lists.wikimedia.org
>>>>> https://lists.wikimedia.org/mailman/listinfo/labs-l
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Labs-l mailing list
>>>> Labs-l at lists.wikimedia.org
>>>> https://lists.wikimedia.org/mailman/listinfo/labs-l
>>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Labs-l mailing listLabs-l at lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/labs-l
>>>
>>>
>>>
>>> _______________________________________________
>>> Labs-l mailing list
>>> Labs-l at lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/labs-l
>>>
>>>
>>
>>
>> _______________________________________________
>> Labs-l mailing listLabs-l at lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/labs-l
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wikimedia.org/pipermail/labs-l/attachments/20140526/27f0c187/attachment-0001.html>


More information about the Labs-l mailing list