Hello
I want to parse some variable-length wiki texts using 'parse' action. My problem is that API returns error if wiki text length exceeds some value, say 1000 characters. In such a case API returns well-formatted HTML page informing about unexpected error and it encourages to try again later. It doesn't look like expected behavior, rather like some fast-coded check. API documentation doesn't say anything about upper length limit. Is there a way to deal with that? I wouldn't like solution like 'splitting text into parts and parsing them separately' because it can be really problematic in some cases e.g. with very long lists ('#' markers).
Łukasz
what are you trying to parse?
On Thu, Sep 27, 2012 at 8:33 AM, Łukasz Czyż lukasz.czyzz@gmail.com wrote:
Hello
I want to parse some variable-length wiki texts using 'parse' action. My problem is that API returns error if wiki text length exceeds some value, say 1000 characters. In such a case API returns well-formatted HTML page informing about unexpected error and it encourages to try again later. It doesn't look like expected behavior, rather like some fast-coded check. API documentation doesn't say anything about upper length limit. Is there a way to deal with that? I wouldn't like solution like 'splitting text into parts and parsing them separately' because it can be really problematic in some cases e.g. with very long lists ('#' markers).
Łukasz
Mediawiki-api mailing list Mediawiki-api@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
2012/9/27 Betacommand Betacommand@gmail.com:
what are you trying to parse?
Here is exemplary wiki text which seems to be not accepted by parser:
# [[one]] (1) #: ''Usage note: '''один''', '''одна''' and '''одно''', whether alone or in compounds (21, 31, 41, 101, etc.), govern the [[singular]]. All intervening adjectives and adjectival participles also become singular. If the noun is in nominative case, the corresponding verb also becomes singular.'' #:: один [[мальчик]] (odín mál’čik) — ''one boy'' #:: [[пятьдесят]] один [[мальчик]] (pjat’desját odín mál’čik) — ''fifty-one boys (lit., fifty-one boy)'' #:: одна [[книга]] (odná kníga) — ''one book'' #:: [[здесь#Russian|Здесь]] [[пятьдесят]] одна [[книга]] (zdes’ pjat’desját odná kníga) — ''Here are fifty-one books (lit., fifty-one book)'' #:: [[В]] [[мой|моём]] [[дом]]е [[быть|есть]] [[двадцать]] одно [[большой|большое]] [[окно]] (v mojóm dóme jest’ dvádtsat’ odnó bol’šóje oknó) — ''There are twenty-one large windows (lit. window) in my house.'' #:: [[В]] [[мой|моём]] [[дом]]е [[быть|было]] [[двадцать]] одно [[большой|большое]] [[окно]] — ''There were (lit. was) twenty-one large windows (lit. window) in my house.'' # [[alone]] # [[only]] (especially with negative connotation) #: В холодильнике остались одни консервы — There’s nothing but cans left in the refrigerator . # a, a certain, some #: Я прочитал это в одной книге — I read it in some book. # ни один — not a single #: На курс не записался ни один студент. #: Во всём доме я не нашёл ни одного стула. # ни одна собака (modern slang) — no one (''often about people who don't care about somebody'') #: ''Когда (я) четыре года назад лежал в больницах с инфарктами, никому до меня дела не было. Ни одна собака (обо мне) не вспомнила!'' (Из интервью с Михаилом Кононовым) <ref>[http://eg.ru/daily/cadr/9314/]</ref> # ''(contrast)'' [[some]] (do something or have some quality), while [[others]] (do not do it; do something else; do not have this quality; or have another or the opposite quality) #: Одни биологи считают, что вирусы произошли от бактерий, другие с ними не согласны — ''Some biologists believe that viruses evolved from bacteria, while others disagree.'' #: Одни насекомые крылаты, другие лишены крыльев — ''Some insects are winged, others have no wings.'' #: Одних быков в стаде кастрировали, других ещё нет — ''Some bulls in the herd were castrated, others not yet.'' #: Одни говорят одно, другие другое — ''There are contradictory rumours about the topic.''
On Thu, Sep 27, 2012 at 02:33:27PM +0200, Łukasz Czyż wrote:
I want to parse some variable-length wiki texts using 'parse' action. My problem is that API returns error if wiki text length exceeds some value, say 1000 characters. In such a case API returns well-formatted HTML page informing about unexpected error and it encourages to try again later. It doesn't look like expected behavior, rather like some fast-coded check. API documentation doesn't say anything about upper length limit. Is there a way to deal with that? I wouldn't like solution like 'splitting text into parts and parsing them separately' because it can be really problematic in some cases e.g. with very long lists ('#' markers).
If you're using GET, try using POST; it's possible your HTTP library or a proxy is truncating the query string in some manner. Otherwise, please provide the url and post data, as well as the exact text of the message you're getting.
FYI, I just tried a GET request with several thousand characters of lorem text, and it worked fine here. At some point, of course, you'll get a 414 error from Wikimedia's servers, but that response isn't an API response in any manner.
mediawiki-api@lists.wikimedia.org