Hi. I'm working on a script which edits a page, adds a section to it and then redirects to this page.
It would be nice if it went straight to the newly-created section. So I need to create a link with # in it.
The problem appears when the title of the section contains some diacritics. For example, link to "bażant królewski" looks like "Ba.C5.BCant_kr.C3.B3lewski".
How can I generate in JavaScript such a link which would be identical to the one generated by MediaWiki? Has somebody written such a function? Or at least, do you know where it is done in MediaWiki php code?
Thanks in advance. And sorry, if it's not the right place to ask this question.
lampak
PS. Links with numbers (like "#1") will not work in this case, in case somebody would like to propose it.
On 24 August 2010 10:43, lampak llampak@gmail.com wrote:
How can I generate in JavaScript such a link which would be identical to the one generated by MediaWiki? Has somebody written such a function? Or at least, do you know where it is done in MediaWiki php code?
See Parser.php, function guessSectionNameFromWikiText; most of the work is done in Sanitizer::escapeId.
-- [[cs:User:Mormegil | Petr Kadlec]]
On 24/08/10 10:59, Petr Kadlec wrote:
On 24 August 2010 10:43, lampakllampak@gmail.com wrote:
How can I generate in JavaScript such a link which would be identical to the one generated by MediaWiki? Has somebody written such a function? Or at least, do you know where it is done in MediaWiki php code?
See Parser.php, function guessSectionNameFromWikiText; most of the work is done in Sanitizer::escapeId.
-- [[cs:User:Mormegil | Petr Kadlec]]
Thanks :) I guess it was foolish of me to expect it would be simple? Never mind, seems it's too much work for a minor fix, so I think it will stay as it is. Or I will figure some quick-and-dirty workaround.
But thanks anyway :)
lampak
On Tue, Aug 24, 2010 at 10:59:04AM +0200, Petr Kadlec wrote:
On 24 August 2010 10:43, lampak llampak@gmail.com wrote:
How can I generate in JavaScript such a link which would be identical to the one generated by MediaWiki? Has somebody written such a function? Or at least, do you know where it is done in MediaWiki php code?
See Parser.php, function guessSectionNameFromWikiText; most of the work is done in Sanitizer::escapeId.
The problem there is that the output of guessSectionNameFromWikiText depends on the configuration of the particular wiki. $wgHtml5 and $wgExperimentalHtmlIds are used, and maybe others.
Sometimes it may be easiest to just ask the API.
On 24 August 2010 15:09, Brad Jorsch b-jorsch@northwestern.edu wrote:
Sometimes it may be easiest to just ask the API.
Yep, that is a possibility:
http://en.wikipedia.org/w/api.php?action=parse&text=%7B%7Banchorencode:b...
-- [[cs:User:Mormegil | Petr Kadlec]]
lampak <llampak <at> gmail.com> writes:
Hi. I'm working on a script which edits a page, adds a section to it and then redirects to this page.
It would be nice if it went straight to the newly-created section. So I need to create a link with # in it.
The problem appears when the title of the section contains some diacritics. For example, link to "bażant królewski" looks like "Ba.C5.BCant_kr.C3.B3lewski".
How can I generate in JavaScript such a link which would be identical to the one generated by MediaWiki? Has somebody written such a function? Or at least, do you know where it is done in MediaWiki php code?
It is pretty simple as long as there is no wikitext or html in the title: convert it to urlencoded UTF-8 (encodeURIComponent does that), replace percent signs with dots, replace spaces with underscores.
On 24/08/10 15:25, Tgr wrote:
lampak<llampak<at> gmail.com> writes:
Hi. I'm working on a script which edits a page, adds a section to it and then redirects to this page.
It would be nice if it went straight to the newly-created section. So I need to create a link with # in it.
The problem appears when the title of the section contains some diacritics. For example, link to "bażant królewski" looks like "Ba.C5.BCant_kr.C3.B3lewski".
How can I generate in JavaScript such a link which would be identical to the one generated by MediaWiki? Has somebody written such a function? Or at least, do you know where it is done in MediaWiki php code?
It is pretty simple as long as there is no wikitext or html in the title: convert it to urlencoded UTF-8 (encodeURIComponent does that), replace percent signs with dots, replace spaces with underscores.
I have tried encodeURIComponent before. Bażant królewski becomes Bażant_królewski. Diacritics are not converted. At least not under Firefox.
lampak
At 2010-08-24 15:44, lampak wrote:
On 24/08/10 15:25, Tgr wrote:
lampak<llampak<at> gmail.com> writes:
Hi. I'm working on a script which edits a page, adds a section to it and then redirects to this page.
It would be nice if it went straight to the newly-created section. So I need to create a link with # in it.
The problem appears when the title of the section contains some diacritics. For example, link to "bażant królewski" looks like "Ba.C5.BCant_kr.C3.B3lewski".
How can I generate in JavaScript such a link which would be identical to the one generated by MediaWiki? Has somebody written such a function? Or at least, do you know where it is done in MediaWiki php code?
It is pretty simple as long as there is no wikitext or html in the title: convert it to urlencoded UTF-8 (encodeURIComponent does that), replace percent signs with dots, replace spaces with underscores.
I have tried encodeURIComponent before. Bażant królewski becomes Bażant_królewski. Diacritics are not converted. At least not under Firefox.
For basic examples you could use this: var txt = 'Zażółć'; txt = encodeURIComponent(encodeURI(txt).replace(/%/g, '.')).replace(/%/g, '.');
Not sure if encodeURI was in old IE (if you care ;-)), but it works with new browsers.
Note that this will not work if the section contains mark up code. But as I understood you are not looking for something bulletproof.
Regards, Nux.
On 24/08/10 22:47, Maciej Jaros wrote:
At 2010-08-24 15:44, lampak wrote:
On 24/08/10 15:25, Tgr wrote:
lampak<llampak<at> gmail.com> writes:
Hi. I'm working on a script which edits a page, adds a section to it and then redirects to this page.
It would be nice if it went straight to the newly-created section. So I need to create a link with # in it.
The problem appears when the title of the section contains some diacritics. For example, link to "bażant królewski" looks like "Ba.C5.BCant_kr.C3.B3lewski".
How can I generate in JavaScript such a link which would be identical to the one generated by MediaWiki? Has somebody written such a function? Or at least, do you know where it is done in MediaWiki php code?
It is pretty simple as long as there is no wikitext or html in the title: convert it to urlencoded UTF-8 (encodeURIComponent does that), replace percent signs with dots, replace spaces with underscores.
I have tried encodeURIComponent before. Bażant królewski becomes Bażant_królewski. Diacritics are not converted. At least not under Firefox.
For basic examples you could use this: var txt = 'Zażółć'; txt = encodeURIComponent(encodeURI(txt).replace(/%/g, '.')).replace(/%/g, '.');
Not sure if encodeURI was in old IE (if you care ;-)), but it works with new browsers.
Note that this will not work if the section contains mark up code. But as I understood you are not looking for something bulletproof.
Great thanks :) I just added additional replaces to change : and _ back and it works :)
lampak
lampak <llampak <at> gmail.com> writes:
I have tried encodeURIComponent before. Bażant królewski becomes Bażant_królewski. Diacritics are not converted. At least not under Firefox.
I don't know what you've tried, but the result of
encodeURIComponent('Bażant królewski')
is
"Ba%C5%BCant%20kr%C3%B3lewski"
in Firefox, and I'm pretty sure it works the same way on other browsers. (encodeURI is almost the same, but leaves more characters unencoded, which in this case is a bad thing.)
Then you need to replace %20 with _, % with ., unencoded characters ~!*()' with their proper utf-8 sequence, and you have the section title fragment.
(It might be a good idea to include a function doing that in wikibits.js, if there isn't one yet.)
On 26/08/10 10:44, Tgr wrote:
lampak<llampak<at> gmail.com> writes:
I have tried encodeURIComponent before. Bażant królewski becomes Bażant_królewski. Diacritics are not converted. At least not under Firefox.
I don't know what you've tried, but the result of
encodeURIComponent('Bażant królewski')
is
"Ba%C5%BCant%20kr%C3%B3lewski"
in Firefox, and I'm pretty sure it works the same way on other browsers. (encodeURI is almost the same, but leaves more characters unencoded, which in this case is a bad thing.)
Then you need to replace %20 with _, % with ., unencoded characters ~!*()' with their proper utf-8 sequence, and you have the section title fragment.
(It might be a good idea to include a function doing that in wikibits.js, if there isn't one yet.)
Ah, indeed. That's because I put it directly to the window.location and then on the address bar these utf codes transformed into ż and ó. Thanks :)
lampak
On Thu, Aug 26, 2010 at 4:44 AM, Tgr gtisza@gmail.com wrote:
in Firefox, and I'm pretty sure it works the same way on other browsers. (encodeURI is almost the same, but leaves more characters unencoded, which in this case is a bad thing.)
Then you need to replace %20 with _, % with ., unencoded characters ~!*()' with their proper utf-8 sequence, and you have the section title fragment.
(It might be a good idea to include a function doing that in wikibits.js, if there isn't one yet.)
This isn't reliable. In particular, it will fail when the section name has any wikitext or HTML in it. The section text is parsed first, then the parsed text has HTML stripped from it, and the anchor is generated from that. This is a mess, since it makes it impossible to create section links reliably even within MediaWiki from any place other than the TOC, but that's the status quo. See bug 5019 and related (111, 2346, 2831). Bug 18700 is also related, although conceptually somewhat different.
So in other words, something like this is reliable enough for internal use in MediaWiki in some places (return to section after edit and history page links), so it will probably work most of the time. I guess a JS function for it might be useful, but it would probably fall out of sync with the core code.
wikitech-l@lists.wikimedia.org