Hi PWB friends,
I was about to write a bot script but figured I should ask her first. On fawiki, I occasionally run into pages where the infobox template at the top of the page is squished into one very long line, as opposed to being multiline. Look at this example https://test.wikipedia.org/w/index.php?title=Infobox_tidy&action=edit I made on testwiki; I want to turn it to something looking like this https://test.wikipedia.org/w/index.php?title=Infobox_tidy&action=edit&oldid=634183. I have a reasonable sense of how to write the bot script and what possible edge cases to watch for, but I wonder if someone may have already written a script for this purpose.
LMK, Huji
I have the same problem and have been thinking on his for a while. Not only for infoboxes, but cite xxx templates as well which make the source text unreadable for humans. So if you do it earlier, please share. :-) I think it is a general problem. We guessed that some builtin editor creates these oneliner templates.
Huji Lee huji.huji@gmail.com ezt írta (időpont: 2025. jan. 2., Cs, 4:32):
Hi PWB friends,
I was about to write a bot script but figured I should ask her first. On fawiki, I occasionally run into pages where the infobox template at the top of the page is squished into one very long line, as opposed to being multiline. Look at this example https://test.wikipedia.org/w/index.php?title=Infobox_tidy&action=edit I made on testwiki; I want to turn it to something looking like this https://test.wikipedia.org/w/index.php?title=Infobox_tidy&action=edit&oldid=634183. I have a reasonable sense of how to write the bot script and what possible edge cases to watch for, but I wonder if someone may have already written a script for this purpose.
LMK, Huji _______________________________________________ pywikibot mailing list -- pywikibot@lists.wikimedia.org Public archives at https://lists.wikimedia.org/hyperkitty/list/pywikibot@lists.wikimedia.org/me... To unsubscribe send an email to pywikibot-leave@lists.wikimedia.org
Thanks for the suggestions.
I will start working on this. One open question in my mind is: does either of the python wikitext parsers have the ability to subtract the portion of wikitext that corresponds to a template, retaining what is before and after? Because if so, then all I need to do is find the *Infobox* template and pull it out using the parser, then tidy it up and paste back where it belongs on the page, without touching the rest of the page.
In my cursory look at *wikitextparser *docs I couldn't find such functionality. I prefer not to have to use regex to find the template wikitext myself (because if I do, then might as well parse it myself too, and that seems redundant in presence of wikitext parser libraries).
On Thu, Jan 2, 2025 at 1:15 PM info@gno.de wrote:
Hi there and happy New Year.
I think such a script would be a good idea. We already have textlib.glue_template_and_params which is able to format templates extracted with extract_templates_and_params. But the formatting of the glue… function is very rudimentary and some formatting options should be added.
I made a test edit on test wiki. For the current Infobox content it worked fine.
Best xqt
Von meinem iPhone gesendet
Am 02.01.2025 um 09:46 schrieb Bináris wikiposta@gmail.com:
I have the same problem and have been thinking on his for a while. Not only for infoboxes, but cite xxx templates as well which make the source text unreadable for humans. So if you do it earlier, please share. :-) I think it is a general problem. We guessed that some builtin editor creates these oneliner templates.
Huji Lee huji.huji@gmail.com ezt írta (időpont: 2025. jan. 2., Cs, 4:32):
Hi PWB friends,
I was about to write a bot script but figured I should ask her first. On fawiki, I occasionally run into pages where the infobox template at the top of the page is squished into one very long line, as opposed to being multiline. Look at this example https://test.wikipedia.org/w/index.php?title=Infobox_tidy&action=edit I made on testwiki; I want to turn it to something looking like this https://test.wikipedia.org/w/index.php?title=Infobox_tidy&action=edit&oldid=634183. I have a reasonable sense of how to write the bot script and what possible edge cases to watch for, but I wonder if someone may have already written a script for this purpose.
LMK, Huji _______________________________________________ pywikibot mailing list -- pywikibot@lists.wikimedia.org Public archives at https://lists.wikimedia.org/hyperkitty/list/pywikibot@lists.wikimedia.org/me... To unsubscribe send an email to pywikibot-leave@lists.wikimedia.org
-- Bináris _______________________________________________ pywikibot mailing list -- pywikibot@lists.wikimedia.org Public archives at https://lists.wikimedia.org/hyperkitty/list/pywikibot@lists.wikimedia.org/me... To unsubscribe send an email to pywikibot-leave@lists.wikimedia.org
pywikibot mailing list -- pywikibot@lists.wikimedia.org Public archives at https://lists.wikimedia.org/hyperkitty/list/pywikibot@lists.wikimedia.org/me... To unsubscribe send an email to pywikibot-leave@lists.wikimedia.org
If anyone is interested I have a python 2 version that uses mwparserfromhell
On Thu, Jan 2, 2025 at 6:13 PM Huji Lee huji.huji@gmail.com wrote:
Thanks for the suggestions.
I will start working on this. One open question in my mind is: does either of the python wikitext parsers have the ability to subtract the portion of wikitext that corresponds to a template, retaining what is before and after? Because if so, then all I need to do is find the *Infobox* template and pull it out using the parser, then tidy it up and paste back where it belongs on the page, without touching the rest of the page.
In my cursory look at *wikitextparser *docs I couldn't find such functionality. I prefer not to have to use regex to find the template wikitext myself (because if I do, then might as well parse it myself too, and that seems redundant in presence of wikitext parser libraries).
On Thu, Jan 2, 2025 at 1:15 PM info@gno.de wrote:
Hi there and happy New Year.
I think such a script would be a good idea. We already have textlib.glue_template_and_params which is able to format templates extracted with extract_templates_and_params. But the formatting of the glue… function is very rudimentary and some formatting options should be added.
I made a test edit on test wiki. For the current Infobox content it worked fine.
Best xqt
Von meinem iPhone gesendet
Am 02.01.2025 um 09:46 schrieb Bináris wikiposta@gmail.com:
I have the same problem and have been thinking on his for a while. Not only for infoboxes, but cite xxx templates as well which make the source text unreadable for humans. So if you do it earlier, please share. :-) I think it is a general problem. We guessed that some builtin editor creates these oneliner templates.
Huji Lee huji.huji@gmail.com ezt írta (időpont: 2025. jan. 2., Cs, 4:32):
Hi PWB friends,
I was about to write a bot script but figured I should ask her first. On fawiki, I occasionally run into pages where the infobox template at the top of the page is squished into one very long line, as opposed to being multiline. Look at this example https://test.wikipedia.org/w/index.php?title=Infobox_tidy&action=edit I made on testwiki; I want to turn it to something looking like this https://test.wikipedia.org/w/index.php?title=Infobox_tidy&action=edit&oldid=634183. I have a reasonable sense of how to write the bot script and what possible edge cases to watch for, but I wonder if someone may have already written a script for this purpose.
LMK, Huji _______________________________________________ pywikibot mailing list -- pywikibot@lists.wikimedia.org Public archives at https://lists.wikimedia.org/hyperkitty/list/pywikibot@lists.wikimedia.org/me... To unsubscribe send an email to pywikibot-leave@lists.wikimedia.org
-- Bináris _______________________________________________ pywikibot mailing list -- pywikibot@lists.wikimedia.org Public archives at https://lists.wikimedia.org/hyperkitty/list/pywikibot@lists.wikimedia.org/me... To unsubscribe send an email to pywikibot-leave@lists.wikimedia.org
pywikibot mailing list -- pywikibot@lists.wikimedia.org Public archives at https://lists.wikimedia.org/hyperkitty/list/pywikibot@lists.wikimedia.org/me... To unsubscribe send an email to pywikibot-leave@lists.wikimedia.org
pywikibot mailing list -- pywikibot@lists.wikimedia.org Public archives at https://lists.wikimedia.org/hyperkitty/list/pywikibot@lists.wikimedia.org/me... To unsubscribe send an email to pywikibot-leave@lists.wikimedia.org
wikitextparser template objects have a pformat method that almost achieves the desired outcome, but it formats recursively, which may not be ideal, particularly for smaller nested templates. Example code:
import wikitextparser
wikitext = """..."""
parsed = wikitextparser.parse(wikitext) template_0 = parsed.templates[0] template_0.string = template_0.pformat(indent=' ') print(parsed)
On Fri, Jan 3, 2025 at 3:22 AM John phoenixoverride@gmail.com wrote:
If anyone is interested I have a python 2 version that uses mwparserfromhell
On Thu, Jan 2, 2025 at 6:13 PM Huji Lee huji.huji@gmail.com wrote:
Thanks for the suggestions.
I will start working on this. One open question in my mind is: does either of the python wikitext parsers have the ability to subtract the portion of wikitext that corresponds to a template, retaining what is before and after? Because if so, then all I need to do is find the *Infobox* template and pull it out using the parser, then tidy it up and paste back where it belongs on the page, without touching the rest of the page.
In my cursory look at *wikitextparser *docs I couldn't find such functionality. I prefer not to have to use regex to find the template wikitext myself (because if I do, then might as well parse it myself too, and that seems redundant in presence of wikitext parser libraries).
On Thu, Jan 2, 2025 at 1:15 PM info@gno.de wrote:
Hi there and happy New Year.
I think such a script would be a good idea. We already have textlib.glue_template_and_params which is able to format templates extracted with extract_templates_and_params. But the formatting of the glue… function is very rudimentary and some formatting options should be added.
I made a test edit on test wiki. For the current Infobox content it worked fine.
Best xqt
Von meinem iPhone gesendet
Am 02.01.2025 um 09:46 schrieb Bináris wikiposta@gmail.com:
I have the same problem and have been thinking on his for a while. Not only for infoboxes, but cite xxx templates as well which make the source text unreadable for humans. So if you do it earlier, please share. :-) I think it is a general problem. We guessed that some builtin editor creates these oneliner templates.
Huji Lee huji.huji@gmail.com ezt írta (időpont: 2025. jan. 2., Cs, 4:32):
Hi PWB friends,
I was about to write a bot script but figured I should ask her first. On fawiki, I occasionally run into pages where the infobox template at the top of the page is squished into one very long line, as opposed to being multiline. Look at this example https://test.wikipedia.org/w/index.php?title=Infobox_tidy&action=edit I made on testwiki; I want to turn it to something looking like this https://test.wikipedia.org/w/index.php?title=Infobox_tidy&action=edit&oldid=634183. I have a reasonable sense of how to write the bot script and what possible edge cases to watch for, but I wonder if someone may have already written a script for this purpose.
LMK, Huji _______________________________________________ pywikibot mailing list -- pywikibot@lists.wikimedia.org Public archives at https://lists.wikimedia.org/hyperkitty/list/pywikibot@lists.wikimedia.org/me... To unsubscribe send an email to pywikibot-leave@lists.wikimedia.org
-- Bináris _______________________________________________ pywikibot mailing list -- pywikibot@lists.wikimedia.org Public archives at https://lists.wikimedia.org/hyperkitty/list/pywikibot@lists.wikimedia.org/me... To unsubscribe send an email to pywikibot-leave@lists.wikimedia.org
pywikibot mailing list -- pywikibot@lists.wikimedia.org Public archives at https://lists.wikimedia.org/hyperkitty/list/pywikibot@lists.wikimedia.org/me... To unsubscribe send an email to pywikibot-leave@lists.wikimedia.org
pywikibot mailing list -- pywikibot@lists.wikimedia.org Public archives at https://lists.wikimedia.org/hyperkitty/list/pywikibot@lists.wikimedia.org/me... To unsubscribe send an email to pywikibot-leave@lists.wikimedia.org
pywikibot mailing list -- pywikibot@lists.wikimedia.org Public archives at https://lists.wikimedia.org/hyperkitty/list/pywikibot@lists.wikimedia.org/me... To unsubscribe send an email to pywikibot-leave@lists.wikimedia.org
Hi Huji,
In order to be truly useful, I believe such a script should be able to read the templatedata of the template and format it accordingly.
Best, Strainu
Pe joi, 2 ianuarie 2025, Huji Lee huji.huji@gmail.com a scris:
Hi PWB friends, I was about to write a bot script but figured I should ask her first. On
fawiki, I occasionally run into pages where the infobox template at the top of the page is squished into one very long line, as opposed to being multiline. Look at this example I made on testwiki; I want to turn it to something looking like this. I have a reasonable sense of how to write the bot script and what possible edge cases to watch for, but I wonder if someone may have already written a script for this purpose.
LMK, Huji