Do we have anything to exclude tables from page text? I read mwparserfromhell doc, but did not find.
You can use mwparserfromhell to filter the type of tag.
wikicode = mwparserfromhell.parse(text) for table in wikicode.ifilter_tags(matches=lambda tag: tag.tag.lower()=="table"): wikicode.remove(table)
On Wed, Dec 21, 2022 at 8:58 AM Bináris wikiposta@gmail.com wrote:
Do we have anything to exclude tables from page text? I read mwparserfromhell doc, but did not find.
-- Bináris _______________________________________________ pywikibot mailing list -- pywikibot@lists.wikimedia.org To unsubscribe send an email to pywikibot-leave@lists.wikimedia.org
Thank you!
JJMC89 jjmc89.wikimedia@gmail.com ezt írta (időpont: 2022. dec. 21., Sze, 18:28):
You can use mwparserfromhell to filter the type of tag.
wikicode = mwparserfromhell.parse(text) for table in wikicode.ifilter_tags(matches=lambda tag: tag.tag.lower()=="table"): wikicode.remove(table)
On Wed, Dec 21, 2022 at 8:58 AM Bináris wikiposta@gmail.com wrote:
Do we have anything to exclude tables from page text? I read mwparserfromhell doc, but did not find.
-- Bináris _______________________________________________ pywikibot mailing list -- pywikibot@lists.wikimedia.org To unsubscribe send an email to pywikibot-leave@lists.wikimedia.org
pywikibot mailing list -- pywikibot@lists.wikimedia.org To unsubscribe send an email to pywikibot-leave@lists.wikimedia.org
I don't find the description of objects and available tags in documentation, can you help me? Are there cells, rows, columns, headings?
mwparserfromhell.nodes.tag.Tag https://mwparserfromhell.readthedocs.io/en/v0.6.4/api/mwparserfromhell.nodes.html#mwparserfromhell.nodes.tag.Tag can represent any HTML-style tag. mwparserfromhell will parse wikitext tables into the HTML tag equivalents https://en.wikipedia.org/wiki/Help:Table#Comparison_of_table_syntax.
Doing something like the below will let you visualize:
text = "before\n{| class='wikitable'\n!c1 h!!c2 h\n|r1 c1\n|r1
c2\n|}\nafter"
for tag in wikicode.ifilter_tags(matches=lambda tag: tag.tag.lower() in
("table", "caption", "th", "tr", "td")): ... print(f"{tag!r} is a {tag.tag!r}") ... "{| class='wikitable'\n!c1 h!!c2 h\n|r1 c1\n|r1 c2\n|}" is a 'table' '!c1 h' is a 'th' '!!c2 h\n' is a 'th' '|r1 c1\n' is a 'td' '|r1 c2\n' is a 'td'
On Wed, Dec 21, 2022 at 5:04 PM Bináris wikiposta@gmail.com wrote:
I don't find the description of objects and available tags in documentation, can you help me? Are there cells, rows, columns, headings? _______________________________________________ pywikibot mailing list -- pywikibot@lists.wikimedia.org To unsubscribe send an email to pywikibot-leave@lists.wikimedia.org
Thank you! I never used this module. I want to physically sort rows and columns of a table, so I have to change the elements of a given table. My first thouught was write it from scratch with strings, but I hope, this helps.
JJMC89 jjmc89.wikimedia@gmail.com ezt írta (időpont: 2022. dec. 22., Cs, 3:21):
mwparserfromhell.nodes.tag.Tag https://mwparserfromhell.readthedocs.io/en/v0.6.4/api/mwparserfromhell.nodes.html#mwparserfromhell.nodes.tag.Tag can represent any HTML-style tag. mwparserfromhell will parse wikitext tables into the HTML tag equivalents https://en.wikipedia.org/wiki/Help:Table#Comparison_of_table_syntax.
Doing something like the below will let you visualize:
text = "before\n{| class='wikitable'\n!c1 h!!c2 h\n|r1 c1\n|r1
c2\n|}\nafter"
for tag in wikicode.ifilter_tags(matches=lambda tag: tag.tag.lower()
in ("table", "caption", "th", "tr", "td")): ... print(f"{tag!r} is a {tag.tag!r}") ... "{| class='wikitable'\n!c1 h!!c2 h\n|r1 c1\n|r1 c2\n|}" is a 'table' '!c1 h' is a 'th' '!!c2 h\n' is a 'th' '|r1 c1\n' is a 'td' '|r1 c2\n' is a 'td'
On Wed, Dec 21, 2022 at 5:04 PM Bináris wikiposta@gmail.com wrote:
I don't find the description of objects and available tags in documentation, can you help me? Are there cells, rows, columns, headings? _______________________________________________ pywikibot mailing list -- pywikibot@lists.wikimedia.org To unsubscribe send an email to pywikibot-leave@lists.wikimedia.org
pywikibot mailing list -- pywikibot@lists.wikimedia.org To unsubscribe send an email to pywikibot-leave@lists.wikimedia.org
FWIW, I wrote a bot script to sort some tables a while ago.
https://bitbucket.org/WilliamAvery/wikipythonics/src/master/draftPickSortBot...
On Thu, 22 Dec 2022, 08:56 Bináris, wikiposta@gmail.com wrote:
Thank you! I never used this module. I want to physically sort rows and columns of a table, so I have to change the elements of a given table. My first thouught was write it from scratch with strings, but I hope, this helps.
JJMC89 jjmc89.wikimedia@gmail.com ezt írta (időpont: 2022. dec. 22., Cs, 3:21):
mwparserfromhell.nodes.tag.Tag https://mwparserfromhell.readthedocs.io/en/v0.6.4/api/mwparserfromhell.nodes.html#mwparserfromhell.nodes.tag.Tag can represent any HTML-style tag. mwparserfromhell will parse wikitext tables into the HTML tag equivalents https://en.wikipedia.org/wiki/Help:Table#Comparison_of_table_syntax.
Doing something like the below will let you visualize:
text = "before\n{| class='wikitable'\n!c1 h!!c2 h\n|r1 c1\n|r1
c2\n|}\nafter"
for tag in wikicode.ifilter_tags(matches=lambda tag: tag.tag.lower()
in ("table", "caption", "th", "tr", "td")): ... print(f"{tag!r} is a {tag.tag!r}") ... "{| class='wikitable'\n!c1 h!!c2 h\n|r1 c1\n|r1 c2\n|}" is a 'table' '!c1 h' is a 'th' '!!c2 h\n' is a 'th' '|r1 c1\n' is a 'td' '|r1 c2\n' is a 'td'
On Wed, Dec 21, 2022 at 5:04 PM Bináris wikiposta@gmail.com wrote:
I don't find the description of objects and available tags in documentation, can you help me? Are there cells, rows, columns, headings? _______________________________________________ pywikibot mailing list -- pywikibot@lists.wikimedia.org To unsubscribe send an email to pywikibot-leave@lists.wikimedia.org
pywikibot mailing list -- pywikibot@lists.wikimedia.org To unsubscribe send an email to pywikibot-leave@lists.wikimedia.org
-- Bináris _______________________________________________ pywikibot mailing list -- pywikibot@lists.wikimedia.org To unsubscribe send an email to pywikibot-leave@lists.wikimedia.org
William Avery willm.avery@gmail.com ezt írta (időpont: 2022. dec. 22., Cs, 11:12):
FWIW, I wrote a bot script to sort some tables a while ago.
https://bitbucket.org/WilliamAvery/wikipythonics/src/master/draftPickSortBot...
Thank you, I will test it!
I tried to list the cells in a table row with __children__ as written at https://mwparserfromhell.readthedocs.io/en/latest/api/mwparserfromhell.nodes...
This is a very ugly playing code that iterates on all table rows, then after the for loop it repeats the last row, then iterates its children. Result: the last row is repeated on screen. So all the children of a tr is itself, not table cells.
for row in wikicode.ifilter_tags(matches=lambda tag: tag.tag.lower()=="tr"): print(row) print('__') print('______') print(row) for elem in row.__children__(): print(elem) print('__')