Merlijn van Deen wrote:
make sure both textFrom and all elements in fields are unicode.
That's the piece of code that I can't make it work yet, due to these unicode problems:
--- def getText(block): lines = block.split('\n') for line in lines: textMatches = extract_text(line) if textMatches: for a in textMatches: textMatch = a return textMatch
pat_county = re.compile(r'''(?x)({{Geobox_contea|)(.*)}}''') extract_text = lambda s: [u[1] for u in re.findall(pat_county, s)] county = getText(text)
pat_inhab = re.compile(r'''(?x)(abitanti=)([^|}]*)''') extract_text = lambda s: [u[1] for u in re.findall(pat_inhab, s)] inhab = getText(text)
textForm = """%(municName)s è un [[Comuni della Svezia|comune]] [[Svezia|svedese]] situato nella [[Contee della Svezia|contea]] di %(countyName)s."""
fields = {'municName':munic, 'numInhab':inhab}
incipit = textForm % fields
p = re.compile('(.*) una [[municipalità]](.*);(.*)[\capoluogo]](.*).')
text = p.sub(incipit, text) ---
The 'for' loop in the 'if' statement is trivial since there is for sure only one occurrence of the text (if there is), but I couldn't make it work without it.
Thanks again Raffaello