Hello,
Names with non-Latin characters in the donation comments are broken and outputting as question marks. Some people are understandably unhappy that their names are not appearing next to their donations. For example, see < http://wikimediafoundation.org/wiki/Special:ContributionHistory?offset=12280...
.
(Thanks to [[ja:user:Aotake]] for pointing it out in #wikimedia.)
this has already been reported. see the previous email from brion regarding the cause.
mark
On Sun, Nov 30, 2008 at 9:52 AM, Jesse Plamondon-Willard < pathoschild@gmail.com> wrote:
Hello,
Names with non-Latin characters in the donation comments are broken and outputting as question marks. Some people are understandably unhappy that their names are not appearing next to their donations. For example, see <
http://wikimediafoundation.org/wiki/Special:ContributionHistory?offset=12280...
.
(Thanks to [[ja:user:Aotake]] for pointing it out in #wikimedia.)
-- Yours cordially, Jesse Plamondon-Willard (Pathoschild)
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Sun, Nov 30, 2008 at 10:52 AM, Jesse Plamondon-Willard < pathoschild@gmail.com> wrote:
Hello,
Names with non-Latin characters in the donation comments are broken and outputting as question marks. Some people are understandably unhappy that their names are not appearing next to their donations. For example, see <
http://wikimediafoundation.org/wiki/Special:ContributionHistory?offset=12280...
.
here:
http://wikimediafoundation.org/wiki/Donate/Now/en
Is me or maybe <form> need charset="*UTF-8*" added to it? withouth it, browser chose the charset to send data from the moon phase.
Tei wrote:
On Sun, Nov 30, 2008 at 10:52 AM, Jesse Plamondon-Willard pathoschild@gmail.com wrote:
Hello,
Names with non-Latin characters in the donation comments are broken and outputting as question marks. Some people are understandably unhappy that their names are not appearing next to their donations. For example, see <
http://wikimediafoundation.org/wiki/Special:ContributionHistory?offset=12280...
.
here:
http://wikimediafoundation.org/wiki/Donate/Now/en
Is me or maybe <form> need charset="*UTF-8*" added to it? withouth it, browser chose the charset to send data from the moon phase.
There's not attribute charset (there's accept-charset). Still, mediawiki doesn't use it and it doesn't have such errors. When not providing an accept-charset, the user agents should use the charset on which the page was transmitted (UTF-8).
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Sun, Nov 30, 2008 at 10:52 AM, Jesse Plamondon-Willard wrote:
Names with non-Latin characters in the donation comments are broken and outputting as question marks. Some people are understandably unhappy that their names are not appearing next to their donations.
*Comments* are fine. *Donor names* are incorrectly encoded.
Tei wrote:
Is me or maybe <form> need charset="*UTF-8*" added to it?
Considering that the part that's broken isn't even *on* our form, I'm pretty sure it's not something on our form. :) The name gets put in at PayPal's forms, and is passed on to us with the payment completion data.
- -- brion
On Mon, Dec 1, 2008 at 7:02 PM, Brion Vibber brion@wikimedia.org wrote:
Tei wrote:
Is me or maybe <form> need charset="*UTF-8*" added to it?
Considering that the part that's broken isn't even *on* our form, I'm pretty sure it's not something on our form. :) The name gets put in at PayPal's forms, and is passed on to us with the payment completion data.
Can you reverse the buggy encoding of Paypal (iconv)?
Marco
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Marco Schuster wrote:
On Mon, Dec 1, 2008 at 7:02 PM, Brion Vibber brion@wikimedia.org wrote:
Tei wrote:
Is me or maybe <form> need charset="*UTF-8*" added to it?
Considering that the part that's broken isn't even *on* our form, I'm pretty sure it's not something on our form. :) The name gets put in at PayPal's forms, and is passed on to us with the payment completion data.
Can you reverse the buggy encoding of Paypal (iconv)?
Assuming it's buggy, perhaps. First we need to confirm simply that PayPal is set to send us UTF-8 data... :)
- -- brion
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
mizusumashi wrote:
Brion Vibber wrote:
Assuming it's buggy, perhaps. First we need to confirm simply that PayPal is set to send us UTF-8 data... :)
Can you send me the raw data fed from PayPal? If you can, I'll check Japanese character's encoding.
No as it contains private data, but we can check the encoding ourselves once we've gotten in there.
- -- brion
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
mizusumashi wrote:
Brion Vibber wrote:
No as it contains private data, but we can check the encoding ourselves once we've gotten in there.
I see. Thank you for the reply.
Ok, I believe we've tracked it down -- the data came through fine from PayPal, and into our primary databases, but got corrupting during copying from the primary database to the reporting database.
The corruption appears to have been due to a temporary variable in the trigger script using the default server encoding (Latin-1) instead of the encoding of the related databases (UTF-8).
I've updated the trigger to define the charset on the variable, which should hopefully fix it for new entries, and have rebuilt the reporting database so old entries are now correct.
- -- brion
Brion Vibber wrote:
Ok, I believe we've tracked it down -- the data came through fine from PayPal, and into our primary databases, but got corrupting during copying from the primary database to the reporting database.
The corruption appears to have been due to a temporary variable in the trigger script using the default server encoding (Latin-1) instead of the encoding of the related databases (UTF-8).
I've updated the trigger to define the charset on the variable, which should hopefully fix it for new entries, and have rebuilt the reporting database so old entries are now correct.
Thank you very much!
I see that some (maybe all) Japanese names are correctly displayed. I am very glad thanks to your work.
But I have a very few dissatisfaction. Surname are displayed after personal name. As you know, in east Asia we write surname and personal name in this order.
If you had good way to correct the order of only name of east Asian (and people in other countries that have surname-personal name ordering), please do it. If not, I guess that the better way is to display like "Surname, Personal name" in all people.
This dissatisfaction is not so strong. But I feel more happy if you did anything for this problem.
By the way, I sent some mails to ML wikitech-l. But they are not in the Archive. Why?
---- mizusumashi
mizusumashi schreef:
By the way, I sent some mails to ML wikitech-l. But they are not in the Archive. Why?
Mails don't always show up immediately. Also, the archives are grouped per month, so you may have been trying to find e-mails sent in late November in the December archives.
Roan Kattouw (Catrope)
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
mizusumashi wrote:
I see that some (maybe all) Japanese names are correctly displayed. I am very glad thanks to your work.
Yay!
But I have a very few dissatisfaction. Surname are displayed after personal name. As you know, in east Asia we write surname and personal name in this order.
Hmmmmmm... we'll see if we get a display ordering or if we can arrange something else nice...
- -- brion
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Brion Vibber wrote:
mizusumashi wrote:
I see that some (maybe all) Japanese names are correctly displayed. I am very glad thanks to your work.
Yay!
But I have a very few dissatisfaction. Surname are displayed after personal name. As you know, in east Asia we write surname and personal name in this order.
Hmmmmmm... we'll see if we get a display ordering or if we can arrange something else nice...
Ok, quick summary:
1) PayPal sends us a payment record with 'first_name' and 'last_name' fields.
2) We insert that record into our CiviCRM database.
3) CiviCRM combines the first name and last name into a "display name"... per standard Western ordering assumptions.
4) The display name is copied into our public reporting database and shown on the web.
It looks like we can't do much about the name split in 1); that's just what we get out of the payment processor. We may be able to fudge things at step 3) by detecting Han characters and producing a properly-sorted display name, at least for that case.
Of course this will still be wrong for Hungarians, and Romanized Japanese names may often get written either way...
- -- brion
On Wed, Dec 3, 2008 at 7:56 PM, Brion Vibber brion@wikimedia.org wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Brion Vibber wrote:
mizusumashi wrote:
I see that some (maybe all) Japanese names are correctly displayed. I am very glad thanks to your work.
Yay!
But I have a very few dissatisfaction. Surname are displayed after personal name. As you know, in east Asia we write surname and personal name in this order.
Hmmmmmm... we'll see if we get a display ordering or if we can arrange something else nice...
Ok, quick summary:
- PayPal sends us a payment record with 'first_name' and 'last_name'
fields.
We insert that record into our CiviCRM database.
CiviCRM combines the first name and last name into a "display
name"... per standard Western ordering assumptions.
- The display name is copied into our public reporting database and
shown on the web.
It looks like we can't do much about the name split in 1); that's just what we get out of the payment processor. We may be able to fudge things at step 3) by detecting Han characters and producing a properly-sorted display name, at least for that case.
Of course this will still be wrong for Hungarians, and Romanized Japanese names may often get written either way...
Thank you for considering Hungarian. You could detect Hungarians by simply looking for donations in Hungarian Forints (HUF).
Best regards, Bence Damokos
- -- brion
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.8 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAkk21moACgkQwRnhpk1wk47rgACg31a0iArCTSyHfQ/Sutv4zorh wjYAni4MbNRDwgtQderCNvGjnQziGGM5 =0p5I -----END PGP SIGNATURE-----
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Roan Kattouw wrote:
Bence Damokos schreef:
Thank you for considering Hungarian. You could detect Hungarians by simply looking for donations in Hungarian Forints (HUF).
Note that not all people who live in Hungary have Hungarian names, and not all Hungarians live in Hungary.
Basically, names are hard. :)
The only way to do it right reliably is to let the person type in their name the way they want it and then *not change it*.
Unfortunately we get the name already divided up from PayPal and are stuck either guessing or making an unattractive 'Surname, Given' display which looks bad for everyone. :(
- -- brion
Unfortunately we get the name already divided up from PayPal and are stuck either guessing or making an unattractive 'Surname, Given' display which looks bad for everyone. :(
There is something to be said for annoying everyone equally. Being an international organisation is very important for the foundation, it may well be worth annoying (non-Hungarian) westerners unnecessarily in order to show that we're not favouring any nationalities over others. (This is all assuming people that use the Surname-Given name order will actually care - they may all be so used to having their names mangled that they barely notice anymore. A little market research may be called for.)
On Wed, Dec 3, 2008 at 10:01 PM, Roan Kattouw roan.kattouw@home.nl wrote:
Bence Damokos schreef:
Thank you for considering Hungarian. You could detect Hungarians by
simply
looking for donations in Hungarian Forints (HUF).
Note that not all people who live in Hungary have Hungarian names, and not all Hungarians live in Hungary.
As there are no such data released (you can't filter donations by currency, or even better currency+location) so I'm just guessing that those donating in forints are mostly (~100%) Hungarians, while there is no easy way to find the Hungarians among those not donating in forints. I didn't want to elaborate on this in my previous mail, but as long as the surname - first name order is not considered wrong, strange or out of place in the context of English, and possibly other languages, than using this order would be a win - win (it would be still acceptable on the English/other interfaces, and on the Hungarian interface it would be correct). However, most Hungarians themselves use the Western order to name themselves in English (and I guess in most foreign languages and contexts) so the Western order would be correct on every interface language (except possibly in those countries that use the non-Western order) except Hungarian (but I dare say that people don't/wouldn't mind it, as they understand that the context is mostly English [website of an American foundation, even the currencies look 'foreign']). In conclusion, I would let the Hungarians' name's rest for this year :).
Unfortunately we get the name already divided up from PayPal and are
stuck either guessing or making an unattractive 'Surname, Given' display which looks bad for everyone. :(
You have a box for comments, that is independent from the PayPal people. Maybe a solution would be to have 3 options instead of two at the privacy checkbox: Display my name [default], Anonymous donation, Display a custom name [this could work possibly for donating in someone other's name, if that's not a privacy concern]. -- Bence Damokos (Damokos Bence in Hungary)
Roan Kattouw (Catrope)
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
(long, complex solutions to guess the right display)
Why not have a "Show Name, Surname / Show Surname, Name" option on the donation display? Easy, consistent, and everybody should be happy with it.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Platonides wrote:
(long, complex solutions to guess the right display)
Why not have a "Show Name, Surname / Show Surname, Name" option on the donation display? Easy, consistent, and everybody should be happy with it.
Because it would show everything wrong? :)
- -- brion
Brion Vibber wrote:
Platonides wrote:
(long, complex solutions to guess the right display)
Why not have a "Show Name, Surname / Show Surname, Name" option on the donation display? Easy, consistent, and everybody should be happy with it.
Because it would show everything wrong? :)
-- brion
Why? West names would be shown with the 'wrong' order when viewed with the East setting, and viceversa. But it'd be a client setting, so anyone can view the list on the order which fits him most.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Platonides wrote:
Brion Vibber wrote:
Platonides wrote:
(long, complex solutions to guess the right display) Why not have a "Show Name, Surname / Show Surname, Name" option on the donation display? Easy, consistent, and everybody should be happy with it.
Because it would show everything wrong? :)
-- brion
Why? West names would be shown with the 'wrong' order when viewed with the East setting, and viceversa. But it'd be a client setting, so anyone can view the list on the order which fits him most.
Because the order is dependent on the listed name, not on the viewer.
- -- brion
wikitech-l@lists.wikimedia.org