Hello,
Some of you may be familiar with sharing links on Facebook.
If i try to share on Facebook an article from Wikipedia in Hebrew or in
Arabic, the Facebook interface offers to add a thumbnail not from the
article, but from the main page of that Wikipedia. The summary text
describing the link is also taken from the main page and not from the
article.
It may have something to do with the non-ASCII title of the article: for
example, the Hebrew article about Haifa [1] has this problem, but the Hebrew
article about Extreme Programming,[2] with an English title, is OK. However,
it does not happen with non-ASCII titles on en.wp; for example "Ça m'énerve"
is OK.
Whose should fix it: Facebook or Wikipedia? Are there any differences in
definitions in this regard between en, ar and he?
Thanks in advance.
[1] http://he.wikipedia.org/wiki/%D7%97%D7%99%D7%A4%D7%94
[2] http://he.wikipedia.org/wiki/Extreme_Programming
[3] http://en.wikipedia.org/wiki/%C3%87a_m%27%C3%A9nerve
--
אמיר אלישע אהרוני
Amir Elisha Aharoni
http://aharoni.wordpress.com
"We're living in pieces,
I want to live in peace." - T. Moore
Hi,
I tried using mwdumper (latest SVN revision 57818)
to import jawiki-20090927-pages-articles.xml [1]
into MySQL, but I got an error:
Data too long for column 'rev_comment'
The problem is that the xml file contains a revision
comment that is 257 bytes long, but the column
accepts at most 255 bytes.
First I was stumped as to how this could happen,
but then I found that on the Wikipedia page, the
comment ends with the byte 'e3', while in the
xml file it ends with 'ef bf bd'. See [2] for details.
I think the cause is something like this:
- Comments are truncated to 255 bytes when they
are stored.
- In this case, this means that a three-byte UTF-8
sequence is cut off after its first byte (hex value e3),
so the comment ends with an invalid one-byte UTF-8
sequence.
- The dump process has to generate valid UTF-8
(otherwise, most XML parsers wouldn't accept
the file), so it replaces the invalid one-byte UTF-8
sequence by the 'replacement character' U+FFFD,
which has the three-byte UTF-8 sequence 'ef bf bd'.
See [3].
- In this case, the comment grows from 255 bytes
to 257 bytes.
How to fix this? I think MediaWiki should make sure
that a comment contains only valid UTF-8 sequences,
even when it is truncated. This may mean that it
has to be truncated to less than 255 bytes.
Alternatively, the dump process could drop invalid
UTF-8 sequences instead of replacing them.
Yet another fix: mwdumper should make sure
that a comment is at most 255 bytes long and
truncate it if necessary.
More details can be found at [2].
Bye,
Christopher
[1] http://download.wikimedia.org/jawiki/20090927/jawiki-20090927-pages-article…
[2] http://en.wikipedia.org/wiki/User:Chrisahn/CommentTooLong
[3] http://www.utf8-chartable.de/unicode-utf8-table.pl?start=65520
unsubscribe please
thank you!
2009/10/17 Q <overlordq(a)gmail.com>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> Max Semenik wrote:
> >> On 10/15/09 7:52 AM, Tim Starling wrote:
> >>> Max Semenik (maxsem) to work on SQLite support.
> >>>
> >>> -- Tim Starling
> >> Thanks for getting the updates going Tim!
> >
> > Sorry for appearing so late in this discussion, I unsubscribed myself
> > from this list after withdrawing my request and noticed Tim's email
> > only this morning.
> >
> > Now to the point.
> >
> > I'm terribly sorry for this unbelievable display of dickheadery from
> > me. Yes, I was pissed off because due to the long waiting period I lost
> > the passion to participate in development, and regaining it back is
> > still to happen, despite Tim's incredible kindness in approving me
> > anyway, despite my ugly behaviour. However, this did not gave me the
> > right to attack Brion publicly, especially in such tone, and to
> > generate this drama. Please accept my humble apologies.
> >
> > Guess the only way to redeem myself would be to do my best in
> > contributing to MediaWiki.
> >
> > Once again, I'm sorry. [strikes the wall with his head]
> >
>
> If there was a Drama Jar that people had to put a $1 in for silly
> pointless drama related to wikipedia, WMF wouldn't have to do
> fundraisers anymore.
>
> So I really wouldn't worry too much about it :)
>
> - -Q
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (MingW32)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEAREIAAYFAkrZzVMACgkQ69PBoSWyJd6U3gCfQFsmn/QjI7bGzlgJEwBNH1jc
> ET4AniY4TGT/PNzGPw5prklnGM1zfa2G
> =UDyK
> -----END PGP SIGNATURE-----
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
All,
this is a quick note to let you know that we've signed on two people,
William Pietri and Howie Fung, to help us on a contract basis with the
deployment of Flagged Revisions on the English Wikipedia.
William is an IT consultant and systems/software engineer; see his
userpage at http://en.wikipedia.org/wiki/User:William_Pietri --
he is also a long-time Wikipedian. He will support the overall
roll-out coordination and requirements planning.
Howie is an experienced Product Manager who previously worked with
Real Networks/Rhapsody and PayPal; see LinkedIn profile here:
<http://www.linkedin.com/profile?viewProfile=&key=3830901>.
We met Howie during our search for the Multimedia Usability Product
Manager position, and were impressed by his background, particularly
with regard to user focused product development, and some great first
thoughts he sent us on how to improve the usability of Wikimedia
Commons. Given his background, we thought it would be great to have
his help in doing some more systematic analysis of usability,
terminology and workflow issues with the proposed English Wikipedia
roll-out.
We are eager to roll out the "Flagged Protection" functionality soon.
As Howie and William get up to speed, they'll post info on remaining
work, and get community feedback on what's vital to have before the
first release.
--
Erik Möller
Deputy Director, Wikimedia Foundation
Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate
You know...that /tests directory. It always keeps fooling
people into thinking we keep unit tests there ;-)
-Chad
On Oct 16, 2009 2:29 PM, "Aryeh Gregor"
<Simetrical+wikilist(a)gmail.com<Simetrical%2Bwikilist(a)gmail.com>>
wrote:
On Fri, Oct 16, 2009 at 2:26 PM, William Pietri <william(a)scissor.com> wrote:
> [1] If I end up with ...
What existing test suite?
_______________________________________________ Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia....
Hi all,
Is it possible to find out how many times any given image has been
viewed?
I know that this can be done on Wikipedia (e.g. [1]), but can it be
done on Commons?
Ideally, I'd like to know how many times an image is viewed by all
sites that use Commons as their image repository, preferably using
the API. I see that there is a "counter" option in the API, but that
doesn't seem to be working [2].
It's very useful to know the number of times images have been viewed
when talking to people about releasing their images onto Commons.
This was mentioned in one of the outcomes of the GLAM-WIKI event:
"Provide statistics on visits to articles and images (including
thumbnails loads) related to cultural institutions, information or
images provided by them, or articles about their collections, and
work with GLAM institutions on adding this to their existing
metrics." A lot of this can be done externally if the information can
be provided by the API or some other means.
Thanks,
Mike
[1] http://stats.grok.se/en/200909/File%3AQueen%20Wilhelmina2.jpg
[2[ http://commons.wikimedia.org/w/api.php?
action=query&titles=File:Queen%
20Wilhelmina2.jpg&prop=info&inprop=protection|talkid
[3] http://meta.wikimedia.org/wiki/GLAM-WIKI_Recommendations
I've got that page on my watchlist and it doesn't change
often. Checking it when it does is easy. The problem is
finding someone who a) is willing to be that person and
b) is trusted enough to make the decision to grant or
not.
-Chad
On Oct 14, 2009 11:49 AM, "Aryeh Gregor"
<Simetrical+wikilist(a)gmail.com<Simetrical%2Bwikilist(a)gmail.com>>
wrote:
On Wed, Oct 14, 2009 at 11:42 AM, Gerard Meijssen <gerard.meijssen(a)gmail.com>
wrote: > Checking some...
Not if it doesn't change on most days . . . load it up with all the
sites you check every day and take ten seconds to see if there are any
new requests. Or subscribe to the RSS feed, or whatever you want.
_______________________________________________ Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia....
Actions like [1] cannot possibly be considered a good thing, but I certainly
understand the sentiment behind it. Attracting new contributors is probably
the most vital task of any collaborative volunteer project; without them,
any such program is doomed to failure. This is an issue that affects
Wikimedia as a whole, but MW development is arguably one of the least
successful subprojects at this activity.
Who is responsible for commit access requests now that Brion has left? That
applies both in the long-term, once the new CTO is hired, and in the short
term, for those requests that are in the queue Right Now. Whoever is
responsible, needs as a matter of urgency to both clear the current queue,
and to establish an ongoing procedure to clear it more regularly in future.
Brion's last review was on 4th August, and left several requests unresolved.
One of those requests is now over four months old. This is not an
acceptable or sustainable gauntlet to put eager new contributors through.
--HM
[1]
http://www.mediawiki.org/w/index.php?title=Commit_access_requests&curid=369…