Wikitech-l May 2010

wikitech-l@lists.wikimedia.org

90 participants
59 discussions

Unicode equivalence
by Praveen Prakash 30 May '10

30 May '10

Hi, I am from Malayalam Wikipedia (ml.wikipedia - user:Praveenp), and my language is Malayalam. Consider our one big problem. After the release of Unicode 5.1.0, there are two kind of encoding for some characters of Malayalam alphabet (because of reverse compatibility). This cause serious problems in linking, searching etc in mediawiki software. Currently Windows 7 is the only operating system which supports Unicode 5.1.0. (? according to my knowledge), but lot of third-party tools for writing and reading Malayalam supports new version. And now large quantity of data in Wikimedia projects are in new version. It is not possible to link, or search titles encoded in pre-Unicode 5.1.0 from Unicode 5.1.0 or vice versa. Currently one of our namespace ???????? (Category) also has one such character, so it is possible to write ???????? as ?????? which renders same as first but different in encoding. It causes problem in categorization also. Is it possible to put some unicode equivalence <http://en.wikipedia.org/wiki/Unicode_equivalence> in mediawiki software? We need urgent help. Pls check http://unicode.org/versions/Unicode5.1.0/#Malayalam_Chillu_Characters also *Visual * *Representation in 5.0 and Prior* *Preferred 5.1 Representation* 1 CHILLU_NN.png 0D23, 0D4D, 200D 0D7A CHILLU_N.png 0D28, 0D4D, 200D 0D7B 3 CHILLU_RR.png 0D30, 0D4D, 200D 0D7C 4 CHILLU_L.png 0D32, 0D4D, 200D 0D7D 5 CHILLU_LL.png 0D33, 0D4D, 200D 0D7E * *Thanks Wikipedia Affiliate Button <http://wikimediafoundation.org/wiki/Support_Wikipedia/en>

13 25

Ideatorrent
by Ryan Lane 29 May '10

29 May '10

A few months ago I created an Ideatorrent site on request of the Wikipedia Usability Initiative team. We wanted to have integrated authentication with the rest of Wikimedia before promoting it, but that will likely be a while, and we think Ideatorrent could be useful to the community. For now, you'll need to create an account. So, here's the link: http://prototype.wikimedia.org/en-idea/ I thought I'd start it off by adding an initial idea: http://prototype.wikimedia.org/en-idea/ideatorrent/idea/4/ 1-3 were tests. As of right now I am the only moderator/admin. If anyone has ideas on how we should handle moderation and/or admin rights, please let me know. Respectfully, Ryan Lane

3 5

Re: [Wikitech-l] [gsoc] splitting the img_metadata field into a new table
by Markus Krötzsch 28 May '10

28 May '10

Hi Bawolff, interesting project! I am currently preparing a "light" version of SMW that does something very similar, but using wiki-defined properties for adding metadata to normal pages (in essence, SMW is an extension to store and retrieve page metadata for properties defined in the wiki -- like XMP for MW pages; though our data model is not quite as sophisticated ;-). The use cases for this light version are just what you describe: simple retrieval (select) and basic inverse searches. The idea is to thus have a solid foundation for editing and viewing data, so that more complex functions like category intersections or arbitrary metadata conjunctive queries would be done on external servers based on some data dump. It would be great if the table you design could be used for such metadata as well. As you say, XMP already requires extensibility by design, so it might not be too much work to achieve this. SMW properties are usually identified by pages in the wiki (like categories), so page titles can be used to refer to them. This just requires that the meta_name field is long enough to hold MW page title names. Your meta_schema could be used to separate wiki properties from other XMP properties. SMW Light does not require nested structures, but they could be interesting for possible extensions (the full SMW does support one-level of nesting for making compound values). Two things about your design I did not completely understand (maybe just because I don't know much about XMP): (1) You use mediumblob for values. This excludes range searches for numerical image properties ("Show all images of height 1000px or more") which do not seem to be overly costly if a suitable schema were used. If XMP has a typing scheme for property values anyway, then I guess one could find the numbers and simply put them in a table where the value field is a number. Is this use case out of scope for you, or do you think the cost of reading from two tables too high? One could also have an optional helper field "meta_numvalue" used for sorting/range-SELECT when it is known from the input that the values that are searched for are numbers. (2) Each row in your table specifies property (name and schema), type, and the additional meta_qualifies. Does this mean that one XMP property can have values of many different types and with different flags for meta_qualifies? Otherwise it seems like a lot of redundant data. Also, one could put stuff like type and qualifies into the mediumblob value field if they are closely tied together (I guess, when searching for some value, you implicitly specify what type the data you search for has, so it is not problematic to search for the value + type data at once). Maybe such considerations could simplify the table layout, and also make it less specific to XMP. But overall, I am quite excited to see this project progressing. Maybe we could have some more alignment between the projects later on (How about combining image metadata and custom wiki metadata about image pages in queries? :-) but for GSoC you should definitely focus on your core goals and solve this task as good as possible. Best regards, Markus On Freitag, 28. Mai 2010, bawolff wrote: > Hi all, > > For those who don't know me, I'm one of the GSOC students this year. > My mentor is ^demon, and my project is to enhance support for metadata > in uploaded files. Similar to the recent thread on interwiki > transclusions, I'd thought I'd ask for comments about what I propose > to do. > > Currently metadata is stored in img_metadata field of the image table > as a serialized php array. Well this works fine for the primary use > case - listing the metadata in a little box on the image description > page, its not very flexible. Its impossible to do queries like get a > list of images with some specific metadata property equal to some > specific value, or get a list of images ordered by what software > edited them. > > So as part of my project I would like to move the metadata to its own > table. However I think the structure of the table will need to be a > little more complicated then just <page id>, <name>, <value> triples, > since ideally it would be able to store XMP metadata, which can > contain nested structures. XMP metadata is pretty much the most > complex metadata format currently popular (for metadata stored inside > images anyways), and can store pretty much all other types of > metadata. Its also the only format that can store multi-lingual > content, which is a definite plus as those commons folks love their > languages. Thus I think it would be wise to make the table store > information in a manner that is rather close to the XMP data model. > > So basically my proposed metadata table looks like: > > *meta_id - primary key, auto-incrementing integer > *meta_page - foreign key for page_id - what image is this for > *meta_type - type of entry - simple value or some sort of compound > structure. XMP supports ordered/unordered lists, associative array > type structures, alternate array's (things like arrays listing the > value of the property in different languages). > *meta_schema - xmp uses different namespaces to prevent name > collisions. exif properties have their own namespace, IPTC properties > have their own namespace, etc > *meta_name - The name of the property > *meta_value - the value of the property (or null for some compound > things, see below) > *meta_ref - a reference to a meta_id of a different row for nested > structures, or null if not applicable (or 0 perhaps) > *meta_qualifies - boolean to denote if this property is a qualifier > (in XMP there are normal properties and qualifiers) > > (see http://www.mediawiki.org/wiki/User:Bawolff/metadata_table for a > longer explanation of the table structure) > > Now, before everyone says eww nested structures in a db are > inefficient and what not, I don't think its that bad (however I'm new > to the whole scalability thing, so hopefully someone more > knowledgeable than me will confirm or deny that). > > The XMP specification specifically says that there is no artificial > limit on nesting depth, however in general practise its not nested > very deeply. Furthermore in most cases the tree structure can be > safely ignored. Consider: > *Use-case 1 (primary usecase), displaying a metadata info box on an > image page. Most of the time that'd be translating specific name and > values into html table cells. The tree structure is totally > unnecessary. for example the exif property DateTimeOriginal can only > appear once per image (also it can only appear at the root of the tree > structure but thats beside the point). There is no need to reconstruct > the tree, just look through all the props for the one you need. If the > tree structure is important it can be reconstructed on the php side, > and would typically be only the part of the tree that is relevant, not > the entire nested structure. > *Use-case 2 (secondary usecase). Get list of images ordered by some > property starting at foo. or get list of images where property bar = > baz. In this case its a simple select. It does not matter where in the > tree structure the property is. > > Thus, all the nestedness of XMP is preserved (So we could re-output it > into xmp form if we so desired), and there is no evil joining the > metadata table with itself over and over again (or at all), which from > what i understand, self-joining to reconstruct nested structures is > what makes them inefficient in databases. > > I also think this schema would be future proof because it can store > pretty much all metadata we can think of. We can also extend it with > custom properties we make up that are guaranteed to not conflict with > anything (The X in xmp is for extensible). > > As a side-note, based on my rather informal survey of commons (aka the > couple people who happened to be on #wikimedia-commons at that moment) > another use-case people think would be cool and useful is metadata > intersections, and metadata-category intersections. I'm not planning > to do this as part of my project, as I believe that would have > performance issues. However doing a metadata table like this does > leave the possibility open for people to do such intersection things > on the toolserver or in a DPL-like extension. > > I'd love to get some feedback on this. Is this a reasonable approach > for me to take on this. > > Thanks for reading. > > -- > -bawolff > > _______________________________________________ > Wikitech-l mailing list > Wikitech-l(a)lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > -- Markus Krötzsch <markus(a)semantic-mediawiki.org> * Personal page: http://korrekt.org * Semantic MediaWiki: http://semantic-mediawiki.org * Semantic Web textbook: http://semantic-web-book.org --

2 2

Revisiting becoming an OpenID Provider
by Robb Shecter 28 May '10

28 May '10

Here's the last post I could find on the subject: > For my part, I'm firmly against joining the "provider but not > consumer" camp. It's of no benefit to anyone . . . I just thought of a great benefit, however. Consider this true scenario: I want to write a MediaWiki API client for editors; something like the Wordpress Dashboard. Really give editors a modern web experience. I'd want to do this as a Rails app: I could build it quickly and find lots of collaborators via GitHub. But there's one problem: people would need to log in to Wikipedia *through my app*. They'd have to enter their username and password to my app, which would turn around an authenticate via the MediaWiki API. Policy-wise, this isn't a good thing; that is, giving people the message that it's ok to type in your credentials to something other than Wikipedia sites. And I believe that this is why no such app exists. And further, why the only similar apps that have been made were fat clients, and e.g. Windows only. Because then, the credentials stay on the user's computer. But imagine: If Wikipedia was an OpenID Provider, or provided OAuth, then my Rails app would be the OpenID Consumer. It'd send people to Wikipedia to log in, and they'd bounce back and begin using the Rails app. My app would never see any private information. I believe this would encourage a new wave of 3rd party app development; everything from big ambitious projects (like my editor dashboard) to small focussed apps (say, a simple web app just for editing one's talk page). Just thinking out loud here! Robb

7 12

MediaWiki security update: 1.15.4 and 1.16.0beta3
by Tim Starling 28 May '10

28 May '10

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 This is a security and bugfix release of MediaWiki 1.15.4 and MediaWiki 1.16 beta 3. Two security vulnerabilities were discovered. Kuriaki Takashi discovered an XSS vulnerability in MediaWiki. It affects Internet Explorer clients only. The issue is presumed to affect all recent versions of IE, it has been confirmed on IE 6 and 8. Noncompliant CSS parsing behaviour in Internet Explorer allows attackers to construct CSS strings which are treated as safe by previous versions of MediaWiki, but are decoded to unsafe strings by Internet Explorer. Full details can be found at: https://bugzilla.wikimedia.org/show_bug.cgi?id=23687 A CSRF vulnerability was discovered in our login interface. Although regular logins are protected as of 1.15.3, it was discovered that the account creation and password reset features were not protected from CSRF. This could lead to unauthorised access to private wikis. See https://bugzilla.wikimedia.org/show_bug.cgi?id=23371 for details. These vulnerabilities are serious and all users are advised to upgrade. Remember that CSRF and XSS vulnerabilities can be used even against firewall-protected intranet installations, as long as the attacker can guess the URL. In addition to the security fix, MediaWiki 1.16 beta 3 also contains many useful bug fixes to 1.16 beta 2. We expect to be able to do a stable release of the 1.16 branch within the next week or two. Both releases contain localisation updates courtesy of translatewiki.net. Full release notes: http://svn.wikimedia.org/svnroot/mediawiki/tags/REL1_15_4/phase3/RELEASE-NO… http://svn.wikimedia.org/svnroot/mediawiki/tags/REL1_16_0beta3/phase3/RELEA… ********************************************************************** 1.15.4 ********************************************************************** Download: http://download.wikimedia.org/mediawiki/1.15/mediawiki-1.15.4.tar.gz Patch to previous version (1.15.3), without interface text: http://download.wikimedia.org/mediawiki/1.15/mediawiki-1.15.4.patch.gz Interface text changes: http://download.wikimedia.org/mediawiki/1.15/mediawiki-i18n-1.15.4.patch.gz GPG signatures: http://download.wikimedia.org/mediawiki/1.15/mediawiki-1.15.4.tar.gz.sig http://download.wikimedia.org/mediawiki/1.15/mediawiki-1.15.4.patch.gz.sig http://download.wikimedia.org/mediawiki/1.15/mediawiki-i18n-1.15.4.patch.gz… Public keys: https://secure.wikimedia.org/keys.html ********************************************************************** 1.16 beta 3 ********************************************************************** Download: http://download.wikimedia.org/mediawiki/1.16/mediawiki-1.16.0beta3.tar.gz Patch to previous version (1.16.0beta2), without interface text: http://download.wikimedia.org/mediawiki/1.16/mediawiki-1.16.0beta3.patch.gz Interface text changes: http://download.wikimedia.org/mediawiki/1.16/mediawiki-i18n-1.16.0beta3.pat… GPG signatures: http://download.wikimedia.org/mediawiki/1.16/mediawiki-1.16.0beta3.tar.gz.s… http://download.wikimedia.org/mediawiki/1.16/mediawiki-1.16.0beta3.patch.gz… http://download.wikimedia.org/mediawiki/1.16/mediawiki-i18n-1.16.0beta3.pat… Public keys: https://secure.wikimedia.org/keys.html -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkv/c34ACgkQgkA+Wfn4zXnCTgCfb7CnMBkZZpcffdUauy8i4LAV KN4Anj41b/jPfzqZwNfmMIH1/8NMaG9/ =k2UI -----END PGP SIGNATURE-----

1 0

Re: [Wikitech-l] Selenium testing framework- Firefox browsers compatible
by Michelle Knight 27 May '10

27 May '10

Hi Dan, There is a list of browsers compatible with Selenium (See http://seleniumhq.org/about/platforms.html#browsers ). The page states that Selenium works with Firefox 2+ when a Linux OS is used (I think Ubuntu would fall under this category ). I am using Firefox 3.5.9 on Ubuntu 9.10 . I have been finishing another project (my grandfather visited me in Oregon from Ohio) and have not played with the at the Selenium Framework since May 14th. I will let you know if I see the error messages. Michelle Knight Message: 5 Date: Tue, 18 May 2010 17:44:03 +0000 (UTC) From: Dan Nessett <dnessett(a)yahoo.com> Subject: Re: [Wikitech-l] Selenium testing framework To: wikitech-l(a)lists.wikimedia.org Message-ID: <hsujl3$v7k$2(a)dough.gmane.org> Content-Type: text/plain; charset=UTF-8 On Tue, 18 May 2010 19:27:38 +0200, Markus Glaser wrote: > Hi Dan, > > I had these error messages once when I used Firefox 3.6 for testing. > Until recently, Selenium did not support this browser. Apparently now > they do, but I did not have a chance to test this yet. So the solution > for me was to point Selenium to a Firefox 3.5. > > Cheers, > Markus My OS is Ubuntu 8.04. The version of Firefox is 3.0.19. Since Ubuntu automatically updates versions of its software, I assume this is the most up-to-date. Is there a list of browser versions compatible with selenium?

4 4

New committers
by Tim Starling 27 May '10

27 May '10

Extension access only: * Liangent: CategoryMultisort extension * Andrew Whitworth (whiteknight): EmbedVideo and EmbedVideoPlus * Garrett Brown (gbruin): FBConnect * Hampton Catlin (hcatlin): webstatscollector -- Tim Starling

1 0

Vector skin not working on BlackBerry?
by David Gerard 27 May '10

27 May '10

There's a few comments on the Wikimedia blog saying they can't access en:wp any more using their BlackBerry. Though we tried it here on an 8900 and it works. Any other reports? - d.

13 19

What's in a name?
by Chad 27 May '10

27 May '10

Good afternoon, This morning I bumped the revision number to 2.0[0]. Some people on IRC didn't like this, so I reverted it and I'm bringing it here. I don't think anyone really wants to keep doing 1.x releases (people seriously get confused that 1.10 comes after 1.6). The following suggestions have been put forward: - Drop the 1 from 1.17.x and make the releases start counting from 17.x, 18.x, etc. - Bump 1.x to 2.0 and move forward from there. - Drop numbers entirely, and pick silly names Thoughts? -Chad [0] http://mediawiki.org/wiki/Special:Code/MediaWiki/66923

12 13

Re: [Wikitech-l] Selenium testing framework: troubleshooting tips
by Michelle Knight 27 May '10

27 May '10

Hi Dan and Markus I have added some troubleshooting tips, based on notes I took during the Friday May 14 meeting, to the Selenium Framework page: http://www.mediawiki.org/wiki/SeleniumFramework#Working_example I think it has the tip about port 4444. My intent was to add information for problem solving and notes as I was using Selenium. If this is something that needs to be updated, then let me know. Also let me know if I need to reference the person who suggested the tip on the page (to give appropriate credit). I will be copying the php framework into Selenium IDE and giving it a try. Michelle Knight --------------------------- Hi Markus, Despite my initial problems getting the Selenium Framework to run, I think it is a great start. Now that I have the PagedTiffHandler working, here is some feedback on the current framework: + When I svn up ../tests (or any ancestor directory), the local changes + I make to RunSeleniumTests cause a local conflict error. Eventually, many of the configuration changes I made should appear in LocalSeleniumSettings, but it isn't clear that is possible for all of them. For example, I change the commented out set_include_path to include my local PHP/PEAR directory. Can this be set in LocalSeleniumSettings? Another difference is the include_once() for each test suite. Is it possible to move these into LocalSeleniumSettings? + It appears there is no way to tell RunSeleniumTests to use a selenium server port other than 4444. It would be useful to have a -port parameter on RunSeleniumServer for this. For example, if there are multiple developers working on the same machine, they probably need to use different selenium servers differentiated by different port numbers. I don't mind working on both of these issues, but since you are the original architect of the framework, it is probably best for you to comment on them first and perhaps suggest what you consider to be the best approach to their resolution. -- -- Dan Nessett

1 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l May 2010