I remeber we discussed using asserts and decided they're a bad idea
for WMF-deployed code - yet I see
Warning: assert() [<a href='function.assert'>function.assert</a>]: Assertion failed in /usr/local/apache/common-local/php-1.22wmf12/extensions/WikibaseDataModel/DataModel/Claim/Claims.php on line 291
Thoughts?
--
Best regards,
Max Semenik ([[User:MaxSem]])
Hey all,
Mozilla made an announcement yesterday about a new framework called Minion:
http://blog.mozilla.org/security/2013/07/30/introducing-minion/https://github.com/mozilla/minion
It's an automated security testing framework for use in testing web
applications. I'm currently looking into how to use it. Would there be any
interest in setting up such a framework for automated security testing of
MediaWiki?
*-- *
*Tyler Romeo*
Stevens Institute of Technology, Class of 2016
Major in Computer Science
www.whizkidztech.com | tylerromeo(a)gmail.com
Hallo,
I would like to announce the release of MediaWiki language extension
bundle 2013.07
* https://translatewiki.net/mleb/MediaWikiLanguageExtensionBundle-2013.07.tar…
* sha256sum: ca381ea1bc1f10c56df28353f91a25129c604ff11938b424833925e8716e2ff3
Quick links:
* Installation instructions are at https://www.mediawiki.org/wiki/MLEB
* Announcements of new releases will be posted to a mailing list:
https://lists.wikimedia.org/mailman/listinfo/mediawiki-i18n
* Report bugs to https://bugzilla.wikimedia.org
* Talk with us at #mediawiki-i18n @ freenode
Release notes for each extension are below.
Amir E. Aharoni
== Babel ==
Only localization updates.
== cldr ==
No changes.
== CleanChanges ==
Only localization updates.
== LocalisationUpdate ==
Only localization updates.
== Translate ==
===Noteworthy changes===
Groups are sorted alphabetically in the export tab of Special:Translate.
Support for Yandex Translate API v1.5.
Edit summaries for automated edits are written in the content language
(bug 52142).
== UniversalLanguageSelector ==
===Noteworthy changes===
The functions for web fonts loading were optimized to improve performance.
The internals of loading translated message were changed from the
original jquery.i18n implementation to allow loading messages from
other domains (CORS).
Languages code aliases are now used properly in the Common languages
section. This allows, for example, proper display of Tagalog for users
from the Philippines.
The variable $wgULSNoImeSelectors was added to disable IME on elements
by specifying jQuery selectors that match them.
The CSS class 'uls-settings-trigger' can be added to any element so
that clicking it will make the ULS appear. It is useful for
documentation and examples.
Web fonts are applied to the IME selector menu, too.
===Fonts===
Persian and Malayalam no longer have a default font.
Added fonts for Canadian Syllabic, Urdu (non-default),
Updated UnifrakturMaguntia font.
===Input methods===
LRM and RLM were added to the Hebrew input methods and the redundant
he-kbd input method was removed.
Danda was removed from the Marathi phonetic input method.
A bug was fixed in Kannada, Tamil and Marathi input methods that
didn't allow typing some characters.
The Slovak input method was fixed according to the standard Slovak keyboard.
The names of the Oriya input methods were updated.
--
Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
http://aharoni.wordpress.com
“We're living in pieces,
I want to live in peace.” – T. Moore
Hi,
after a month of work on my GSoC project Incremental Dumps [1], I think I
have now something worth sharing and talking about, though it's still far
from complete.
What the code can do now is to read a pages-history XML dump and create the
various kinds of dumps (pages/stub, current/history) in the new format from
that.
It can then convert a dump in the new format back to XML.
The XML output is almost the same as existing XML dumps, but there are some
differences [2].
The current state of the new format also now has a detailed specification
[3] (this describes the current version, the format is still in flux and
can change daily).
If you want, you can also try running the code. [4]
It's not production-quality yet (e.g. it doesn't report errors properly),
but it should work.
Compilation instructions are in the README file.
Any comments or questions are welcome.
Petr Onderka
User:Svick
[1]: http://www.mediawiki.org/wiki/User:Svick/Incremental_dumps
[2]:
http://www.mediawiki.org/wiki/User:Svick/Incremental_dumps/File_format/XML_…
[3]:
http://www.mediawiki.org/wiki/User:Svick/Incremental_dumps/File_format/Spec…
[4]: https://github.com/wikimedia/operations-dumps-incremental/tree/gsoc
Today I'm noticing that if I visit someone else's user or talk
page (this is on en.wp), I see a little orange box saying
"Talk: you have new messages" even though I don't. Presumably
that user does, or something.
I may be mistaken, but I thought voting was enabled for the product
VisualEditor, and now it is not.
Could someone confirm this?
It is currently disabled for:
Commons App
Huggle
openZIM
Parsoid
(Spam)
Tool Labs tools
VisualEditor
Wiki Loves Monuments
WikiLoves Monuments Mobile
Wikimedia Labs
Wikipedia App
Voting provides a way to watch a bug without sending any bugmail. It
also lets people do a +1 without adding a comment.
Is the reason for disabling voting because of perception that large
numbers of votes magically creates new developers? If so, would it be
re-enabled if the voting interface was relabelled and better
documented. i.e.
https://bugzilla.wikimedia.org/show_bug.cgi?id=34490
--
John Vandenberg
Le 2013-07-26 20:26, Amgine a écrit :
> The request is to create a web-based text corpus[1] from which to
> derive
> frequencies and then compare with existing wiktionaries. Not a light
> undertaking, but one which has been proposed and implemented
> previously
> (e.g. Connel's Gutenberg project[2])
>
> Generically speaking, someone would need to determine the appropriate
> size of the corpus sample, it's temporal currency, and the method of
> creating and maintaining it. This isn't easy to do, and having no
> strictures results in unwieldy and mostly irrelevant products like
> Google's n-grams[3] (on the other hand, if someone can figure out how
> to
> filter n-grams usefully it would mean we don't have to build our
> own.)
Actually, I think it would be interesting to have a trend history of
words usage over centuries (current trend would also be interesting but
probably harder to implement). Wikisource may be used in order to
achieve that.
>
> Amgine
>
> [1] https://en.wikipedia.org/wiki/Linguistic_corpus
> [2] https://en.wiktionary.org/wiki/User:Connel_MacKenzie/Gutenberg
> [3] http://storage.googleapis.com/books/ngrams/books/datasetsv2.html
>
>
> On 26/07/13 09:18, Lars Aronsson wrote:
>> On 07/23/2013 11:23 AM, Mathieu Stumpf wrote:
>>> Here is what I would like to do : generating reports which give,
>>> for
>>> a given language, a list of words which are used on the web with a
>>> number evaluating its occurencies, but which are not in a given
>>> wiktionary.
>>>
>>> How would you recommand to implemente that within the wikimedia
>>> infrastructure?
>>
>> Some years back, I undertook to add entries for
>> Swedish words in the English Wiktionary. You can
>> follow my diary at http://en.wiktionary.org/wiki/User:LA2
>>
>> Among the things I did was to extract a list of all
>> Swedish words that already had entries. The best
>> way was to use CatScan to list entries in categories
>> for Swedish words. Even if there is a page called
>> "men", this doesn't mean the Swedish word "men"
>> has an entry, because it could be the English word
>> "men" that is in that page.
>>
>> Then I extracted all words from some known texts,
>> e.g. novels, the Bible, government reports, and the
>> Swedish Wikipedia, counting the number of
>> occurrencies of each word. Case significance is
>> a bit tricky. There should not be an entry for
>> lower-case stockholm, so you can't just convert
>> everything to lower case. But if a sentence begins
>> with a capital letter, that word should not have
>> a capitalized entry. Another tricky issue is
>> abbreviations, which should keep the period,
>> for example "i.e." rather than "i" and "e". But
>> the period that ends a sentence should be removed.
>> When splitting a text into words, I decided to keep
>> all periods and initial capital letters, even if this
>> leads to some false words.
>>
>> When you have word frequency statistics for a text,
>> and a list of existing entries from Wiktionary, you
>> can compute the coverage, and I wrote a little
>> script for this. I found that English Wiktionary already
>> had Swedish entries covering 72% of the words in the
>> Bible, and when I started to add entries for the most
>> common of the missing words, I was able to increase
>> this to 87% in just a single month (September 2010).
>>
>> Many of the common words that were missing when
>> I started were adverbs such as "thereof", "herein",
>> which occur frequently in any text but are not very
>> exciting to write entries about. This statistics-based
>> approach gave me a reason to add those entries.
>>
>> It is interesting to contrast a given text to a given
>> dictionary in this way. The Swedish entries in the
>> English Wiktionary is a different dictionary than the
>> Swedish entries in the German or Danish Wiktionary.
>> The kinds of words found in the Bible are different
>> from those found in Wikipedia or in legal texts.
>> There is not a single, universal text corpus that we
>> can aim to cover. Google has released its ngram
>> dataset. I'm not sure if it covers Swedish, but even
>> if it does, it must differ from the corpus frequencies
>> published by the Swedish Academy.
>>
>> It is relatively easy to extract a list of existing entries
>> from Wiktionary. But to prepare a given text corpus
>> for frequency and coverage analysis needs more
>> preparation.
>
>
> _______________________________________________
> Wiktionary-l mailing list
> Wiktionary-l(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiktionary-l
--
Association Culture-Libre
http://www.culture-libre.org/
Hi wikitech-l!
The db replication of s5 and s6 stopped on the toolserver. Merlissimo
searched for information and found that you stopped some of your
slaves that toolserver is using as master.
Is there an ETA when they will be back? Please provide some information!
Thanks und cheers, Silke
--
Silke Meyer
Internes IT-Management und Projektmanagement Toolserver
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. (030) 219 158 260
http://wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.