Once we ship the firefogg extension support for the uploading videos;
commons should request that users select the highest quality source
video footage available ie the HD video their camera captured or DV
original edited footage from their local computer and then commons will
supply the transcode settings.
I think it would be good if we wrote up some documentation to explain
this to uploaders... any volunteers to help on that front?
Presently for the firefogg upload support I have arbitrarily chosen 400
pixels wide with keep-aspect ratio and 500kbs bitrate.
Firefogg could let us request multiple encodes or profiles from the user.
Should we plan on supporting multiple "profiles" ie multiple quality
settings? Ie one version at around 320 wide 300kbs for low bandwidth /
resolution environments, cell phones etc (300kbs should be "acceptable
quality" once the new Thusnelda theora encoder lands).
We could additionally read the source file resolution that users provide
and choose a "maximum quality preservation version" we could probably
even ship the Dirac codec with firefogg (Dirac is a high quality at high
resolution wavelate codec for more on dirac see your favorite info source ;)
If we want to support multiple quality settings for a single "stream"
this will require a bit more infrastructure. Specifically I propose we
add another namespace for temporal media called Stream: and have it
directly map to ROE xml something like: http://tinyurl.com/72x57r more
info on ROE http://wiki.xiph.org/index.php/ROE
File:my_movie_low_quality.ogg and File:my_movie_high_quality.ogg would
soft redirect to Stream:my_movie and all the meta info would be stored
there. The Stream namespace also allows us to group other media tracks
that share a temporal meaning such as multiple language audio dubbing
and multilingual transcripts/ subtitles. The javascript player can then
dynamically select audio language and or subtitles based on the user
language.
Stream namespace could also store "mirrors" or point to "torrents"
improving syndication and bandwith cost distribution for high traffic HD
content ie (Miro could read the ROE file and grab the torrent rather
than hit our servers) and or a Firefox torrent extension could be
detected by our javascript player and choose the torrent over hitting
our servers for the HD content.
Not to say all these things will happen at once ... just pointing out
the need for a new namespace to group idential temporal meaning files.
--michael
This has probably been raised before (is there a bug for it?), but it
appears to me that it would be significantly more user-friendly to
append a <references/> section at the bottom of the text that is being
parsed when it is missing.
This would
* make it easier for new users to discover referencing functionality
(oldbies could still properly reformat the tag);
* solve the "references missing in section preview" problem.
Is there any obvious reason not to do this?
--
Erik Möller
Deputy Director, Wikimedia Foundation
Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate
I just recently started to play with interwiki.py (Pywikipedia bot
framework) for propagating interwiki links. My interest comes
from organizing the category tree, so I'm focusing on interwiki
links between categories. Interwiki bots normally run in
autonomous mode, but this means they give up on complicated cases.
If I run this script under manual supervision, without the
"-autonomous" option, it stops and asks me how to resolve each
conflict. This happens ever so often. I have now (manually)
sorted out the interwiki links between all languages of
Category:Knowledge, which was intertwined with Category:Science,
and Category:Austrian writers which was mixed up with
Category:Austrian literature. Such mistakes easily happen, of
course. Who can spot errors in all these languages?
Many languages had interwiki links from their category for
Austrian writers to the Japanese category for Austrian literature.
I'm not sure exactly when or where this error originated. But on
June 19, 2007, the English and Spanish Wikipedia's interwiki link
to Japanese changed from Austrian novelists to Austrian
literature, i.e. from one error to another. Ten days later, this
link was copied to the Dutch Wikipedia. The error was corrected on
en.wikipedia on October 1, 2007, but remained on other languages.
Yes, that's 15 months ago.
The circular interwiki link structure from en:Category:Austrian
writers to es:Categoría:Escritores de Austria to ja:... and back
to en:Category:Austrian literature is such a conflict that makes
interwiki.py give up when it runs in autonomous mode.
Thus, corrections (as on October 1) do not propagate. Instead a
report about the conflict is given in a logfile, but apparently
nobody had fixed this problem in the last 15 monhts. This
conflict also blocked new interwiki links from propagating.
After I cleared up the mess, 21 new interwiki links were added to
the category on the Russian Wikipedia (one where I have a bot
flag). That means 21 languages of Wikipedia had created
categories (or announced them to the interwiki system) for
Austrian writers in the last 15 months, and they all added their
interwiki link to the English Wikipedia. But these additions did
not propagate because of the conflict.
So, my question:
Has anybody mapped exactly how many such interwiki conflicts we
have? Or how many interwiki sets do we have without conflicts?
Could/should someone make a list of current conflicts and try to
rank them by importance, so we can get started in fixing them?
In the longer term, we need to redesign the interwiki links into a
centralized system, that can be maintained. I think the way to do
this is to use Wikimedia Commons. Instead of copying all the
interwiki links to every language of Wikipedia, it should be
enough to add {{commons|Category:Writers from Austria}}, and the
rest should happen automatically.
--
Lars Aronsson (lars(a)aronsson.se)
Aronsson Datateknik - http://aronsson.se
For a lot (all?) text input fields for reasons we have set a maxlength
of 200 but some are set to 255.
Is there a special reason for 200? Or should we increase all to 255? Or
reduce all to 200? I like consistency :)
Raymond.
We're currently working on a grant proposal that is related to the
usability for uploading and embedding media files to Wikimedia
Commons. (This is an area that we will likely not be able to address
in detail as part of the Stanton project, so we're trying to parcel it
into a separate project.) As part of this proposal, I would like to
make a compelling case that pictures and other media uploaded to
Commons benefit from strongly from the increased visibility,
especially through Wikipedia articles. I'd also like to demonstrate
that images get used in multiple languages and multiple projects.
The simplest research approach that any volunteer could take is to
take a sample (say 50 featured media and 50 random ones) and to
catalog in a spreadsheet usage across Wikimedia projects, using the
CheckUsage tool. But I'm sure there are other approaches - both
quantitative and qualitative - that might work as well, e.g. based on
Wikipedia article traffic statistics.
I'd love to see some volunteer input into this question, which
essentially boils down: Why is Wikimedia Commons awesome, and why is
it worth investing in to make it even better? I've started a page on
Meta here if you want to contribute ideas on-wiki:
http://commons.wikimedia.org/wiki/Commons:Case_for_Commons
But feel free to e-mail me off-list as well. :-)
Thanks for any and all help,
Erik
--
Erik Möller
Deputy Director, Wikimedia Foundation
Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate
As part of the sequencer I have been working on an add media wizard to
enable the searching and inserting of media to a given sequence. This
add media wizard could also serve as an entry point to adding media to
pages.
== demo ==
First disclaimer: this is still pretty early on in development more of
an early semi-working prototype than a beta or anything usable. (For
example have not done much cross browser testing yet (use firefox))
But if you want to check it out go ahead and add:
importScriptURI('http://mvbox2.cse.ucsc.edu/w/extensions/MetavidWiki/skins/external_media_wi…');
to your User:{username}/monobook.js page
or you can load a slightly older version at:
http://en.wikipedia.org/w/extensions/MetavidWiki/skins/external_media_wizar…
Once installed to your user page go to edit some page like "sandbox"
highlight a word or place the cursor where you want to insert media.
Click the add media wizard at the top right of the edit box. You should
get a few images from commons can click on an image to insert, add
in-line description, crop if you like and then preview the insert into
the page. Once happy "do the insert" it will paste in the wiki code to
insert that image into the page you can modify and then re-preview the
page if you like.
You can also do a political search like "Iraq" or "Obama" and pull up
metavid clips to see how the setting in-and-out points of video has been
prototyped so far.
(note: metavid fallback to flash video while the html5 video tag for
firefox is still maturing... if you are using a firerfox nightly
http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-trunk/ you
may have to click the options button in the lower right to select ogg
video stream)
Once done playing with the prototype, I recommend removing the script
from your user page. I will update the list once its more _ready_ for
wider usage.
It should work on remote wikis with the commons import system. Ie you
can try including this script on your local mediaWiki installation .. if
you have $wgAllowCopyUploads enabled your wiki should be able to
download and import commons images directly into the wiki.
== Issues ==
So this brings up a whole host of issues.. here are some that I thought
of... I thought I would ping this to get some more ;)
=== security ===
* Right now the wizard pulls directly from remote repositories, (ie
commons and metavid.org serves up the search results in json with a
callback).. This means the compromise of any server that we support as a
remote repository will result in xss issue. This is true for any remote
script that users include but would be a bigger problem if the add media
wizard moves into more common usage.
** We should probably proxy the results so we can just process them as
RSS and run normal script filters on the data. -- this is slower and
adds more strain to our servers but provides more security.
** or we limit the "included by default" repositories and put in a kill
switch of sorts that we can run to stop injecting from any compromised
remote repository? making it difficult to cause big xss issues "by
default". We have users jump though some hoops to enable less common
remote repositories similar to how the user scripts work?
=== performance / maintainable/modular code / internationalization ===
* JS Library loading... we want to start moving towards more modular
scripts ie we don't need to include all the remote repository objects on
first load (ie we have a search object for flicker we want to
dynamically add in that remote repository search object as necessary
when the user click on the flicker repository tab. )
For portability outside of mediaWiki we have to have each js object/file
should define any user language messages that it includes, that way our
script server system can send out the right language messages with the
JS library that uses them.
Some discussion about a javascript loading system took place on this
list not long ago.. I said I would revisit the issue so I will try and
do that soon :)
http://lists.wikimedia.org/pipermail/wikitech-l/2008-December/040625.html
=== Licenses / archive restrictions ==
So far I have just included the metavid "remote" repository (importing
is not working yet until we enable Uploading by URL and do some fixes
for cross site issues (ie your inserting content on the wikipedia domain
but want the resource to be uploaded to commons).
In terms of external archive license issues I am thinking we essentially
require that the external archive provide license info and we represent
that with a little icon below each image and then pull the appropriate
template into the import resource description.
The other obvious external archives restriction is for video that they
provide the video in ogg theora format. Preferably they run oggz_chop so
that a segment of the video can be dynamically selected. We have already
been working with archive.org on this front see:
http://metavid.org/blog/2008/12/08/archiveorg-ogg-support/
== road map / up-and-coming efforts ==
* get scaling working ... (right now just defaults to thumb or cropped
size).
* license support (add in license thumbnails and wikimedia commons
template mappings for import descriptions)
* layout control (real time layout control will let you adjust size and
float layout properties... maybe even let you move the image around in
the page.
* "add by URL" option for parsing resource pages of common repositories.
ie maybe you find a picture using flickr's search engine you want to
copy and paste that url not search for it again.
* uploading .. integrate http://firefogg.org/ for uploading video from
arbitrary source content to ogg theora with server side provided
encoding settings.
* fix importing of videos (from metavid initially, but done in a general
way to support archive.org video inserts)
* javascript loader (integrate a solution to the large set of many
javascript files / localization problem)
* add annodex oggz_chop to wikipedia server side architecture so that we
can support setting in-and-out points for ogg video (like we can do on
metavid and archive.org video)
* improve generalization of search classes and add support more remote
repositories. (archive.org, fliker etc)
* improve mediaWiki api so we can query for "only videos" or only svg
and or both the Title and Description text at the same time.
* make a more general protocol for establishing queriable properties.
This will let us do discovery of "advanced search" parameters in a
general way. More complicated use case is for full semantic wiki. For
some examples of finding video clips with semantic searches see:
http://metavid.org/wiki/Sample_Semantic_Queries_page
* improve the image "editor" integrate and or improve around multi-user
collaboration some library for simple canvas/image manipulations ie:
http://editor.pixastic.com/ (with server side support for rendering out
these transformations for performance and older and or otherwise
crippled web browsers (ie IE)
** along those lines add in server side support for cropping with the
larger transformation framework in mind.
peace,
michael
The current enwiki database dump
(http://download.wikimedia.org/enwiki/20081008/) has been crawling
along since 10/15/2008.
I realize that dumps can appear stalled in their normal processing
(http://meta.wikimedia.org/wiki/Data_dumps#Schedule), but in the
recent past (as far as I know) they have not been stalled this long
without there being something actually wrong. The completion date for
"All pages with complete page edit history" (where it is currently)
fluctuates within the latter half of 2009.
Is this purposeful? And is there anything I (or other community
members) can do about it? I personally just need the pages-articles
part. Would it be possible to dump up to that part on a different
thread?
Thank you for your time.
Gabriel Weinberg
[Breaking this thread off...]
On 12/28/08 1:32 AM, Niklas Laxström wrote:
> The anchors of non-latin headers are already (latin) gibberish:
> #.D0.A4.D0.B8.D0.BB.D1.8C.D0.BC.D0.BE.D0.B3.D1.80.D0.B0.D1.84.D0.B8.D1.8F
>
> It doesn't seem reasonable to think that people could create anchors
> in their head from text, except in special cases.
If we're going to stick with strict ASCII-limited anchors, it might be
worth considering making them more legible, say with transliteration to
ASCII Latin chars. :P
On the other hand, XHTML *doesn't* actually limit us this way!
The XHTML 1.0 recommendation of restriction to [A-Za-z][A-Za-z0-9:_.-]*
is for compatibility with HTML 4.0, which defines:
ID and NAME tokens must begin with a letter ([A-Za-z]) and may be
followed by any number of letters, digits ([0-9]), hyphens ("-"),
underscores ("_"), colons (":"), and periods (".").
XHTML specifcies ID and NMTOKEN types here, which are *not* restricted
to ASCII, but rather a large number of scripts:
http://www.w3.org/TR/2000/WD-xml-2e-20000814#NT-NameCharhttp://www.w3.org/TR/2000/WD-xml-2e-20000814#NT-Letterhttp://www.w3.org/TR/2000/WD-xml-2e-20000814#NT-Digithttp://www.w3.org/TR/2000/WD-xml-2e-20000814#NT-Extender
If there are no major browser compatibility problems, I would probably
recommend we roll back the nasty old .XX encoding for HTML 4
compatibility, in which case we could quite legally produce something
direct, such as:
http://ru.wikipedia.org/wiki/Уплисцихе#Уплисцихе_в_средневековье
which URL-encodes out to:
http://ru.wikipedia.org/wiki/%D0%A3%D0%BF%D0%BB%D0%B8%D1%81%D1%86%D0%B8%D1%…
(which can be nicely displayed as pretty Unicode in the URL bar of
modern browsers)
as opposed to the current:
http://ru.wikipedia.org/wiki/%D0%A3%D0%BF%D0%BB%D0%B8%D1%81%D1%86%D0%B8%D1%…
-- brion
This is a proposal to change how the MediaWiki software displays
subpages. This isn't really an issue over a Wikipedia because
subpages in the main namespace are disabled, but using subpages at
Wikisource is a standard way of dividing up works, leaving only a
table of contents at the root article.
The problem is that this results in pages titles like this:
United States Code/Title 35/Chapter 14/Section 151
or even
Nicene and Post-Nicene Fathers: Series II/Volume I/Constantine/The
Life of Constantine/Book II/Chapter 23
which IMHO looks more like a file system than a user-friendly
website. I would suggest
United States Code » Title 35 » Chapter 14 » Section 151
or my own favourite
United States Code » Title 35 » Chapter 14 »
Section 151
(With "Section 151" in bigger font.)
This would effectively involve moving the subpages div above the
title and changing the title from the entire path to just the subpage
name.
See:
http://en.wikisource.org/wiki/Wikisource:Scriptorium#Subpage_formatting
for the discussion I started on Wikisource and further down the same
page
http://en.wikisource.org/wiki/
Wikisource:Scriptorium#Subpage_formatting:_some_more_examples
for some formatted versions of the above examples
and
http://en.wikisource.org/wiki/Nicene_and_Post-
Nicene_Fathers:_Series_II/Volume_I/Constantine/
The_Life_of_Constantine/Book_II/Chapter_23
and
http://en.wikisource.org/wiki/Treaty_on_European_Union/
Protocol_on_the_convergence_criteria_referred_to_in_Article_109j_of_the_
Treaty_establishing_the_European_Community
for examples of how ugly the current setup can be.
Michael
Hi guys,
Could someone with the access please change the redirection on
http://wikimania.wikimedia.org/ to the wikimania2009.wikimedia.org site?
As it is now 2009, I think it would be best suitable to change this to the
latest site.
Thanks,
James
--
[[User:JamesR]] (formerly [[User:E]])
English Wikipedia Administrator
Wikimedia Australia Member