At Wikimedia Conference in Berlin I met with Felix from Wikimedia Ghana,
who is super interested in getting more immersive media available such as
360-degree panoramic photos ("photo spheres"); I showed him the tool labs
widget using panellum to do WebGL spherical photo viewing -- see
https://phabricator.wikimedia.org/T70719#2204864 -- and he was very excited
to see that it's something we could probably work out how to integrate in
the nearish term.
That got me thinking more generally about new media types (video, panos,
stereoscopic photos/videos/panos, 3D models, interactive diagrams, etc) and
how we can extend them to support annotations and linking in a way that
could create immersive visual experiences with the same kind of rich
information and interlinking that Wikipedia is famous for in the world of
Ladies and gentlemen, I give you: "*Epic saga: immersive hypermedia (Myst
I would be real interested to hear y'all's ideas on medium to long term
feasibility and desirability of this sort of system, and what we can pull
more directly into the short term.
For instance I would love to get the panoramic / spherical viewers
integrated in MMV, which is much easier than figuring out how to do
clickable annotations in 3d environment. ;)
Medium term, I would also love to see us look at the annotation system
that's on Commons done in site JS, and see if we can build a
future-extensible system that's more integrated into the wiki and can be
used in MMV.
Longer term, I think it'll just be nice to have these kinds of long-term
goals to work towards.
Thoughts? Ideas? Am I crazy, or just crazy enough? ;)
In addition to the smaller bandwidth requirements for VP9 video encoding
versus Theora or VP8, Microsoft is adding support for VP9 video and Opus
audio in WebM to Windows 10 in the summer 2016 update.
Currently in Win10 preview builds this only works in Edge when using Media
Source Extensions, and VP9 is disabled by default if not
hardware-accelerated, but it's coming along. :)
If the final version lands with suitable config, users of Edge version 15
playback on Wikipedia. Neat!
Things still to do in TimedMediaHandler:
* add transcode output for audio-only files as Opus in WebM container
* keep working on the Kaltura->VideoJS front-end switch to make our lives
easier fixing the UI (Derk-Jan & Brion) and to prep for...
* eventually we'll want to use MPEG-DASH manifests and Media Source
Extensions to implement playback that's responsive to network and CPU speed
and can switch resolutions seamlessly. This may or may not be a
prerequisite for Win10 Edge playback if MS sticks with the MSE requirement.
* consider improving the transcode status overview at
Special:TimedMediaHandler; it reports errors in a way that doesn't scale
Things still to do in Wikimedia site config:
* add VP9/Opus transcodes to our config (audio, 240p, 360p, 480p, 720p,
1080p definitely; consider 1440p and 2160p for Ultra-HD videos)
* consider dropping some VP8 sizes (desktop browsers that support VP8
should all support VP9 now; old Android versions that don't grok VP9 might
be main target remaining for VP8)
Things to consider:
* VP9 is slower to encode than VP8, and a transition will require a lot of
back-running of existing files. We *will* need to assign more video
scalers, at least temporarily.
* I started writing a client-side bot to trigger new transcodes on old
files. Prefer I should finish that, or prep a server side script that
someone in ops will have to babysit?
* in future, the ogv.js JS decoder shim will still be used for Safari and
IE 11, but I may be able to shift it from Theora to VP9 after making more
fixes to the WebM demuxer. Decoding is slower per pixel but at a lower
resolution you often get higher quality because of better compression and
handling of motion -- and bandwidth usage is much better, which should make
it a win on iPhones. This means eventually we may be able to reduce or drop
the Ogg output. This will also tie in with MPEG-DASH adaptive streaming, so
should be able to pick best size for CPU speed more reliably than current
* longer term, AOMedia codec will arrive (initial code drop came out
recently, based on VP10) with definite support from Google, Mozilla, and
Microsoft. This should end up supplementing VP9 in a couple years, and
should be even more bandwidth-efficient for high resolutions.
Claims resizing JPEGs "any size" in 25ms or less on m3.medium AWS instances.
While this is closed source, and is part of a worrying trend of closed SaaS
frameworks one has to pay by the hour, the author reveals enough technical
details to figure out how it works. Namely:
- the inspiration for the code is an unnamed Japanese paper which describes
how to process the Y, U and V components of a JPEG in parallel
- it uses a similar technique as the jpeg:size option of ImageMagick,
whereby only parts of the JPEG are read, instead of every pixel, according
to the needed target thumbnail size
- it leverages "vector math" in the processor, which I assume means AVX
instructions and registries
Essentially, it's parallelized decoding and resizing of JPEGs, using
hardware-specific instructions for optimization.
Of course writing something similar would be a large undertaking. Let's
hope that the folks who work on ImageMagick/GraphicsMagick take note and
try to do just that :)
I have confirmation that it's extremely fast from pals at deviantArt (whose
infrastructure is on AWS) who tried it out. To the point that they're
likely getting rid of their storage of intermediary resized images.
I have a feeling that we'll be seeing more of this sort of
hardware-optimized JPEG decoding/transcoding once Intel releases their
first CPUs with integrated FPGAs, which is supposed to happen soon-ish.
Unfortunately these Xeon CPUs will be released "in limited quantities, to
cloud providers first". Here's that annoying trend again...