On Sat, Aug 1, 2009 at 12:13 AM, Michael Dale<mdale(a)wikimedia.org> wrote:
true... people will never upload to site without
instant gratification (
cough youtube cough ) ...
Hm? I just tried uploading to youtube and there was a video up right
away. Other sizes followed within a minute or two.
At any rate its not replacing the firefogg that has
gratification at point of upload its ~just another option~...
As another option— Okay. But video support on the site stinks because
of lack of server side 'thumbnailing' for video. People upload
multi-megabit videos, which is a good thing for editing, but then they
don't play well for most users.
Just doing it locally is hard— we've had failed SOC projects for this—
doing it distributed has all the local complexity and then some.
Also I should add that this wiki@home system just
gives us distributed
transcoding as a bonus side effect ... its real purpose will be to
distribute the flattening of edited sequences. So that 1) IE users can
view them 2) We can use effects that for the time being are too
you can download and play the sequences with normal video players and 4)
we can transclude sequences and use templates with changes propagating
to flattened versions rendered on the wiki@home distributed computer
I'm confused as to why this isn't being done locally at Wikimedia.
Creating some whole distributed thing seems to be trading off
something inexpensive (machine cycles) for something there is less
supply of— skilled developer time. Processing power is really
Some old copy of ffmpeg2theora on a single core of my core2 desktop
process a 352x288 input video at around 100mbit/sec (input video
consumption rate). Surely the time and cost required to send a bunch
of source material to remote hosts is going to offset whatever benefit
We're also creating a whole additional layer of cost in that someone
have to police the results.
Perhaps my tyler durden reference was too indirect:
* Create a new account
* splice some penises 30 minutes into some talking head video
* extreme lulz.
Tracking down these instance and blocking these users seems like it
would be a fulltime job for a couple of people and it would only be
made worse if the naughtyness could be targeted at particular
resolutions or fallbacks. (Making it less likely that clueful people
will see the vandalism)
While presently many machines in the wikimedia
internal server cluster
grind away at parsing and rendering html from wiki-text the situation is
many orders of magnitude more costly with using transclution and temples
with video ... so its good to get this type of extension out in the wild
and warmed up for the near future ;)
In terms of work per byte of input the wikitext parser is thousands of
times slower than the theora encoder. Go go inefficient software. As a
result the difference may be less than many would assume.
Once you factor in the ratio of video to non-video content for the
for-seeable future this comes off looking like a time wasting
Unless the basic functionality— like downsampled videos that people
can actually play— is created I can't see there ever being a time
where some great distributed thing will do any good at all.
is going to significant harm compression efficiency for
any inter-frame coded output format unless you perform a two pass
encode with the first past on the server to do keyframe location
detection. Because the stream will restart at cut points.
also true. Good thing theora-svn now supports two pass encoding :) ...
Yea, great, except doing the first pass for segmentation is pretty
similar to the computational cost as simply doing a one-pass encode of
but an extra key frame every 30 seconds properly wont
compression efficiency too much..
It's not just about keyframes locations— if you encode separately and
then merge you lose the ability to provide continuous rate control. So
there would be large bitrate spikes at the splice intervals which will
stall streaming for anyone without significantly more bandwidth than
vs the gain of having your hour long
interview trans-code a hundred times faster than non-distributed
conversion. (almost instant gratification)
Well tuned you can expect a distributed system to improve throughput
at the expense of latency.
Sending out source material to a bunch of places, having them crunch
on it on whatever slow hardware they have, then sending it back may
win on the dollars per throughput front, but I can't see that having
true... You also have to log in to upload to
commons.... It will make
life easier and make abuse of the system more difficult.. plus it can
Having to create an account does pretty much nothing to discourage
act as a motivation factor with distributed@home
teams, personal stats
and all that jazz. Just as people like to have their name show up on the
"donate" wall when making small financial contributions.
"But why donate? Wikimedia is all distributed, right?"
But whatever— It seems that the goal here is to create trendy buzzword
technology while basic functionality like simple thumbnailing sits