Hi,
stat1002 and stat1003 need a reboot to get updated Linux kernels. I'm
planning to start tomorrow (Wed 27th of Jan) at around 8am UTC.
If someone has a particularly long-running script and needs a reschedule,
please let me know. I'll drop a quick note on IRC when the systems are back
up.
Cheers,
Moritz
Hello, friends!
We have some preliminary numbers and graphs for Commons, English
Wikipedia, and German Wikipedia on the following:
* Uploads per month
* Unique uploaders per month
* New uploaders per month
* Cross-wiki uploads per month (currently wonky, patch in to fix it)
* UploadWizard uploads per month (based on categories, might be flawed)
You can find the graphs here:
https://edit-analysis.wmflabs.org/multimedia-health
The raw numbers are available, if you're into it:
http://datasets.wikimedia.org/limn-public-data/metrics/multimedia-health/
These numbers will automatically update each month, and we have
historical data as far back as is necessary (but feel free to disagree
with that assessment).
Upcoming numbers:
* Uploaders by tool per month (i.e. people using UW, CWU, etc.)
* New uploaders by tool per month
* Deletions
Numbers I want but haven't totally sussed out how to find (but I'm close!):
* Number of pages with images per month
* Number of images on pages per month
All of those numbers and graphs will show up in the same places (see
links above) and will also be updated automatically, so we never have to
think about implementing metrics ever again.
If you want to mess up my code, you can try to do so in the
analytics/limn-multimedia-data repository on gerrit, and the
configurations for Dashiki are here:
https://meta.wikimedia.org/wiki/Config:MultimediaHealthhttps://meta.wikimedia.org/wiki/Dashiki:CategorizedMetrics
Let me know if you have any questions, suggestions, complaints, or
praise for these efforts - I'm available on- or off-list, on
Phabricator, or on IRC in the #wikimedia-multimedia channel as always :)
And, side plug, the wonderful Analytics humans who brought you the
reportupdater and Dashiki tools can be found on the analytics list (one
of the addressees of this message) or in #wikimedia-analytics.
Thanks everyone, here's to more great numbers this year!
--
Mark Holmquist
Lead Engineer, Multimedia
Wikimedia Foundation
mtraceur(a)member.fsf.org
http://marktraceur.info
+Analytics list so they can comment.
I don't have such a script. It's a pretty intensive job to compile top
articles especially over a month. The pageview API was supposed to have
top articles per month per wiki but the job is so massive that it failed to
run in Hive. Analytics knows there are better algorithms out there to
solve this problem. So the pageview API just has top per day per wiki.
I imagine that you are looking at some very specific wikis and countries...
not all of them. Maybe someone on the list can make an example hive script
(given a wiki and country) that gives the top for a day.
On Wed, Jan 20, 2016 at 12:23 PM, Dan Foy <dfoy(a)wikimedia.org> wrote:
> Hi Kevin,
>
> In your collection of scripts for Hive, do you have one that can act as a
> starting point for me to get the top N articles / URLs for Wikipedia in a
> country?
>
> Thanks,
> Dan
>
>
>
Hi all,
Barring complications, I plan to move forward with
https://phabricator.wikimedia.org/T110090 on Monday. To do this, I need to
stop the Hive and Oozie servers so I can start them up elsewhere.
I will start this at 18:00 UTC on this Monday January 25th. I don’t think
this should take me more than 30 minutes, but just in case I’d like to
reserve 2 hours.
Please plan on not being able to use Hive or Oozie between 18:00 and 20:00
on Monday. Oozie jobs will be paused during this migration, and resumed
once it is finished.
Thanks!
-Andrew Otto
Hello,
The UI for vital signs will no longer display legacy pageview data
(pageviews calculations that used the old definition). We are working
towards having one consistent pageview definition [1] in every tool that
surfaces pageview data.
Please have in mind that the new definition only exists since May 2015 any
data from before the switch was made was calculated using the old
(undocumented as far as we know) definition.
You can access vital signs UI in the following url:
https://vital-signs.wmflabs.org/#projects=ruwiki,itwiki,dewiki,frwiki,enwik…
Thanks,
Nuria
[1] https://meta.wikimedia.org/wiki/Research:Page_view
Hey yo,
Just a note that EventLogging had replication problems and needed to
be backfilled yesterday. This means that if you had scripts running
early this morning over EventLogging data from yesterday or the last
few days, you're probably gonna need to rerun them and should check
whether you need to.
--
Oliver Keyes
Count Logula
Wikimedia Foundation
Hi all!
This week, ops will be redirecting all mobile web traffic from the mobile
caches to the text caches, as part of a larger project to consolidate and
simplify our the cache architecture. This means that the data in the
webrequest table in the webrequest_source=‘mobile’ partition will now live
in the webrequest_source=‘text’ partition.
Joseph and I will work to make sure our prod Hadoop jobs keep working.
This is just a heads up in case you have any of your own jobs depending on
this partition.
https://phabricator.wikimedia.org/T109286https://phabricator.wikimedia.org/T122651
-Ao