Hi Peter,
Re: your question about getting historical ORES article quality predictions, there's an open ticket on Phabricator https://phabricator.wikimedia.org/T146718 to add these data to the public replicas hosted on Labs (and therefore accessible via both SSH tunnel and Quarry). Chime in on the discussion if you'd like to see these tables added!
Right now I believe the best way to access historical scores is to download and parse the full dataset https://datasets.wikimedia.org/public-datasets/enwiki/article_quality/. But halfak or others may be aware of better methods.
Best, Jonathan
On Wed, Mar 15, 2017 at 7:26 AM, Peter Ekman pdekman@gmail.com wrote:
I'll suggest Wikihistory, e.g. https://tools.wmflabs.org/xtools/wikihistory/wh.php?page_title=Tulip_mania which gives all the editors (ranked by number of edits), article size(?) and edits (per year, month, or even weeks). There's a bit more at https://en.wikipedia.org/w/index.php?title=Tulip_mania& action=info#mw-pageinfo-watchers which includes info on page watchers, recent edits and a wikidata link. Page views at https://tools.wmflabs.org/pageviews/?project=en. wikipedia.org&platform=all-access&agent=user&range= latest-30&pages=Tulip_mania only goes back a couple of years. Before that is an inconsistent series (somewhere) All these are available from the history tab on the article page The only other thing that I'd want is the ORES scores (AI quality prediction for any individual version given the permid). Is this best place to get these at https://ores.wmflabs.org/v2/scores/enwiki/wp10/?revids=769824240 ? Is there an easy way to get a regular-interval time series of these? (I wouldn't expect a complete time series for 1,000s of versions!)
Hope this helps.
Peete
===== Message: 1 Date: Wed, 15 Mar 2017 21:18:59 +1300 From: "Stuart A. Yeates" syeates@gmail.com To: Research into Wikimedia content and communities wiki-research-l@lists.wikimedia.org Subject: [Wiki-research-l] tool / framework for article lifecycle stats ? Message-ID: <CAC_Lu0bjKgdcRYeKVr2U9Cr_hzatW10jiVi8=bDFcTXE4rs9Nw@ mail.gmail.com> Content-Type: text/plain; charset="utf-8"
Is there a tool or framework forgetting article lifecycle stats in an automated fashion. Is anyone aware of something like that? Things like creator (+ their basic stats), total # of edits, who's edited the article (+ their basic stats), article age, article flags, etc.
I'm reasonably platform / language agnostic. I'll only need stats on dozens of articles an hour, so no need for a weaponised platform.
cheers stuart -- ...let us be heard from red core to black sky