Thanks Christian. I do not believe that we need to backfill the TSVs that are filled from the udp2log stream.

Oliver -- GLEE uses the geo-edit data.

-Toby

On Mon, Dec 1, 2014 at 4:57 AM, Oliver Keyes <okeyes@wikimedia.org> wrote:
Thanks, Christian! :)

What do we use geowiki for, out of interest?

On 1 December 2014 at 06:59, Christian Aistleitner <christian@quelltextlich.at> wrote:
Hi,

On Sun, Nov 30, 2014 at 12:34:27AM -0800, Ori Livneh wrote:
> See message below about a network outage currently affecting multiple
> servers in eqiad.

The network is up again, and the affected machines are again reachable.

TL;DR:
+------------------------------+-----------+--------------------------------+
| Dataset                      | Affected? | Will be backfilled?            |
+------------------------------+-----------+--------------------------------+
| Analytics slave databases    | no        | ---                            |
| Analytics cluster            | no        | ---                            |
| Pagecounts-all-sites         | no        | ---                            |
| Pagecounts-raw               | yes       | yes, not completed yet         |
| TSVs                         | yes       | no                             |
| EventLogging database        | yes       | yes, done                      |
| EventLogging graphite graphs | yes       | no                             |
| geowiki                      | yes       | yes, done                      |
| Wikipedia Zero graphs        | yes       | Excluded 2014-11-30 from plots |
+------------------------------+-----------+--------------------------------+

I'll track updates on

  https://phabricator.wikimedia.org/T76334

Best regards,
Christian



* Pagecounts-raw

  pagecounts-20141130-040000.gz
  pagecounts-20141130-050000.gz
  pagecounts-20141130-060000.gz
  pagecounts-20141130-070000.gz
  pagecounts-20141130-080000.gz
  pagecounts-20141130-090000.gz
  pagecounts-20141130-100000.gz
  pagecounts-20141130-110000.gz
  projectcounts-20141130-040000
  projectcounts-20141130-050000
  projectcounts-20141130-060000
  projectcounts-20141130-070000
  projectcounts-20141130-080000
  projectcounts-20141130-090000
  projectcounts-20141130-100000
  projectcounts-20141130-110000

are bad.

We'll backfill them from pagecounts-all-sites.

If you're still using pagecounts-raw, please consider switching to
pagecounts-all-sites:

  https://wikitech.wikimedia.org/wiki/Analytics/Pagecounts-all-sites



* TSVs:

All udp2log streams are affected. Calling out only the most prominent
ones:
** sampled-1000 TSVs
** mobile-sampled-100 TSVs
** zero TSVs
** edits TSVs

They are all missing data between 2014-11-30T03:50 and
2014-11-30T10:13.

Properly backfilling them from the cluster would be possible, but this
would need serious data massaging. If not one says the data for
2014-11-30 is badly needed, I would not backfill the TSVs.



* EventLogging:

** Database is up and running and backfilled.
No artifacts are expected.

** EventLogging stats on graphite
Note that this item is only about the graphs in graphite. The data in
the database (see above) is ok!)

The overall counts are ok and should not show artifacts.
The per schema counts are basically blank for 2014-11-30.  Backfilling
them would be really time consuming, and the historic parts of those
graphs do not seem to be used anyways. So I suggest to not backfill
here.



* geowiki:

Data for the affected period has been backfilled.



* Wikipedia Zero graphs:

2014-11-30 has been added to the list of dates that will not show up
in the Wikipedia Zero plots.




--
---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
                           Companies' registry: 360296y in Linz
Christian Aistleitner
Kefermarkterstrasze 6a/3     Email:  christian@quelltextlich.at
4293 Gutau, Austria          Phone:          +43 7946 / 20 5 81
                             Fax:            +43 7946 / 20 5 81
                             Homepage: http://quelltextlich.at/
---------------------------------------------------------------

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics




--
Oliver Keyes
Research Analyst
Wikimedia Foundation

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics