Asher,
As you can see in the conversation below, I discovered that sha1 hashes are
not being generated historically for revision content in Wikipedia -- just
generated from the time of release (mid-April) forward. Is this intended?
If so, is there a plan to fill in the sha1's historically or should we
(data monkeys) keep track of THE GREAT SHA1 EPOCH of April, 2012 when
performing our analysis from here forward?
Thanks!
-Aaron
On Thu, May 10, 2012 at 4:27 PM, Diederik van Liere <dvanliere(a)wikimedia.org
wrote:
> Hi Aaron,
>
> You are right, it seems not to have been calculated for older pages (I
> checked it for eswiki). I was under the impression that this either had
> finished or is in progress. Probably best to ask Asher for more details.
>
>
> Best,
>
> Diederik
>
> On Thu, May 10, 2012 at 5:09 PM, Aaron Halfaker
<aaron.halfaker(a)gmail.com>wrote;wrote:
>
>> I'm using the web API and db42.
>>
>> API example:
>>
http://en.wikipedia.org/w/api.php?action=query&prop=revisions&title…
>>
>> MySQL example:
>> mysql> select rev_id, rev_sha1 from revision where rev_page = 15661504
>> and rev_id <= 488033783 order by rev_timestamp desc limit 2;
>> +-----------+---------------------------------+
>> | rev_id | rev_sha1 |
>> +-----------+---------------------------------+
>> | 488033783 | i8x0e29kxs2t1o9f1h03ks7q59yyyrv |
>> | 485404713 | |
>> +-----------+---------------------------------+
>> 2 rows in set (0.00 sec)
>>
>> -Aaron
>>
>> On Thu, May 10, 2012 at 4:03 PM, Diederik van Liere <
>> dvanliere(a)wikimedia.org
wrote:
>>
>>> Which machine are you accessing?
>>> D
>>>
>>> On Thu, May 10, 2012 at 4:58 PM, Aaron Halfaker <
>>> aaron.halfaker(a)gmail.com
wrote:
>>>
>>>> Hey guys,
>>>>
>>>> I'm trying to use the sha1 hashes of Wiki content for the first time
>>>> (woot! Props to D et al. for seeing it through) but I'm having some
>>>> trouble actually getting them out of the API/databases. It looks like
the
>>>> checksums only go back to April 19th. Is this true of all pages? Is
there
>>>> any plan to propagate the metric backwards?
>>>>
>>>> -Aaron
>>>>
>>>> _______________________________________________
>>>> Analytics mailing list
>>>> Analytics(a)lists.wikimedia.org
>>>>
https://lists.wikimedia.org/mailman/listinfo/analytics
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Analytics mailing list
>>> Analytics(a)lists.wikimedia.org
>>>
https://lists.wikimedia.org/mailman/listinfo/analytics
>>>
>>>
>>
>> _______________________________________________
>> Analytics mailing list
>> Analytics(a)lists.wikimedia.org
>>
https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>
> _______________________________________________
> Analytics mailing list
> Analytics(a)lists.wikimedia.org
>
https://lists.wikimedia.org/mailman/listinfo/analytics
>
>