The article is for me done now.
D. K.
2022-11-08 21:32 GMT+01:00, Dušan Kreheľ <dusankrehel(a)gmail.com>om>:
> [Fix]:
>
> A link to the source code has been added.
>
> @Dan Andreescu: The format is correct. The annual summary is a typical
> basic statistical interval, and we save time by merging. The file size
> problem disappears if the file is split by local wikis. And the skwiki
> is only 49MB for the year 2021, which does not require a more
> demanding level of the end user who processes them for their purpose.
>
> 2022-11-08 21:30 GMT+01:00, Dušan Kreheľ <dusankrehel(a)gmail.com>om>:
>> A link to the source code has been added.
>>
>> @Dan Andreescu: The format is correct now. The annual summary is a
>> typical basic statistical interval, and we save time by merging. The
>> file size problem disappears if the file is split by wÃk. And the
>> skwiki has only 49MB for the year 2021, which does not require the
>> level of the end user who processes them for their purpose.
>>
>> 2022-10-06 19:31 GMT+02:00, Dan Andreescu <dandreescu(a)wikimedia.org>rg>:
>>> @Dušan Kreheľ: I think there's a misunderstanding. I read your
>>> re-written
>>> article. In it, you say that the current format is:
>>>
>>> domain_code page_title count_views total_response_size
>>>
>>> For an example, you give this:
>>>
>>> sk Kreheľ 2 0
>>>
>>> But, actually, that format is deprecated and the new format is pageviews
>>> complete, which looks like this:
>>>
>>> sk.wikipedia Kreheľ null desktop 13 B2D2G2J2O2T1V1X1
>>>
>>> The B2D2G2J2O2T1V1X1 is exactly the kind of encoding you're talking
>>> about,
>>> and no 0-values are present.
>>>
>>> You made the point that we are missing a yearly rollup in this new
>>> format.
>>> This would be quite a large file, but if there's a good use case for
>>> such
>>> a
>>> dump, a request in phabricator is a good way to proceed.
>>>
>>> On Sat, Oct 1, 2022 at 9:58 AM Dušan Kreheľ <dusankrehel(a)gmail.com>
>>> wrote:
>>>
>>>> The big update of the article is done. Please, You look.
>>>>
>>>> Gergő Tisza: The current fresh hour format can remain. Later it can be
>>>> converted to another format. And thus be more suitable for others.
>>>>
>>>> 2022-09-18 22:35 GMT+02:00, Dušan Kreheľ <dusankrehel(a)gmail.com>om>:
>>>> > I have updated the document. I added the export of human pageviews
>>>> > for
>>>> > year 2021. The statistics are in the article. A download link has
>>>> > been
>>>> > added.
>>>> >
>>>> > Dan Andreescu: None problem was to understand You.
>>>> >
>>>> > 2022-09-05 21:48 GMT+02:00, Dan Andreescu
<dandreescu(a)wikimedia.org>rg>:
>>>> >> Hi Dušan,
>>>> >>
>>>> >> I added the details on pageviews_complete to the talk page on
your
>>>> >> proposal
>>>> >> <
>>>>
https://en.wikipedia.org/w/index.php?title=User_talk:Du%C5%A1an_Krehe%C4%BE…
>>>> >.
>>>> >> Please let me know if it's still confusing.
>>>> >>
>>>> >
>>>> _______________________________________________
>>>> Wikitech-l mailing list -- wikitech-l(a)lists.wikimedia.org
>>>> To unsubscribe send an email to wikitech-l-leave(a)lists.wikimedia.org
>>>>
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>>>
>>
>