[Labs-l] [Analytics] doubt on GeoData / how to obtain articles with coords

Marc Miquel marcmiquel at gmail.com
Mon Mar 2 22:42:44 UTC 2015


Hi Max and Oliver,

Thanks for your answers. geo_tags table seems quite uncomplete. I just
checked some random articles in for instance Nepali Wikipedia, for its
Capital Katmandú there is coords in the real article but it doesn't appear
in geo_tags. Then it doesn't seem an option.

Marc
ᐧ

2015-03-02 23:38 GMT+01:00 Oliver Keyes <okeyes at wikimedia.org>:

> Max's idea is an improvement but still a lot of requests. We really need
> to start generating these dumps :(.
>
> Until the dumps are available, the fastest way to do it is probably Quarry
> (http://quarry.wmflabs.org/) an open MySQL client to our public database
> tables. So, you want the geo_tags table; getting all the coordinate sets on
> the English-language Wikipedia would be something like:
>
> SELECT * FROM enwiki_p.geo_tags;
>
> This should be available for all of our production wikis (SHOW DATABASES
> is your friend): you want [project]_p rather than [project]. Hope that
> helps!
>
> On 2 March 2015 at 17:35, Max Semenik <maxsem.wiki at gmail.com> wrote:
>
>> Use generators:
>> api.php?action=query&generator=allpages&gapnamespace=0&prop=coordinates&gaplimit=max&colimit=max
>>
>> On Mon, Mar 2, 2015 at 2:33 PM, Marc Miquel <marcmiquel at gmail.com> wrote:
>>
>>> Hi guys,
>>>
>>> I am doing some research and I struggling a bit to obtain geolocalized
>>> articles in several languages. They told me that the best tool to obtain
>>> the geolocalization for each article would be GeoData API. But I see there
>>> I need to introduce each article name and I don't know if it is the best
>>> way.
>>>
>>> I am thinking for instance that for big wikipedies like French or German
>>> I might need to make a million queries to get only those with coords...
>>> Also, I would like to obtain the region according to ISO 3166-2 which seems
>>> to be there.
>>>
>>> My objective is to obtain different lists of articles related to
>>> countries and regions.
>>>
>>> I don't know if using WikiData with python would be a better option. But
>>> I see that there there isn't the region. Maybe I could combine WikiData and
>>> some other tool to give me the region.
>>> Anyone could help me?
>>>
>>> Thanks a lot.
>>>
>>> Marc Miquel
>>>>>>
>>> _______________________________________________
>>> Analytics mailing list
>>> Analytics at lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>
>>>
>>
>>
>> --
>> Best regards,
>> Max Semenik ([[User:MaxSem]])
>>
>> _______________________________________________
>> Analytics mailing list
>> Analytics at lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>
>
> --
> Oliver Keyes
> Research Analyst
> Wikimedia Foundation
>
> _______________________________________________
> Analytics mailing list
> Analytics at lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/labs-l/attachments/20150302/cf591d4a/attachment.html>


More information about the Labs-l mailing list