Hi,

Since the Mediawiki API allows to get the size in bytes of the last revision of a Wikipedia page, is it not possible to retrieve this information with a generator? (it's a real question, I'm not at all comfortable with this API). 

Ettore Rizza


Le sam. 12 janv. 2019 à 14:41, Reem Al-Kashif <reemalkashif@gmail.com> a écrit :
Right, I see what you mean. Thanks a lot!

On Sat, 12 Jan 2019 at 15:35, Lucas Werkmeister <mail@lucaswerkmeister.de> wrote:

Well, if you take just the MWAPI part of the query, you’ll get exactly 10000 results, but most of them aren’t male authors (a lot of them seem to be lists of various kinds). And I think those 10000 results are all we can get from the API, so if we limit those to male authors afterwards, we only get a few results (about 100), and there’s no way to increase that number as far as I’m aware, because apparently we can’t get more than 10000 total pages from MWAPI.

Cheers,
Lucas

On 12.01.19 13:57, Reem Al-Kashif wrote:
Thank you so much, Nicolas & Lucas! 

@Lucas this helps a lot! At least I will get an idea about what I need until PetScan is sorted out. Would you elaborate a bit more what do you mean by "most of its results are linked to items we don’t care about"?

Best,
Reem

On Sat, 12 Jan 2019 at 14:18, Lucas Werkmeister <mail@lucaswerkmeister.de> wrote:

You can’t directly query for the size as far as I know, but you can use the longpages query page generator to get a list of the longest enwiki pages, then filter the associated items for male authors. But this will only get you about a hundred results until the longpages list is exhausted (most of its results are linked to items we don’t care about), and it won’t get you the actual size (and therefore the order of results isn’t necessarily meaningful either, you just know they’re among the longest pages).

SELECT ?item ?titleEn WHERE {
  hint:Query hint:optimizer "None".
  SERVICE wikibase:mwapi {
    bd:serviceParam wikibase:endpoint "en.wikipedia.org";
                    wikibase:api "Generator";
                    mwapi:generator "querypage";
                    mwapi:gqppage "Longpages";
                    mwapi:gqplimit "max".
    ?title wikibase:apiOutput mwapi:title.
  }
  BIND(STRLANG(?title, "en") AS ?titleEn)
  ?sitelink schema:name ?titleEn;
            schema:isPartOf <https://en.wikipedia.org/>;
            schema:about ?item.
  ?item wdt:P31 wd:Q5;
        wdt:P106 wd:Q36180;
        wdt:P21 wd:Q6581097.
}

Try it!

Cheers, Lucas

On 12.01.19 12:56, Nicolas VIGNERON wrote:
Hi Reem,

If this page https://www.mediawiki.org/wiki/Wikidata_Query_Service/User_Manual/MWAPI is up-o-date it's does not seem possible to get the article size of a wikipedia article (but I must I don't use and know "wikibase:mwapi" a lot, maybe it has or will changed).

Cheers,
Nicolas

Le sam. 12 janv. 2019 à 12:16, Reem Al-Kashif <reemalkashif@gmail.com> a écrit :
Hello!

Hope this finds you well. I put together a query to create a list of English Wikipedia articles about male writers. Is it possible to filter the results by size? For example, articles that are larger than or equal to 10k bytes?

I understand that this is better done by PetScan, but my PetScan query refuses to cooperate for a reason I don't know yet.. :/

Thanks in advance.

Best,
Reem

--
Kind regards,
Reem Al-Kashif


Virus-free. www.avg.com
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


--
Kind regards,
Reem Al-Kashif


_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


--
Kind regards,
Reem Al-Kashif

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata