On Sun, Sep 7, 2014 at 8:11 AM, Sean Pringle <springle(a)wikimedia.org> wrote:
>
> On Sun, Sep 7, 2014 at 5:58 AM, Brad Jorsch (Anomie) <
> bjorsch(a)wikimedia.org> wrote:
>
>> The database query for that is simple enough:
>>
>> SELECT /* ApiQueryCategoryMembers::run Anomie */
>> cl_from,cl_sortkey,cl_type,page_namespace,page_title,cl_timestamp FROM
>> `page`,`categorylinks` FORCE INDEX (cl_timestamp) WHERE cl_to =
>> 'Copy_to_Wikimedia_Commons_(bot-assessed)' AND (cl_from=page_id) ORDER BY
>> cl_timestamp,cl_from LIMIT 501;
>>
>> And the PHP code doesn't do anything complicated either. Maybe Sean can
>> give us more insight if there's some subtle database thing going on here.
>>
>
> As Nik noted, the query plan walking cl_timestamp is not ideal. Plus, even
> with the forced index the query requires a filesort since cl_timestamp
> index is on (cl_to,cl_timestamp) and not (cl_timestamp,cl_from).
>
We're including a constant cl_to in the query here, so the index on
(cl_to,cl_timestamp) is exactly what we want.
As for ORDER BY cl_timestamp, cl_from, that's
https://gerrit.wikimedia.org/r/#/c/103589/ taking advantage of InnoDB's
clustered indexes where it silently appends the primary key (or in this
case the first/only UNIQUE key, cl_from) to all other indexes.
When I EXPLAIN this query against enwiki, there's no filesort on master,
db1055, db1051, and db1066.
stdClass Object
(
[id] => 1
[select_type] => SIMPLE
[table] => categorylinks
[type] => ref
[possible_keys] => cl_timestamp
[key] => cl_timestamp
[key_len] => 257
[ref] => const
[rows] => 635858
[Extra] => Using index condition; Using where
)
stdClass Object
(
[id] => 1
[select_type] => SIMPLE
[table] => page
[type] => eq_ref
[possible_keys] => PRIMARY
[key] => PRIMARY
[key_len] => 4
[ref] => enwiki.categorylinks.cl_from
[rows] => 1
[Extra] =>
)
There is on db1061, db1062, db1065, db1072, and db1073.
stdClass Object
(
[id] => 1
[select_type] => SIMPLE
[table] => categorylinks
[type] => ref
[possible_keys] => cl_timestamp
[key] => cl_timestamp
[key_len] => 257
[ref] => const
[rows] => 706656
[Extra] => Using index condition; Using where; Using filesort
)
stdClass Object
(
[id] => 1
[select_type] => SIMPLE
[table] => page
[type] => eq_ref
[possible_keys] => PRIMARY
[key] => PRIMARY
[key_len] => 4
[ref] => enwiki.categorylinks.cl_from
[rows] => 1
[Extra] =>
)
My wild guess would be that the latter set of databases were somehow
created differently so that InnoDB is clustering using some index other
than cl_from.
>
> Removing the FORCE INDEX would allow cl_sortkey index to be used, with
> better selectivity.
>
On master, db1055, db1051, and db1066, removing the FORCE INDEX still
reports from EXPLAIN that it chose the cl_timestamp index. It does cause
EXPLAIN to stop saying "Using index condition" though.
stdClass Object
(
[id] => 1
[select_type] => SIMPLE
[table] => categorylinks
[type] => ref
[possible_keys] => cl_from,cl_timestamp,cl_sortkey
[key] => cl_timestamp
[key_len] => 257
[ref] => const
[rows] => 525652
[Extra] => Using where
)
stdClass Object
(
[id] => 1
[select_type] => SIMPLE
[table] => page
[type] => eq_ref
[possible_keys] => PRIMARY
[key] => PRIMARY
[key_len] => 4
[ref] => enwiki.categorylinks.cl_from
[rows] => 1
[Extra] =>
)
On db1061, db1062, db1065, db1072, and db1073, it does choose cl_sortkey
but it still filesorts.
stdClass Object
(
[id] => 1
[select_type] => SIMPLE
[table] => categorylinks
[type] => ref
[possible_keys] => cl_from,cl_timestamp,cl_sortkey
[key] => cl_sortkey
[key_len] => 257
[ref] => const
[rows] => 525652
[Extra] => Using index condition; Using where; Using filesort
)
stdClass Object
(
[id] => 1
[select_type] => SIMPLE
[table] => page
[type] => eq_ref
[possible_keys] => PRIMARY
[key] => PRIMARY
[key_len] => 4
[ref] => enwiki.categorylinks.cl_from
[rows] => 1
[Extra] =>
)
If I also remove the cl_from from the ORDER BY, then all databases return
the same query using the cl_timestamp index. But the API can't do that
without reopening https://bugzilla.wikimedia.org/show_bug.cgi?id=24782.
--
Brad Jorsch (Anomie)
Software Engineer
Wikimedia Foundation
you should use both, the dataType: 'jsonp' on the ajax config and the
format=json on the url
On Tue, Sep 9, 2014 at 6:00 AM, <mediawiki-api-request(a)lists.wikimedia.org>
wrote:
> Send Mediawiki-api mailing list submissions to
> mediawiki-api(a)lists.wikimedia.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
> or, via email, send a message with subject or body 'help' to
> mediawiki-api-request(a)lists.wikimedia.org
>
> You can reach the person managing the list at
> mediawiki-api-owner(a)lists.wikimedia.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Mediawiki-api digest..."
>
>
> Today's Topics:
>
> 1. Re: Cross-domain AJAX problems with Wikipedia's API (jim andrews)
> 2. Re: Cross-domain AJAX problems with Wikipedia's API
> (Kristian Kankainen)
> 3. Re: Cross-domain AJAX problems with Wikipedia's API (jim andrews)
> 4. Re: Cross-domain AJAX problems with Wikipedia's API (Legoktm)
> 5. Re: Cross-domain AJAX problems with Wikipedia's API (jim andrews)
> 6. Re: Cross-domain AJAX problems with Wikipedia's API
> (Bartosz Dziewoński)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 8 Sep 2014 10:34:01 -0700
> From: jim andrews <jim(a)vispo.com>
> To: mediawiki-api(a)lists.wikimedia.org
> Subject: Re: [Mediawiki-api] Cross-domain AJAX problems with
> Wikipedia's API
> Message-ID: <5CF6D333-BDF3-4952-B927-F164981F563C(a)vispo.com>
> Content-Type: text/plain; charset=windows-1252
>
> I read the recent thread on cross-domain AJAX problems with Wikipedia’s
> API but I’m still having problems. I’m getting the following error:
>
> XMLHttpRequest cannot load
> https://en.wikipedia.org/w/api.php?action=query&list=recentchanges&format=j….
> No 'Access-Control-Allow-Origin' header is present on the requested
> resource. Origin 'http://vispo.com' is therefore not allowed access.
>
> That results after I make the following call:
> $.ajax({url:url, success:ajaxSuccess, error:ajaxError});
>
> where url is the above url, ajaxSuccess is a function, and so is
> ajaxError. I also wrote an accessible function fooblah but it doesn’t get
> called.
>
> Please advise.
>
> ja
> http://vispo.com
>
>
>
> -------------------------------------------------------------------------------------------------------------
> Brad wrote:
>
> There is a whitelist, stored in the configuration variable
> $wgCrossSiteAJAXdomains in CommonSettings.php. This file can be viewed at
> [1], or in revision control at [2].
>
> You can query anonymously using JSONP (e.g. [3]), or by querying from your
> own server rather than from a webpage. If you are going to be querying from
> a webpage, do review the API Etiquette page.[4]
>
>
>
> [1]:
> https://noc.wikimedia.org/conf/highlight.php?file=CommonSettings.php
>
> [2]:
>
>
> https://git.wikimedia.org/blob/operations%2Fmediawiki-config.git/master/wmf…
>
> [3]:
>
>
> https://et.wikipedia.org/w/api.php?action=query&list=recentchanges&format=j…
>
> [4]:
> https://www.mediawiki.org/wiki/API:Etiquette
>
>
> ------------------------------
>
> Message: 2
> Date: Tue, 09 Sep 2014 07:52:44 +0300
> From: Kristian Kankainen <kristian(a)eki.ee>
> To: mediawiki-api(a)lists.wikimedia.org
> Subject: Re: [Mediawiki-api] Cross-domain AJAX problems with
> Wikipedia's API
> Message-ID: <540E879C.2090503(a)eki.ee>
> Content-Type: text/plain; charset=UTF-8; format=flowed
>
> Hello!
>
> I got my queries working with the following code. Try removing your
> callback parameter from the url and try this:
>
> $.ajax({
> 'url': url,
> 'dataType': 'jsonp',
> 'cache': true,
> 'success': ajaxSuccess,
> 'error': ajaxError
> });
>
> Kristian K
>
> 08.09.2014 20:34, jim andrews kirjutas:
> > I read the recent thread on cross-domain AJAX problems with Wikipedia’s
> API but I’m still having problems. I’m getting the following error:
> >
> > XMLHttpRequest cannot load
> https://en.wikipedia.org/w/api.php?action=query&list=recentchanges&format=j….
> No 'Access-Control-Allow-Origin' header is present on the requested
> resource. Origin 'http://vispo.com' is therefore not allowed access.
> >
> > That results after I make the following call:
> > $.ajax({url:url, success:ajaxSuccess, error:ajaxError});
> >
> > where url is the above url, ajaxSuccess is a function, and so is
> ajaxError. I also wrote an accessible function fooblah but it doesn’t get
> called.
> >
> > Please advise.
> >
> > ja
> > http://vispo.com
> >
> >
> >
> -------------------------------------------------------------------------------------------------------------
> > Brad wrote:
> >
> > There is a whitelist, stored in the configuration variable
> > $wgCrossSiteAJAXdomains in CommonSettings.php. This file can be viewed at
> > [1], or in revision control at [2].
> >
> > You can query anonymously using JSONP (e.g. [3]), or by querying from
> your
> > own server rather than from a webpage. If you are going to be querying
> from
> > a webpage, do review the API Etiquette page.[4]
> >
> >
> >
> > [1]:
> > https://noc.wikimedia.org/conf/highlight.php?file=CommonSettings.php
> >
> > [2]:
> >
> >
> https://git.wikimedia.org/blob/operations%2Fmediawiki-config.git/master/wmf…
> >
> > [3]:
> >
> >
> https://et.wikipedia.org/w/api.php?action=query&list=recentchanges&format=j…
> >
> > [4]:
> > https://www.mediawiki.org/wiki/API:Etiquette
> > _______________________________________________
> > Mediawiki-api mailing list
> > Mediawiki-api(a)lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
> >
>
>
>
>
> ------------------------------
>
> Message: 3
> Date: Tue, 9 Sep 2014 00:10:47 -0700
> From: jim andrews <jim(a)vispo.com>
> To: MediaWiki API announcements & discussion
> <mediawiki-api(a)lists.wikimedia.org>
> Subject: Re: [Mediawiki-api] Cross-domain AJAX problems with
> Wikipedia's API
> Message-ID: <FDC654BC-8CE3-4F20-8144-C0CD56A421DD(a)vispo.com>
> Content-Type: text/plain; charset=windows-1252
>
> Hi Kristian,
>
> Thanks for your help. I have removed the callback parameter (and the
> format=jsonfm) from the below url and have also changed the $.ajax call as
> you advise. Consequently, I am no longer getting the error I previously
> was. However, I am getting the below error:
>
> Resource interpreted as Script but transferred with MIME type text/html: "
> https://en.wikipedia.org/w/api.php?action=query&list=recentchanges&gsradius…".
> jquery-1.11.0.min.js:4
> Refused to execute script from '
> https://en.wikipedia.org/w/api.php?action=query&list=recentchanges&gsradius…'
> because its MIME type ('text/html') is not executable, and strict MIME type
> checking is enabled.
>
> I am getting this error both locally and after I upload to my real server.
> Please advise.
>
> ja
>
> On Sep 8, 2014, at 9:52 PM, Kristian Kankainen <kristian(a)eki.ee> wrote:
>
> > Hello!
> >
> > I got my queries working with the following code. Try removing your
> callback parameter from the url and try this:
> >
> > $.ajax({
> > 'url': url,
> > 'dataType': 'jsonp',
> > 'cache': true,
> > 'success': ajaxSuccess,
> > 'error': ajaxError
> > });
> >
> > Kristian K
> >
> > 08.09.2014 20:34, jim andrews kirjutas:
> >> I read the recent thread on cross-domain AJAX problems with Wikipedia’s
> API but I’m still having problems. I’m getting the following error:
> >>
> >> XMLHttpRequest cannot load
> https://en.wikipedia.org/w/api.php?action=query&list=recentchanges&format=j….
> No 'Access-Control-Allow-Origin' header is present on the requested
> resource. Origin 'http://vispo.com' is therefore not allowed access.
> >>
> >> That results after I make the following call:
> >> $.ajax({url:url, success:ajaxSuccess, error:ajaxError});
> >>
> >> where url is the above url, ajaxSuccess is a function, and so is
> ajaxError. I also wrote an accessible function fooblah but it doesn’t get
> called.
> >>
> >> Please advise.
> >>
> >> ja
> >> http://vispo.com
> >>
> >>
> >>
> -------------------------------------------------------------------------------------------------------------
> >> Brad wrote:
> >>
> >> There is a whitelist, stored in the configuration variable
> >> $wgCrossSiteAJAXdomains in CommonSettings.php. This file can be viewed
> at
> >> [1], or in revision control at [2].
> >>
> >> You can query anonymously using JSONP (e.g. [3]), or by querying from
> your
> >> own server rather than from a webpage. If you are going to be querying
> from
> >> a webpage, do review the API Etiquette page.[4]
> >>
> >>
> >>
> >> [1]:
> >> https://noc.wikimedia.org/conf/highlight.php?file=CommonSettings.php
> >>
> >> [2]:
> >>
> >>
> https://git.wikimedia.org/blob/operations%2Fmediawiki-config.git/master/wmf…
> >>
> >> [3]:
> >>
> >>
> https://et.wikipedia.org/w/api.php?action=query&list=recentchanges&format=j…
> >>
> >> [4]:
> >> https://www.mediawiki.org/wiki/API:Etiquette
> >> _______________________________________________
> >> Mediawiki-api mailing list
> >> Mediawiki-api(a)lists.wikimedia.org
> >> https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
> >>
> >
> >
> > _______________________________________________
> > Mediawiki-api mailing list
> > Mediawiki-api(a)lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
>
>
>
>
> ------------------------------
>
> Message: 4
> Date: Tue, 09 Sep 2014 01:14:14 -0700
> From: Legoktm <legoktm.wikipedia(a)gmail.com>
> To: mediawiki-api(a)lists.wikimedia.org
> Subject: Re: [Mediawiki-api] Cross-domain AJAX problems with
> Wikipedia's API
> Message-ID: <540EB6D6.7040302(a)gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> On 9/9/14, 12:10 AM, jim andrews wrote:
> > Hi Kristian,
> >
> > Thanks for your help. I have removed the callback parameter (and the
> format=jsonfm) from the below url and have also changed the $.ajax call as
> you advise. Consequently, I am no longer getting the error I previously
> was. However, I am getting the below error:
>
> Actually you should pass format=json if you want JSON to be returned. If
> no format parameter is specified, the API defaults to "xmlfm", which is
> an HTML representation of the XML format. See
> <https://www.mediawiki.org/wiki/API:Data_formats#Output> for some more
> details.
>
> -- Legoktm
>
>
>
> ------------------------------
>
> Message: 5
> Date: Tue, 9 Sep 2014 01:38:34 -0700
> From: jim andrews <jim(a)vispo.com>
> To: MediaWiki API announcements & discussion
> <mediawiki-api(a)lists.wikimedia.org>
> Subject: Re: [Mediawiki-api] Cross-domain AJAX problems with
> Wikipedia's API
> Message-ID: <4F588DB3-91D8-44C0-90D5-4ABA17170E7C(a)vispo.com>
> Content-Type: text/plain; charset=windows-1252
>
> Hi Legoktm,
>
> Did you note
>
> $.ajax({
> 'url': url,
> 'dataType': 'jsonp',
> 'cache': true,
> 'success': ajaxSuccess,
> 'error': ajaxError
> });
>
> I presume the ‘dataType’:’jsonp’ inserts some parameter in the URL that
> sets the format. To jsonp.
>
> ja
>
> On Sep 9, 2014, at 1:14 AM, Legoktm <legoktm.wikipedia(a)gmail.com> wrote:
>
> > On 9/9/14, 12:10 AM, jim andrews wrote:
> >> Hi Kristian,
> >>
> >> Thanks for your help. I have removed the callback parameter (and the
> format=jsonfm) from the below url and have also changed the $.ajax call as
> you advise. Consequently, I am no longer getting the error I previously
> was. However, I am getting the below error:
> >
> > Actually you should pass format=json if you want JSON to be returned. If
> > no format parameter is specified, the API defaults to "xmlfm", which is
> > an HTML representation of the XML format. See
> > <https://www.mediawiki.org/wiki/API:Data_formats#Output> for some more
> > details.
> >
> > -- Legoktm
> >
> > _______________________________________________
> > Mediawiki-api mailing list
> > Mediawiki-api(a)lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
>
>
>
>
> ------------------------------
>
> Message: 6
> Date: Tue, 9 Sep 2014 10:56:20 +0200
> From: Bartosz Dziewoński <matma.rex(a)gmail.com>
> To: "MediaWiki API announcements & discussion"
> <mediawiki-api(a)lists.wikimedia.org>
> Subject: Re: [Mediawiki-api] Cross-domain AJAX problems with
> Wikipedia's API
> Message-ID:
> <CAA-yUx1k+_0750XL=
> PA0eBBvCTSnY6MzwLJrFCxg2y4J8snHSw(a)mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> It doesn't.
>
> --
> -- Matma Rex
>
A tool I have written, For the Common Good [1], uses the following type of query to
fetch a list of "random" files that users may like to transfer to Commons. The
category name may differ but the structure is the same:
https://en.wikipedia.org/w/api.php?format=xml&cmnamespace=6&cmtitle=Categor…
In 2011 when I was first writing FtCG, this query ran at an acceptable speed.
Recently, though, it has become extremely slow, to the point where timeouts are now
a regular occurrence. It sometimes takes 4 or 5 tries (and several minutes) before
results are returned. From then on, however, it works quickly. If you run this exact
query now, there's a good chance it will work quickly because others have been
running the query before you.
The cause seems to be the "cmsort=timestamp" portion of the request. If this is
removed, it works essentially instantaneously. However, I don't really want the
files in alphabetical order, as it doesn't seem very "random".
Four questions:
1. Why does this query take so long?
2. Can anything be done on the server side to make it faster?
3. Why does it take so much longer now than it did in 2011?
4. Is there a better way to fetch a random cross-section of files in a particular
category?
TTO
[1] https://en.wikipedia.org/wiki/User:This,_that_and_the_other/For_the_Common_…
Hello!
Since I don't get even the simplest code snippet to work (e.g [1]) I
want to ask a really simple question.
Am I allowed to query the Wikipedia API [2] from any domain if I specify
the 'origin' field to correspond my domain? Or is there a whitelist with
only perhaps all relevant WikiMedia sites blocking me from this. I have
tried more elaborate ajax script parameters following all bot best
practices but I get CORS blocked.
The code snippet mentioned above was modified only with "'origin':
location.origin" but I get the error:
"SyntaxError: JSON.parse: unexpected end of data at line 1 column 1 of
the JSON data"
(using jquery 1.11.0 and Firefox 31.0).
Best wishes
Kristian Kankainen
[1]: https://www.mediawiki.org/wiki/Manual:CORS
[2]: https://et.wikipedia.org/w/api.php