[Wikipedia-l] minor ? about stats in new codebase. + a proposal
Daniel Mayer
maveric149 at yahoo.com
Thu Jul 11 01:36:46 UTC 2002
On Wednesday 10 July 2002 03:53 pm, Jimbo wrote:
> lcrocker at nupedia.com wrote:
> > > In the new software, would something like [[User talk archive
> > > 1:maveric149]] be counted as an article when the statistics are
> > > run, or does the stats look for and exclude any article with the
> > > string "talk" in it?
> >
> > Only the specifically recognized namespace prefixes are treated
> > as namespaces; anything else is just a regular article with a
> > colon in the title. 0
>
> This sounds dangerous to me, as people might inadvertently "pollute" the
> namespace name space?
>
> --Jimbo
I agree -- that's why I asked the question.
However, I don't think we should just give up on the use of the colon
character as Magnus suggested. I see the use of the colon as one of the nicer
new features of the new wikiware version. But there is the stats issue.....
Would it be just as easy to run the stats based on grouping page names that
contain certain text strings? For example, search for every page name that
has the strings "user", "talk", "wikipedia", "image", "/" and maybe "list" in
them, then group each of these, count the number in each category, remove
pages that are double counted then present several 'total number of articles'
stats on the stats page.
If this proposal would be more of a CPU resource issue than grouping by
namespace, then we could simply have these stats updated once a day based on
a snapshot of the page title database (this can be processed off-server or by
the server itself in bits and pieces whenever a few CPU cycles can be
spared).
This might be a good idea anyway, since I don't see the need to having a
stats page that recalculates the number of articles for each and every
request (esp. since the main reason for reworking the php wikiware was to
maximize efficiency).
--maveric149
More information about the Wikipedia-l
mailing list