sj wrote:
> It should however be possible to identify the *hour* in which the
> millionth article was created, no?
Sure, with a few hours of programming that should be possible.
Until the board puts a bounty on it, I'll pass my turn (just kidding of
course, not even then).
Someone will come up with a clever estimate by interpolating monthly figures
(figures for the current month need to be calibrated, because dumps for each
language have been taken at a different time of day)
By the way: it reminds me of candid camera where a woman in the supermarket
asked the man before her, who had a full basket, if she could jump the queue
because she had to pay only for one roll of peppermint. Of course the
request was granted. As soon as she had put down the money, she was stormed
by a photographer and the manager who offered her a large cheque and a
holiday for two because she was the millionth customer. IMO one of their
better pranks :)
Erik Zachte
To be more precise, I think it would be possible (because deleted records
are still preserved in another database),
but I don't think anyone is willing to dig that deep, only to find that a
ten word stub which had been marked for deletion within minutes now has to
be preserved for posterity :)
Erik Zachte
Hoi,
I want to record some audio stuff for use within wiktionary. Ogg is the
format that is recommended. What Windows software can I use to record in
the ogg format ??
When I have .wav files how can they be converted to ogg ??
Thanks,
GerardM
Hey,
Do you imagine how all that en-gb/en-us stuff appears from international point of view?
Somewhere I saw that Wikipedia should not be stressed as British-English or American-English only website, if you're really trying to do something useful, at least for international community, then improve the content, not language pecularities.
> If people are willing to spend some time using this syntax to specify
> both spellings, why not let them? It might seem like a little thing to
> many if not most people, but it would certainly give Wikipedia's content
> a more professional and consistent overall appearance.
Cheers,
Domas
Hello! There has been talk recently about reference
tabs and cross referencing, esp with the new
WikiProject to do fact checking here [
http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Fact_and_Reference_Check
].
I think a lot of people are converging on the idea of
having some sort of tab system to enclose discrete
facts with the quoted fact being placed at the end of
the article.
I'd guess the tabs could be any mark that hasn't been
designated elsewhere ([[, {{) ? Some proposed tabs
are <<, ?? , they are currently being used as ad hoc
comment tabs in this
[http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Fact_and_Reference_Check…
] example <!--(1) --> Fact <!--(1) end--> to see how a
tab system might work.
Would it be hard to program into Wikimedia a <<tab>>
system that would autogenerate the quotation? Other
features that could also be added in would be:
-The ability to add/hide super scripts after each fact
that would direct you to the list of references below.
-The ability to hide the actual tabs themselves but
have them follow the text around if it is moved or
changed.
The second point is important since a tab system could
clutter up the text when editing the article. I'd
assume that would be hard to incorporate :). Perhaps
a second 'Edit this page' button but it includes the
<<tabs>>?
Thank you for your time, esp. to the techs who are
voulinteering to program Wikimedia.
ShaunMacPherson
______________________________________________________________________
Post your free ad now! http://personals.yahoo.ca
> Would it be possible to know which article was number one million?
I don't think it is possible to establish this after the fact.
Let us say article x on p: was the millionth article and was added at time
t.
Parsing the databases at time t+10min another article y would seem the
millionth article because someone had deleted some articles in the meantime
on q:
And so on, every measurement would point to a different article.
Erik Zachte
Sunday or Monday, I want to spend $10,000 on new hardware. What
should it be?
1. 2 gigabit switches are ordered already, so don't say anything
about that. :-)
2. Rackspace is now at a premium, HOWEVER, we should not worry a LOT
about that, since we are going to have to get a 2nd Rack very soon no
matter what we decide today. There is approximately 3U left in the
rack, maybe 4U.
A 2nd rack is going to add $700 per month to our colocation bill, not a
big deal in the grand scheme of things.
3. I have here in the office 12 4U servers (11 mobos, 12 cases) of
extremely questionable quality. Details are posted on meta:
http://meta.wikimedia.org/wiki/Hardware_donation_September_2004
4. Some have suggested more squids? Is there consensus on that?
--
"La nèfle est un fruit." - first words of 50,000th article on fr.wikipedia.org
On Sep 20, 2004, at 9:09 AM, Tân PekTiong wrote:
> On Sep 20, 2004, at 11:31 PM, Brion Vibber wrote:
>
>> On Sep 19, 2004, at 11:34 PM, Henry H. Tan-Tenn wrote:
>>> The Walon Wikipedia was fortunate to have received assistance in
>>> migrating its data into Wikipedia. The Minnan Wikipedia still has
>>> articles sitting out there. But knowing the developers' time is
>>> highly constrained, we have not complained nor do we see it as a
>>> "right" to receive such help.
>>
>> If someone had *asked* we'd have been happy to help... I at least
>> never heard any request to do this.
>
>
> http://mail.wikipedia.org/pipermail/wikitech-l/2004-May/010194.html
> I have *asked* if anybody can help to transfer the articles from old
> holopedia to the newly created min-nan
> wikipedia but it seems that developers are too busy.
Unfortunately it seems no one noticed this at the time...
>> At this point it would be harder to merge two separate wikis that
>> have undergone separate development, but we can probably arrange
>> something if that's desired.
>>
>
> The dump is still out there
> http://tmjiang.dyndns.org/wikipedia/wikipedia-backup.sql
> Can we copy the old articles with history to the new min-nan
> wikipedia, to avoid conflict, we can try to
> add a tag to old article name to be xxxx_old or something like that.
> Once the xxxx_old articles are moved
> to the new min-nan wikipedia, we can do the merge or rename of the
> articles.
Can someone take a look at this? I'll try to get to it later in the
week if not (I'm at a conference right now with limited free time).
-- brion vibber (brion @ pobox.com)
Well, the 'what to buy' discussion fizzled out, and in the meantime
Tim Starling did some tests which suggest that the squids are
bandwidth-limited, so buying more squids no longer seems like the slam
dunk that it did the other day.
I'm leaving for Switzerland in less than 24 hours, but we can continue
a longer term discussion of what needs to be bought next, and I can
just as easily make an order from there as here.
One concept that came up in IRC that sounds good to me -- the older
apache machines accept cheaper non-Ecc non-Registered RAM. It would
be relatively inexpensive (say $130/GB) to fill those up with 4Gb of
RAM each (which will really only give 3.5GB each usable due to
motherboard limitations, so maybe I will just fill them with 3.5GB
each to start with).
This RAM could be used for memcached.
We have no really firm data, but Tim believes (reasonably, but he
cautions that it is uncertain) that the parser cache hit rate could go
from 50% to maybe 85% with enough RAM. A back-of-the-envelope
calculation suggests that as much as 50GB could be put to use.
One naturally supposes that the best bang for the buck is at the low
end, i.e. taking memcached from 6GB (current, if I'm not mistaken) to
say 12GB or 18GB or whatever, would likely result in a significant
increase in that hit rate.
My understanding is that increasing the parser cache hit rate is a
great way to leverage our apache infrastructure. Spending $1300 for
10 Gb of RAM is likely to do a great deal more good than spending
$1050 for another apache.
Comments?
--Jimbo
--
"La nèfle est un fruit." - first words of 50,000th article on fr.wikipedia.org
Ok, so attachments don't work. Thus, I copy and paste my brief
correspondence with Jimbo. See below.
------------------------------------------------------------------------------------------------------
Subject:
Re: using the "random page" feature in academic studies
From:
"Jimmy (Jimbo) Wales" <jwales(a)wikia.com>
Date:
Thu, 16 Sep 2004 08:33:40 -0700
To:
samuel <sha(a)chello.se>
Well, as it turns out, I do have a lack of knowledge, and so I recommend
to you wikitech-l mailing list or #mediawiki irc channel on freenode.net.
I am unsure how the random article function works, it may avoid
certain types of articles for some reason, I don't actually know.
In terms of a study of quality, I do think that "random article" may
be a misleading starting point, but of course this depends on the
interpretation of the results.
Wikipedia is a work in progress, and so what would likely be
interesting would be to have "random articles" but sorted into
categories depending on things like: how many edits, how long the
article has been around, how "stable" it is (i.e. if it got a ton of
edits in the past but has reached an equilibrium now), how long it is,
etc.
Any random article is probably not as good as Britannica, for example.
But "featured articles" are generally much much better.
--Jimbo
samuel wrote:
> Hi!
>
> First off, thank you for bringing one of the best ideas in the history
> of the Internet to fruition. With that out of the way, onwards to the
> actual point of this email: I am about conduct a study of the quality of
> Wikipedia's content. In order to do this, I will need to randomly select
> articles, and for this, the "random page" feature appears a natural
> choice. However, I cannot find any information on how it works, and it
> is essential that such information in described in the methodology
> section of the study. The questions I have regarding the selection are
> the following:
>
> 1) What counts as a page?
> 2) Is the selection done from all pages/articles or a subset of them?
> 3) Is there any weight attached to outcomes (for example, so that a very
> popular or frequently edited article would have a higher chance of
> appearing as a random article)?
>
> If you, for whatever reason, have a better idea than using the "random
> page" feature for a study of this sort, I would be very glad if you
> could let me know. Also, if you are unable to answer this email for lack
> of knowledge (I find this hard to believe, but you never know) or time
> constraints, or whatever, I would greatly appreciate if you could point
> me in the direction of someone who is more likely to be up for the task.
>
> Thank you very much!
>
> Best regards,
> Samuel Härgestam,
> undergrad in mathematics, computer science and philosophy at Stockholm
> Univerisity, as well as a true Wikipedia lover