Hi,
a new parser for internal links [[...]] and ''quotes''
has been committed to cvs HEAD.
Using the new parser, image thumbnail captions can
have links via [[Image:bla.jpg|thumb|A big [[bla]] I've photographed]].
The code has broken prefixed links that are used by ar: (Al[[razi]]
kind of links). I'll have to fix this tomorow.
A test page is available at
http://jeluf.mormo.org/testwiki/wiki.phtml?title=London
Regards,
JeLuF
It is wrong to support old browsers because it makes wikipedia maintainance more crufty and
difficult.
Theres 3 main cases where someone has an old browser:
They are in the first world environment:
1) are completely clueless
2) are part of the digital divide (eg old, and hence somewhat clueless)
or
3)They are in the third world.
The majority of people in the first world are not those who we need care about whatsoever, as its
very easy for them to seek out help (remember, linux is free, get a live CD; boot it, run
wikipedia...)
Third world however, is the most important area to focus on.
Now, the ONLY reason these people are running an old browser is because they are running windows.
Thus, we should tell them to run linux. Its the ONLY viable solution for running an up to date OS
on legacy hardware.
Now, if they have internet connection, i dont care where they are, they are within physical
distance of someone with more of a brain than they have (namely the ISP staff). Either one of
these people will be able to install linux.
You see, these 3rd world herd humans that are not able to figure out how to run something other
than NS3/IE2.0 ... were somehow able to figure out how to get a windows box. Thus, we know that
they arent incapable of doing ...things.
Thus, if they breath, and are poor we advise them to run linux. its better for us, and it will
domino be better for them and then us...and so forth.
this really isnt worth discussion: rectify quietly and efficiently.
__________________________________
Do you Yahoo!?
Yahoo! Mail SpamGuard - Read only the mail you want.
http://antispam.yahoo.com/tools
hey all,
I wanted to run my own wiki, and had a couple of questions that I would greatly
appreciate being answered (or getting pointers to the right direction for answers)
I want to make a wiki, but I want to enforce some constraints on it..
First, I want the wiki to be in a generic hierarchy such that the hierarchy
follows certaion administrator-defined rules - ie:
a) its a hierarchy with three levels
b) linking between levels is limited to certain places inside the web page.
Ultimately, what I'm looking for is an interface between wiki and CVS - I'd
like the pages to be gathered from mysql and dumped into source control in
corresponding directories (this would fit my application very well).
And, ultimately to go the other way, to take stuff out of CVS and put it
onto the wiki.
I wouldn't mind writing the glue code to do this.. but I'd rather not write
an entire new wiki in order to do it.. ;-(
In other words, I'm hoping that this isn't a wheel that have to reinvent. I'm
really interested in using wiki as a collaboriative tool, but I'm not sure if it
fits the more rigorous model that I have in mind.. If it doesn't, what do
people suggest as alternatives?
Thanks much,
Ed
(
ps - what's the difference between wiki and its various clones particularly
phpwiki? Is there a wiki coded in perl?
)
Jimmy Wales wrote:
> To compare Wikipedia to Columbia Encyclopedia...
> http://www.encyclopedia.com/
> has the full text of Columbia.
>
> There are pages for alphabetic browsing.
> http://www.encyclopedia.com/browse/browse-Aa.asp
>
> From these pages, it should be possible to get a list of all their
> article titles.
>
> These could be matched up against Wikipedia article titles.
Well, matching them up doesn't prove very easy. For example, what they
call "Abdül Aziz" is on Wikipedia called "Abd-ul-Aziz".
I have used the following heuristics to match up articles:
- redirects (obviously)
- names in the other order ("Thomas Jefferson" rather than
"Jefferson, Thomas")
- middle names deleted
The latter two are already somewhat error-prone (though I haven't
spotted such an error yet).
With just these, I was able to match up 24003 article titles from
encyclopedia.com with articles on Wikipedia. 25101 other article titles
did not yield Wikipedia equivalents (although many of them have one;
e.g. Aziz as mentioned above). A number of other titles (silly me forgot
to output their number) led to the same Wikipedia article; for example,
"Aachen" and "Aix-la-Chapelle" were listed seperately on theirs, but of
course they're the same thing.
The 24003 Wikipedia articles I could match up amount to 79979774 bytes
(almost 80 MB).
However, unfortunately I also had to find that some of them are
disambiguation pages; for example, where encyclopedia.com has a one and
only "Adalbert, Saint", Wikipedia's [[Saint Adalbert]] disambiguates to
[[Adalbert of Prague]] and [[Adalbert of Magdeburg]].
So, clearly, this isn't quite as easy. But anyway. Here is the complete
report:
(**WARNING!** 5.3 MB file! Very slow server! Better let it
download, have dinner, and then view locally!)
http://lionking.org/~timwi/t/wikipedia/comparison.html
Greetings,
Timwi
P.S.: More fun projects? ;-)
Timwi is, of course, completely accurate in all his statements. If I
were a college professor, I would award him A+ and try to get him a job
as a teaching fellow! :-)
My worst error was describing n log n as "log n" -- back to algebra
class for Uncle Ed!
Anyway, since there are lots of sharp minds ready to pounce on any bugs,
why don't we start taking an organized look at the database structure?
I'm clearly no expert on /devising/ sort algorithms, but I'm fairly good
at recognizing whether someone has come up with a good idea.
Who wants to work on database structure with me -- or, at least, is
willing to let me sit in and watch?!
Ed Poor, aka Uncle Ed
I really wish I hadn't "upgraded" to this recommended version. But I
won't go into all the issues here.
One question: The status of links (broken, existing) NO LONGER UPDATE on
any Wiki posts. Does anyone know why this is? I don't ever plan on
implementing memcache (the lack of real documentation as to how I can do
it being the primary reason)... is this an issue for future wikipedia
releases? Is memcache (or lack thereof) ruining the Wiki's ability to
update broken/not broken links?
I have memcache turned off, and the MediaWiki namespace turned off as well.
Thanks for answers.
ciaran
Hi,
this isn't strictly on-topic, because it's not specific to Wikipedia,
but you can probably help me with this.
1) When I load the downloaded SQL dump into MySQL, does it matter if I
have already created the indices for the table, or is this detrimental?
2) If the answer to that is "It is detrimental", then: How do I remove
those indexes? Apparently even if I delete the entire database and
re-create it with just the 'cur' table, magically the indexes are still
there.
Thanks,
Timwi
Jimmy Wales wrote:
> From these pages, it should be possible to get a list of all their
> article titles.
>
> These could be matched up against Wikipedia article titles.
>
> Then we could ask the hypothetical: suppose Wikipedia just snagged the
> same 55,000 topics as Columbia? How big would the resulting text be?
I'm taking it!
Just today I've downloaded the en.wikipedia.org database dump. I don't
have a very fast machine, so it took some time to decompress, and it's
still busy importing it into the DB. Does anyone know approximately how
long that takes? (Since it doesn't show any progress meter or anything)
But once that is done, the Perl script will be easy.
Timwi
Well I actually snagged it some 1.5 year ago already.
It is a 50 Mb TomeRaider file on my Pocket PC :)
Alas not public domain, so I did not publish this.
Compare to en: TomeRaider file, most recent edition (Dec) is 185 Mb (TR uses
internal compression).
The Columbia download was 1 GB, but that was mostly HTML.
Comparing on article per article basis will be different.
Titles and organization of topics will differ.
Erik Zachte
To compare Wikipedia to Columbia Encyclopedia...
http://www.encyclopedia.com/
has the full text of Columbia.
There are pages for alphabetic browsing.
http://www.encyclopedia.com/browse/browse-Aa.asp
>From these pages, it should be possible to get a list of all their
article titles.
These could be matched up against Wikipedia article titles.
Then we could ask the hypothetical: suppose Wikipedia just snagged the
same 55,000 topics as Columbia? How big would the resulting text be?
If the answer is in the ballpark of 6,500,000 words -- i.e. the same
size as Columbia - then we have an obvious strategy. If, as I would
imagine, the answer is that we're bigger, then we can start digging
into how many of our longer articles would have to be edited down in
order to hit the same "ballpark".
Note that we don't have an answer from a publisher as to how big we
can be. The guy I talked to expressed a desire to be "as big as
possible" but I warned him that that's a limitation that's going to
come from their end, not ours, because we're already bigger than
Britannica, so our issue is how to get *small enough*, not how to
produce *enough*.
--Jimbo