Hello
I'm new here and joined after Jimmy seems to have liked my suggestion about doing an off-line version of Wikipedia.
We all know Wikipedia is updated frequently, so it should have a daily database of changes in order to the program be able to update the database.
Basically, the database + program could be purchased on CD or downloaded on internet. We could have a "static database" (since we can't rewrite CDs) and make just "diffs databases" to update. When the diffs database gets big, the program should suggest purchasing a new CD or an "updated static database" (for the case the user has downloaded the program).
I'm interested on starting this program (that will be licensed as GPL) and I can do it in C++ (compiling with Microsoft Visual C++). Since it's a GPL program, I know most people would prefer using a GPL compiler like gcc and maybe for an opensource OS, but I'm not used to program on it.
Well... Of course I want to participate the project, but I want to do a Windows version now (I've learned a lot of Windows APIs from other programs I did before and I'm really not interested on learning all again for other OS at this time). We may change the compiler since I have no problems to make a few "adjustments" in the way I program and I'm a computer science student, so one day or other I'll need to learn that. After this program is done, we could port it to other OS, or if many people don't agree to do it for Windows, we can split and each group start doing it for different OS.
So now I've 2 questions, because I'm new here: 1) Was it already discussed here? Is there already something in progress? (maybe I should have asked that before writting everything I did ;) )
2) What do we have? a) How is wikipedia database organized? b) Can we have access to the "raw database" so the program could use it? c) Is there already a "changed log"?
3) Who's with me? ;)
I think that's all for now.
Regards,
/*
+---------------------------------+ | Luís Fernando Estrozi | +---------------------------------+ | Ciência da Computação - USP | | | | mailto:lemon@grad.icmc.usp.br | | ICQ#: 25541891 | | | | http://grad.icmc.usp.br/~lemon/ | +---------------------------------+
There are 10 types of people in the world: Those who understand binary, and those who don't
*/ EOF
Hello Luís,
Luís Fernando schrieb:
So now I've 2 questions, because I'm new here:
- Was it already discussed here? Is there already something in
progress? (maybe I should have asked that before writting everything I did ;) )
Well, I am already developing a Wikipedia offline reader using the XUL framework [1]. But I'm still waiting for some features for the reasons shown below.
- What do we have? a) How is wikipedia database organized?
There are two database tables - the "cur" and the "old". The cur table stores the current version of an article, the old table stores each previous version. Since there is no revision id attached to an article version in the cur table, you can't point to a specific article version - see [2].
b) Can we have access to the "raw database" so the program could use it?
There is a thread in this mailing list concerning this question ([3]).
- Who's with me? ;)
Probably me. ;-)
Eckhart
On Monday 06 September 2004 18:54, Luís Fernando wrote:
Basically, the database + program could be purchased on CD or downloaded on internet. We could have a "static database" (since we can't rewrite CDs) and make just "diffs databases" to update. When the diffs database gets big, the program should suggest purchasing a new CD or an "updated static database" (for the case the user has downloaded the program).
I'm interested on starting this program (that will be licensed as GPL) and I can do it in C++ (compiling with Microsoft Visual C++). Since it's a GPL program, I know most people would prefer using a GPL compiler like gcc and maybe for an opensource OS, but I'm not used to program on it.
Well... Of course I want to participate the project, but I want to do a Windows version now (I've learned a lot of Windows APIs from other programs I did before and I'm really not interested on learning all again for other OS at this time). We may change the compiler since I have no problems to make a few "adjustments" in the way I program and I'm a computer science student, so one day or other I'll need to learn that. After this program is done, we could port it to other OS, or if many people don't agree to do it for Windows, we can split and each group start doing it for different OS.
Wouldn't it be by far the easiest if this program would be a small "web server" which would retrieve and serve Wikipedia pages to user's web browser of choice? That way it could be made quickly, small, and portable.
Wouldn't it be by far the easiest if this program would be a small "web server" which would retrieve and serve Wikipedia pages to user's web browser of choice? That way it could be made quickly, small, and portable.
It would be. Also, it would suck. (not to mention that implies installing MySQL on user's machine - a requirement that I wouldn't like to put in an end-user application).
The whole point of writing an offline, native client is to provide much better browsing and/or editing experience for users. If it's not much better, people won't bother to install it.
Users don't care about how easy (or hard) it is to develop software. Only programmers care about that. Users care about value added. If there is no value added in native client, there is no reason for them to use it.
The only value added of such solution would be ability to run Wikipedia offline. While a noble thing, it's not really that much of a change. There's much more that could be done and much more will have to be done if anyone expects people to actually use said application.
Krzysztof Kowalczyk | http://blog.kowalczyk.info
Very interesting discussing topic.
We are trying to use MetaWiki as an internal communication, documentation, and knowlege base platform. I really find it is annoying to edit and update pages online because it is slow, especially from Internet. It is also painful to navigate back and forth in the wiki site to get the right pages to write on stuffs.
As a result, I continue to use a local small program named e-Stack Room to take notes and write my own knowlege base. Actually I have used it for years. It is much faster and well organized in tree view.
The offline and online topic is a long-life topic in computer science. I learnt from graduate Operating System course that there are a lot of researchers trying to handle it. It is about achieving remote and local storage coherence with less communication cost. I can review the course to see details if necessary. :-)
Maybe MetaWiki is not suitable for our goal. But if there is a local, faster, well organized version for it, I'd like to be the first one to try it.
On Tue, 7 Sep 2004 23:10:15 -0700, Krzysztof Kowalczyk kkowalczyk@gmail.com wrote:
Wouldn't it be by far the easiest if this program would be a small "web server" which would retrieve and serve Wikipedia pages to user's web browser of choice? That way it could be made quickly, small, and portable.
It would be. Also, it would suck. (not to mention that implies installing MySQL on user's machine - a requirement that I wouldn't like to put in an end-user application).
The whole point of writing an offline, native client is to provide much better browsing and/or editing experience for users. If it's not much better, people won't bother to install it.
Users don't care about how easy (or hard) it is to develop software. Only programmers care about that. Users care about value added. If there is no value added in native client, there is no reason for them to use it.
The only value added of such solution would be ability to run Wikipedia offline. While a noble thing, it's not really that much of a change. There's much more that could be done and much more will have to be done if anyone expects people to actually use said application.
Krzysztof Kowalczyk | http://blog.kowalczyk.info
Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
Nikola Smolenski wrote:
On Monday 06 September 2004 18:54, Luís Fernando wrote:
Well... Of course I want to participate the project, but I want to do a Windows version now (I've learned a lot of Windows APIs from other programs I did before and I'm really not interested on learning all again for other OS at this time). We may change the compiler since I have no problems to make a few "adjustments" in the way I program and I'm a computer science student, so one day or other I'll need to learn that. After this program is done, we could port it to other OS, or if many people don't agree to do it for Windows, we can split and each group start doing it for different OS.
Wouldn't it be by far the easiest if this program would be a small "web server" which would retrieve and serve Wikipedia pages to user's web browser of choice? That way it could be made quickly, small, and portable.
Check out the "waikiki" package from MediaWiki CVS - it already contains my (a little outdated) version of a command-line wiki-to-HMTL programs and comes with a no-install web server for windows. It is written in C++ and uses the sqlite library (database-in-a-file, no server required).
I did create a CD for the German wikipedia once, complete with installer that can copy the software, database and/or images to hard disk, though only installing the software is required, the rest can stay on CD.
I also have an online update tool that can download a list of articles via the Special:Export page and update the mentioned sqlite database accordingly. What would be needed is a Special page to list "all articles new or changed since XXX", so it can synchronize with the live site.
Magnus
One reason I am so interested in this is that as I continue to work to expand my contacts in Africa, I keep hearing that CD-ROM is good for them, they have a pressing need for content CD-ROMs, and their internet access is terrible.
Getting wikipedia->CD working easily is one piece of the puzzle there.
wikitech-l@lists.wikimedia.org