[Wikipedia-l] Code to get text of page
Jimmy Wales
jwales at bomis.com
Thu Aug 30 06:29:29 UTC 2001
$FS = "\xb3";
$FS1 = $FS . "1";
$FS2 = $FS . "2";
$FS3 = $FS . "3";
# read full contents of .db file into $dbfile
%Page = split(/$FS1/, $dbfile, -1);
%Section = split(/$FS2/, $Page{'text_default'}, -1);
%Text = split(/$FS3/, $Section{'data'}, -1);
$pagetext = $Text{'text'}; # text of the page
I got this from Clifford Adam's UseMod website. It's highly
useful. I'm using it to provide a snippet of each page in the
search engine output.
Tomorrow I will use it to fulltext index the entire site. This
will be mondo cool.
--
*************************************************
* http://www.nupedia.com/ *
* The Ever Expanding Free Encyclopedia *
*************************************************
More information about the Wikipedia-l
mailing list