[Wikipedia-l] Code to get text of page

Jimmy Wales jwales at bomis.com
Thu Aug 30 06:29:29 UTC 2001


        $FS  = "\xb3";
        $FS1 = $FS . "1";
        $FS2 = $FS . "2";
        $FS3 = $FS . "3";
  # read full contents of .db file into $dbfile
        %Page = split(/$FS1/, $dbfile, -1);
        %Section = split(/$FS2/, $Page{'text_default'}, -1);
        %Text = split(/$FS3/, $Section{'data'}, -1);
       $pagetext = $Text{'text'};   # text of the page

I got this from Clifford Adam's UseMod website.  It's highly
useful.  I'm using it to provide a snippet of each page in the
search engine output.

Tomorrow I will use it to fulltext index the entire site.  This
will be mondo cool.



-- 
*************************************************
*            http://www.nupedia.com/            *
*      The Ever Expanding Free Encyclopedia     *
*************************************************



More information about the Wikipedia-l mailing list