Hi Asaf,
it looks like the script has difficulties with the welcome page.
Few things: - Update your code against the SVN if necessary. - Are you sure /home/asaf/dev/hewiki_dump/index.html exists ? - I will review this part of the code this evening an add checks and logs to help finding the issue.
Regards Emmanuel
Le mar 30/06/09 11:20, "Asaf Bartov" asaf.bartov@gmail.com a écrit:
Hello, Emmanuel, and everyone.
Im still trying to create a ZIM file of the Hebrew Wikipedia dump. Ive made some progress -- the dump is complete, but the buildZimFileFromDirectory.pl script is still failing:
The command line I used was:
asaf@abartov-deb:~/dev/kiwix/dumping_tools/scripts$ ./BUILDZIMFILEFROMDIRECTORY.PL --HTMLPATH=/HOME/ASAF/DEV/HEWIKI_DUMP --WELCOMEPAGE=./INDEX.HTML
The output is:
NOTICE: CREATE TABLE will create implicit sequence "article_aid_seq" for serial column "article.aid" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "article_pkey" for table "article" NOTICE: CREATE TABLE will create implicit sequence "category_cid_seq" for serial column "category.cid" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "category_pkey" for table "category" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "categoryarticles_pkey" for table "categoryarticles" NOTICE: CREATE TABLE will create implicit sequence "zimfile_zid_seq" for serial column "zimfile.zid" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "zimfile_pkey" for table "zimfile" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "zimdata_pkey" for table "zimdata" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "zimarticles_pkey" for table "zimarticles" NOTICE: CREATE TABLE will create implicit sequence "indexarticle_xid_seq" for serial column "indexarticle.xid" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "indexarticle_pkey" for table "indexarticle" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "words_pkey" for table "words" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "trivialwords_pkey" for table "trivialwords" Use of uninitialized value in concatenation (.) or string at ../classes//Kiwix/ZimIndexer.pm line 805. Use of uninitialized value $welcomePage in concatenation (.) or string at ../classes//Kiwix/ZimIndexer.pm line 746. DBD::Pg::db do failed: ERROR: invalid input syntax for integer: "" at ../classes//Kiwix/ZimIndexer.pm line 385. ERROR: invalid input syntax for integer: ""
The last few lines in all.log are:
[2009-06-30 07:05:43,997] builZimFileFromDirectory.pl - Rewriting url in /home/asaf/dev/hewiki_dump/articles/�×/� /22/Portal~�×� _�_� ����¢�×__79ba.html[2009-06-30 07:05:44,010] builZimFileFromDirectory.pl - Adding to DB /home/asaf/dev/hewiki_dump/articles/�×/� /22/Portal~�×� _�_� ����¢�×__79ba.html[2009-06-30 07:05:44,243] builZimFileFromDirectory.pl - Rewriting url in /home/asaf/dev/hewiki_dump/articles/�/�¨/� /��¨� ���_ ���.html[2009-06-30 07:05:44,305] builZimFileFromDirectory.pl - Adding to DB /home/asaf/dev/hewiki_dump/articles/�/�¨/� /��¨� ���_ ���.html[2009-06-30 07:05:44,547] builZimFileFromDirectory.pl - Rewriting url in /home/asaf/dev/hewiki_dump/articles/a/c/h/�§���¥~Achdut.JPG_bd38. html[2009-06-30 07:05:44,567] builZimFileFromDirectory.pl - Adding to DB /home/asaf/dev/hewiki_dump/articles/a/c/h/�§���¥~Achdut.JPG_bd38. html[2009-06-30 07:05:44,797] builZimFileFromDirectory.pl - Rewriting url in /home/asaf/dev/hewiki_dump/articles/�/�/�/�×�� ��×~�� �����¤26����.html[2009-06-30 07:05:44,812] builZimFileFromDirectory.pl - Adding to DB /home/asaf/dev/hewiki_dump/articles/�/�/�/�×�� ��×~�� �����¤26����.html[2009-06-30 07:05:45,007] builZimFileFromDirectory.pl - Rewriting url in /home/asaf/dev/hewiki_dump/articles/t/r/i/�§���¥~Triumph_of_the_W ill_-_Congress_Hall.jpg_2aad.html[2009-06-30 07:05:45,024] builZimFileFromDirectory.pl - Adding to DB /home/asaf/dev/hewiki_dump/articles/t/r/i/�§���¥~Triumph_of_the_W ill_-_Congress_Hall.jpg_2aad.html[2009-06-30 07:05:45,249] builZimFileFromDirectory.pl - Adding to DB /home/asaf/dev/hewiki_dump/articles/�×/�/�×/�×��×��×_� �������×.html[2009-06-30 07:05:45,509] builZimFileFromDirectory.pl - Rewriting url in /home/asaf/dev/hewiki_dump/articles/�/�/�/������ [2009-06-30 07:05:45,578] builZimFileFromDirectory.pl - Adding to DB /home/asaf/dev/hewiki_dump/articles/�/�/�/������
Any idea what is wrong? Any more diagnostics I can provide?
Thanks in advance,
Asaf
On Mon, Apr 20, 2009 at 11:11 PM, Emmanuel Engelhart wrote: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Asaf Bartov a écrit :
So far, Ive been able to compile and run all the tools, but am
having some
trouble creating ZIM files: Ive dumped a locally-loaded
Mediawiki
installation to HTML using these instructions
,
but when I try to run the builZimFileFromDirectory.pl script, I
get silly
postgresql errors about failing to connect using "kiwix" user.
I have commited to Kiwix svn a new version of builZimFileFromDirectory.pl allowing to specify the dbUser and dbPassword on the command line.
Emmanuel -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org [2]
iEYEARECAAYFAkns1vEACgkQn3IpJRpNWtNliwCeKyaPwXWdCr5rclkuYvqgCa2m wdMAn2F+sgUByhQT5g4o6XoD7WJeIeIL =USFE -----END PGP SIGNATURE-----
--
Asaf Bartov
Links:
[1] http://openzim.org/Wiki2html [2] http://enigmail.mozdev.org
Hi, Emmanuel. Thanks for responding.
- I am (and was) sync'd to revision 661, which is the latest for the dumping tools subdir. - The file exists and is a simple HTML document. Are there any assumptions made about its content? If so, what are they, and can I get a sample index.html document from one of your ZIM files, without downloading the huge German ZIM file I see on the site?
Thanks!
A.
On Tue, Jun 30, 2009 at 12:35 PM, emmanuel@engelhart.org wrote:
Hi Asaf,
it looks like the script has difficulties with the welcome page.
Few things:
- Update your code against the SVN if necessary.
- Are you sure /home/asaf/dev/hewiki_dump/index.html exists ?
- I will review this part of the code this evening an add checks and logs
to help finding the issue.
Regards Emmanuel
Le mar 30/06/09 11:20, "Asaf Bartov" asaf.bartov@gmail.com a écrit:
Hello, Emmanuel, and everyone.
Im still trying to create a ZIM file of the Hebrew Wikipedia dump. Ive made some progress -- the dump is complete, but the buildZimFileFromDirectory.pl script is still failing:
The command line I used was:
asaf@abartov-deb:~/dev/kiwix/dumping_tools/scripts$ ./BUILDZIMFILEFROMDIRECTORY.PL --HTMLPATH=/HOME/ASAF/DEV/HEWIKI_DUMP --WELCOMEPAGE=./INDEX.HTML
The output is:
NOTICE: CREATE TABLE will create implicit sequence "article_aid_seq" for serial column "article.aid" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "article_pkey" for table "article" NOTICE: CREATE TABLE will create implicit sequence "category_cid_seq" for serial column "category.cid" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "category_pkey" for table "category" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "categoryarticles_pkey" for table "categoryarticles" NOTICE: CREATE TABLE will create implicit sequence "zimfile_zid_seq" for serial column "zimfile.zid" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "zimfile_pkey" for table "zimfile" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "zimdata_pkey" for table "zimdata" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "zimarticles_pkey" for table "zimarticles" NOTICE: CREATE TABLE will create implicit sequence "indexarticle_xid_seq" for serial column "indexarticle.xid" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "indexarticle_pkey" for table "indexarticle" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "words_pkey" for table "words" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "trivialwords_pkey" for table "trivialwords" Use of uninitialized value in concatenation (.) or string at ../classes//Kiwix/ZimIndexer.pm line 805. Use of uninitialized value $welcomePage in concatenation (.) or string at ../classes//Kiwix/ZimIndexer.pm line 746. DBD::Pg::db do failed: ERROR: invalid input syntax for integer: "" at ../classes//Kiwix/ZimIndexer.pm line 385. ERROR: invalid input syntax for integer: ""
The last few lines in all.log are:
[2009-06-30 07:05:43,997] builZimFileFromDirectory.pl - Rewriting url in /home/asaf/dev/hewiki_dump/articles/�×/� /22/Portal~�×� _�_� ����¢�×__79ba.html[2009-06-30 07:05:44,010] builZimFileFromDirectory.pl -
Adding to DB
/home/asaf/dev/hewiki_dump/articles/�×/� /22/Portal~�×� _�_� ����¢�×__79ba.html[2009-06-30 07:05:44,243] builZimFileFromDirectory.pl -
Rewriting url
in /home/asaf/dev/hewiki_dump/articles/�/�¨/� /��¨� ���_ ���.html[2009-06-30 07:05:44,305] builZimFileFromDirectory.pl - Adding to
DB
/home/asaf/dev/hewiki_dump/articles/�/�¨/� /��¨� ���_ ���.html[2009-06-30 07:05:44,547] builZimFileFromDirectory.pl - Rewriting
url
in /home/asaf/dev/hewiki_dump/articles/a/c/h/�§���¥~Achdut.JPG_bd38. html[2009-06-30 07:05:44,567] builZimFileFromDirectory.pl - Adding to DB /home/asaf/dev/hewiki_dump/articles/a/c/h/�§���¥~Achdut.JPG_bd38. html[2009-06-30 07:05:44,797] builZimFileFromDirectory.pl - Rewriting url in /home/asaf/dev/hewiki_dump/articles/�/�/�/�×�� ��×~�� �����¤26����.html[2009-06-30 07:05:44,812] builZimFileFromDirectory.pl -
Adding to DB
/home/asaf/dev/hewiki_dump/articles/�/�/�/�×�� ��×~�� �����¤26����.html[2009-06-30 07:05:45,007] builZimFileFromDirectory.pl -
Rewriting url
in /home/asaf/dev/hewiki_dump/articles/t/r/i/�§���¥~Triumph_of_the_W ill_-_Congress_Hall.jpg_2aad.html[2009-06-30 07:05:45,024]
builZimFileFromDirectory.pl - Adding to DB
/home/asaf/dev/hewiki_dump/articles/t/r/i/�§���¥~Triumph_of_the_W ill_-_Congress_Hall.jpg_2aad.html[2009-06-30 07:05:45,249]
builZimFileFromDirectory.pl - Adding to DB
/home/asaf/dev/hewiki_dump/articles/�×/�/�×/�×��×��×_� �������×.html[2009-06-30 07:05:45,509] builZimFileFromDirectory.pl -
Rewriting
url in /home/asaf/dev/hewiki_dump/articles/�/�/�/������ [2009-06-30 07:05:45,578] builZimFileFromDirectory.pl - Adding to DB /home/asaf/dev/hewiki_dump/articles/�/�/�/������
Any idea what is wrong? Any more diagnostics I can provide?
Thanks in advance,
Asaf
On Mon, Apr 20, 2009 at 11:11 PM, Emmanuel Engelhart wrote: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Asaf Bartov a écrit :
So far, Ive been able to compile and run all the tools, but am
having some
trouble creating ZIM files: Ive dumped a locally-loaded
Mediawiki
installation to HTML using these instructions
,
but when I try to run the builZimFileFromDirectory.pl script, I
get silly
postgresql errors about failing to connect using "kiwix" user.
I have commited to Kiwix svn a new version of builZimFileFromDirectory.pl allowing to specify the dbUser and dbPassword on the command line.
Emmanuel -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org [2]
iEYEARECAAYFAkns1vEACgkQn3IpJRpNWtNliwCeKyaPwXWdCr5rclkuYvqgCa2m wdMAn2F+sgUByhQT5g4o6XoD7WJeIeIL =USFE -----END PGP SIGNATURE-----
--
Asaf Bartov
Links:
[1] http://openzim.org/Wiki2html [2] http://enigmail.mozdev.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hi Asaf,
I have just commited a patched version of the ZimIndexer.pm and buildZimFileFromDirectory.pl.
Now: * format of the "welcomePage" parameter is checked (no '.' or '/' at the beginning of the path) * Check if the welcomePage exists at all * In the case of the welcomePage was a redirect and if no other pages were pointing to this page... the page was removing and that could have generate your error message. This can not occur anymore.
Hope, the script will work now for you.
Regards Emmanuel
Asaf Bartov a écrit :
Hi, Emmanuel. Thanks for responding.
- I am (and was) sync'd to revision 661, which is the latest for the dumping
tools subdir.
- The file exists and is a simple HTML document. Are there any assumptions
made about its content? If so, what are they, and can I get a sample index.html document from one of your ZIM files, without downloading the huge German ZIM file I see on the site?
Thanks!
A.
On Tue, Jun 30, 2009 at 12:35 PM, emmanuel@engelhart.org wrote:
Hi Asaf,
it looks like the script has difficulties with the welcome page.
Few things:
- Update your code against the SVN if necessary.
- Are you sure /home/asaf/dev/hewiki_dump/index.html exists ?
- I will review this part of the code this evening an add checks and logs
to help finding the issue.
Regards Emmanuel
Le mar 30/06/09 11:20, "Asaf Bartov" asaf.bartov@gmail.com a écrit:
Hello, Emmanuel, and everyone.
Im still trying to create a ZIM file of the Hebrew Wikipedia dump. Ive made some progress -- the dump is complete, but the buildZimFileFromDirectory.pl script is still failing:
The command line I used was:
asaf@abartov-deb:~/dev/kiwix/dumping_tools/scripts$ ./BUILDZIMFILEFROMDIRECTORY.PL --HTMLPATH=/HOME/ASAF/DEV/HEWIKI_DUMP --WELCOMEPAGE=./INDEX.HTML
The output is:
NOTICE: CREATE TABLE will create implicit sequence "article_aid_seq" for serial column "article.aid" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "article_pkey" for table "article" NOTICE: CREATE TABLE will create implicit sequence "category_cid_seq" for serial column "category.cid" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "category_pkey" for table "category" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "categoryarticles_pkey" for table "categoryarticles" NOTICE: CREATE TABLE will create implicit sequence "zimfile_zid_seq" for serial column "zimfile.zid" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "zimfile_pkey" for table "zimfile" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "zimdata_pkey" for table "zimdata" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "zimarticles_pkey" for table "zimarticles" NOTICE: CREATE TABLE will create implicit sequence "indexarticle_xid_seq" for serial column "indexarticle.xid" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "indexarticle_pkey" for table "indexarticle" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "words_pkey" for table "words" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "trivialwords_pkey" for table "trivialwords" Use of uninitialized value in concatenation (.) or string at ../classes//Kiwix/ZimIndexer.pm line 805. Use of uninitialized value $welcomePage in concatenation (.) or string at ../classes//Kiwix/ZimIndexer.pm line 746. DBD::Pg::db do failed: ERROR: invalid input syntax for integer: "" at ../classes//Kiwix/ZimIndexer.pm line 385. ERROR: invalid input syntax for integer: ""
The last few lines in all.log are:
[2009-06-30 07:05:43,997] builZimFileFromDirectory.pl - Rewriting url in /home/asaf/dev/hewiki_dump/articles/�×/� /22/Portal~�×� _�_� ����¢�×__79ba.html[2009-06-30 07:05:44,010] builZimFileFromDirectory.pl -
Adding to DB
/home/asaf/dev/hewiki_dump/articles/�×/� /22/Portal~�×� _�_� ����¢�×__79ba.html[2009-06-30 07:05:44,243] builZimFileFromDirectory.pl -
Rewriting url
in /home/asaf/dev/hewiki_dump/articles/�/�¨/� /��¨� ���_ ���.html[2009-06-30 07:05:44,305] builZimFileFromDirectory.pl - Adding to
DB
/home/asaf/dev/hewiki_dump/articles/�/�¨/� /��¨� ���_ ���.html[2009-06-30 07:05:44,547] builZimFileFromDirectory.pl - Rewriting
url
in /home/asaf/dev/hewiki_dump/articles/a/c/h/�§���¥~Achdut.JPG_bd38. html[2009-06-30 07:05:44,567] builZimFileFromDirectory.pl - Adding to DB /home/asaf/dev/hewiki_dump/articles/a/c/h/�§���¥~Achdut.JPG_bd38. html[2009-06-30 07:05:44,797] builZimFileFromDirectory.pl - Rewriting url in /home/asaf/dev/hewiki_dump/articles/�/�/�/�×�� ��×~�� �����¤26����.html[2009-06-30 07:05:44,812] builZimFileFromDirectory.pl -
Adding to DB
/home/asaf/dev/hewiki_dump/articles/�/�/�/�×�� ��×~�� �����¤26����.html[2009-06-30 07:05:45,007] builZimFileFromDirectory.pl -
Rewriting url
in /home/asaf/dev/hewiki_dump/articles/t/r/i/�§���¥~Triumph_of_the_W ill_-_Congress_Hall.jpg_2aad.html[2009-06-30 07:05:45,024]
builZimFileFromDirectory.pl - Adding to DB
/home/asaf/dev/hewiki_dump/articles/t/r/i/�§���¥~Triumph_of_the_W ill_-_Congress_Hall.jpg_2aad.html[2009-06-30 07:05:45,249]
builZimFileFromDirectory.pl - Adding to DB
/home/asaf/dev/hewiki_dump/articles/�×/�/�×/�×��×��×_� �������×.html[2009-06-30 07:05:45,509] builZimFileFromDirectory.pl -
Rewriting
url in /home/asaf/dev/hewiki_dump/articles/�/�/�/������ [2009-06-30 07:05:45,578] builZimFileFromDirectory.pl - Adding to DB /home/asaf/dev/hewiki_dump/articles/�/�/�/������
Any idea what is wrong? Any more diagnostics I can provide?
Thanks in advance,
Asaf
On Mon, Apr 20, 2009 at 11:11 PM, Emmanuel Engelhart wrote:
Asaf Bartov a écrit :
So far, Ive been able to compile and run all the tools, but am
having some
trouble creating ZIM files: Ive dumped a locally-loaded
Mediawiki
installation to HTML using these instructions
,
but when I try to run the builZimFileFromDirectory.pl script, I
get silly
postgresql errors about failing to connect using "kiwix" user.
I have commited to Kiwix svn a new version of builZimFileFromDirectory.pl allowing to specify the dbUser and dbPassword on the command line.
Emmanuel
- -- - -- Asaf Bartov
Links: - ------ [1] http://openzim.org/Wiki2html [2] http://enigmail.mozdev.org