On Nov 19 2009, Platonides wrote:
zh509@york.ac.uk wrote:
On Nov 19 2009, Roan Kattouw wrote:
2009/11/19 zh509@york.ac.uk:
Greeting,
May I ask the question about wikipedia database. I downloaded the Wikipedia revision current data. and found there are some records have the exactly same rev_id, rev_user and same timestamp. What does it mean? are they the same edit or different?
If they belong to the same wiki, they're very likely to be the same edit. Of course such duplicates should theoretically not occur.
Roan Kattouw (Catrope)
Thanks, I noted that because i add Revision Table and Page table together. May I ask why for the same page.page_latest, there are two same records on the table? Is that the link between revision and Page is the rev_id=page.page_latest?
page.page_latest point to the current revision.rev_id
However, you shouldn't be able to have several revisions with the same rev_id. Even if something went horribly wrong at the wiki level, rev_id is a PRIMARY KEY. How did you do the import? I suspect you may have broken something importing or merging.
I took the sub-current data from MediaWiki and import them to Oracle. I found there are two same page_latest ID in the page table. Then when I tried to join Revision table and Page table together, this caused two same rev_id.
May I ask why I have two page_latest on page table, what it mean? If I want to put Revision table and Page table together, which should be the link point?
thanks, Zeyi
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l