Hi Tom!
The snippet looks fine at a glance, though I wonder why you are not just using maintenance/edit.php.
Am 19.09.19 um 14:17 schrieb Tom Schulze:
I import pages using a custom maintenance script which reads a files' content from the file system and saves it to the mediawiki db using:
$title = Title::newFromText('Widget:MyWidget'); $wikiPage = new WikiPage( $title ); $newContent = ContentHandler::makeContent( $contentFromFile, $title ); $wikiPage->doEditContent( $newContent );
In the MW Class reference https://doc.wikimedia.org/mediawiki-core/master/php/classContentHandler.html#a2f403e52fb305523b0812f37de41622d it says "If [the modelId parameter for ContentHandler::makeContent() is] not provided, $title->getContentModel() is used." I assume, that it checks the namespace among others and uses javascript for Widgets? Because in my case it's a widget that causes the error. The extension is installed prior to the importation and the namespace 'Widget' exists.
So what should happen is that Title::getContentModel() decides that the default model for the Widget namespace should be javascript (based on an entry in $wgNamespaceContentModels made by the extension), and return the string "javascript".
When recording the model of the content in the content table, that string gets normalized by creating an entry in the content_models table, if no such entry exists yet for "javascript", generating a unique integer ID (in your case, this appears to be 2). This integer gets recorded in content.content_model_id.
When reading the page's content later, the model name associated with 2 is looked up in the content_models table (actually, in a cached version of that table), returning "javascript". This however fails in your case.
The question is: since the number 2 was generated by an auto-increment key when inserting into content_models, why is the row now missing from the table? How can that be?
Is there something wrong with the snippet?
Not in an obvious way.
The only explanation I have is that the edit actually fails for some reason, and the database transaction gets rolled back. This would result in a situation where the row for "javascript" is not in content_models, but it's still in the cached version of that table (in APC memory or memcached or whatever you have your object cache set to).
So perhaps you retry after the initial failure. Since the cached table has an entry for "javascript", MediaWiki will just use that, and not write to the table again. Your edit succeeds - but now you have the number 2 in content.content_model_id, but no row for 2 in the content_models table. You can still read the page as long as you have the cached version of the content_models table in memory - but as soon as the cache expires, things blow up.
As I said in my earlier response, I'm working on a patch to avoid this situation, see https://gerrit.wikimedia.org/r/c/mediawiki/core/+/514245.
However, I'm still not 100% sure that what I described above is what actually happened. Did you have some kind of failure when you first tried to import the widget (or any javascript, such as MediaWiki:common.js?)
If you didn't, I'm back to having no clue as to what might be causing this problem. Which of course would not be good at all :)