On Wed, Jul 24, 2013 at 9:12 PM, Ori Livneh ori@wikimedia.org wrote: thanks for the reply Ori.
On Wed, Jul 24, 2013 at 11:59 AM, dan entous <dan.entous.wikimedia@gmail.com
wrote:
context
i’m working on a mediawiki extension, http://www.mediawiki.org/wiki/Extension:GWToolset, which has as one of its goals, the ability to upload media files to a wiki. the extension, among other tasks, will process an XML file that has a list of urls to media files and upload those media files to the wiki along with metadata contained within the XML file. our ideal goal is to have this extension run on http://commons.wikimedia.org/ onhttp://commons.wikimedia.org/.
Check out the 'DataPages' subdirectory in the mediawiki/extensions/examples repository (< https://git.wikimedia.org/summary/mediawiki%2Fextensions%2Fexamples.git%3E). It was designed to showcase how to work with ContentHandler, and it does so by implementing an XML content type and namespace.
in our last meeting with the foundation, in july, we were asked to move away from ContentHandler since there is a potential for XML files to exceed a 1mb limit. at the moment, the extension is using ContentHandler and DOMDocument to read the XML Content because in june we were asked to use ContentHandler; we originally planned to read the XML as a file and use XMLReader, which would be more efficient.
in a subsequent reply to this thread, Brian offers a potential way of dealing with this issue. i’ll be able to take a look at his approach later this month, but if anyone can prove the concept beforehand or refer me to some code that has already done so, that would be great.
we initially developed the extension to store the files in the File: namespace, but we were told by the Foundation that we should use ContentHandler instead. unfortunately there is an issue with storing content > 1mb in the db so we need to find another solution.
That's a lot of XML! You can gzip page content, FWIW. _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l