[Mediawiki-l] Storing or Linking Documents
Michael Daly
mikedaly at magma.ca
Mon Apr 9 17:26:33 UTC 2007
Dave Sigafoos wrote:
> Unless MS removed every
> 'text' word from their document I don't see where an extension that
> could index the words would be a moving target.
I have a lot of old documents in IBM's Bookmanager format. They are
encrypted in such a way that no one can scan past the formating info and
find the text. I rely on some old, buggy Bookmanager software to access
them and expect that with another change of OS version I will lose the
ability to use them. IBM has never released the internal format
specification for the documents and nothing I've been able to do has
wrested the info from them.
I keep expecting MS to pull a similar stunt with Word. They could sell
the encryption as a "security" feature.
> Do you think that MS gives a crap whether MW is or is not capable of
> indexing word documents? Unless, of course you feel able to calling
> Bill and having him change his format.
MS might give a crap about the laws currently being pushed out that
force open standards for document storage. These governments don't want
to have docs stored that become obsolete because of one vendor's
decision to change their format. This was what I was thinking of when I
mentioned saving in an open standard. Of course, the battle right now
is MS's version of it's "open standard" versus the open source
community's desire for a truly open standard.
It's been in the computer news so much lately I thought you'd get the
drift of the comments. I guess I shouldn't have been so obscure. Sorry!
> I believe that the market place will decide, rightly or wrongly (and who
> decides that?), what tools and "standards" will be used.
It looks like the elected reps will beat the market to it. That of
course brings its own risks/rewards.
> Of course right now it doesn't really matter as the only TYPE is IMAGE
> and MW doesn't appear to be able to search on it (which makes sense if
> the only type I wanted to store was IMAGE.
Searching on images is a major problem. Do you search only on names of
images, on descriptions or on content? Searching content is still a
significant research effort in image processing and recognition.
The only restriction on what you can upload is in the extension list in
Localsettings.php. My wiki allows specific text uploads. There's an
extension I'm working on that processes them (mod of an existing
extension). I don't see any reason why you couldn't make an extension
that does the same with any chosen doc format.
Mike
More information about the MediaWiki-l
mailing list