New subject: [Mediawiki-l] Storing or Linking Documents

8 Apr 2007

      Identifying the type isn't the problem - that's easy.  The problem is writing decoders for every document format in the world, and hacking them into the existing MySQL-based search system.
Having said that, if we had to-plaintext converters for key doc formats, couldn't we add that plaintext to the text indexed for the Image: page?  This could happen at save, and Wiki admins could configure converters by doc suffix.
Of course, the true answer is still to browbeat our users into using wiki markup... ;-)
Ian
-----Original Message-----
From: 	Dave Sigafoos [mailto:davesigafoos@sanmar.com]
Sent:	Saturday, April 07, 2007 06:45 PM Pacific Standard Time
To:	MediaWiki announcements and site admin list
Subject:	Re: [Mediawiki-l] Storing or Linking Documents
So how hard would it be to expand the upload process to allow selecting
the 'type' of upload?  Then the 'type' would be able to be searched thus
adding a good benefit to MW.
Also, wouldn't it make sense, since the upload process has a 'comment'
that you can enter, that a search against this comment be allowed.
I do understand that searching on binary of an image really makes no
sense (unless you are storing hidden text :) but allowing entry / search
of keywords might be a good idea
Thanks.
DSig
David Tod Sigafoos | SANMAR Corporation
PICK Guy
206-770-5585
davesigafoos@sanmar.com
-----Original Message-----
From: mediawiki-l-bounces@lists.wikimedia.org
[mailto:mediawiki-l-bounces@lists.wikimedia.org] On Behalf Of Jim Wilson
Sent: Friday, April 06, 2007 11:31
To: MediaWiki announcements and site admin list
Subject: Re: [Mediawiki-l] Storing or Linking Documents
...
The Image: namespace stores the meta-data for all uploaded files; I
guess the "Image" name is based on history and how it's used in WP.
But
...
for those of us using MW for corporate nets, "Image:" means any
uploaded
...
file.
AFAIK, the namespace is called "Image" because that's what it's meant to
store - images.  Not video, not Excel spreadsheets, not Word docs.
Using the Image upload facility for something other than pure images
represents an intentional circumvention of the spirit of the device
(regardless of business needs - which I understand).
For the record, we have a wiki here where I work, and yes, people upload
Excel spreadsheets and word docs and PDFs and ZIP files and .... etc.
-- Jim
On 4/6/07, Ian Smith ismith@good.com wrote:
...
Dave Sigafoos:
...
I had gathered that images weren't searchable which makes sense to
me
...
...
(except for descriptive information) but I did not realize that a
document with 'text' would not be searchable.
Documents are simply stored as-is in the filesystem; so, an uploaded
Word doc ends up stored in c:\WebServer\mediawiki\images\f\f7\foo.doc.
In contrast, Wiki pages are stored as fields in the MySQL database.
Search doesn't work on uploaded documents, because:

the search uses the MySQL search facility, and so only works on

stuff
...
which is in the DB
2. since an uploaded doc could be in any format, there's no way to
search it: eg. if a document compresses its content using some
proprietary scheme, there's no general way to look inside it.
Note that the problems go beyond search: features like "What links
here"
...
only work for links from Wiki pages, etc.
...
I do see now that it seems to put all uploaded 'media' to IMAGE:
which
...
I
...
am not sure I understand.
The Image: namespace stores the meta-data for all uploaded files; I
guess the "Image" name is based on history and how it's used in WP.
But
...
for those of us using MW for corporate nets, "Image:" means any
uploaded
...
file.
Believe me, I feel your pain... if you find a way to stop your users
using Word for a single sentence of plain text, let me know.  ;-)
Ian

MediaWiki-l mailing list
MediaWiki-l@lists.wikimedia.org
http://lists.wikimedia.org/mailman/listinfo/mediawiki-l
_______________________________________________
MediaWiki-l mailing list
MediaWiki-l@lists.wikimedia.org
http://lists.wikimedia.org/mailman/listinfo/mediawiki-l
_______________________________________________
MediaWiki-l mailing list
MediaWiki-l@lists.wikimedia.org
http://lists.wikimedia.org/mailman/listinfo/mediawiki-l