[Mediawiki-l] Storing or Linking Documents

Dave Sigafoos davesigafoos at sanmar.com
Sun Apr 8 16:49:46 UTC 2007


My thought was that if we have the ability to add 'types' we could then
define extensions to work with that 'type'.

There are a couple formats that seem to be 'universal' whether we like
it or not.  Word, excel, ppt, pdf then emerging standards like those
coming from open office.

Also code documents, php, html, c and xml etc could be stored.

And it isn't so much " .. The problem is writing decoders for every
document format in the world .." as a couple of the standards. For
example, in the environment that I work in I have written several api
examples to connect to a different database.  Once there are a couple
good examples are there then some will be able to duplicate the process.

Also I am not sure how many 'decoders' would be needed.  For example a
word document should be able to be searched without decoding it to plain
text.  Yes?  Maybe not.

" .. couldn't we add that plaintext to the text indexed for the Image:
page ..".  This would work, but wouldn't it make more sense to have
definitions of the 'document type'.  

I realize that this is more than wiki was intended but MW is such an
incredible 'product' that I can see people using it more and more for
their business use.

Of course not all tools should be used for all situations.  It just
*seems* to me that document/documentation search/retrieval is a close
fit.

Thanks for the follow up

DSig
David Tod Sigafoos | SANMAR Corporation
PICK Guy
206-770-5585
davesigafoos at sanmar.com 

 

-----Original Message-----
From: mediawiki-l-bounces at lists.wikimedia.org
[mailto:mediawiki-l-bounces at lists.wikimedia.org] On Behalf Of Ian Smith
Sent: Sunday, April 08, 2007 8:30
To: MediaWiki announcements and site admin list; MediaWiki announcements
and site admin list
Subject: Re: [Mediawiki-l] Storing or Linking Documents

Identifying the type isn't the problem - that's easy.  The problem is
writing decoders for every document format in the world, and hacking
them into the existing MySQL-based search system.

Having said that, if we had to-plaintext converters for key doc formats,
This could happen at save, and Wiki admins could configure converters by
doc suffix.

Of course, the true answer is still to browbeat our users into using
wiki markup... ;-)

Ian

 -----Original Message-----
From: 	Dave Sigafoos [mailto:davesigafoos at sanmar.com]
Sent:	Saturday, April 07, 2007 06:45 PM Pacific Standard Time
To:	MediaWiki announcements and site admin list
Subject:	Re: [Mediawiki-l] Storing or Linking Documents

So how hard would it be to expand the upload process to allow selecting
the 'type' of upload?  Then the 'type' would be able to be searched thus
adding a good benefit to MW.

Also, wouldn't it make sense, since the upload process has a 'comment'
that you can enter, that a search against this comment be allowed.
I do understand that searching on binary of an image really makes no
sense (unless you are storing hidden text :) but allowing entry / search
of keywords might be a good idea

Thanks.

DSig
David Tod Sigafoos | SANMAR Corporation
PICK Guy
206-770-5585
davesigafoos at sanmar.com 

 

-----Original Message-----
From: mediawiki-l-bounces at lists.wikimedia.org
[mailto:mediawiki-l-bounces at lists.wikimedia.org] On Behalf Of Jim Wilson
Sent: Friday, April 06, 2007 11:31
To: MediaWiki announcements and site admin list
Subject: Re: [Mediawiki-l] Storing or Linking Documents

> The Image: namespace stores the meta-data for all uploaded files; I
> guess the "Image" name is based on history and how it's used in WP.
But
> for those of us using MW for corporate nets, "Image:" means any
uploaded
> file.

AFAIK, the namespace is called "Image" because that's what it's meant to
store - images.  Not video, not Excel spreadsheets, not Word docs.

Using the Image upload facility for something other than pure images
represents an intentional circumvention of the spirit of the device
(regardless of business needs - which I understand).

For the record, we have a wiki here where I work, and yes, people upload
Excel spreadsheets and word docs and PDFs and ZIP files and .... etc.

-- Jim

On 4/6/07, Ian Smith <ismith at good.com> wrote:
>
> Dave Sigafoos:
> >
> > I had gathered that images weren't searchable which makes sense to
me
> > (except for descriptive information) but I did not realize that a
> > document with 'text' would not be searchable.
>
> Documents are simply stored as-is in the filesystem; so, an uploaded
> Word doc ends up stored in c:\WebServer\mediawiki\images\f\f7\foo.doc.
> In contrast, Wiki pages are stored as fields in the MySQL database.
>
> Search doesn't work on uploaded documents, because:
> 1. the search uses the MySQL search facility, and so only works on
stuff
> which is in the DB
> 2. since an uploaded doc could be in any format, there's no way to
> search it: eg. if a document compresses its content using some
> proprietary scheme, there's no general way to look inside it.
>
> Note that the problems go beyond search: features like "What links
here"
> only work for links from Wiki pages, etc.
>
> > I do see now that it seems to put all uploaded 'media' to IMAGE:
which
> I
> > am not sure I understand.
>
> The Image: namespace stores the meta-data for all uploaded files; I
> guess the "Image" name is based on history and how it's used in WP.
But
> for those of us using MW for corporate nets, "Image:" means any
uploaded
> file.
>
> Believe me, I feel your pain... if you find a way to stop your users
> using Word for a single sentence of plain text, let me know.  ;-)
>
> Ian
>
>
> _______________________________________________
> MediaWiki-l mailing list
> MediaWiki-l at lists.wikimedia.org
> http://lists.wikimedia.org/mailman/listinfo/mediawiki-l
>
_______________________________________________
MediaWiki-l mailing list
MediaWiki-l at lists.wikimedia.org
http://lists.wikimedia.org/mailman/listinfo/mediawiki-l

_______________________________________________
MediaWiki-l mailing list
MediaWiki-l at lists.wikimedia.org
http://lists.wikimedia.org/mailman/listinfo/mediawiki-l

_______________________________________________
MediaWiki-l mailing list
MediaWiki-l at lists.wikimedia.org
http://lists.wikimedia.org/mailman/listinfo/mediawiki-l



More information about the MediaWiki-l mailing list