Uploading the original PDFs to a publicly accessable website would most likely be a copyright violation, so we wouldn't want to do that anyway. However the originals need to be available somehow (pehaps in a restricted sense) for people to verify the OCR against when marking up (I'm assuming that this is to be going on to wikisource), as an error in a formula would be very hard to spot for a layman
Another question is what to do about about diagrams (assuming that there are some), I would imagine that if the the RS claims copyright of the scans we can't just extract them and use them. Simple ones I imagine we can (and probably should) convert to SVG, but for more detailed ones, that could be tricky.
James
On 25/09/06, geni geniice@gmail.com wrote:
On 9/25/06, Rich rich.rr@gmail.com wrote:
Great, I don't mind helping, when I know where and what. An obvious suggestion but do we want to have a wiki to 1) Coordinating download/scans/error checking 2)Upload the pdf's to and 3) store the
OCR's
and as a base for error checking?
If you are going to go to the trouble of running it through an OCR you might as well upload in text form rather than messing around with PDFs. -- geni _______________________________________________ Wikimedia UK mailing list wikimediauk-l@wikimedia.org http://meta.wikimedia.org/wiki/Wikimedia_UK http://mail.wikimedia.org/mailman/listinfo/wikimediauk-l