[Commons-l] Fwd: [Clipart] Clipart mining guide

Brianna Laugher brianna.laugher at gmail.com
Wed Oct 17 03:36:53 UTC 2007


Could be good for Commons to have some help files on the best
practices of extracting illustrations (if not clipart...) from scans.
So this may be of interest and maybe folks here have some advice for
using Gimp etc too.

cheers,
Brianna


---------- Forwarded message ----------
From: John Olsen <johnny_automatic at mac.com>
Date: 17 Oct 2007 13:21
Subject: [Clipart] Clipart mining guide
To: clipart at lists.freedesktop.org



I have had a couple of people ask me to go through the steps I use to
get images  from old books.  I am happy to share them.  I can only do
a version using an Adobe Creative Suite workflow as this is what I am
most familiar with and it seems the GIMP needs some additional helpers
to handle  PDFs.  So I imagine this all can be done with an Open
Source workflow.  I just am not the authority to write on it.  Maybe
someone can translate it.

Anyway, I have a draft below.  I was thinking of adding it to the Wiki
section "http://openclipart.org/wiki/Clipart_Acquisition" but wondered
if that clutters up that section.  maybe it is better in a separate
subsection under either "Clip Art Information" or "Contributor & User
Handbook".  If so someone who can unlock the top level would need to
set that up for me.  Ryan?

Any suggestions or ideas are appreciated.

John Olsen


== Guideline for Mining Images from Online Book Libraries ==
* Sites such as those collected under the texts section of
www.Archive.org offer a gold mine of Public Domain images.  This is a
guide to how to extract these images for use here on OCAL.  Please
note the author uses a workflow using Adobe Creative Suite 3 because
he is most familiar with this software.  It probably can all be done
using Open Source software.  Someone else will need to add those
instructions.
* Find a book with images you would like to extract.  Keep in mind
that the resolution is not extremely high so small images may not have
enough resolution to extract good SVGs.
* Download the PDF version of the book.  It usually has the best
resolution.  The black and white PDF will be made for reading the text
and might not have the best images.  it is better to get the full
color PDF and do your own adjustments.
* Open the PDF in Photoshop.  You will be asked to select a page.
Pick the page you want and open it.  Then crop the image tight around
the graphic you want.
* Alternatively you can extract individual pages using Adobe Acrobat
and then open these single pages in Photoshop.  This can be faster and
less memory intensive when mining large books.
* Using Image>Adjustments>Black & White convert the image to black &
white.  The High Contrast Red Filter Preset usually does a good job.
* Further enhance the image using
Image>Adjustments>Brightness/Contrast to increase contrast and
brightness if necessary so you get a nice high contrast image.
* Save the file as .PSD or .TIF or any format that Adobe Illustrator opens.
* Open this file in Adobe Illustrator.
* Use Live Trace to convert photo to vector art.  The following
presets usually give the best result.  One Color Logo (give black
lines only-smallest file size), Black & White Logo (white parts are
filled shapes as well) and Comic Art.
* When happy with the image, Save as SVG.  Do not preserve Illustrator
editing so file is basic SVG.
* Using File>Save for Web & Devices make a PNG file to upload with SVG.
* Upload file to OCAL.
* You usually get better results with strong black & white images.
They make nice clean traces with reasonable file sizes.  It is easier
to color these afterwards then try to trace a full color image and
expect clean, crisp lines.
_______________________________________________
clipart mailing list
clipart at lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/clipart



-- 
They've just been waiting in a mountain for the right moment:
http://modernthings.org/



More information about the Commons-l mailing list