Re: [Wikisource-l] Two requests for MediaViewer

1 Oct 2014

2014-10-01 9:18 GMT+02:00 Jane Darnell &lt;jane023(a)gmail.com&gt;om>:

...
  Actually, I would rather have a tool that pulls apart
djvu files as they
 are uploaded; keeping the text in WS and the pics in Commons

This is very interesting since abbyy.xml files contain both a full detail
(character by character) detail of text mapping & format, and coordinates
of any not-textual content (illustrations) of the scanned page. Using
appropriately such data, it would be possible to extract automatically
illustrations and other graphical elements of pages. nevertheless, I saw
that such "self-cropping" of illustration sometimes fails, and often is
confused by some unusual format of illustrations/graphical element, so that
many "illustrations" are nonsense or have to be cropped again. Unluckily,
djvu files have no such "illustration coordinates" inside.

Alex

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [Wikisource-l] Two requests for MediaViewer