Hi all,
I've been attempting to generalise some code for scraping data out of
Wikisources:
https://github.com/wikisource/api
(Just in case any of you PHP devs out there are looking for such a
thing.)
It's not very complete yet, but is functioning and I'm using it in a
couple of projects. Would love any feedback or ideas for development!
Thanks,
sam
Who tries livePreview for nsPage, finds that it's no so much comfortable,
since size/position of its output doesn't allow a useful comparison with
front image or with wikicode. VisualEditor could be a good solution, but
presently it have limitations dealing with wikisource nsPage, mostly when
adding fastly complex format, without the help of our beloved tools.
Here: https://it.wikisource.org/wiki/Utente:Alex_brollo/livePreview.js a
temptative js code to customize the output of livePreview for wikisource
needs: output goes into a box superimposed to edit area, draggable if user
likes a comparison with wiki code, instead of a comparison with front
image.
Far from being a professional script, it is just to proof that* it can be
done. *
If there's any better solution for that issue, please tell me!
Alex brollo (from it.source)
Hello everyone,
can I ask you if you are currently using IA Upload tool
with IA books that *do not* have a Djvu file?
It's few weeks I'm trying to upload this book
https://archive.org/details/ComeRuinareLAutoritaImage
with the tool, and in theory the IA-upload can now make the djvu by
himself,
but in this case it's not working.
But maybe it's just this book.
Did you have any issues?
Andrea
Hi all,
I've been tinkering with an idea I've had for importing Project
Gutenberg books into Wikisource: http://tools.wmflabs.org/pg2ws/
The idea is that, if Wikidata makes a link between a PG ID number and a
Wikisource Index page, then we can go through that Index page one page
at a time, and copy the page's text from the PG book to the WS page.
The interface so far isn't very brilliant, but I'm just trying to figure
out if this is worthwhile or not. Basically, it's a matter of selecting
the right chunk of text in the right-most text box (the full PG text)
and hitting the button to move it left into the centre box. Then
cleaning it up (manually and with the magic cleaning button) to make it
match the image, and then uploading it to Wikisource.
It's a bad tool though, because it doesn't handle the running header,
and the copy-across button doesn't do nice things with {{hws}} etc. —
not to mention all the other things it doesn't do.
Anyway, just thought I'd mention it. :-) Anyone think this is an avenue
worth exploring? Certainly I'd love to be able to say we've got
everything PG has /and more/!
—Sam
PS changes made by this tool are all tagged as "OAuth CID: 638" —
https://en.wikisource.org/w/index.php?title=Special:RecentChanges&tagfilter…
Hello all,
We've just published the October 2016 Indic Wikisource statistics. After
implementing Google OCR script to our all Indic Wikisource , they are
growing rapidly.
Here is the few stats ans their top three rank...
As per Number of article
1. Sanskrit Wikisource ( 14840 pages) - supported by 0.05% scan pages.
2. Telugu Wikisource ( 11708 pages) - supported by 24.3% scan pages.
3. Kannada Wikisource ( 7666 pages) - supported by 1.05% scan pages.
As per Number of page Validation
1. Telugu Wikisource ( 17943 pages)
2. Tamil Wikisource ( 5116 pages)
3. Gujarati Wikisource ( 3519 pages)
As per Number of page Proofread
1. Telugu Wikisource ( 19872 pages)
2. Malayalam Wikisource ( 8022 pages)
3. Tamil Wikisource ( 7157 pages)
As per percentage supported by scan pages.
1. Telugu Wikisource ( 24.30%)
2. Bengali Wikisource (22.21%)
3. Gujarati Wikisource (16.70%)
I want to specially mention that there are no visible improvement at
Marathi and Assamese Wikisource. And also there some improvements in Odia
Wikisource, they are now 4.84% supported by scan pages. Just two months
back they ware 0%.
For Sanskrit, Telugu & Kannada Wikisource, they need to exploring their
work of text towards scan page support.
Full Indic Wikisource stats here
https://wikisource.org/wiki/Wikisource:Indic_Wikisource_Stats
Regards,
Jayanta Nath
Indic Wikisource Community