I've been experimenting with pulling editions' author information in from Wikidata, and it's reasonably easy if we
a) look at the immediate page's WD item, and see if it's got an author;
b if it doesn't, traverse up via P629 to the work, and see if it's got an author.
If neither of those things exist, give up.
If an author was found somewhere, see if it's got a sitelink to Wikisource and
then display it either with or without a link.
The list of authors is then joined together with commas or whatever you want.

That seems to work for a good chunk of cases on English Wikisource. Is that sort of thing universal amongst Wikisources?

It fails on things like organisational authors, and doesn't do anything with translators (although the same process could be followed for them... sort of).

I'd love to develop a cross-Wikisource lua module that could display a list of authors, if it's possible.

Small steps! :-)

On Wed, 1 Nov 2017, at 05:58 PM, Andrea Zanni wrote:
Authors from Italian Wikisource have already a lot (if not all) metadata on Wikidata:
authors are *easy* compared to books (don't have the whole work-edition issue),
so I think that users Candalua, Alex brollo and others solved this problem long ago.
When you've copied all the metadata from WS authors to WD items (phase 1),
you then need a system in place to
* pull the data from WD and put it in WS (Lua templates or something)
* maintain it (the templates need to remind the user to go to WD and update the information)
This is phase 2.

Unfortunately, for books we're always pre-phase 1 :-(


On Wed, Nov 1, 2017 at 10:15 AM, Jane Darnell <jane023@gmail.com> wrote:
Yes you definitely need this flow of useful&visible interproject links both ways: as a trigger for Wikidatans to do more with Wikisource pages, and as a trigger to Wikisourcerers to do more with Wikidata items

On Wed, Nov 1, 2017 at 10:01 AM, Sam Wilson <sam@samwilson.id.au> wrote:

Yup, still true. We do at least have a common goal of structured HTML, as defined by http://schema.org/CreativeWork

It sounds like Tpt's scraper will do wonders, if a Wikisource just complies to that. I think that's one of the next steps we need to take.

I sort of figure from the English Wikisource point of view that we should do more on bringing data *in* from Wikidata, in our {{header}}, rather than working on making it easier to extract data *out* with microformats/structured-HTML. Well, we should do both, of course! :-) But my feeling from the process of getting Author data in from Wikidata is that the whole Wikidata integration becomes so much more worthwhile and clearer (and we sort out the various edge cases) when we're actively using it for real.

But of course, each Wikisource is in a similar position. :-( And are we to all be developing the Lua scripts and templates in isolation? Indeed no! :-) We shall put them all toegther in our brave new Wikisource extension! :)


On Wed, 1 Nov 2017, at 04:03 PM, Andrea Zanni wrote:
@Sam, Tpt,
my personal experience is too that HTML is the way to pull out the Wikisource important metadata,
but it's also that every Wikisource has sort of a different way to show them,
meaning that you need to tweak your scraper for each Wikisource.
Is that still true? Last time I did it was more than one year ago, but I need to try it again soon.

On Wed, Nov 1, 2017 at 1:00 AM, Sam Wilson <sam@samwilson.id.au> wrote:
Yes I think you're definitely right! The easier way to send Wikisource
data to Wikidata is going to be a clever gadget that reads the
microformat or schema'd info in each page. My hack was just a quick and
easy test at getting some things added. :)

Ultimately, I'm actually not that excited about working on the tools
that we need to transfer the data. No no I don't mean that! Well, just
that the end point we're aiming at is that a bunch of info *won't be* at
all in Wikisource, but will be pulled from Wikidata, and so I am much
more interested in making better tools for working with the data in
Wikidata. :-) If you see what I mean.

My idea with ws-search is that it will progressively pull more and more
data from Wikidata, and only resort to HTML scraping where the data is
missing from Wikidata. I'm attempting to encapsulate this logic in the
`wikisource/api` PHP library.

On Tue, 31 Oct 2017, at 11:14 PM, Thomas Pellissier Tanon wrote:
> Hello Sam,
> Thank you for this nice feature!
> I have created a few months ago a prototype of Wikisource to Wikidata
> importation tool for the French Wikisource based on the schema.org
> annotation I have added to the main header template (I definitely think
> we should move from our custom microformat to this schema.org markup that
> could be much more structured). It's not yet ready but I plan to move it
> forward in the coming weeks. A beginning of frontend to add to your
> Wikidata common.js is here:
> We should probably find a way to merge the two projects.
> Cheers,
> Thomas
> > Le 31 oct. 2017 à 15:10, Nicolas VIGNERON <vigneron.nicolas@gmail.com> a écrit :
> >
> > 2017-10-31 13:16 GMT+01:00 Jane Darnell <jane023@gmail.com>:
> > Sorry, I am much more of a Wikidatan than a Wikisourcerer! I was referring to items like this one
> >
> > No need to be sorry, that is actually a good question and this example is even better (I totally forgot this kind of case).
> >
> > For now, this is probably better to deal with it by hands (and I'm not sure what this tools can even do for this).
> >
> > Cdlt, ~nicolas
> > _______________________________________________
> > Wikisource-l mailing list
> _______________________________________________
> Wikisource-l mailing list
> Email had 1 attachment:
> + signature.asc
>   1k (application/pgp-signature)

Wikisource-l mailing list
Wikisource-l mailing list

Wikisource-l mailing list

Wikisource-l mailing list

Wikisource-l mailing list