Hey,
While looking at the source of the pywikipediabot in the past, I noticed that it contained a bunch of Wikibase specific code (sometimes even Wikidata specific). The code I saw often did a poor job at separating different concerns, and did some weird things to represent parts of the Wikibase data model.
I figured it'd be a lot nicer if there was a clean and correct implementation of the Wikibase data model that can then be used by pywikipediabot, and other Python projects that need to interact with a Wikibase instance. I went ahead and created what is essentially my first ever Python project [0] to do exactly this. (This is far from fully implemented, and only linked to here to give you an idea of what I am talking about.)
Since I'm not following pywikipediabot development closely, I'm not aware of how much interest there is for having such a component. I'm also not sure on what exactly would need to be implemented to serve the needs to the pywikipediabot codebase, or on how to proceed to then starting with the refactorings required to make use of it.
What is essentially needed for this to go forward is a pywikipediabot developer that takes charge of this project. If that happens I'll happily make the contributions required to ensure the data model implementation is done both correctly and cleanly.
[0] https://github.com/JeroenDeDauw/WikibaseDataModelPython
Cheers
-- Jeroen De Dauw http://www.bn2vs.com Don't panic. Don't be evil. ~=[,,_,,]:3 --
Hi Jeroen,
Sorry about the late response...
On Fri, Sep 6, 2013 at 11:47 AM, Jeroen De Dauw jeroendedauw@gmail.comwrote:
Hey,
While looking at the source of the pywikipediabot in the past, I noticed that it contained a bunch of Wikibase specific code (sometimes even Wikidata specific). The code I saw often did a poor job at separating different concerns, and did some weird things to represent parts of the Wikibase data model.
Right now I think our biggest issue is that Claim subclasses PropertyPage. I originally wrote this thinking it would be convenient, but now after using the code for a while, the only function we actually use is PropertyPage.datatype(), which can easily be fixed. I'll start working on that.
We've tried to keep most of the Wikidata-specific code in the family files, like storing globes in the Wikidata family file. If there's any more of that, we should fix it.
I figured it'd be a lot nicer if there was a clean and correct implementation of the Wikibase data model that can then be used by pywikipediabot, and other Python projects that need to interact with a Wikibase instance. I went ahead and created what is essentially my first ever Python project [0] to do exactly this. (This is far from fully implemented, and only linked to here to give you an idea of what I am talking about.)
I took a look at your code and while it probably is technically correct, I think it adds a level of complexity that we as bot developers don't actually need. For example, we have no need to have different item and item_id classes, they really are basically the same thing for us.
Since I'm not following pywikipediabot development closely, I'm not aware of how much interest there is for having such a component. I'm also not sure on what exactly would need to be implemented to serve the needs to the pywikipediabot codebase, or on how to proceed to then starting with the refactorings required to make use of it.
What is essentially needed for this to go forward is a pywikipediabot developer that takes charge of this project. If that happens I'll happily make the contributions required to ensure the data model implementation is done both correctly and cleanly.
[0] https://github.com/JeroenDeDauw/WikibaseDataModelPython
Cheers
-- Jeroen De Dauw http://www.bn2vs.com Don't panic. Don't be evil. ~=[,,_,,]:3 --
--Legoktm
Hi guys,
Op 20-9-2013 4:12, legoktm schreef:
Hi Jeroen,
Sorry about the late response...
Talked with Jeroen a bit about this on irc. Forgot to send an email here.
On Fri, Sep 6, 2013 at 11:47 AM, Jeroen De Dauw <jeroendedauw@gmail.com mailto:jeroendedauw@gmail.com> wrote:
Hey, While looking at the source of the pywikipediabot in the past, I noticed that it contained a bunch of Wikibase specific code (sometimes even Wikidata specific). The code I saw often did a poor job at separating different concerns, and did some weird things to represent parts of the Wikibase data model.
I wonder if Jeroen looked at the code in compat or in core. Core is much cleaner than compat. Jeroen?
Right now I think our biggest issue is that Claim subclasses PropertyPage. I originally wrote this thinking it would be convenient, but now after using the code for a while, the only function we actually use is PropertyPage.datatype(), which can easily be fixed. I'll start working on that.
Good idea. We should try to stick to https://meta.wikimedia.org/wiki/Wikidata/Notes/Data_model as much as possible, so implementing things in Pywikipedia when things come available on Wikidata itself. We shouldn't forgot about Pywikipedia itself. I ran into a page problem earlier.
Take the class Page:
Three calling formats are supported:
- If the first argument is a Page, create a copy of that object. This can be used to convert an existing Page into a subclass object, such as Category or ImagePage. (If the title is also given as the second argument, creates a copy with that title; this is used when pages are moved.) - If the first argument is a Site, create a Page on that Site using the second argument as the title (may include a section), and the third as the namespace number. The namespace number is mandatory, even if the title includes the namespace prefix. This is the preferred syntax when using an already-normalized title obtained from api.php or a database dump. WARNING: may produce invalid objects if page title isn't in normal form! - If the first argument is a Link, create a Page from that link. This is the preferred syntax when using a title scraped from wikitext, URLs, or another non-normalized source.
I think it would be nice if subclasses of page (like ItemPage) also support these three formats. What do you think? Use case for this is generators. If I use a generator on Wikidata it returns page objects, would be nice to just be able to say itempage = pywikibot.ItemPage(page).
Maarten
Hey,
I took a look at your code and while it probably is technically correct,
I think it adds a level of complexity that we as bot developers don't actually need.
Creating a domain model, which is the direction my code is going, is indeed not the best approach when one only needs to do simple tasks. Transaction scripts typically suffice there. For creating a truly reusable component for dealing with the DataModel, one cannot simply consider the most simple cases though. While a full blown domain model is perhaps not called for, it is useful to embed domain rules into domain objects, and have them properly separated from use case specific logic.
I'm not quite sure what complexity you are referring to. With the approach I am taking, one can hide a lot of complexity, rather then having to deal with it all over the place. Overall the code written so far has (objective) complexity quite a bit below what I saw in pywikipedia. I suspect that what you perceive as complexity is simply the lack of familiarity with the approach taken. Which is actually a quite valid reason to caution against using a domain model, since understanding from the people that are expected to have to deal with it should be held into account.
we have no need to have different item and item_id classes, they really
are basically the same thing for us.
An item is quite different then an item id. There will be cases where you just need to deal with one of them, and places where you have tasks so simple that creating objects such as these does not make sense. If you go treat these things as the same concept, you'll end up shooting yourself in the foot sooner or later, in a similar fashion as making Claim derive from PropertyPage, which also at some point might have seen like a thing you could get away with.
I wonder if Jeroen looked at the code in compat or in core. Core is much
cleaner than compat. Jeroen?
Can you link to the specific code?
We should try to stick to
https://meta.wikimedia.org/wiki/Wikidata/Notes/Data_model as much as possible, so implementing things in Pywikipedia when things come available on Wikidata itself.
I agree that sticking to the canonical data model spec is a good idea. Do however keep in mind that the most up to date and accurate spec of what is actually used exists as the PHP DataModel component. If a structurally completely different implementation is made in Python that follows the on-wiki spec, serious problems can be caused by certain kinds of changes. In particular, if the PHP and Python implementations get designed to be open against different kinds of changes, then pain will occur if a change is made in the PHP component that cannot easily be matched in the Python one.
Cheers
-- Jeroen De Dauw http://www.bn2vs.com Don't panic. Don't be evil. ~=[,,_,,]:3 --
Hi Jeroen,
Op 20-9-2013 16:25, Jeroen De Dauw schreef:
I wonder if Jeroen looked at the code in compat or in core. Core is
much cleaner than compat. Jeroen?
Can you link to the specific code?
* https://git.wikimedia.org/tree/pywikibot%2Fcore.git * https://git.wikimedia.org/blob/pywikibot%2Fcore.git/58800272e1b55323bd1989f1... * https://git.wikimedia.org/blob/pywikibot%2Fcore.git/58800272e1b55323bd1989f1...
Maarten
On Sat, Sep 21, 2013 at 12:00 AM, Maarten Dammers maarten@mdammers.nlwrote:
Hi Jeroen,
Op 20-9-2013 16:25, Jeroen De Dauw schreef:
I wonder if Jeroen looked at the code in compat or in core. Core is
much cleaner than compat. Jeroen?
agreed with Maarten, Core branch is handling Wikibase by the data model
but the compat has just a subclass of page named "DataPage" and it has some methods handling items (doesn't have anything for properties at all) Actually I wrote the DataPage class and I did that because I think we don't need to implement the whole Data model for running but in a Wikibase-repo wiki
Can you link to the specific code?
- https://git.wikimedia.org/**tree/pywikibot%2Fcore.githttps://git.wikimedia.org/tree/pywikibot%2Fcore.git
- https://git.wikimedia.org/**blob/pywikibot%2Fcore.git/**
58800272e1b55323bd1989f1a1457c**b5299ed626/pywikibot%2Fpage.**py#L2304https://git.wikimedia.org/blob/pywikibot%2Fcore.git/58800272e1b55323bd1989f1a1457cb5299ed626/pywikibot%2Fpage.py#L2304
58800272e1b55323bd1989f1a1457c**b5299ed626/pywikibot%2Fsite.**py#L3406https://git.wikimedia.org/blob/pywikibot%2Fcore.git/58800272e1b55323bd1989f1a1457cb5299ed626/pywikibot%2Fsite.py#L3406
Maarten
______________________________**_________________ Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.**org Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/pywikipedia-lhttps://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
Best
pywikipedia-l@lists.wikimedia.org