Tl;dr: is there a place for Wikipedia-related code? What do we do for code reusability?
I proposed a Page method in https://phabricator.wikimedia.org/T328769. This would have shown if an article is a biography (it is about a person). My idea was opposed because the codebase of BasePage and APIPage is not Wikipedia-related, and some developers don't want to see any Wikipedia-specific code in it. OK, let's say, this is a valid argument. Let's call it *code purity*.
On the other hand, Pywikibot is mostly developed by Wikipedians, mostly used in Wikipedia, and biographies form a primary scope of Wikipedia. I definitely think that biographies SHOULD be subclassed to have a lot of methods that are useful for a lot of Wikipedias and bot owners. If they write this code for themselves again and again, that's a waste. Now I definitely will use this feature. What can I do? First idea is to subclass Page in my code. That is the natural solution. Of course, when I publish my scripts, others won't be able to use them, and I won't be able to use others' scripts as they are, because I need this subclass. Let's call this point od view *code reusability* which is generally kept an important thing in the world of programming.
Now, what we do (see the above task) is that we throw away code reusability for the sake of code purity. Is that OK?
I am honestly curious, where the place of Wikipedia is in this framework.
What I WILL do in the present situation: write a module called huwiki and don't bother other Wikipedias and place my code into this module. It will contain functions that take the page as parameter (not nice, not natural, I don't like it, but this is the only way if I cannot subclass) and nice pagegenerators, all for Hungarian Wikipedia alone, because I need them.
Is that really what we want? Please help to find the place of Wikipedia-related code.
I'm relatively new to pywikibot, so I'm not really familiar with the culture of the project, i.e. which new features are accepted and which are rejected. But, yes, I did comment on the phab ticket that I thought this didn't belong as a part of the Page class.
Let me give some of my more general experience as a software engineer. On any project, there's always going to be a tension between which features get added to the core codebase and which don't. This is equally true of open source and proprietary projects. Everything that gets added to the API makes the code bigger, more complicated for the user to understand, more complicated for a developer to maintain, etc. So there's a cost to adding anything new. What's important to you may not be important to other people.
Also, other people may have their own ideas about how something should work. For example, as I noted on the phab ticket, I've got my own code which determines if an article is a biography. It uses an entirely different method than your code does. For what I was doing, my method is the right way. For what you're doing, your method is the right way. It's not clear that either method is universally the right way, so putting either of those in the core codebase may impose on everybody something which isn't the right thing for them.
Most big projects have an ecosystem that grows up around them. Collections of add-on packages that are designed to interface easily with the core product. I think that's what makes sense here. Package up your code and publish it. People who like your way of doing things can just install your package along with the core pywikibot package. Your users get to reuse your code. Other people don't have to pay the cost of it living in the core codebase. The best of both words.
Again, this is all just some of my personal wisdom based on my own real-world experiences.
On Mar 31, 2023, at 3:28 AM, Bináris wikiposta@gmail.com wrote:
Tl;dr: is there a place for Wikipedia-related code? What do we do for code reusability?
I proposed a Page method in https://phabricator.wikimedia.org/T328769 https://phabricator.wikimedia.org/T328769. This would have shown if an article is a biography (it is about a person). My idea was opposed because the codebase of BasePage and APIPage is not Wikipedia-related, and some developers don't want to see any Wikipedia-specific code in it. OK, let's say, this is a valid argument. Let's call it code purity.
On the other hand, Pywikibot is mostly developed by Wikipedians, mostly used in Wikipedia, and biographies form a primary scope of Wikipedia. I definitely think that biographies SHOULD be subclassed to have a lot of methods that are useful for a lot of Wikipedias and bot owners. If they write this code for themselves again and again, that's a waste. Now I definitely will use this feature. What can I do? First idea is to subclass Page in my code. That is the natural solution. Of course, when I publish my scripts, others won't be able to use them, and I won't be able to use others' scripts as they are, because I need this subclass. Let's call this point od view code reusability which is generally kept an important thing in the world of programming.
Now, what we do (see the above task) is that we throw away code reusability for the sake of code purity. Is that OK?
I am honestly curious, where the place of Wikipedia is in this framework.
What I WILL do in the present situation: write a module called huwiki and don't bother other Wikipedias and place my code into this module. It will contain functions that take the page as parameter (not nice, not natural, I don't like it, but this is the only way if I cannot subclass) and nice pagegenerators, all for Hungarian Wikipedia alone, because I need them.
Is that really what we want? Please help to find the place of Wikipedia-related code. -- Bináris _______________________________________________ pywikibot mailing list -- pywikibot@lists.wikimedia.org Public archives at https://lists.wikimedia.org/hyperkitty/list/pywikibot@lists.wikimedia.org/me... To unsubscribe send an email to pywikibot-leave@lists.wikimedia.org
Roy Smith roy@panix.com ezt írta (időpont: 2023. márc. 31., P, 17:25):
For example, as I noted on the phab ticket, I've got my own code which determines if an article is a biography. It uses an entirely different method than your code does. For what I was doing, my method is the right way. For what you're doing, your method is the right way. It's not clear that either method is universally the right way, so putting either of those in the core codebase may impose on everybody something which isn't the right thing for them.
Thank you for your thoughts. I read your code and found it useful. As you say there, some heuristics is in the thing, so determining whether an article is a biography is not exact, which is a problem. My approach is that there are basic indicators for it, especially P31/Q5. That means it is very likely to be a biography. If it returns a False, we don't say it is not a biography, but that we have to investigate further. What I do is very similar to yours: looking for categories and infoboxes. You found "People and person infobox templates"category. I found Kategória:Személyek infoboxsablonjai in huwiki. These two are connected through Wikidata. https://www.wikidata.org/wiki/Q4574 is something that could be really useful to write a basic code for all Wikipedias, and it has 81 interwikis. Then everybody can add own criteria. Once we decided by specific points of view that a certain article is a biography, we could instantiate a subclass of Page that yould do a lot of things primarily based on Wikidata. That's how I see, An example I have just thought of: as living person templates on talk pages can again be connected through Wikidata, or listed somehow, and so do living person categories, we could have a maintenance script that lists possible contradictions between the article, the talk page and Wikidata if the person is living or dead. We could offer this in scripts directory for those who cannot code themselves. I wote something, but nobody will use it in any other Wikipedia. Now if there is nobody in a small wiki who is able and willing to code this, they won't have the feature.
What about the suggestion to add derived class WikipediaPage (and module _wikipdia_page) I don't see any problem with this solution. We have the _wikibase module and all its class are imported to the main module __init__. I don't think it's a problem to open another specific module.
And I say that as a person who use pywikibot not for Wikipedia (but Wikisource). Although I agree that it's can't be a method on the main 'Page' class.
בתאריך יום ו׳, 31 במרץ 2023, 17:07, מאת Bináris wikiposta@gmail.com:
Tl;dr: is there a place for Wikipedia-related code? What do we do for code reusability?
I proposed a Page method in https://phabricator.wikimedia.org/T328769. This would have shown if an article is a biography (it is about a person). My idea was opposed because the codebase of BasePage and APIPage is not Wikipedia-related, and some developers don't want to see any Wikipedia-specific code in it. OK, let's say, this is a valid argument. Let's call it *code purity*.
On the other hand, Pywikibot is mostly developed by Wikipedians, mostly used in Wikipedia, and biographies form a primary scope of Wikipedia. I definitely think that biographies SHOULD be subclassed to have a lot of methods that are useful for a lot of Wikipedias and bot owners. If they write this code for themselves again and again, that's a waste. Now I definitely will use this feature. What can I do? First idea is to subclass Page in my code. That is the natural solution. Of course, when I publish my scripts, others won't be able to use them, and I won't be able to use others' scripts as they are, because I need this subclass. Let's call this point od view *code reusability* which is generally kept an important thing in the world of programming.
Now, what we do (see the above task) is that we throw away code reusability for the sake of code purity. Is that OK?
I am honestly curious, where the place of Wikipedia is in this framework.
What I WILL do in the present situation: write a module called huwiki and don't bother other Wikipedias and place my code into this module. It will contain functions that take the page as parameter (not nice, not natural, I don't like it, but this is the only way if I cannot subclass) and nice pagegenerators, all for Hungarian Wikipedia alone, because I need them.
Is that really what we want? Please help to find the place of Wikipedia-related code. -- Bináris _______________________________________________ pywikibot mailing list -- pywikibot@lists.wikimedia.org Public archives at https://lists.wikimedia.org/hyperkitty/list/pywikibot@lists.wikimedia.org/me... To unsubscribe send an email to pywikibot-leave@lists.wikimedia.org