I noticed that core files have @author annotations.
My take on this is as follows: Any active codebase (such as mediawiki) sees constant change and code is refactored, rewritten, renamed, files moved around, split up, etc. that the only meaningful interperation of "@author" would be someone who first created that file / function, no matter how small that piece of code was. At that level, it is not that meaningful, especially in the face of refactoring and restructuring. git log --follow might provide a better picture for an individual file. I think all @author annotations should be removed. When editing a piece of code, I imagine some developers might find it a little annoying ... and confusing especially during refactoring ... whether to retain it or not.
I find these annotations misleading and wonder why they exist and what purpose they serve. Would appreciate a discussion on this. Alternatively, I would appreciate if someone can point me to a wiki page / phab task / essay that explains the rationale for why these annotations should exist and be preserved.
Thanks,
Subbu.
Le 13/06/2017 à 07:11, Subramanya Sastry a écrit :
I noticed that core files have @author annotations.
My take on this is as follows: Any active codebase (such as mediawiki) sees constant change and code is refactored, rewritten, renamed, files moved around, split up, etc. that the only meaningful interperation of "@author" would be someone who first created that file / function, no matter how small that piece of code was. At that level, it is not that meaningful, especially in the face of refactoring and restructuring. git log --follow might provide a better picture for an individual file. I think all @author annotations should be removed. When editing a piece of code, I imagine some developers might find it a little annoying ... and confusing especially during refactoring ... whether to retain it or not.
I find these annotations misleading and wonder why they exist and what purpose they serve. Would appreciate a discussion on this. Alternatively, I would appreciate if someone can point me to a wiki page / phab task / essay that explains the rationale for why these annotations should exist and be preserved.
Hello,
Jon Robson opened a task about it a year or so ago:
"Remove @author lines from code" https://phabricator.wikimedia.org/T139301
MY understanding is that removing the @author @copyright tags in MediaWiki code represent ownership of the original code placed under the GPL. Subsequent modifications being derivative products.
I am not a lawyer, but by dropping the copyright information, I highly suspect that will be a breach of the license.
We also had a conversation about the CREDITS file: https://lists.gt.net/wiki/wikitech/714928
On Tue, Jun 13, 2017 at 10:33 AM, Antoine Musso hashar+wmf@free.fr wrote:
Hello,
Jon Robson opened a task about it a year or so ago:
"Remove @author lines from code" https://phabricator.wikimedia.org/T139301
MY understanding is that removing the @author @copyright tags in MediaWiki code represent ownership of the original code placed under the GPL. Subsequent modifications being derivative products.
I am not a lawyer, but by dropping the copyright information, I highly suspect that will be a breach of the license.
We also had a conversation about the CREDITS file: https://lists.gt.net/wiki/wikitech/714928
We've had the same discussion in the team earlier this year and I dug up this: https://www.softwarefreedom.org/resources/2012/ManagingCopyrightInformation....
The relevant part for us is this: "But be careful when removing the notices of other developers. Since free software licenses require licensees to preserve notices, wrongfully removing one is a violation of the license from that contributor and may be copyright infringement. If it’s absolutely clear that every remnant of a developer’s contribution has been removed, then it is probably OK to remove the associated copyright notice; otherwise, it’s best to keep it around. However, a requirement to “preserve” or “reproduce” a developer’s copyright notice does not necessarily require that the notice be kept in exactly the same place it started; it’s usually acceptable to move notices from individual source files to a central attribution file, for example."
Cheers Lydia
On 06/13/2017 03:33 AM, Antoine Musso wrote:
Jon Robson opened a task about it a year or so ago: "Remove @author lines from code" https://phabricator.wikimedia.org/T139301
Aha .. thanks! I was looking at a 2016 code review y'day and noticed my comment about @author there ( https://gerrit.wikimedia.org/r/#/c/279669/9/includes/tidy/Balancer.php@24 ) and remember seeing a phab task afterwards but couldn't find it easily ... so I figured I dreamt it up ;-) But, anyway, looks like that is the place to add additional comments and figure out if this can be acted upon.
Subbu.
Hi!
MY understanding is that removing the @author @copyright tags in MediaWiki code represent ownership of the original code placed under the GPL. Subsequent modifications being derivative products.
But there's no way to verify that the code is indeed an original creation of whoever is listed under @author, and not a derivative work of something else.
I am not a lawyer, but by dropping the copyright information, I highly suspect that will be a breach of the license.
AFAIK GPL itself does not protect attribution. It allows (optionally) to add clauses protecting attribution, but does not require it.
I wonder though, given that Git has all the change history including authorship, what is the need to duplicate that information in the source code (and risk the two getting out of sync)?
And if we are not considering Git logs to be part of the distribution, we're already violating this GPL clause:
You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change.
Since you can commit the change (thus causing original work to be modified) without such notice, except for Git metadata. We obviously consider Git metadata to be enough in this case, why not in any others?
I agree @author tags should be removed and replaced with a mention (or maybe even a new section) in CREDIT file
Zppix Volunteer developer for WMF enwp.org/User:Zppix
On Jun 13, 2017 3:02 PM, "Stas Malyshev" smalyshev@wikimedia.org wrote:
Hi!
MY understanding is that removing the @author @copyright tags in MediaWiki code represent ownership of the original code placed under the GPL. Subsequent modifications being derivative products.
But there's no way to verify that the code is indeed an original creation of whoever is listed under @author, and not a derivative work of something else.
I am not a lawyer, but by dropping the copyright information, I highly suspect that will be a breach of the license.
AFAIK GPL itself does not protect attribution. It allows (optionally) to add clauses protecting attribution, but does not require it.
I wonder though, given that Git has all the change history including authorship, what is the need to duplicate that information in the source code (and risk the two getting out of sync)?
And if we are not considering Git logs to be part of the distribution, we're already violating this GPL clause:
You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change.
Since you can commit the change (thus causing original work to be modified) without such notice, except for Git metadata. We obviously consider Git metadata to be enough in this case, why not in any others?
-- Stas Malyshev smalyshev@wikimedia.org
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Tue, Jun 13, 2017 at 7:11 AM, Subramanya Sastry ssastry@wikimedia.org wrote:
I find these annotations misleading and wonder why they exist and what purpose they serve.
It can sometimes tell you whom to ask for advice or reviews. (git log would too but it's more effort.)
On Jun 13, 2017 6:24 AM, "Gergo Tisza" gtisza@wikimedia.org wrote:
On Tue, Jun 13, 2017 at 7:11 AM, Subramanya Sastry ssastry@wikimedia.org wrote:
I find these annotations misleading and wonder why they exist and what purpose they serve.
It can sometimes tell you whom to ask for advice or reviews. (git log would too but it's more effort.)
For the record (since one of my patches was specifically mentioned) Gergo's reason matches mine. This is common (if informal) practice for a number of open source software projects---a way to easily indicate the original author of the code, as a pointer to who to ask if you've got questions about it. The informal practice is to add yourself to the list if you undertake any major change or refactoring, some contribution which would make you "at least as good a person to ask questions of" for some part of the file.
It's admittedly imperfect. We have a list of module owners on-wiki which is "better" but much harder to find. And as Gergo says, `git log` is more authoritative, although anyone who has tried this sort of archeology knows that it can be exhausting to scroll through the long history of trivial fixes, language updates, code style tweaks, etc to find the few people who made significant contributions.
In my experience a central CREDITS file is actually worse for code archeology (although perhaps meets certain legal obligations better). Because it is separate from the file itself, it is harder to find (and if it's hard to find, we might as well use our on-wiki list), and less likely to be kept up to date. I'm listed in the Linux kernel CREDITS file for several projects which have subsequently been completely removed from the source tree! If you keep @authors in the source file at least it stands a better shot at being removed if/when the code is, or of surviving a renaming or refactoring if still relevant. (The linux kernel has both standalone CREDITS akin to our on-wiki list of module maintainers as well as in-flight authorship info, and I still get occasional inquiries from folks diving into the pty code for the first time.)
If we were to make a change, I'd suggest including an @authors tag in each file explicitly naming the appropriate on-wiki module owners list, ie: @author https://meta/wiki/Module_owners#Parser
Or what have you. Simply removing the @author tag seems like it is removing a resource for new contributors with no replacement. --scott
On 06/13/2017 10:14 AM, C. Scott Ananian wrote:
On Jun 13, 2017 6:24 AM, "Gergo Tisza" gtisza@wikimedia.org wrote:
On Tue, Jun 13, 2017 at 7:11 AM, Subramanya Sastry ssastry@wikimedia.org wrote:
I find these annotations misleading and wonder why they exist and what purpose they serve.
It can sometimes tell you whom to ask for advice or reviews. (git log would too but it's more effort.)
For the record (since one of my patches was specifically mentioned) Gergo's reason matches mine. This is common (if informal) practice for a number of open source software projects---a way to easily indicate the original author of the code, as a pointer to who to ask if you've got questions about it. T
I think the @author tag is at best a documentation hack for this scenario. What happens when people leave the project or there are more than one person who understands that code, or expertise shifts with changing codebase?
It would be better to actually add a documentation line to point to a wiki page where this information can be found (and kept up to date).
Subbu.
It's the internet, nobody ever really goes away. ;)
I will say, this convention makes more sense for a large project like mediawiki core with many contributors; even more so if some of those contributors are volunteers or short-term, since the @author tag can help to track them down absent any long-term association with the WMF. It is less necessary for a project like parsoid, which is authored in the main by five or fewer folks.
But I'd support an effort to add @author tags which deference via the wiki module owners table. --scott
On Jun 13, 2017 12:59 PM, "Subramanya Sastry" ssastry@wikimedia.org wrote:
On 06/13/2017 10:14 AM, C. Scott Ananian wrote:
On Jun 13, 2017 6:24 AM, "Gergo Tisza" gtisza@wikimedia.org wrote:
On Tue, Jun 13, 2017 at 7:11 AM, Subramanya Sastry <ssastry@wikimedia.org
wrote:
I find these annotations misleading and wonder why they exist and what
purpose they serve.
It can sometimes tell you whom to ask for advice or reviews. (git log
would too but it's more effort.)
For the record (since one of my patches was specifically mentioned) Gergo's reason matches mine. This is common (if informal) practice for a number of open source software projects---a way to easily indicate the original author of the code, as a pointer to who to ask if you've got questions about it. T
I think the @author tag is at best a documentation hack for this scenario. What happens when people leave the project or there are more than one person who understands that code, or expertise shifts with changing codebase?
It would be better to actually add a documentation line to point to a wiki page where this information can be found (and kept up to date).
Subbu.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Hi!
It can sometimes tell you whom to ask for advice or reviews. (git
log would too but it's more effort.)
I feel @author is a bit misleading in this case - if code is refactored/amended, original author that wrote it, possibly 10 years ago, may not be the best person to ask what's going on in it now. OTOH, the person who knows it best now may not be comfortable listing oneself as author of the code after just refactoring and amending it, not originally authoring it.
Additionally, some @author clauses only list name or nick, without any contact information. If the person is still active in the project under the same name, it may be easy to track them, but if not, it's mostly hopeless.
Piling on here, but I wanted to point out that this in my opinion is what git blame https://git-scm.com/docs/git-blame is for. PHPStorm, Sublime Text, Vim, others I'm sure, all have plugins to quickly see who last touched each line of code. To dig deeper (maybe the last person just fixed a typo), try searching the log for just that set of lines, e.g. for lines 110 to 115 in api.php:
git log -L110,115:/path/to/api.php
~Leon
On Tue, Jun 13, 2017 at 4:06 PM, Stas Malyshev smalyshev@wikimedia.org wrote:
Hi!
It can sometimes tell you whom to ask for advice or reviews. (git
log would too but it's more effort.)
I feel @author is a bit misleading in this case - if code is refactored/amended, original author that wrote it, possibly 10 years ago, may not be the best person to ask what's going on in it now. OTOH, the person who knows it best now may not be comfortable listing oneself as author of the code after just refactoring and amending it, not originally authoring it.
Additionally, some @author clauses only list name or nick, without any contact information. If the person is still active in the project under the same name, it may be easy to track them, but if not, it's mostly hopeless.
-- Stas Malyshev smalyshev@wikimedia.org
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
I am inclined to agree with Subbu, but of course there are legal implications. Zhou said it would be fine to move all these tags to a centralised CREDITS file and point to that file, and that doing so wouldn't breach the licence. He is a lawyer and is therefore qualified to make such determinations.
Whether moving CREDITS to a centralised file actually solves the problem, rather than just shifting it around, is debatable.
Dan
On 13 June 2017 at 06:11, Subramanya Sastry ssastry@wikimedia.org wrote:
I noticed that core files have @author annotations.
My take on this is as follows: Any active codebase (such as mediawiki) sees constant change and code is refactored, rewritten, renamed, files moved around, split up, etc. that the only meaningful interperation of "@author" would be someone who first created that file / function, no matter how small that piece of code was. At that level, it is not that meaningful, especially in the face of refactoring and restructuring. git log --follow might provide a better picture for an individual file. I think all @author annotations should be removed. When editing a piece of code, I imagine some developers might find it a little annoying ... and confusing especially during refactoring ... whether to retain it or not.
I find these annotations misleading and wonder why they exist and what purpose they serve. Would appreciate a discussion on this. Alternatively, I would appreciate if someone can point me to a wiki page / phab task / essay that explains the rationale for why these annotations should exist and be preserved.
Thanks,
Subbu.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
There are also some newer initiatives for having a list of contributors that is more prominently featured on the readme or docs, and that also includes other kinds of contributions, not just code creation (docs, design, review, funding, tutorials...).
I find the idea very appealing to recognize and empower all kinds of contributions/contributors, it would be great to make a plan to move to something similar: all-contributors https://github.com/kentcdodds/all-contributors
On Tue, Jun 13, 2017 at 12:30 PM Dan Garry dgarry@wikimedia.org wrote:
I am inclined to agree with Subbu, but of course there are legal implications. Zhou said it would be fine to move all these tags to a centralised CREDITS file and point to that file, and that doing so wouldn't breach the licence. He is a lawyer and is therefore qualified to make such determinations.
Whether moving CREDITS to a centralised file actually solves the problem, rather than just shifting it around, is debatable.
Dan
On 13 June 2017 at 06:11, Subramanya Sastry ssastry@wikimedia.org wrote:
I noticed that core files have @author annotations.
My take on this is as follows: Any active codebase (such as mediawiki) sees constant change and code is refactored, rewritten, renamed, files moved around, split up, etc. that the only meaningful interperation of "@author" would be someone who first created that file / function, no matter how small that piece of code was. At that level, it is not that meaningful, especially in the face of refactoring and restructuring. git log --follow might provide a better picture for an individual file. I
think
all @author annotations should be removed. When editing a piece of code,
I
imagine some developers might find it a little annoying ... and confusing especially during refactoring ... whether to retain it or not.
I find these annotations misleading and wonder why they exist and what purpose they serve. Would appreciate a discussion on this.
Alternatively, I
would appreciate if someone can point me to a wiki page / phab task /
essay
that explains the rationale for why these annotations should exist and be preserved.
Thanks,
Subbu.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
-- Dan Garry Lead Product Manager, Discovery Wikimedia Foundation _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Whether moving CREDITS to a centralised file actually solves the problem, rather than just shifting it around, is debatable.
I think a centralized credits / contributors file is far better since it recognizes contributions even if those changes have since been all edited / rewritten / deleted. It also doesn't try to associate authorship to specific project artifacts.
Subbu.
wikitech-l@lists.wikimedia.org