Hi all!
In my opinion, MediaWiki’s PHPCS ruleset feels largely rooted in an older version of PHP, where static type declarations (formerly known as “type hints”) did not exist. As we move towards more modern code, I think some rules should be relaxed, and others adjusted. More specifically, I’d like to know if most people agree with the following propositions and conclusion:
Proposition 1: *Some code is sufficiently documented by names and types*, and does not require additional documentation. Cases where additional documentation is required do certainly exist, but they can only be identified by human reviewers, not by automated tools.
You can see this in our existing code wherever a doc comment specifies only a type (with @var, @param, or @return), but no additional text. For example, in CreditsAction https://gerrit.wikimedia.org/g/mediawiki/core/+/de752f45af/includes/actions/CreditsAction.php, nobody needs to be told that the LinkRenderer will be used to render links, or that the UserFactory creates User objects:
class CreditsAction extends FormlessAction {
/** @var LinkRenderer */
private $linkRenderer;
/** @var UserFactory */ private $userFactory;
Likewise, it’s not necessary to explain in great detail that the string returned by LinksTable::getTableName() https://gerrit.wikimedia.org/g/mediawiki/core/+/de752f45af/includes/deferred/LinksUpdate/LinksTable.php#175 is the table name, that the $actor parameter of ActorCache::remove( UserIdentity $actor ) https://gerrit.wikimedia.org/g/mediawiki/core/+/de752f45af/includes/user/ActorCache.php#103 represents the actor to remove from the cache, or what the meaning of the Message $m and returned MessageValue are in Message\Converter::convertMessage() https://gerrit.wikimedia.org/g/mediawiki/core/+/de752f45af/includes/Message/Converter.php#48 :
/**
* Convert a Message to a MessageValue
* @param Message $m
* @return MessageValue
*/ public function convertMessage( Message $m ) {
(I want to clarify that in this last example I’m only talking about the @param and @return tags that already don’t have any prose text. While the method comment “Convert a Message to a MessageValue” might also be considered redundant, I think this would be more contentious, and I’m not advocating for removing that today.)
Proposition 2: *Adding types as static types is generally preferable.* Unlike doc comments, static types are checked at runtime and thus guaranteed to be correct (as long as the code runs at all); the small runtime cost should be partially offset by performance improvements in newer PHP versions, and otherwise considered to be worth it. New code should generally include static types where possible, and existing code may have static types added as part of other work on it. I believe this describes our current development practice as MediaWiki developers.
Note that some older MediaWiki classes are considered unsuitable for static types, and should only be used in comments; this is expected to help in a future migration away from these classes (see T240307#6191788 https://phabricator.wikimedia.org/T240307#6191788).
Proposition 3: *Where types can be losslessly represented as PHP static types, types in doc comments are unnecessary.* When doc comments are considered necessary for actual documentation beyond types, then the type is generally still included (and Phan will check that it matches the static type), but when no further documentation is needed (see proposition 1 above), then the @var, @param, etc. doc comment can be omitted.
Note that depending on the PHP version, not all types can be losslessly represented as PHP static types yet (e.g. union types and mixed both need to wait for PHP 8.0, null and false for PHP 8.2); in such cases, doc comments can remain necessary.
Conclusion: *We should update our PHPCS ruleset to require fewer doc comments.* Exact rules are probably to be decided, depending on how much work we’re willing to put into the sniff implementations (e.g. is it feasible to require /** @param */ doc comments only if a parameter has no static type?), but generally, I argue that we want code such as the following to be allowed by our standard PHPCS ruleset:
class CreditsAction extends FormlessAction {
private LinkRenderer $linkRenderer;
private UserFactory $userFactory;
/** Convert a Message to a MessageValue */
public function convertMessage( Message $m ): MessageValue {
When doc comments are still necessary or at least beneficial because the type alone isn’t enough information, it’s up to humans to decide this while writing the code or point it out during code review.
What do people think about this? :)
PS: In PHP 8, we could abbreviate some of this code even more using constructor property promotion:
class CreditsAction extends FormlessAction {
public function __construct(
Page $page,
IContextSource $context,
private LinkRenderer $linkRenderer,
private UserFactory $userFactory
) {
parent::__construct( $page, $context );
}
(Again, I’m not saying that all code should look like this – but I think we have plenty of existing code that effectively carries no additional information in its documentation, and which could be converted into this form without losing anything.)
Cheers, Lucas