On Fri, Aug 30, 2019 at 10:09 PM Krinkle <krinklemail(a)gmail.com> wrote:
For anything else, it doesn't really work in my
experience because PHPUnit
won't actually provide a valid implementation of the interface. It returns
null for anything, which is usually not a valid implementation of the
contract the class under test depends on. As a result, you end up with
hundreds of lines of phpunit-mockery to basically reimplement the
dependency's general functionality from scratch, every time.
In my experience, classes that use a particular interface usually call
only one or two methods from it, which can usually be replaced by
returning fixed values or at most two or three lines of code. The
implementations of the stub methods usually can be very simple,
because you know exactly the context that they'll be called and you
can hard-code the results. It could sometimes not be worth it to mock
for the reason you give, but in my experience that's very much the
exception rather than the rule.
I have seen
this too many times and fail to see the added value for all that complexity
and maintenance, compared to simply letting the actual class do its work.
Practically speaking, what kind of problem would that test methodology
catch? I believe others may have a valid answer to this, but I haven't yet
seen it. Perhaps it's a source of problems I tend to prevent by other
means. There's more than one way to catch most problems :)
Consider a class like ReadOnlyMode. It has an ILoadBalancer injected.
The only thing it uses it for is getReadOnlyReason(). When testing
ReadOnlyMode, we want to test "What happens if the load balancer
returns true? What about false?" A mock allows us to inject the
required info with a few lines of code. If you use a real load
balancer, you're creating a number of problems:
1) Where do you get the load balancer from? You probably don't want to
manually supply all the parameters. If you use MediaWikiServices,
you're dependent on global state, so other tests can interfere with
yours. (Unless you tear down the services for every test, and see next
point.)
2) It's a waste of resources to run so much code just to generate
"true" or "false". One reason unit tests are so much faster than
integration tests is because they don't have to do all this setup of
the services.
3) I want an ILoadBalancer whose getReadOnlyReason() will return true
or false, so I can test how ReadOnlyMode will behave. How do I get a
real one to do that? There's no setReadOnlyReason() in ILoadBalancer.
Now I have to explore a concrete implementation and its dependencies
to figure out how to get it to return "true" or "false". Then anyone
reviewing my code has to do the same to verify that it's correct. This
is instead of a few lines of code that are contained within my test
file itself and are checkable for correctness via cursory inspection.
4) If the contract or implementation details of this totally separate
interface/class change, it will break my test. Now anyone who changes
LoadBalancer's implementation details risks breaking tests of classes
that depend on it. Worse, they might make the tests incorrect but
still pass. Perhaps I was setting readOnlyReason to true somehow, and
now that way no longer works, so it's really returning false -- but
maybe for some other reason my test now passes (perhaps other changes
made at the same time). Probably nobody will ever notice until an
actual bug arises.
5) In some cases you want to know that a certain method is being
called with certain parameters, but it's nontrivial to check
indirectly. For instance, suppose a certain method call is supposed to
update a row in the database. Even if you didn't mind giving it a real
database despite all the above reasons, how are you going to test the
correct row is being updated? You'd have to populate an initial row
(make sure the database gets reset afterwards!) and check that it was
updated properly. But what if a refactor introduced a bug in the WHERE
clause that happens not to stop the query from working in your case
but might cause problems on production data? Maybe the bug changed the
WHERE clause to one that happens to select your row of test data
despite being wrong. Trying to test every possible set of preexisting
data is not feasible. What's extremely simple is just testing that the
expected query is issued to a mock.
6) Performance is important to us. The direct way to test performance
is by measuring test run time, but this is unreliable because it's
heavily affected by load on the test servers. Also, lots of
performance issues only come up on large data sets, and we're probably
not going to run tests routinely on databases the size of Wikipedia's.
One way to help catch inadvertent performance regressions is to test
that the means of ensuring performance are being used properly. For
instance, if a method is supposed to first check a cache and only fall
back to the database for a cache miss, we want to test that the
database query is actually only issued in the event of a cache miss.
With mocks this is simple: inject a cache that will return a hit, and
inject a database that expects not to be called. This is not really
doable if you're injecting a real database.