On Fri, Aug 30, 2019 at 10:09 PM Krinkle krinklemail@gmail.com wrote:
For anything else, it doesn't really work in my experience because PHPUnit won't actually provide a valid implementation of the interface. It returns null for anything, which is usually not a valid implementation of the contract the class under test depends on. As a result, you end up with hundreds of lines of phpunit-mockery to basically reimplement the dependency's general functionality from scratch, every time.
In my experience, classes that use a particular interface usually call only one or two methods from it, which can usually be replaced by returning fixed values or at most two or three lines of code. The implementations of the stub methods usually can be very simple, because you know exactly the context that they'll be called and you can hard-code the results. It could sometimes not be worth it to mock for the reason you give, but in my experience that's very much the exception rather than the rule.
I have seen this too many times and fail to see the added value for all that complexity and maintenance, compared to simply letting the actual class do its work. Practically speaking, what kind of problem would that test methodology catch? I believe others may have a valid answer to this, but I haven't yet seen it. Perhaps it's a source of problems I tend to prevent by other means. There's more than one way to catch most problems :)
Consider a class like ReadOnlyMode. It has an ILoadBalancer injected. The only thing it uses it for is getReadOnlyReason(). When testing ReadOnlyMode, we want to test "What happens if the load balancer returns true? What about false?" A mock allows us to inject the required info with a few lines of code. If you use a real load balancer, you're creating a number of problems:
1) Where do you get the load balancer from? You probably don't want to manually supply all the parameters. If you use MediaWikiServices, you're dependent on global state, so other tests can interfere with yours. (Unless you tear down the services for every test, and see next point.)
2) It's a waste of resources to run so much code just to generate "true" or "false". One reason unit tests are so much faster than integration tests is because they don't have to do all this setup of the services.
3) I want an ILoadBalancer whose getReadOnlyReason() will return true or false, so I can test how ReadOnlyMode will behave. How do I get a real one to do that? There's no setReadOnlyReason() in ILoadBalancer. Now I have to explore a concrete implementation and its dependencies to figure out how to get it to return "true" or "false". Then anyone reviewing my code has to do the same to verify that it's correct. This is instead of a few lines of code that are contained within my test file itself and are checkable for correctness via cursory inspection.
4) If the contract or implementation details of this totally separate interface/class change, it will break my test. Now anyone who changes LoadBalancer's implementation details risks breaking tests of classes that depend on it. Worse, they might make the tests incorrect but still pass. Perhaps I was setting readOnlyReason to true somehow, and now that way no longer works, so it's really returning false -- but maybe for some other reason my test now passes (perhaps other changes made at the same time). Probably nobody will ever notice until an actual bug arises.
5) In some cases you want to know that a certain method is being called with certain parameters, but it's nontrivial to check indirectly. For instance, suppose a certain method call is supposed to update a row in the database. Even if you didn't mind giving it a real database despite all the above reasons, how are you going to test the correct row is being updated? You'd have to populate an initial row (make sure the database gets reset afterwards!) and check that it was updated properly. But what if a refactor introduced a bug in the WHERE clause that happens not to stop the query from working in your case but might cause problems on production data? Maybe the bug changed the WHERE clause to one that happens to select your row of test data despite being wrong. Trying to test every possible set of preexisting data is not feasible. What's extremely simple is just testing that the expected query is issued to a mock.
6) Performance is important to us. The direct way to test performance is by measuring test run time, but this is unreliable because it's heavily affected by load on the test servers. Also, lots of performance issues only come up on large data sets, and we're probably not going to run tests routinely on databases the size of Wikipedia's. One way to help catch inadvertent performance regressions is to test that the means of ensuring performance are being used properly. For instance, if a method is supposed to first check a cache and only fall back to the database for a cache miss, we want to test that the database query is actually only issued in the event of a cache miss. With mocks this is simple: inject a cache that will return a hit, and inject a database that expects not to be called. This is not really doable if you're injecting a real database.