Architecture Guidelines: Writing Testable Code

List overview All Threads
Download

newer

older

Separation of Concerns

Deployment highlights - week of...

Daniel Kinzler

31 May 2013 31 May '13

8:15 p.m.

When looking for resources to answer Tim's question at https://www.mediawiki.org/wiki/Architecture_guidelines#Clear_separation_of_concerns, I found a very nice and concise overview of principles to follow for writing testable (and extendable, and maintainable) code:

"Writing Testable Code" by Miško Hevery http://googletesting.blogspot.de/2008/08/by-miko-hevery-so-you-decided-to.html.

It's just 10 short and easy points, not some rambling discussion of code philosophy.

As far as I am concerned, these points can be our architecture guidelines. Beyond that, all we need is some best practices for dealing with legacy code.

MediaWiki violates at least half of these principles in pretty much every class. I'm not saying we should rewrite MediaWiki to conform. But I'd wish that it was recommended for all new code to follow these principles, and that (local) "just in time" refactoring of old code in accordance with these guidelines was encouraged.

-- daniel

Show replies by date

Tim Starling

3 Jun 3 Jun

3:35 p.m.

On 31/05/13 20:15, Daniel Kinzler wrote:

...

When looking for resources to answer Tim's question at https://www.mediawiki.org/wiki/Architecture_guidelines#Clear_separation_of_concerns, I found a very nice and concise overview of principles to follow for writing testable (and extendable, and maintainable) code:

"Writing Testable Code" by Miško Hevery http://googletesting.blogspot.de/2008/08/by-miko-hevery-so-you-decided-to.html.

It's just 10 short and easy points, not some rambling discussion of code philosophy.

As far as I am concerned, these points can be our architecture guidelines. Beyond that, all we need is some best practices for dealing with legacy code.

MediaWiki violates at least half of these principles in pretty much every class. I'm not saying we should rewrite MediaWiki to conform. But I'd wish that it was recommended for all new code to follow these principles, and that (local) "just in time" refactoring of old code in accordance with these guidelines was encouraged.

I'm not convinced that unit testing is worth doing down to the level of detail implied by that blog post. Unit testing is essential for certain kinds of problems -- especially complex problems where the solution and verification can come from two different (complementary) directions.

But if you split up your classes to the point of triviality, and then write unit tests for a couple of lines of code at a time with an absolute minimum of integration, then the tests become simply a mirror of the code. The application logic, where flaws occur, is at a higher level of abstraction than the unit tests.

Even if you test code in larger chunks, very simple code tends to fail in ways that are invisible to unit tests. For example, you can get 100% test coverage of some code that generates an HTML form by confirming that it generates the HTML you wanted it to generate, but that's trivial. The most likely place for a flaw is in the specification -- i.e. in the HTML code which you typed into the unit tests and then typed again (perhaps with a trivial transformation) into the implementation.

So my question is not "how do we write code that is maximally testable", it is: does convenient testing provide sufficient benefits to outweigh the detrimental effect of making everything else inconvenient?

As for the rest of the blog post: I agree with items 3-8. I would agree with item 1 with the caveat that value objects can be constructed directly, which seems to be implied by item 9 anyway. The rest of item 9, and item 2, are the topics which I have been discussing here and on the wiki.

Regarding item 10: certainly separation of concerns is a fundamental principle, but there are degrees of separation, and I don't think I would go quite as far as requiring every method in a class to use every field that the class defines.

-- Tim Starling

Daniel Kinzler

9:58 p.m.

Thanks for your thoughtful reply, Tim!

Am 03.06.2013 07:35, schrieb Tim Starling:

...

On 31/05/13 20:15, Daniel Kinzler wrote:

...
"Writing Testable Code" by Miško Hevery http://googletesting.blogspot.de/2008/08/by-miko-hevery-so-you-decided-to.html.

It's just 10 short and easy points, not some rambling discussion of code philosophy.

I'm not convinced that unit testing is worth doing down to the level of detail implied by that blog post. Unit testing is essential for certain kinds of problems -- especially complex problems where the solution and verification can come from two different (complementary) directions.

I think testability is important, but I think it's not the only (or even main) reason to support the principles from that post. I think these principles are also important for maintainability and extensibility.

Essentially, they enforce modularization of code in a way that makes all parts as independent of each other as possible. This means they can also be understood by themselves, and can easily be replaced.

...

But if you split up your classes to the point of triviality, and then write unit tests for a couple of lines of code at a time with an absolute minimum of integration, then the tests become simply a mirror of the code. The application logic, where flaws occur, is at a higher level of abstraction than the unit tests.

That's why we should have unit tests *and* integration tests.

I agree though that it's not necessary or helpful to enforce the maximum possible breakdown of the code. However, I feel that the current code is way to the monolithic end of the spectrum - we could and should do a lot better.

...

So my question is not "how do we write code that is maximally testable", it is: does convenient testing provide sufficient benefits to outweigh the detrimental effect of making everything else inconvenient?

If there are indeed such detrimental effects. I see two main inconveniences:

* More classes/files. This is, in my opinion, mostly a question of using the proper tools.

* Working with "passive" objects, e.g. $chargeProcessor->process( $card ) instead of $card->charge(). This means additional code for injecting the processor, and more code for calling the logic.

That is inconvenient, but not detrimental, IMHO: it makes responsibilities clearer and allows for easy substitution of logic.

...

As for the rest of the blog post: I agree with items 3-8.

yay :)

...

I would agree with item 1 with the caveat that value objects can be constructed directly, which seems to be implied by item 9 anyway.

Yes, absolutely: value objects can be constructed directly. I'd even go so far as to say that it's ok, at least at first, to construct controller objects directly, using servies injected into the local scope (though it would be better to have a factory for the controllers).

...

The rest of item 9, and item 2, are the topics which I have been discussing here and on the wiki.

To me, 9 is pretty essential, since without that principle, value objects will soon cease to be thus, and will again grow into the monsters we see in the code base now.

Item 2 is less essential, though still important, I think; basically, it requires every component (class) to make explicit which other component it relies on for collaboration. Only then, it can easily be isolated and "transplanted" - that is, re-used in a different context (like testing).

...

Regarding item 10: certainly separation of concerns is a fundamental principle, but there are degrees of separation, and I don't think I would go quite as far as requiring every method in a class to use every field that the class defines.

Yes, I agree. Separation of concerns can be driven to the atomic level, and at some point becomes more of a pain than an aid. But we definitely should split more than we do now.

-- daniel

Nikolas Everett

11:04 p.m.

I have no qualms with any of the guidelines. They are good guidelines but like all guidelines they are made to be bent when appropriate so long as you leave a good explanatory comment. My main concern is that the article is about test how to write more unit testable code which is something I think people take too far. The thing that unit tests are good for is testing that a "unit" of code does what you expect it to. The problem is that people sometimes test portions of atomic units without testing the whole unit. Java folks are especially dogmatic about testing just one class at a time which is a great guideline but tends to be the wrong thing to do about 20% of the time.

My favorite example of this is testing a Repository or a DAO with a mock database. A repository's job is to issue the correct queries to the database and spit the results back correctly. Without talking to an actual database you aren't testing this. Without some good test data in that database you aren't testing this. I'd go so far as to say you have to talk to _exactly_ the right database (MySQL in our case) but other very smart people disagree with me on that point.

While this example is especially silly I'm sure we've all finished writing a tests, looked at the test code and thought, "This test proves that I'm interacting correctly with collaborator objects but doesn't prove that my functionality is correct." Sometimes this is caused by collaborators being non-obvious. Sometimes this is caused by global state that you have to work around. In any case I'd argue that these tests should really be deleted because all they really do is make your code coverage statistics better, give you a false sense of security, and slow down your builds.

So I just wrote a nice little wall of text about what is wrong with the world and like any good preacher I'll propose a few solutions: 1. Live with having bigger units. Call the tests an integration test if it makes you feel better. I don't really care. But you have to stand up the whole database connection, populate it with test data that mimics production in a useful sense, and then run the query. 2. Build smaller components sensibly and carefully. The goal is to be able to hold all of the component in your head at once and for the component to present such a clean API that when you mock it out tests are meaningful. 2. Write tests that test the entire application after it is started with stuff like Selenium. The disadvantage here is that these run way slower than unit tests and require you learn yet another tool. Too bad. Some stuff is simply untestable without a real browser like Tim's HTML forms. 3. Use lots of static analysis tools. They really do help identify dumb mistakes and don't even require you do anything other than turn them on, run them before you commit, and fail the build when they fail. Worth it. 4. Don't write automated tests at all and do lots of code reviews and manual testing. Sometimes this is really the most sensible thing. I'll leave it to you to figure out when that is though.

There is a great presentation on InfoQ about unit testing that I can't find anymore where the presenter likens testing to guard rails. He claims that just because you have guard rails you shouldn't stop paying attention and expect them to save you.

Sorry for the rambling wall of text.

Nik

On Mon, Jun 3, 2013 at 7:58 AM, Daniel Kinzler daniel@brightbyte.de wrote:

...

Thanks for your thoughtful reply, Tim!

Am 03.06.2013 07:35, schrieb Tim Starling:

...
On 31/05/13 20:15, Daniel Kinzler wrote:

...
"Writing Testable Code" by Miško Hevery <

http://googletesting.blogspot.de/2008/08/by-miko-hevery-so-you-decided-to.ht...

...
.

...
It's just 10 short and easy points, not some rambling discussion of

code philosophy.

...
I'm not convinced that unit testing is worth doing down to the level of detail implied by that blog post. Unit testing is essential for certain kinds of problems -- especially complex problems where the solution and verification can come from two different (complementary) directions.

I think testability is important, but I think it's not the only (or even main) reason to support the principles from that post. I think these principles are also important for maintainability and extensibility.

Essentially, they enforce modularization of code in a way that makes all parts as independent of each other as possible. This means they can also be understood by themselves, and can easily be replaced.

...
But if you split up your classes to the point of triviality, and then write unit tests for a couple of lines of code at a time with an absolute minimum of integration, then the tests become simply a mirror of the code. The application logic, where flaws occur, is at a higher level of abstraction than the unit tests.

That's why we should have unit tests *and* integration tests.

I agree though that it's not necessary or helpful to enforce the maximum possible breakdown of the code. However, I feel that the current code is way to the monolithic end of the spectrum - we could and should do a lot better.

...
So my question is not "how do we write code that is maximally testable", it is: does convenient testing provide sufficient benefits to outweigh the detrimental effect of making everything else

inconvenient?

If there are indeed such detrimental effects. I see two main inconveniences:

More classes/files. This is, in my opinion, mostly a question of using

the proper tools.

Working with "passive" objects, e.g. $chargeProcessor->process( $card )

instead of $card->charge(). This means additional code for injecting the processor, and more code for calling the logic.

That is inconvenient, but not detrimental, IMHO: it makes responsibilities clearer and allows for easy substitution of logic.

...
As for the rest of the blog post: I agree with items 3-8.

yay :)

...
I would agree with item 1 with the caveat that value objects can be constructed directly, which seems to be implied by item 9 anyway.

Yes, absolutely: value objects can be constructed directly. I'd even go so far as to say that it's ok, at least at first, to construct controller objects directly, using servies injected into the local scope (though it would be better to have a factory for the controllers).

...
The rest of item 9, and item 2, are the topics which I have been discussing here and on the wiki.

To me, 9 is pretty essential, since without that principle, value objects will soon cease to be thus, and will again grow into the monsters we see in the code base now.

Item 2 is less essential, though still important, I think; basically, it requires every component (class) to make explicit which other component it relies on for collaboration. Only then, it can easily be isolated and "transplanted" - that is, re-used in a different context (like testing).

...
Regarding item 10: certainly separation of concerns is a fundamental principle, but there are degrees of separation, and I don't think I would go quite as far as requiring every method in a class to use every field that the class defines.

Yes, I agree. Separation of concerns can be driven to the atomic level, and at some point becomes more of a pain than an aid. But we definitely should split more than we do now.

-- daniel

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Jeroen De Dauw

4 Jun 4 Jun

12:20 a.m.

Hey,

He claims that just because you have guard rails you shouldn't stop paying

...

attention and expect them to save you.

Being someone who practices TDD, I fully agree with that. It is not because you wrote tests first and got a 100% coverage that you proved your code to be correct and bug free. That tests can prevent all bugs or that they prevent the need to to think about what you are doing are two common arguments made against a mythical strawman version of TDD. They are both clearly fallacious and not the goal of TDD or writing tests in general. [0]

4. Don't write automated tests at all and do lots of code reviews and

...

manual testing. Sometimes this is really the most sensible thing. I'll leave it to you to figure out when that is though.

Absolutist statements are typically wrong. There are almost always cases in which some practice is not applicable. However I strongly disagree with your recommendation of not writing tests and automating them. I disagree even stronger with the notion that manual testing is generally something you want to do. I've seen many experts in the field of software design recommend strongly against manual testing, and am currently seeing the same theme being pretty prevalent here at the International PHP Conference I'm currently attending.

So my question is not "how do we write code that is maximally

...

testable", it is: does convenient testing provide sufficient benefits to outweigh the detrimental effect of making everything else inconvenient?

This contains the suggestion that testable code inherently is badly designed. That is certainly not the case. Good design and testability go hand in hand. One of the selling points of testing is that it strongly encourages you to create well designed software.

...

I'm not convinced that unit testing is worth doing down to the level

...
of detail implied by that blog post. Unit testing is essential for certain kinds of problems -- especially complex problems where the solution and verification can come from two different (complementary) directions.

I think testability is important, but I think it's not the only (or even main) reason to support the principles from that post. I think these principles are also important for maintainability and extensibility.

Essentially, they enforce modularization of code in a way that makes all parts as independent of each other as possible. This means they can also be understood by themselves, and can easily be replaced.

There are other advantages to writing tests as well. Just out of the top of my head:

* Regression detection * Replaces manual testing with automated testing, saves lots of time, esp in projects with multiple devs. Manual testing tends to be incomplete and skipped as well, so the number of bugs caught is much lower. And it does not scale. At all. * Documentation so formal it can be executed and is never out of date * Perhaps the most important: removes the fear of change. One can refactor code to clean up some mess without having to fear one broke existing behavior. Tests are a great counter to code rot. Without tests, your code quality is likely to decline.

I'm sure I missed some other good points. Lots of great literature on the subject out there btw.

[0] http://codemanship.co.uk/parlezuml/blog/?postid=1170

Cheers

-- Jeroen De Dauw http://www.bn2vs.com Don't panic. Don't be evil. ~=[,,_,,]:3 --

Mathieu Stumpf

12:54 a.m.

Le 2013-06-03 16:20, Jeroen De Dauw a écrit :

...

Absolutist statements are typically wrong.

Please, don't miss an oportunity to provide people with the joy of reading paradoxes : Absolutist statements are *always* wrong.

There, fixed that for you. ;)

(sorry, I couldn't resist…)

Nikolas Everett

1 a.m.

On Mon, Jun 3, 2013 at 10:20 AM, Jeroen De Dauw jeroendedauw@gmail.comwrote:

...

Don't write automated tests at all and do lots of code reviews and

...
manual testing. Sometimes this is really the most sensible thing. I'll leave it to you to figure out when that is though.

Absolutist statements are typically wrong. There are almost always cases in which some practice is not applicable. However I strongly disagree with your recommendation of not writing tests and automating them. I disagree even stronger with the notion that manual testing is generally something you want to do. I've seen many experts in the field of software design recommend strongly against manual testing, and am currently seeing the same theme being pretty prevalent here at the International PHP Conference I'm currently attending.

I think not having automated tests is right in some situations but I certainly wouldn't recommend it. Manual testing sucks and having nice tests with Selenium or some such tool is way better in most situations but there are totally times where a good code review and manual verification are perfect. I'm thinking of temporary solutions or styling issues are difficult to verify with automated tests. I'm certainly no expert and I'd _love_ to learn more about things that help in the situations where I feel like manual testing is best. I'd love nothing more than to be wrong.

...

So my question is not "how do we write code that is maximally

...
testable", it is: does convenient testing provide sufficient benefits to outweigh the detrimental effect of making everything else

inconvenient?

...
This contains the suggestion that testable code inherently is badly designed. That is certainly not the case. Good design and testability go hand in hand. One of the selling points of testing is that it strongly encourages you to create well designed software.

IMHO you can design code so that it is both easy to understand and easy to test but there is a real temptation to sacrifice comprehenability for testability. Mostly I see this in components being split into incomprehensibly small chunks and then tested via an intricate mock waltz. I'm not saying this happens all the time, only that this happens and we need to be vigilent. The guidelines in the article help prevent such craziness.

...

There are other advantages to writing tests as well. Just out of the top of my head:

Regression detection

Replaces manual testing with automated testing, saves lots of time, esp

in projects with multiple devs. Manual testing tends to be incomplete and skipped as well, so the number of bugs caught is much lower. And it does not scale. At all.

Documentation so formal it can be executed and is never out of date

Perhaps the most important: removes the fear of change. One can refactor

code to clean up some mess without having to fear one broke existing behavior. Tests are a great counter to code rot. Without tests, your code quality is likely to decline.

This is perfect! If you think of your tests as formal verification documents then you are in good shape because this implies that the tests are readable.

If I had my druthers I'd like all software to be designed in such a way that it can be tested automatically with informative tests that read like documentation. We'd all like that. To me it looks like there are three problems: 1. How do you keep out tests that are incomprehensible as documentation? 2. What do you do with components for which no unit test can be written that could serve as documentation? 3. What do you do when the formal documentation will become out of date so fast that it feels like a waste of time to write it?

I really only have a good answer for #2 and that is to test components together like the DB and Repository or the server side application and the browser.

1 troubles me quite a bit because I've found those tests to be genuinely hurtful in that they give you the sense that you are acomplishing something when you aren't.

Nik

Chris Steipp

2:48 a.m.

On Mon, Jun 3, 2013 at 7:20 AM, Jeroen De Dauw jeroendedauw@gmail.com wrote:

...

So my question is not "how do we write code that is maximally

...
testable", it is: does convenient testing provide sufficient benefits to outweigh the detrimental effect of making everything else inconvenient?

This contains the suggestion that testable code inherently is badly designed. That is certainly not the case. Good design and testability go hand in hand. One of the selling points of testing is that it strongly encourages you to create well designed software.

I've fixed several security bugs in the past year where we had unit tests covering the code. The code's author just didn't expect their code to be used in certain ways, which lead to the vulnerability. So speaking solely from a security perspective, testable/tested code is not always well designed code. I think everyone would agree with that, but from my perspective, I think the good design needs to trump the testability. I would guess that in most cases there shouldn't be a conflict, but I think there are times when it will come up.

On Mon, Jun 3, 2013 at 6:04 AM, Nikolas Everett neverett@wikimedia.org wrote:

...

Build smaller components sensibly and carefully. The goal is to be

able to hold all of the component in your head at once and for the component to present such a clean API that when you mock it out tests are meaningful.

Yep. Very few security issues come up from a developer saying, "I'm going to chose a lower security option", and they attacker plows through it. It's almost always that the attacker is exploiting something that the developer didn't even consider in their design. So the more things that a developer needs to hold in their head from between the request and the response, the more likely vulnerabilities are going to be introduced. So simplifying some of our complex components and clearly documenting their security properties would be very helpful towards a more secure codebase. Adding layers of abstraction, without making the security easy to understand and demonstrate, will hurt us.

Daniel Kinzler

5:15 a.m.

Am 03.06.2013 18:48, schrieb Chris Steipp:

...

On Mon, Jun 3, 2013 at 6:04 AM, Nikolas Everett neverett@wikimedia.org wrote:

...

Build smaller components sensibly and carefully. The goal is to be

able to hold all of the component in your head at once and for the component to present such a clean API that when you mock it out tests are meaningful.

Yep. Very few security issues come up from a developer saying, "I'm going to chose a lower security option", and they attacker plows through it. It's almost always that the attacker is exploiting something that the developer didn't even consider in their design. So the more things that a developer needs to hold in their head from between the request and the response, the more likely vulnerabilities are going to be introduced. So simplifying some of our complex components and clearly documenting their security properties would be very helpful towards a more secure codebase. Adding layers of abstraction, without making the security easy to understand and demonstrate, will hurt us.

I agree with the sentiment, but disagree with the metric used.

Currently, we have relatively few components, which have very complex internal information flows, and quite complex dependency networks (or quite simple: everything depends on everything).

I'm advocating a system of many more components with several dependencies each, but with simple internal information flow and a clear hierarchy of dependency.

So, which one is simpler to hold in your head? Well, it's simpler to remember fewer components. But not fully understanding their internal information flow (EditPage, anyone) or how they interact and depend on each other is what is really hurting security (and overall code quality).

So, I'd argue that even if you have to remember 15 (well named) classes instead of 5, you are still better off if these 15 classes only depend on a total of 5000 lines of code, as opposed to 50k or more with the current system.

tl;dr: number of lines is a better metric for the impact of dependencies than the number of classes is. Big, multi purpose classes and context objects (and global state) keep the number of classes low, but cause dependency on a huge number of LoC.

-- daniel

Jim Laurino

3:59 a.m.

On 06/03/2013 10:20:26 AM, Jeroen De Dauw - jeroendedauw@gmail.com wrote: <snip>

...

Regression detection

Replaces manual testing with automated testing, saves lots of time, esp

in projects with multiple devs. Manual testing tends to be incomplete and skipped as well, so the number of bugs caught is much lower. And it does not scale. At all.

Documentation so formal it can be executed and is never out of date

Perhaps the most important: removes the fear of change. One can refactor

code to clean up some mess without having to fear one broke existing behavior. Tests are a great counter to code rot. Without tests, your code quality is likely to decline.

There is an economic argument also. Consider the extra cost of writing automated tests versus doing manual tests as an investment to reduce long-term costs. With manual testing someone (maybe you) has to remember and recreate a lot more each time testing is required. The more sucessful a software project is, the longer it will be in production; thus "maintenance" will be performed more times, and the bigger the payoff will be in the investment.

<snip>

-- Jim Laurino wican.x.jimlaur@dfgh.net Please direct any reply to the list. Only mail from the listserver reaches this address.

Tim Starling

10:59 a.m.

On 04/06/13 00:20, Jeroen De Dauw wrote:

...

Hey,

He claims that just because you have guard rails you shouldn't stop paying

...
attention and expect them to save you.

Being someone who practices TDD, I fully agree with that. It is not because you wrote tests first and got a 100% coverage that you proved your code to be correct and bug free. That tests can prevent all bugs or that they prevent the need to to think about what you are doing are two common arguments made against a mythical strawman version of TDD. They are both clearly fallacious and not the goal of TDD or writing tests in general. [0]

Indeed, TDD doesn't produce bug-free code. In fact, according to a literature review published in the book "Making Software" (Ed. Andy Oram & Greg Wilson), TDD does not even significantly *reduce* the post-release defect density in controlled trials, let alone eliminate bugs altogether.

"There is some evidence to suggest that TDD improves external quality. Although the outcomes of controlled experiments are mostly inconclusive, industrial use and pilot studies strongly favor TDD. However, the supporting evidence from industrial use and controlled experiments disappears after filtering out the less rigorous studies (i.e., L0 and L1 trials). Furthermore, the evidence from pilot studies and controlled experiments is contradictory once L0 and L1 trials are filtered out."

The review was even more negative about the effect of TDD on productivity:

"The available evidence from the trials suggests that TDD does not have a consistent effect on productivity. The evidence from controlled experiments suggests an improvement in productivity when TDD is used. However, the pilot studies provide mixed evidence, some in favor of and others against TDD. In the industrial studies, the evidence suggests that TDD yields worse productivity. Even when considering only the more rigorous studies (L2 and L3), the evidence is equally split for and against a positive effect on productivity."

And also in the discussion section:

"One basic fact on which almost everyone agrees is that TDD is difficult to learn. It involves a steep learning curve that requires skill, maturity, and time, particularly when developers are entrenched in the code-then-test paradigm."

Because of this, I can be fairly confident in recommending that my team avoids the use of TDD.

-- Tim Starling

Jeroen De Dauw

10:29 p.m.

Hey,

Because of this, I can be fairly confident in recommending thata my

...

team avoids the use of TDD.

Clearly you are not a fan of TDD. Which is entirely fine. If you adopt this practice or not is a personal choice, and not one that should be forced upon anyone. Like with all practices, effectiveness varies depending on the people and the environment. You are however writing tests for most of your code I hope? MediaWiki hack a serious lack of tests, and we are paying the cost of that, if you realize this or not, so be careful with what you recommend.

Cheers

-- Jeroen De Dauw http://www.bn2vs.com Don't panic. Don't be evil. ~=[,,_,,]:3 --

John Erling Blad

5 Jun 5 Jun

1:37 a.m.

As the dumb ass trying to merge a lot of the code last year at Wikidata I would say stop bitching about whether to make tests or not. Any tests are better than no tests, without tests merging code is pure gambling. Yes you can create a small piece of code and be fairly sure that your own code works when you merge it in, but no you can not be sure that it will continue to work when a bunch of other nerds are wild-guessing what your code is doing and then merge their own code in. It is inly one sure method to make the code keep on working and that is by adding tests. Those tests are not primarily for you, unless you do TDD, they are for everyone else.

If you want some real comparison on what can be achieved with proper testing, then take a look at Coverity scan: 2012 Open Source Report [1], they are tracking 254 active projects. You must check what the projects are doing to make the code bugfree. (Amanda uses a automatic gatekeeper, and _they_have_extensive_tests_)

You should not make tests that is really testing internals of an unit, you should make tests that operates on the API of the unit. You should be able to change the internals of an unit without changing the tests. If you must change the tests because of internal changes in an unit, then there is probably something wrong with the test.

Security issues in an implementation is most of the time not found by unit tests, because the unit tests should test the API and not the implementation. An other thing is that if the unit test in fact tests the API the necessary seems are probably in place to do source code fault injection, and that is a very efficient tool to find security issues. Tak a look at "Software Security Assessment Tools Review" [2] for a list of proper methods for security assessment.

Also tests are not a replacement for proper code review, but it will speed up the process in those cases where bugs are detected during the testing process. In some cases you can infact use the time spent on merges and the frequency whereby you break the build as a measure of how good test coverage you have. Less coverage gives higher break frequency.

TDD is nice, but often it turns into testing internals of some idea the developer has, which is wrong. It also have a strange effect to create lock-in on certain solutions which can be a problem. That is you chose implementation after how easy it is to test, not how efficient the implementation is. The same thing can be said about other formal processes so I'm not sure it is a valid point. It is said that those familiar with TDD writes the tests and implementation in close to the same time as the implementation alone. Rumors has it that this comes from the fact that the prewritten tests makes the implementation more testable, and that this is an easier way than adding tests to an easier implementation later. I don't know.

Code with some tests are sometimes compared to code with no tests at all, and then there are made claims that the code without tests are somehow similarly bugfree to some degree. This argument usually doesn't hold, as "some tests" isn't good enough, you need real numbers for the coverage before you can make valid claims and if the coverage is low it can be very difficult to say anything about the real effects. If you really want to make an effort to get bugfree code you need a lot of tests, above 50% and probably more like 80%. Not only do you need a lot of tests, you need a lot of people to fix those bugs, because lots of tests will find more bugs to fix. It is like writing a document with no spell checker vs using a spell checker.

[1] http://www.coverity.com/library/pdf/coverity-scan-2011-open-source-integrity... [2] http://samate.nist.gov/docs/NAVSEA-Tools-Paper-2009-03-02.pdf

Marc A. Pelletier

2:15 a.m.

On 06/04/2013 11:37 AM, John Erling Blad wrote:

...

It is like writing a document with no spell checker vs using a spell checker.

Which would be the right moment to remind you of the Cupertino effect that illustrates so well how the combination of automation and trust in that automation is known to cause as many (and often more insidious) problems than it solves.

My own experience is that "test coverage" is a poor evaluation metric for anything but "test coverage"; it doesn't produce better code, and tends to produce code that is considerably harder to understand conceptually because it has been over-factorized into simple bits that hide the actual code and data flow. "Forest for the trees".

-- Marc

Jeroen De Dauw

2:36 a.m.

Hey,

My own experience is that "test coverage" is a poor evaluation metric

...

for anything but "test coverage"; it doesn't produce better code, and tends to produce code that is considerably harder to understand conceptually because it has been over-factorized into simple bits that hide the actual code and data flow. "Forest for the trees".

Test coverage is a metric to see how much of your code is executed by your tests. From this alone you cannot say if some code is good or bad. You can have bad code with 100% coverage, and good code without any coverage. You are first stating it is a poor metric to measure quality and then proceed to make the claim that more coverage implies bad code. Aside from contradicting yourself, this is pure nonsense. Perhaps you just expressed yourself badly, as test coverage does not "produce" code to begin with.

Cheers

-- Jeroen De Dauw http://www.bn2vs.com Don't panic. Don't be evil. ~=[,,_,,]:3 --

Nikolas Everett

2:57 a.m.

On Tue, Jun 4, 2013 at 12:36 PM, Jeroen De Dauw jeroendedauw@gmail.comwrote:

...

Hey,

My own experience is that "test coverage" is a poor evaluation metric

...
for anything but "test coverage"; it doesn't produce better code, and tends to produce code that is considerably harder to understand conceptually because it has been over-factorized into simple bits that hide the actual code and data flow. "Forest for the trees".

Test coverage is a metric to see how much of your code is executed by your tests. From this alone you cannot say if some code is good or bad. You can have bad code with 100% coverage, and good code without any coverage. You are first stating it is a poor metric to measure quality and then proceed to make the claim that more coverage implies bad code. Aside from contradicting yourself, this is pure nonsense. Perhaps you just expressed yourself badly, as test coverage does not "produce" code to begin with.

The thing is quite a few of us have seen cases where people bend over backwards for test coverage, sacrificing code quality and writing tests that don't provide any real value. In this respect high test coverage can poison your code. It shouldn't but it can.

The problem is rejecting changes like this while still encouraging people to write the useful kinds of tests - tests for usefully large chunks that serve as formal documentation. Frankly, one of my favorite tools in the world is Python's doctests because the test _is_ the documentation.

Nik

Marc A. Pelletier

3:20 a.m.

On 06/04/2013 12:57 PM, Nikolas Everett wrote:

...

The thing is quite a few of us have seen cases where people bend over backwards for test coverage, sacrificing code quality and writing tests that don't provide any real value.

Probably better expressed than I did.

My point is: clearly test coverage doesn't /produce/ bad code -- but writing *for* test coverage does. Or at least, I've observed a strong correlation between mandated test coverage metrics and code with atrocious factorization and poor conceptual coherency.

Tests are good. Unit testing has valuable uses in a number of cases. Trying to universally shoehorn either into the development process is rarely useful, and often disastrous.

(Often, for instance, coherency of the code is sacrificed atop the altar of separation of concerns for a vacuous gain in "testability" at the expense of clarity).

-- Marc

John Erling Blad

6:44 a.m.

Can you give any examples of real code that become less clear after it was rewritten for testability, and explain why it is worse after the rewrite?

On Tue, Jun 4, 2013 at 7:20 PM, Marc A. Pelletier marc@uberbox.org wrote:

...

On 06/04/2013 12:57 PM, Nikolas Everett wrote:

...
The thing is quite a few of us have seen cases where people bend over backwards for test coverage, sacrificing code quality and writing tests that don't provide any real value.

Probably better expressed than I did.

My point is: clearly test coverage doesn't /produce/ bad code -- but writing *for* test coverage does. Or at least, I've observed a strong correlation between mandated test coverage metrics and code with atrocious factorization and poor conceptual coherency.

Tests are good. Unit testing has valuable uses in a number of cases. Trying to universally shoehorn either into the development process is rarely useful, and often disastrous.

(Often, for instance, coherency of the code is sacrificed atop the altar of separation of concerns for a vacuous gain in "testability" at the expense of clarity).

-- Marc

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

John Erling Blad

3:04 a.m.

Test coverage is not a quality metric, it is a quantity metric. That is it says something about the amount of tests. The coverage can say something about the overall code given that the code under test infact reflect the remaining code. As the code under test usually are better than the remaining code the overall code is usually worse than the predicted number of faults in the remaining code.

My 0.05€ is to enforce testing wherever possible, possibly by using automatic gatekeepers, and also add in other quality and security metrics for the reports. Err, automated reports, that is another problem...

On Tue, Jun 4, 2013 at 6:36 PM, Jeroen De Dauw jeroendedauw@gmail.com wrote:

...

Hey,

My own experience is that "test coverage" is a poor evaluation metric

...
for anything but "test coverage"; it doesn't produce better code, and tends to produce code that is considerably harder to understand conceptually because it has been over-factorized into simple bits that hide the actual code and data flow. "Forest for the trees".

Test coverage is a metric to see how much of your code is executed by your tests. From this alone you cannot say if some code is good or bad. You can have bad code with 100% coverage, and good code without any coverage. You are first stating it is a poor metric to measure quality and then proceed to make the claim that more coverage implies bad code. Aside from contradicting yourself, this is pure nonsense. Perhaps you just expressed yourself badly, as test coverage does not "produce" code to begin with.

Cheers

-- Jeroen De Dauw http://www.bn2vs.com Don't panic. Don't be evil. ~=[,,_,,]:3 -- _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Tim Starling

8:28 a.m.

On 04/06/13 22:29, Jeroen De Dauw wrote:

...

Hey,

Because of this, I can be fairly confident in recommending thata my

...
team avoids the use of TDD.

Clearly you are not a fan of TDD. Which is entirely fine. If you adopt this practice or not is a personal choice, and not one that should be forced upon anyone. Like with all practices, effectiveness varies depending on the people and the environment. You are however writing tests for most of your code I hope? MediaWiki hack a serious lack of tests, and we are paying the cost of that, if you realize this or not, so be careful with what you recommend.

The studies were comparing writing unit tests before writing code with writing unit tests after writing code. Coverage was not equivalent, but nobody is saying there should be no unit tests.

My position on unit testing is like I said in my original post. I think it is essential for complex code, with decreasing gains as code becomes simpler. So I decide whether or not to write a unit test for a given module based on a cost/benefit analysis.

-- Tim Starling

Aaron Schulz

8 Jun 8 Jun

4:29 a.m.

I generally agree with 2-8, and 10. I think points 2 and 10 are pretty subjective and must be applied very pragmatically.

-- View this message in context: http://wikimedia.7.x6.nabble.com/Architecture-Guidelines-Writing-Testable-Co... Sent from the Wikipedia Developers mailing list archive at Nabble.com.

4212

Age (days ago)

4219

Last active (days ago)

wikitech-l@lists.wikimedia.org

20 comments

10 participants

tags (0)

participants (10)

Aaron Schulz
Chris Steipp
Daniel Kinzler
Jeroen De Dauw
Jim Laurino
John Erling Blad
Marc A. Pelletier
Mathieu Stumpf
Nikolas Everett
Tim Starling