BDD Redux
November 03, 2024
A long while ago I - Ryan Wilcox - wrote an article for Toptal on Behavior Driven Development (or BDD). I’ve learned a lot in the years since that article was published, so time for a revisit of the entry (which can no longer be found online).
Introduction: What is BDD
Behavior Driven Development, from it’s own wiki:
BDD aims to bridge the gap between the differing views of computer systems held by Business users and Technologists. … Its focus is on minimizing the hurdles between specification, design, implementation and confirmation of the behavior of a system
BDD can be very effective when a developer collaborates with an Agile product owner or business analyst, along with a QA representative, to jointly create pending tests that the developer will later implement.
The business person specifies behaviors they want to see in the system, the developer asks questions based on their understanding of the system - and providing some additional behaviors from the development perspective, and the QA person is there to note and ask questions about edge cases in the functional requirements or acceptance criteria.
Yes, the three amigos! And, in particular, look at QA helping to refine and ensure implementation quality right at the beginning!
Ideally, (and how I attempt to practice BDD) tests created this way should be traceable from a ticket’s acceptance criteria to the changed source code. By that I mean “I should be able to find the words from the acceptance criteria in the ticket in the code differences”.
There will certainly be tests created that are specific to the code in play: tests that are specifically about the acceptance criteria won’t be all of your tests, but it can be some.
Implementation of Behavior Driven Development in code bases
A brief talk about Cucumber
Often when people think about BDD they think about test frameworks like Cucumber where you can write tests in supported phrases that look kind of like English (or other human languages):
Scenario: I want to log in
Given I am on the login page
When I log in with a valid user
Then I will be logged in
Each of those lines are phrases, but they are essentially from a dictionary of commands the computer knows how to execute. A developer (or Software Engineer In Test, if you’re lucky enough to have one) needs to write the phrases and also the implementation of them: what happens when the computer encounters a specific phrase. A phrase could be very specific, or could be a slight mad-lib. “I login with a [valid user]”, “I am on [the login page]”, etc.
Something that often catches less-technical people is that these phrases are not English, they are matched expressions: it’s “I am on the login page”, but not “I am at the login page”. The former is a phrase the computer recognizes (it matches a known expression), the latter not.
You may be familiar with languages that look like, but aren’t exactly, English. If not, I present Applescript, a language with a series of problems but one of them is this very thing: in human languages there’s a bunch of different ways to say the same thing: human languages are very flexible. In programming, usually not so much.
With that background stated, I’m sour on Cucumber. One reason is that as a developer I need to implement those phrases (fine), and - because of that “not as flexible as human language” characteristic, less technical people will often get frustrated at the needing to get things exact, then toss half done tests my way. Sure, with Cucumber I’m writing tests at a slightly higher level of abstraction, but in a way that’s more work as I have to write all the layers down too. This is a bit of ceremony I very much dislike.
The second reason is, Cucumber tests are often UI driven tests, high up on the test pyramid: these are slow to execute and (unless you are very clever) tend to be brittle as the UI changes underneath them (re-themes, new features, etc).
Now, Cucumber tests don’t have to be UI focused, but almost always are. In fact, complex behaviors aren’t just about things we can trigger in the UI: What about a bill that goes out automatically? No UI there, but still behaviors we need to describe and test.
I’ve been on maybe ten different projects that have used Cucumber, and nine have managed to get themselves into a pickle with it: UI based tests that are brittle (in spite of best practices from the Cucumber community), never updated when “that one Cucumber loving developer” left the company, or eventually deemed too slow to execute or write.
BDD without Ceremony
Thankfully, more and more tests frameworks support descriptions for tests: instead of being limited to valid function names in (whatever language you’re using), you can use descriptions which then can be printed as part of test output
JUnit 5 introduces this as @DisplayName
. Example:
@Test
@DisplayName("Bills on the 30th of the month")
void test() {
// TODO: FILL OUT THIS STUB
}
Or, in Ruby’s Minitest
it "Bills on the 30th day of the month" do
# TODO: FILL OUT THIS STUB
end
While there’s some “weird programming punctuation” in those examples (especially the Java version) it’s easy to pick out the acceptance criteria signal in the noise.
In Python it’s not as nice, but doable
def test_my_function(self):
"""It bills on the 30th day of the month"""
# TODO: FILL OUT THIS STUB
If I sit down with my three amigos and start writing this test someone (ideally QA) is thinking about what additional edge cases they’d like the developer to implement, and what edge cases they want to explore in more manual tests.
For example, in our billing example, what about February? Did the product owner write “the 30th” in the Acceptance Criteria, and we need to go back for clarification? Or did they really mean “the last day of the month”, and the developer made an assumption?
Maybe we go back to the product owner, have a discussion, and end up adding two more tests:
it "Bills on the 28th of the month in February" do
...
end
it "Bills on the 29th of February in leap years" do
...
end
Note all of these examples has “FILL OUT THIS STUB” in them. The amigos get together to write the work to be done, but this isn’t a pair programming session: write as little structure as it takes to compile.
Collaboration, Artifacts and hope for a more efficient development cycle
Ideally the output of such collaboration is better acceptance criteria of the functionality in question, and more complete tests of the same. If we are very very lucky it may also mean better understanding of the product requirement under design.
To frame this in the terms of software artifacts and logistics, we can drive a unified understanding of the functionality across four artifacts, presented below.
The Product Description / Refinement can influence - and be influenced by - the creation of Acceptance Criteria. Acceptance Criteria influence - and can be influenced by - Tests. When we are checking things for Integration Approval we should know that our acceptance criteria is implemented (because they were part of the automated tests we wrote together, while we were describing the behaviors of the system together!)
Conclusion
Behavior Driven Development enables a bunch of collaboration work to happen before lots of code is written: it’s much easier to change English than it is to change a bunch of coding!
The expected behaviors of the functionality become a form of shared capital: the expected behaviors of the system should be as important as the machine instructions of the running final program to all members of the team: development (it gives them a blueprint), QA (they have helped ensure quality by identifying edge cases in the automated test cases), and Product (they can walk away knowing that the team’s understanding of the functionality is correct, or have better understanding themselves about the business domain in question).
At the end of the day, you have automated tests that implement your acceptance criteria AND (most importantly!) you can audit that they have been automated with a simple Find in the code differences! A common, shared, understanding between product and development!