Ipeleng Molete
The Home of Ipelatech's Blog

The Home of Ipelatech's Blog

The Shepherd System

The Shepherd System

How to write Evolvable Code

Ipeleng Molete's photo
Ipeleng Molete
·Nov 22, 2021·

22 min read

Featured on Hashnode

Table of contents

  • Introduction
  • TLDR;
  • Background
  • The System
  • Naming Conventions
  • Testing
  • Conclusion
  • References

Introduction

I've struggled to write good code for most of my development career. The pain begins at even defining what good code is! And what are the step-by-step guidelines for doing so, once I've decided on a project name?

My quest to answer this burning question got ignited by a talk I watched, Get A Whiff Of This by Sandi Metz. This started me on a journey of reading articles and tutorials, listening to podcasts, watching talks and interviews and immersing myself in the topic. I was looking for patterns that answered questions like, "What does good code look like?", "How do I write good code?", "How do I test it?". I then started experimenting, taking a little from here, a little from there, trying to mould a methodology I could eaily follow but that would get me the best results.

As the owner of The Simplest Brain in the World™, I like my methodologies and processes simple and easily repeatable with little logic. For this particular domain, coding, I don't want to agonise over whether the code is testable or not, I need to push out a feature! I don't want to have to put breakpoints over every step when I debugging because I can't follow what execution path. I want my paths to be as linear and logical as possible.

This process has led me to creating Shepherd. Or, it's long name, The Shepherd System for Writing Evolvable Code. It's the result of trial and error, refinement and re-refinement, swearing, multiple rewrites (the number is far too embarrassing to mention in public), small breakthroughs, bigger breakthroughs and finally - something I can use.

The purpose of this article is to introduce the main concepts, as well as provide a little extra background to show my thinking processes. It's also a nice way to store it for posterity. The System is really just one rule, but following it, like many things in life, has consequences (not all are bad), so after that, there are tips and guidelines of how to deal with those consequences. Over time, I'll be writing tutorial-style articles that provide further creature comforts to help implement the System but aren't wholly necessary to use, or even use in the System, you can use them in your daily coding whether or not you're sollowing Shepherd. Enjoy.

TLDR;

  • Separate Data and Behaviour

Classes should only have methods (ideally one public, and as many private helpers as you need) or variables, not both. Method-only classes, I call "Handlers", variable-only classes, I call "Data Classes". They're also better known as Plain Old x Objects (POxO) or Data Transfer Object (DTOs).

  • Don't use Constructors

Construct your object in a method.

  • Use a sensible naming convention

Whatever it is, make sure it's consistent. If you can't find a file you're looking for in under 30 seconds, your naming convention is inefficient!

  • Use BDD scenarios for end-to-end tests, unit and integration tests are done with your unit testing framework

Background

Shepherd is a set of defensive programming practices designed to write Evolvable Code. Now, before we can delve into what means, I think we should first explore what it means to write good code, because I think evolvable code is good code.

I've come across many a forum or thread where it's suggested that good code is testable code. I mean this in the sense that, if you make your code testable by design, then it is good code. It's true to an extent but my personal experience, testable code often means code with well-managed dependencies, and often coming out like this (this is, of course, ignoring other nuances for the sake of brevity and simplicity)

$dependency = new Dependency;

class Client{
    public function constructor_or_other_method(Dependency $dependency) {}
}

$client = new Client($depdendency);

as opposed to this

class Client {
    public function constructor_or_other_method() {
        $this->dependency = new Dependency;
    }
}

for code where the dependency will change.

But it rarely says much about how the code inside the actual methods or classes should look like. Functional Programmers will say it should not have side effects, Object Oriented programmers will want to see objects and messages. All valid points, however I think good code should be testable and refactorable by design, not just testable. It's a subtle distinction, but I'll explain.

If you write testable code, it should be refactorable. This implies you can choose whether your code will be easy to refactor or not. I'm arguing that good code should be testable and refactorable. You write it from the outset with the explicit intent that you will need to test it and you may need to change it in future. You still have to balance that with You Ain't Gonna Need It (YAGNI), but you can still provide yourself little breakpoints or escape points, where you can make changes ideally without disturbing other pieces of the codebase. Then you get all the other juicy benefits we like to throw around, Separation of Concerns, loose coupling, etc...

Which brings us to the definition of Evolvable Code. Evolvable Code is code that is:

  • testable, and
  • refactorable

I don't claim that Shepherd is an original idea, it's a collection of well-known practices put together in an (hopefully) original way, designed to produce evolvable code. Over the last year, I've devoted a lot of time to answering the question, "What is good code?". For me, code written by following Shepherd is good code. Note I say good, not necessarily pretty or optmised! It's good because it always allows me to prettify and optimize it, I never get stuck in dead ends.

I also like to look at Shepherd as a "configuration" of concepts. If you take x,y,z concept and put them together in a certain way, it's Shepherd. If you swap out y for n, it's not Shepherd. But that's ok because that configuration solves a different use- or edge-case.

The System

At it's core, Shepherd is actually one rule, with a set of trade-offs and how to manage those trade-offs. It's like in soccer, if you're going to press high you need a high defensive line, if you're going to have a high defensive line, you need a sweeper keeper, and so on. So without further ado - the rule.

Separate Data and Behaviour

  • The what

Have classes that contain only variables (think "pure" DTO) and classes that only contain functions. I call the former Data Classes and the latter, Handlers. That's it! That's Shepherd. Really. This rule is heavily inspired by the Entity Component System model used in the gaming industry. But since we're not all processing thousands of objects in realtime and in memory, I had to come up with a way of reaping the same benefits but in a way applicable to general programming.

  • What it looks like

Let's say I'm making an Order Processing system. I'm going to have a thing called an Order and some operations I need to perform on that Order, let's say I'll need to Open, Process, then Close. So I'm going to do this:

//Order.php
class Order {
    public DateTime $date;
    public array $line_items;
    public int $total;
}

//OrderOpener.php
class OrderOpener {
    public [static] function open(Order $order) : Order {
        //
    }

    //any private methods needed
}

//OrderProcessor.php
class OrderProcessesor {
    public [static] function process(Order $order) : Order {
        //
    }

    //any private methods needed
}

//OrderCloser.php
class OrderCloser {
    public [static] function close(Order $order) : Order {
        //
    }

    //any private methods needed
}

//and if the situation allows
//OrderManager.php
class OrderManager {
    public [static] function manage(Order $order) {
        $open_order = OrderOpener::open($order);
        $processed_order = OrderProcessor::process($open_order);
        return OrderCloser::close($processed_order);
    }
}

I put static in brackets there because I like to make the methods static but it's not a strict requirement.

  • Why?

Evolvable Code. The above example, though quite trivial and contrived, illustrates what my code looks like. I get:

  • four very focussed classes
  • three classes I can test easily with the added benefits that the concept of pure functions from Functional Programming bring
  • one I can mock in tests with minimal yak-shaving, my tests would look like:
$order = new Order;
$order->date = new DateTime();
$order->line_items = ["Item1", "Item2"];
$order->total = 50000;

//OrderOpenerTest.php
$open_order = OrderOpener::open($order);

//OrderProcessorTest.php
$processed_order = OrderProcessor::process($order);

//sometimes, I may have to do this in the test, but it's not very painful
$open_order = OrderOpener::open($order);
$processed_order = OrderProcessor::process($open_order);
  • a neat and clean API (one method for a class, with an easy-to-follow Naming Convention)
  • OPTIONS! I can extend any of these classes horizontally and vertically without disturbing other parts of the codebase and a lot of times, the original code itself

Say a requirement comes in, we have 2 types of Orders, ones with and without discounts. Ok, Data-side, I can do this:

class OrderDiscounted extends Order {
    public int $discount;
}

//or

trait HasDiscount {
    public int $discount;
}

Handler-side, I can do this:

interface IOrderOpener {
    function open(Order $order);
}

class OrderNormalOpener implements IOrderOpener {}
class OrderDiscountedOpener implements IOrderOpener {}

//etc...

or, in a language that supports method overloading, like C#

public class OrderOpener {
    public [static] Order open(OrderNormal order) {}
    public [static] Order open(OrderDiscounted order) {}
 }

I can evolve and change the code sometimes without even having to touch the original code or I'll have to modify it only structurally without worrying that I've broken anything. If requirements change the way a Normal Order is opened, then I only have one place to change it, and it's not going to mess with the logic for other Order types. This is unlike if I just went to the original OrderOpener and threw an "if" in there. Funny enough, though, under Shepherd, it's still a legitimate tactic, as long as you understand how that code will evolve in future, if it needs to. Evolvability deserves an article on its own, which I'll get around to, but even with this simple example, you can already see the possibilities are endless!

In reality, I actually end up with relatively few Data classes but I'm heavy on the Handlers. They are my Service Layer, where all the Business Logic lives.

  • Cons

Lots of files in the Handler folder. We'll deal a little with that tradeoff later under Naming Conventions and how you can make life easier by adopting a sensible one, but I'd say I prefer to have 10 files with 10 lines of code each than one with 100 lines - to break it down simply. And even better if those 10 lines have no side effects.

  • Recipe
  1. take a class, cut the methods out and paste each into it's own separate class - this includes constructors! We'll deal with those just now.
  2. use the data class as an argument, or if you need a subset of the data in the class, overload the relevant method or even better, just make it it's own class
  3. test each class on its own
  4. profit

Now we deal with the trade-offs.

No constructors

  • The What

Constructors are a special breed of menace in my books. I don't like them... intensely! In my experience, they create the potential for unnecessary pain and suffering. Where I can, I aggressively avoid them. Here's one reason, in C#:

class Example {

    public Example() {
        //get stuff from db
    }

    public void do_stuff() {}
}

How do I use this in somewhere when I don't need the stuff from the db? Hmm:

public Example(bool get_stuff_from_db = false) {
    if (get_stuff_from_db) {
        //get stuff from db
    }
}

Ok, but now I get this:

var example = new Example(true); //what am I trueing?

Here's another one.

class Example {
   public Example() {

   } 

   public Example(string something) {

   }
}

Business says the string is optional... oh...

class Example {
   public Example() {

   } 

   public Example(string something = "") {

   }
}

Crap! What does new Example() call? The pain levels ramp up when you add more overloads, more arguments and turn that up even more if the constructors are calling each other!

  • What it looks like

Splitting each method, including constructors, into their own classes completely erases this class of issues. Even if you've never had this issue, well, you can extend that run by killing the possibility of it ever becoming an issue by killing the constructor.

  • Why?

Pain avoidance

  • Cons

I'm so biased on this topic, I'm not even going to think of any. Maybe you belong to the Society Of Constructor Dependency Injection and if you do this you'll lose your membership, and now you have to choose betweeen being happy and being a member. * shrugs shoulders * I'm not sorry for making you choose.

I'll deal more with Dependency Injection in another article.

  • Recipe
  1. cut them out of your class
  2. put each into it's own class, rename it and make it return an instance of your desired class
  3. profit
  4. (optional) cut up SOCDI membership card

Naming Conventions

  • The What

Ranking in on the Top 2 of Hardest Things to do In Computer Science, naming things is hard when you're programming. I guess the best advice here is have a scheme and be consistent with it.

  • What it looks like

Classes

The requirements will give you all the domain names, so there's not much to add here. For my Handler and Helper classes, I tried to come up with some kind of nice formula, but it doesn't scale in any direction, so this is going to be example only.

In Laravel, we have a Models folder and in it, there's a model called User. Say now I have to create a Role class. Here's my solution:

//Models folder
class UserRole {}

They need permissions?

//Models folder
class UserPermission {}

Requirements say we need to be able to create them:

//Handlers folder
class UserRoleCreator {}
class UserPermissionCreator {}

and delete?

//Handlers folder
class UserRoleDeleter {}
class UserPermissionDeleter {}

Here's another, say I need to make a helper class to check if an array is empty.

//Helpers folder
class EmptyArrayChecker {}

Though, sometimes something like this may happen:

//Helpers folder
class EmptyArrayChecker {}
class AssociativeArrayChecker {}

In that case, I whip out Find and Replace, because I like similar things grouped together, either by prefix (I prefer this way), so the names here, out of necessity in my mind, would change to:

//Helpers folder
class ArrayEmptyChecker {}
class ArrayAssociativeChecker {}

Ideally, the names will start with the noun of the thing they're operating on, then followed by the operation.

Method Names

They often write themselves if you use the above class-naming scheme:

class UserRoleCreator {
    public function create() {}
}

class ArrayEmptyChecker {
    public function check() {}
}

Methods that return booleans read like a question:


function is_hidden() {}
function should_delete() {}
function will_change_if_input_is_a_positive_number() {}

I'm not terse with names, I'd rather have a long descriptive method name than try to "save paper" by shortening it, then having to play detective and decoding it every time I need to use it.

1001 days out of a 1000, I'll take:

function is_associative_array(){}

over

function is_assoc(){}

Variables

This works similar to method names. Boolean variables will read like a question. Variables that represent a class will usually have the class name, so:

$user = new User;
$happy_user = new UserHappy;

//or
$system = new UserSortingAndRankingSystem

And again, for the most part, I don't abbreviate. I think the only thing I ever abbreviate is "res" for result, and "args" for argument, otherwise, I try to be as descriptive as possible.

On a related note, with local varibles - abuse the crap out of them! Just like methods provide a convenient place to cut and move code to different classes, local variables create a nice, I call, Fowler-esque (read the book or watch Sandy's talk, both in the references) place to cut something and move to a method. I like to think of creating local variables and methods as drawing a dotted line where I can cut later, so I do this is often as I can because each line is another option. Sometimes I end up using them, sometimes I end up removing them but at least I gave Future Me the option to make the correct decision when it needed to be made, once things were clearer. A small trivial example, instead of:

class Example {

    public function method_one($user) {
        if ($user->age >= 18) {
            //
        }
    }
}

I'll do this:

class Example {
    public function method_one($user){
        $is_over_age = $user->age >= 18;

        if ($is_over_age) {
            //
        }
    }
}

So it's easier to see when when something like this happens:

class Example {
    public function method_one($user){
        $is_over_age = $user->age >= 18;
    }

    public function method_two($user){
        $is_over_age = $user->age >= 18;
    }
}

because we're starting to see some explicit repetition because we named the concept of checking the age, instead of hiding it in the if statement and potentially missing it. After noticing that, we can do a nice little refactor

class Example {
    public function method_one($user) {
        $is_over_age = $this->is_over_age($user);
    }
    public function method_two($user) {
        $is_over_age = $this->is_over_age($user);
    }
    public function is_over_age($user) {
        return $user->age >= 18;
    }
}

and the final, evolved form, if it ever needed to come to this:

class Example {
    public function method_one($user) {
        $is_over_age = $this->is_over_age($user);
    }
    public function method_two($user) {
        $is_over_age = $this->is_over_age($user);
    }
}

class UserOverAgeChecker {
    public function check($user) {
        return $user->age >= 18;
    }
}

It fits in nicely with keeping options open and following YAGNI. I never do more than I have to and if I have to, I can always improve it if I have to - and that's probably the property I most appreciate.

  • Why?

Admin/filing/file system reasons (for classes). Everything related to Users is in and around the User file. This makes searching a little easier. Plus, a lot of the time, I can predictably guess a name if I need to find something. Requirements says we need to change the way we create a user. Ok, where would I do that? Ah, UserCreator. New requirement says there's a Free User and Paid User - File > New > UserFreeCreator, File > New > UserPaidCreator.

Other than that, I like my code to read English-like because it has the basic structure of we're doing something to a something.

$user = UserHappyRoleCreator::create()

almost reads like a sentence, it tells you everything you need to know about the comings and goings on of what you're looking at, and it makes the code somewhat self-documenting. On top of that, if anything goes wrong, I know which model to go to if I need to investigate. In that way, everything aligns. Everything in the code base that has to do with a UserHappyRole, whether it's a model, controller, event, enum, listener, email, is going to start with UserHappyRole and it'll tell me exactly what it's doing with it.

Other special cases follow on from what I said above, I'll tend to leave the model just like that, UserHappyRole, then I append suffixes for:

Controller => UserHappyRoleController
Enum => UserHappyRoleEnum
//or
UserHappyRoleRegistrationStateEnum
Handlers => common suffixes are Creator, Updater, Deleter,  Manager, Processor 
# (one can be fussy and still append 'Handler', 
# but my namespace does that for me, 
# plus it's annoying to type)
Event => UserHappyRolePaidSubscriptionEvent

You get the idea. I also don't mind reusing names, and letting namespaces sort out which is which.

  • Cons

Sometimes you get really weird looking or long names.

  • Recipe
  1. choose a sensible naming convention
  2. be consistent with it
  3. profit

Testing

  • The What

Testing is related to the actual process of evolving code and deserves it's own whole article also but here I'll just explore just some high-level ideas I have about testing. I'll start with the overall process/methods.

BDD

I absolutely love this method because done right, my clients write the software, I just write the code. I use the Gherkin syntax/method of writing requirements. I've found even the most non-technical of non-technical clients take to it. A lot of times, it really helps make their ideas clearer. Turning onto this method also made me realise that testing actually starts at the Requirements Gathering phase. And the most important output of that phase is the Assertion - the "Then" part of Gherkin tests. Why? These are the things the client actually wants!. Let's look at this real-life scenario:

Given a Buyer wants to buy
And a Seller want to sell
When the Seller sells to the Buyer
Then the Buyer receives the indended product
And the Seller receives the intended amount

Where in this process are the two parties actually happy - right at the end. The true nature of their intentions is revealed, like a good movie, right at the end. No matter what your client says about anything else, what they really want is in the assertions - so get those nailed! A good way to think about them is to make them proveable. Can you reasonably prove something happened? For example:

Then the System should print a receipt 

#vs

Then the Transaction is closed

The first is very proveable, and you can use it all the way down to your unit tests. Closed can mean anything. With the second, when you present to the client, they're expecting to see a receipt and a dancing monkey, yet you thought it was just a state change and you only changed a field in the database. If this particular transaction needs the receipt with the red logo printed, then put it in there. If there's one place to be disgustingly pedantic throughout the lifecycle of any software project, it's with the BDD Assertions!

Next is where to use them on the Testing Triangle, before you read on, take a gues... hint: End-to-end tests.

Did you answer end-to-end tests? Oh wow, that's cool, we think alike then. For no more compicated reason than that's the part you and the client spoke about, that's the only part of the system they'll actually see, and end-to-end tests test exactly that. The rest, integration and unit tests are done with the unit testing framework.

We mock data. We mock output from third party libraries - and change the tests when that output changes. And we mock as much as humanly possible. We don't mock the DB, or at least, I don't, I just use an in-memory SQLite DB. And our assertions support the initial assertions made in the BDD requirements.

And lastly, do we write tests before, during or after? My answer to that is, yes. It doesn't really matter, as long as they're there before you deploy to staging. I have a preference for working TDD-style as often as I can.

I think the most romantic version of this would be where if you follow the tests in a certain order, they read like the long-form version of the BDD spec.

  • What it looks like

Something like this:

//spec
Given a User exists
When they click Log Out
Then the System should log them out
//tests
//Test User Creator
public function test_can_create();
//Test Session Creator
public function test_can_create();
//Test Session Deleter
public function test_can_delete();

P.S. you also just learned the naming conventions I use for my tests. If I need extra descriptions, you'll often see something like:

public function test_can_create_with_just_name();
public function test_can_create_with_name_and_email();
public function test_throws_error_cannot_create_with_bad_credit_score();
  • Why?

I once saw a talk where the speaker said good tests are like a Save Point in a game, as long as they're passing, you know even if you're not going forward, at least you're not going backwards. Another article I read provided another bombshell - something along the lines of "If you write code with tests, once you're done coding, you're done; if you don't write code with tests, once you're done coding, you've only just started."

I have legacy code out there I dread coming back because there are no/few/crap tests. I speak a bit about this in my Speed of Dev article but this is partly why, nowadays, I don't brag about how quickly I can write code, but how accrately. It takes a little while longer, but I do it gladly knowing I could be writing code I'll never have to look at again, no matter how ugly or pretty it comes out.

  • Cons

You're coding twice and maintaining two codebases. You can't get around that, so just pretend it's all one code base, the tests are an extension of your code, as opposed to an imposition or something else to maintain.

Development time is a little longer. Actually, using Shepherd makes your development time a little longer, since there is a bit extra you will do, like making many files instead of one, you'll never develop at the fastest pace you can. But maybe that's also a good thing because you'll know when you log off you may never have to look at that code again, even if (or especially if!) it took you 50% longer to do it than you normally would. And that's a powerful feeling. If there's one thing I'm trying to do less of, is go back to and look at code I wrote at breakneck speed. The best way to do that is to not write it that way in the first place!

  • Recipe
  1. do your requirements gathering properly, you and the client need to be very clear on what you're trying to achieve, especially in the Assertions
  2. base your end-to-end tests on the gherkin scenarios
  3. use your unit testing framework to do unit and integration tests
  4. mock fairly aggressively
  5. profit, and never look back

Conclusion

And there is it - the entire body of knowledge on the Shepherd System! From here you should be able to be fairly productive with little further guidance. There are tips and tricks that I've compiled over the time I've been using Shepherd, but they're more creature comforts than any more rules. These will be turned into articles, along with anything new I discover along the way.

Overall, the process of developing Shepherd has provided with many great insights on the superhuman feat we perform called coding. It's a genuine talent to crank out really good code. I'm not even sure I possess it in huge amounts, but I've found that sticking as closely to this one rule and little tips as often as I can, it's levelled the playing field somewhat for me, going by what I've seen people saying it good code. I have code I can easily test, easily refactor and extend, and that doesn't keep me up at night, except maybe the prospect of writing more good code. So, give it a try, but use common sense where it dictates, this isn't a one-size-fits-all approach, but it should fit most.

Thanks for dropping in. Le saleng hanthle!

References

These are the signifcant videos/articles/books (in no particular order) I feel I gained the most insight from while creating Shepherd.

Background vector created by rawpixel.com - www.freepik.com

General Coding

Testing

Entity Component Systems

Data Oriented Design

 
Share this