5  Implementation

Now that you have a plan, it’s finally time to get started with the implementation.

From Design to Code: Fill in the Blanks

Armed with your design from the last chapter, you can now translate your sketch into a code skeleton. Start by outlining the functions, place calls to them where needed, and add comments for any steps you’ll figure out later. For example, the design from Figure X could result in the following draft:

Order of functions

Your script likely includes multiple functions, so you’ll need to decide their order from top to bottom. Since scripts typically start with imports (e.g., of libraries like numpy) and end with a main function, personally I prefer to put general functions that only rely on external dependencies at the top (i.e., the ones that are at the lower levels of abstraction in your call hierarchy). This ensures that, as you read the script from top to bottom, each function depends only on what was defined before it. Maintaining this order avoids circular dependencies and encourages you to write reusable, modular functions that serve as building blocks for the code that follows.

Once your skeleton stands, you “only” need to fill in the details, which is a lot less intimidating than facing a blank page. Plus, since you started with a thoughtful design, your final program is more likely to be well-structured and easy to understand. Compare this to writing code on the fly, where decisions about functions are often made haphazardly—you’ll appreciate the difference.

Using AI Code Generators

AI assistants like ChatGPT or GitHub Copilot can be helpful tools when writing code, especially at the level of individual functions. However, remember that these tools only reproduce patterns from their training data, which includes both good and bad code. As a result, the code they generate may not always be optimal. For instance, they might use inefficient for-loops instead of more elegant matrix operations. Similarly, support for less popular programming languages may be subpar.
To get better results, consider crafting prompts like: “You are a senior Python developer with 10 years of experience writing efficient, edge-case-aware code. Write a function …”

Minimum Viable Results

In product development, there’s a concept called the Minimum Viable Product (MVP). This refers to the simplest version of a product that still provides value to users. The MVP serves as a prototype to gather feedback on whether the product meets user needs and to identify which features are truly essential. By iterating quickly and testing hypotheses, teams can increase the odds of creating a successful product that people will actually pay for.

This approach also has motivational benefits. Seeing something functional—even if basic—early on makes it easier to stay engaged. It’s far better than toiling for months without tangible results. We recommend applying this mindset to your research software development by starting with a script that generates “Minimum Viable Results.”

This means creating a program that produces outputs resembling your final results, like plots or tables, but using placeholder data instead of actual values. For instance:

  • If your goal is to build a prediction model, start with one that simply predicts the mean of the observed data.
  • If you’re developing a simulation, begin with random outputs, such as a random walk.

This approach also serves as a “stupid baseline”—a simple, easy-to-beat reference point for your final method. It’s a sanity check: if your sophisticated solution can’t outperform this baseline, something’s off.

By starting with Minimum Viable Results, you can test your code end-to-end early on, see tangible progress, and iteratively improve from there.

Breaking Code into Components

Starting a new project often begins with all your code in a single script or notebook. This is fine for quick and small tasks, but as your project grows, keeping everything in one file becomes messy and overwhelming. To keep your code organized and easier to understand, it’s a good idea to move functionality into separate files, also called (sub)modules. Separating code into modules makes your project easier to navigate, test, and reuse.

A typical first step is splitting the main logic of your analysis (main.py) from general-purpose helper functions (utils.py). Over time, as utils.py expands, you’ll notice clusters of related functionality that can be moved into their own files, such as preprocessing.py, models.py, or plot_results.py. This modular approach naturally leads to a clean directory structure, which might look like this for a larger Python project:1

src/
└── my-package/
    ├── __init__.py
    ├── main.py
    ├── models/
       ├── __init__.py
       ├── baseline_a.py
       ├── baseline_b.py
       └── my_model.py
    └── utils/
        ├── __init__.py
        ├── preprocessing.py
        └── plot_results.py

In main.py, you can import the relevant classes and functions from these modules to keep the main script clean and focused:

from models.my_model import MyModel
from utils import preprocessing

if __name__ == '__main__':
    # steps that will be executed when running `python main.py`
    model = MyModel()
Keep helper functions separate

Always separate reusable helper functions from the main executable code. This also means that files like utils/preprocessing.py should not include a main function, as they are not standalone scripts. Instead, these modules provide functionality that can be imported by other scripts—just like external dependencies such as numpy.

As you tackle more projects, you may develop a set of functions that are so versatile and useful that you find yourself reusing them across multiple projects. At that point, you might consider packaging them as your own open-source library, allowing others to install and use it just like any other external library.

Keep It Compact

When writing code, aim to achieve your goals while using as little screen space as possible—this applies to both the number of lines and their length.

Tips to create compact, reusable code
  • Avoid duplication: Instead of copying and pasting code in multiple places, consolidate it into a reusable function to save lines.

  • Prefer ‘deep’ functions: Avoid extracting very short code fragments (1-2 lines) into a separate function, especially if this function would require many arguments. Such shallow functions with wide interfaces increase complexity without meaningfully reducing line count. Instead, strive for deep functions (spanning multiple lines) with narrow interfaces (e.g., only 1-3 input arguments, i.e., fewer arguments than the function has lines of code), which tend to be more general and reusable [3].

  • Address nesting: If your code becomes overly nested, this can be a sign that parts of the code should be moved into a separate function. This simplifies logic and shortens lines.

  • Use Guard Clauses: Deeply nested if-statements can make code harder to read. Instead, use guard clauses [1] to handle preconditions (e.g., checking for wrong user input) early, leaving the “happy path” clear and concise. For example:

    if condition:
        if not other_condition:
            # do something
            return result
    else:
        return None

    Can be refactored into:

    if not condition:
        return None
    if other_condition:
        return None
    # do something
    return result

    This approach reduces nesting and improves readability.

Documentation & Comments: A Note to Your Future Self

While you write it, everything seems obvious. However, when revisiting your code a few months later (e.g., to address reviewer feedback), you’re often left wondering what the heck you were doing. This is especially true when some external constraint (like a library quirk) forced you to create a workaround instead of opting for the straightforward solution. When returning to such code, you might be tempted to replace the awkward implementation with something more elegant, only to rediscover why you chose that approach in the first place. This is where comments can save you some trouble. And they are even more important when collaborating with others who need to understand your code.

We distinguish between documentation and comments: Documentation provides the general description of when and how to use your code, such as function docstrings explaining what the function computes, its input parameters, and return values. This is particularly important for open source libraries where you can’t personally explain the code’s purpose and usage to others. Comments help developers understand why your code was written in a certain way, like explaining that unintuitive workaround. Additionally, for scientific code, you may also need to document the origin of certain values or equations by referencing the corresponding paper in the comments.

Code should be self-documenting

Ideally, your code should be written so clearly that it’s self-explanatory. Comments shouldn’t explain what the code does, only why it does that (when not obvious). Comments and documentation, like code, need to be maintained—if you modify code, update the corresponding comments, or they become misleading and harmful rather than helpful. Using comments sparingly minimizes the risk of confusing, outdated comments.

Informative variable and function names are essential for self-explanatory code. When you’re tempted to write a comment that summarizes what the following block of code does (e.g., # preprocess data), consider moving these lines into a separate function with an informative name, especially if they contain significant, reusable logic.

Naming is hard

There are only two hard things in Computer Science: cache invalidation and naming things.
– Phil Karlton2

Finding informative names for variables, functions, and classes can be challenging, but good names are crucial to make the code easier to understand for you and your collaborators.

Tips for effective naming
  • Names should reveal intent. Longer names (consisting of multiple words in snake_case or camelCase, depending on the conventions of your chosen programming language) are usually better. However, stick to domain conventions—if everyone understands X and y as feature matrix and target vector, use these despite common advice denouncing single letter names.
  • Be consistent: similar names should indicate similar things.
  • Avoid reserved keywords (i.e., words your code editor colors differently, like Python’s input function).
  • Use verbs for functions, nouns for classes.
  • Use affirmative phrases for booleans (e.g., is_visible instead of is_invisible).
  • Use plurals for collections (e.g., cats instead of list_of_cats).
  • Avoid encoding types in names (e.g., color_dict), since if you decide to change the data type later, you either need to rename the variable everywhere or the name is now misleading.

Tests: Protect What You Love

We all want our code to be correct. During development, we often verify this manually by running the code with example inputs to check if the output matches our expectations. While this approach helps ensure correctness initially, it becomes cumbersome to recreate these test cases later when the code needs changes. The simple solution? Package your manual tests into a reusable test suite that you can run anytime to check your code for errors.

Tests typically use assert statements to confirm that the actual output matches the expected output. For example:

def add(x, y):
    return x + y

def test_add():
    # verify correctness with examples, including edge cases
    # syntax: assert (expression that should evaluate to True), "error message"
    assert add(2, 2) == 4, "2 + 2 should equal 4"
    assert add(5, -6) == -1, "5 - 6 should equal -1"
    assert add(-2, 10.6) == 8.6, "-2 + 10.6 should equal 8.6"
    assert add(0, 0) == 0, "0 + 0 should equal 0"

Pure functions—those without side effects like reading or writing external files—are especially easy to test because you can directly supply the necessary inputs. Placing your main logic into pure functions therefore simplifies testing the critical parts of your code. For impure functions, such as those interacting with databases or APIs, you can use techniques like mocking to simulate external dependencies.

Testing in Python with pytest

Consider using the pytest framework for your Python tests. Organize all your test scripts in a dedicated tests/ folder to keep them separate from the main source code.

When designing your tests, focus on edge cases—unusual or extreme scenarios like values outside the normal range or invalid inputs (e.g., dividing by zero or passing an empty list). The more thorough your tests, the more confident you can be in your code. Each time you make significant changes, run all your tests to ensure the code still behaves as expected.

Some developers even adopt Test-Driven Development (TDD), where they write tests before the actual code. The process begins with writing tests that fail, then creating the code to make them pass. TDD can be highly motivating as it provides clear goals, but it requires discipline and may not always be practical in the early stages of development when function definitions are still evolving.

Testing at different levels

Ideally, you’ll test your software at all levels:

  • Unit Tests: Test individual components (e.g., single functions) to verify basic logic.
  • Integration/System Tests: Check that different parts of the system work together as expected. These often require more complex setups, like running multiple services at the same time.
  • Manual Testing: Identify unexpected behavior or overlooked edge cases. Whenever a bug is found, create an automated test to reproduce it and prevent regression.
  • User Testing: Evaluate the user interface (UI) with real users to ensure clarity and usability. UX designers often perform these tests using design mockups before coding begins.

Debugging

When your code doesn’t work as intended, you’ll need to debug—systematically identify and fix the problem. Debugging becomes easier if your code is organized into small, testable functions covered by unit tests. These tests often help narrow down the source of the issue. If none of your tests caught the bug, write a new test to reproduce it and ensure this case is covered in the future.

To isolate the exact line causing the error:

  • Use print statements to log variable values at key points and understand the program’s flow.
  • Add assert statements to verify intermediate results.
  • Use a debugger, often integrated into your IDE, to set breakpoints where execution will pause, allowing you to step through the program manually and inspect variables.

Debugging is an essential skill that not only fixes bugs but also improves your understanding of the code and its behavior.

Make It Fast

Make it run, make it right, make it fast.
– Kent Beck (or rather this dad, Douglas Kent Beck3)

Now that your code works and produces the right results (as you’ve dutifully confirmed with thorough testing), it’s time to think about performance.

Readability over performance

Always prioritize writing code that’s easy to understand. Performance optimizations should never come at the cost of readability. More time is spent by humans reading and maintaining code than machines executing it.

Find and fix the bottlenecks

Instead of randomly trying to speed up everything, focus on the parts of your code that are actually slow. A quick way to find bottlenecks is to manually interrupt your code during a long run; if it always stops in the same place, that’s likely the issue. For a more systematic approach, use a profiler. Profilers analyze your code and show you how much time each part takes, helping you decide where to focus your efforts.

Run it in the cloud

Working with large datasets may trigger Out of Memory errors as your computer runs out of RAM. While optimizing your code can help, sometimes the quickest solution is to run it on a larger machine in the cloud. Platforms like AWS, Google Cloud, Azure, or your institution’s own compute cluster make this cost-effective and accessible. That said, always look for simple performance improvements first!

Think About Big O

Some computations have unavoidable limits. For example, finding the maximum value in an unsorted list requires checking every item—there is no way around this. The “Big O” notation is used to describe these limits, helping you understand how your code scales as data grows (both in terms of execution time and required memory).

  • Constant time (\(\mathcal{O}(1)\)): Independent of dataset size (e.g., looking up a key in a dictionary).
  • Linear time (\(\mathcal{O}(n)\)): Grows proportionally to data size (e.g., finding the maximum in a list).
  • Problematic growth (e.g., \(\mathcal{O}(n^3)\) or \(\mathcal{O}(2^n)\)): Polynomial or exponential scaling can make algorithms impractical for large datasets.

When developing a novel algorithm, you should examine its scaling behavior both theoretically (e.g., using proofs) and empirically (e.g., timing it on datasets of different sizes). Designing a more efficient algorithm is a major achievement in computational research!

Divide & Conquer

If your code is too slow or your dataset too large, try splitting the work into smaller, independent chunks and combining the results. This “divide and conquer” approach is used in many algorithms, like the merge sort algorithm, and in big data frameworks like MapReduce.

Example: MapReduce

MapReduce [2] was one of the first frameworks developed to work with ‘big data’ that does not fit on a single computer anymore. The data is split into chunks and distributed across multiple machines, where each chunk is processed in parallel (map step), and then the results are combined into the final output (reduce step).

For instance, if you’re training a machine learning model on a very large dataset, you could train separate models on subsets of the data and then aggregate their predictions (e.g., by averaging them), thereby creating an ensemble model.

Replace For-Loops with Map/Filter/Reduce

Sequential for loops can often be replaced with map, filter, and reduce operations for better readability and potential parallelism:

  • map: Transforms each element in a sequence.
  • filter: Keeps elements that meet a condition.
  • reduce: Aggregates elements recursively (e.g., summing values).

For example:

from functools import reduce

### Simplify this loop:
current_sum = 0
current_max = -float('inf')
for i in range(10000):
    new_i = i**0.5
    # the modulo operator x % y gives the remainder when diving x by y
    # i.e., we're checking for even numbers, where the rest is == 0
    if (round(new_i) % 2) == 0:
        current_sum += new_i
        current_max = max(current_max, new_i)

### Using map/filter/reduce:
# map(function to apply, list of elements)
new_i_all = map(lambda x: x**0.5, range(10000))
# filter(function that returns true or false, list of elements)
new_i_filtered = filter(lambda x: (round(x) % 2) == 0, new_i_all)
# reduce(function to combine current result with next element, list of elements, initial value)
current_sum = reduce(lambda acc, x: acc + x, new_i_filtered, 0)
current_max = reduce(lambda acc, x: max(acc, x), new_i_filtered, -float('inf'))
# (of course, for these simple cases you could just use sum() and max() on the list directly)

In Python, list comprehensions also offer concise alternatives:

new_i_filtered = [i**0.5 for i in range(10000) if (round(i**0.5) % 2) == 0]

Exploit Parallelism

Many scientific computations are “embarrassingly parallelizable,” meaning tasks can run independently. For example, running simulations with different model configurations, initial conditions, or random seeds. Each of these experiments can be submitted as a separate job and run in parallel on a compute cluster. By identifying parts of your code that can be parallelized, you can save time and make full use of available resources.

Refactoring: Make Change Easy

Refactoring is the process of modifying existing code without altering its external behavior. In other words, it preserves the “contract” (interface) between your code and its users while improving its internal structure.
Common refactoring tasks include:

  • Renaming: Giving variables, functions, or classes more meaningful and descriptive names.
  • Extracting Functions: Breaking large functions into smaller, more focused ones (\(\to\) one function should do one thing).
  • Eliminating Duplication: Consolidating repeated code into reusable functions.
  • Simplifying Logic: Reducing deeply nested code structures or introducing guard clauses for clarity.
  • Reorganizing Code: Grouping related functions or classes into appropriate files or modules.

Why refactor?

Refactoring is typically done for two main reasons:

  1. Addressing Technical Debt:
    When code is written quickly—often to meet deadlines—it may include shortcuts that make future changes harder. This accumulation of compromises is called “technical debt.” Refactoring cleans up this debt, improving code quality and making the code easier to understand.
    • Example: Revisiting old code can be like tidying up a messy campsite. Just as a good scout leaves the campground cleaner than they found it, a responsible developer leaves the codebase better for the next person (or themselves in the future).
  2. Making Change Easier:
    Sometimes, implementing a new feature in your existing code feels like forcing a square peg into a round hole. Instead of struggling with awkward workarounds, you should first refactor your code to align with the new requirements. The goal of software design isn’t to predict every possible future change (which is impossible) but to adapt gracefully when those changes arise.
    • Before adding a new feature, clean up your code so that the change feels natural and seamless. This not only simplifies the task at hand but also results in a more general, reusable functions and classes.

Refactorings to simplify changes

For each desired change, make the change easy (warning: this may be hard), then make the easy change.
– Kent Beck4

  • Replace Magic Numbers with Constants: Magic numbers—values with unclear meaning—can make code harder to understand and maintain. By replacing them with constants, you create a single source of truth that’s easy to modify.

    # Before:
    if status == 404:
        ...
    
    # After:
    ERROR_NOT_FOUND = 404
    if status == ERROR_NOT_FOUND:
        ...
  • Don’t Repeat Yourself (DRY): Copying and pasting code may seem like a quick fix, but it leads to problems later. If the logic changes, you’ll need to update it everywhere it’s duplicated, which is error-prone. Instead, move the logic into a reusable function or method.

    # Before:
    if (model.a > 5) and (model.b == 3) and (model.c < 8):
        ...
    
    # After:
    class MyModel:
        def is_ready(self):
            return (self.a > 5) and (self.b == 3) and (self.c < 8)
    
    if model.is_ready():
        ...
  • Organize for Coherence: Keep code elements that need to change together in the same file or module. Conversely, separate unrelated parts of your code to prevent unnecessary entanglement. This way, changes are localized, which reduces cognitive load.

    In larger codebases shared by multiple teams, this is even more critical. When changes require excessive communication and coordination, it signals a need to reorganize the code. Clear ownership and reduced dependencies help teams work independently while keeping the system coherent through agreed upon interfaces.

Additional tips
  • Test as you refactor: Always run tests before and after refactoring to ensure no functionality is accidentally broken. Writing or expanding automated tests is often part of the process to safeguard against regressions.
  • Leverage IDE support: Modern IDEs like PyCharm or Visual Studio Code provide tools for automated refactoring, such as renaming, extracting functions, or moving files. These can save time and reduce errors.
  • Avoid over-refactoring: While cleaning up code is valuable, avoid making unnecessary changes that don’t improve functionality or clarity. Over-refactoring wastes time and can confuse collaborators.

By refactoring regularly and following these practices, you’ll create a cleaner, more maintainable codebase that is adaptable to future needs and fun to work with.

Before you continue

At this point, you should have a clear understanding of:

  • How to transform your ideas into code.
  • Some best practices to write code that is easy to understand and maintain.

  1. The __init__.py file is needed to turn a directory into a package from which other scripts can import functionality. Usually, the file is completely empty.↩︎

  2. https://martinfowler.com/bliki/TwoHardThings.html↩︎

  3. https://x.com/KentBeck/status/704385198301904896↩︎

  4. https://x.com/KentBeck/status/250733358307500032↩︎