pytest BDD: Write Tests in Given-When-Then Format (2026)

The most surprising thing about writing tests in a Given-When-Then format with pytest-bdd is that you’re not actually writing tests in that format; you’re writing specifications that your tests then implement.

Let’s see this in action. Imagine we’re testing a simple calculator.

features/calculator.feature:

Feature: Basic Calculator Operations

  Scenario: Adding two numbers
    Given the calculator is on
    And the first number is 5
    And the second number is 3
    When I add the numbers
    Then the result should be 8

Now, how do we make pytest-bdd understand this? We need to connect these Gherkin steps to Python code.

tests/test_calculator.py:

from pytest_bdd import scenarios, given, when, then
import pytest

# Load the feature file
scenarios('features/calculator.feature')

# Define step implementations

@given("the calculator is on")
def calculator_on():
    return {'value': 0} # Initialize calculator state

@given("the first number is <number>")
def first_number(calculator, number):
    calculator['first'] = int(number)

@given("the second number is <number>")
def second_number(calculator, number):
    calculator['second'] = int(number)

@when("I add the numbers")
def add_numbers(calculator):
    calculator['value'] = calculator['first'] + calculator['second']

@then("the result should be <expected_result>")
def result_should_be(calculator, expected_result):
    assert calculator['value'] == int(expected_result)

When you run pytest, pytest-bdd finds the scenarios call, parses features/calculator.feature, and looks for matching step implementations in tests/test_calculator.py. If it finds a match for every step in a scenario, it generates a pytest test case.

Here’s the mental model:

Feature Files (.feature): These are your living documentation and specifications. They describe what the system should do from a user or business perspective using Gherkin syntax (Given-When-Then, And, But). They are intentionally high-level and declarative.
Step Implementations (Python): These are the actual test code. Each Gherkin step in your feature file needs a corresponding Python function decorated with @given, @when, or @then. These functions contain the logic to set up the state, perform an action, or assert an outcome.
Scenarios Decorator: This is the bridge. scenarios('path/to/your.feature') tells pytest-bdd to find all scenarios in the specified feature file and create corresponding test functions.
Context/State Management: Notice the calculator fixture in the Python code. pytest-bdd allows you to pass context between steps. In the example, calculator_on returns a dictionary, and subsequent steps can access and modify this dictionary. This is crucial for maintaining state across Given, When, and Then steps within a single scenario.
Parameterization: Gherkin steps can have parameters (e.g., <number>, <expected_result>). pytest-bdd automatically extracts these parameters and passes them as arguments to your Python step implementation functions. The <...> syntax in Gherkin maps directly to function arguments in Python.

The core problem pytest-bdd solves is bridging the gap between human-readable specifications and executable test code. It enforces a disciplined way of writing tests where the specification is always the primary artifact.

A subtle but powerful aspect is how pytest-bdd handles step sharing and reuse. If you have multiple feature files that use the exact same Gherkin step text (e.g., "the user is logged in"), you only need to write the Python implementation for that step once. pytest-bdd will discover and use it for all scenarios across all feature files. This promotes a DRY (Don’t Repeat Yourself) principle not just in your test code, but also in your specifications.

The next concept you’ll likely encounter is managing more complex state and data tables within your feature files.