Testing Guide¶

JoyfulJay includes comprehensive testing infrastructure including unit tests, integration tests, property-based testing with Hypothesis, fuzz testing, and performance benchmarks.

Test Organization¶

Text Only

tests/
├── unit/                          # Unit tests for individual components
│   ├── test_stats_property.py     # Property-based tests for statistics
│   ├── test_entropy_property.py   # Property-based tests for entropy
│   ├── test_prometheus_metrics.py # Prometheus integration tests
│   └── extractors/                # Extractor unit tests
│       ├── test_tls.py
│       ├── test_quic.py
│       └── ...
├── integration/                   # End-to-end tests
│   ├── test_e2e_extraction.py
│   └── test_tranalyzer_e2e.py
├── fuzz/                          # Fuzz testing
│   └── test_extractors_fuzz.py
└── fixtures/                      # Shared test fixtures
    └── tranalyzer_packets.py

benchmarks/
├── benchmark_stats.py             # Statistics function benchmarks
└── benchmark_pipeline.py          # Pipeline processing benchmarks

Running Tests¶

All Tests¶

Bash

# Run all tests
pytest tests/

# With coverage
pytest tests/ --cov=src/joyfuljay --cov-report=html

# Verbose output
pytest tests/ -v

Specific Test Categories¶

Bash

# Unit tests only
pytest tests/unit/

# Integration tests only
pytest tests/integration/

# Fuzz tests (slow)
pytest tests/fuzz/ -m slow

# Skip slow tests
pytest tests/ -m "not slow"

Single Test File¶

Bash

pytest tests/unit/test_stats_property.py -v

Property-Based Testing¶

Property-based testing uses Hypothesis to automatically generate test inputs and verify that properties hold for all inputs.

Installation¶

Bash

pip install joyfuljay[dev]  # Includes hypothesis>=6.0

Statistics Properties¶

Located in tests/unit/test_stats_property.py:

Python

from hypothesis import given, strategies as st
from joyfuljay.utils.stats import compute_statistics

@given(st.lists(st.floats(allow_nan=False, allow_infinity=False), min_size=1))
def test_statistics_invariants(values):
    """Verify statistical invariants hold for any valid input."""
    stats = compute_statistics(values)

    # Property 1: min <= mean <= max
    assert stats.min <= stats.mean <= stats.max

    # Property 2: standard deviation is non-negative
    assert stats.std >= 0.0

    # Property 3: count equals input length
    assert stats.count == len(values)

Tested Properties¶

Statistics (`test_stats_property.py`)¶

Property	Description
Ordering	`min <= mean <= max` always holds
Non-negative std	`std >= 0` for all inputs
Count consistency	`count == len(input)`
Sum accuracy	`sum` matches `sum(input)` within tolerance

Entropy (`test_entropy_property.py`)¶

Property	Description
Non-negative	Entropy is always >= 0
Upper bound	Entropy <= log2(alphabet_size)
Zero for uniform	Single-value input has entropy 0

Inter-arrival Times¶

Property	Description
Length	Output has length n-1 for n inputs
Non-negative	All gaps are >= 0 for sorted input

Writing Property Tests¶

Python

from hypothesis import given, settings, strategies as st

# Define strategies for generating test data
packet_strategy = st.fixed_dictionaries({
    "timestamp": st.floats(min_value=0, max_value=1e9),
    "payload_len": st.integers(min_value=0, max_value=65535),
    "protocol": st.sampled_from([6, 17]),  # TCP or UDP
})

@given(st.lists(packet_strategy, min_size=1, max_size=100))
@settings(max_examples=100)  # Control test iterations
def test_flow_properties(packets):
    """Test that flow properties hold for any packet sequence."""
    # Your assertions here
    pass

Fuzz Testing¶

Fuzz testing generates random, potentially malformed inputs to find crashes and edge cases in parsers.

Protocol Extractor Fuzzing¶

Located in tests/fuzz/test_extractors_fuzz.py:

Python

@pytest.mark.slow
@given(
    st.lists(st.binary(min_size=0, max_size=512), min_size=1, max_size=5),
    st.sampled_from([6, 17]),  # TCP or UDP
)
@settings(max_examples=50)
def test_protocol_extractors_fuzz(payloads, protocol):
    """Fuzz all protocol extractors with random payloads."""
    flow = build_flow(payloads, protocol)

    extractors = [
        TLSExtractor(),
        SSHExtractor(),
        QUICExtractor(),
        DNSExtractor(),
    ]

    for extractor in extractors:
        # Should never crash, even with garbage input
        features = extractor.extract(flow)
        assert isinstance(features, dict)

Fuzz Targets¶

Extractor	Attack Surface
`TLSExtractor`	ClientHello parsing, extension handling
`SSHExtractor`	Banner parsing, key exchange
`QUICExtractor`	Long/short header parsing, version negotiation
`DNSExtractor`	Query/response parsing, compression pointers

Running Fuzz Tests¶

Bash

# Run fuzz tests (marked as slow)
pytest tests/fuzz/ -v -m slow

# Increase examples for deeper testing
pytest tests/fuzz/ -v --hypothesis-seed=42 \
  --hypothesis-settings='{"max_examples": 1000}'

Writing Fuzz Tests¶

Python

from hypothesis import given, strategies as st

# Strategy for generating potentially malicious TLS data
tls_fuzz_strategy = st.one_of(
    st.binary(min_size=0, max_size=1024),  # Random bytes
    st.builds(  # Structured but invalid
        lambda: b'\x16\x03\x01' + os.urandom(100),
    ),
)

@given(tls_fuzz_strategy)
def test_tls_parser_robustness(data):
    """TLS parser should handle any input without crashing."""
    try:
        result = parse_tls_record(data)
    except ParseError:
        pass  # Expected for invalid input
    # Should never raise other exceptions

Performance Benchmarks¶

Running Benchmarks¶

Bash

# Pipeline benchmark
python benchmarks/benchmark_pipeline.py

# Statistics benchmark
python benchmarks/benchmark_stats.py

Pipeline Benchmark¶

Located in benchmarks/benchmark_pipeline.py:

Python

def run_benchmark():
    """Benchmark end-to-end pipeline processing."""
    pcap_path = Path("tests/data/sample.pcap")
    config = Config()
    pipeline = Pipeline(config)

    start = time.perf_counter()
    features = pipeline.process_pcap(str(pcap_path), output_format="dict")
    elapsed = time.perf_counter() - start

    print(f"Flows: {len(features)}")
    print(f"Elapsed: {elapsed:.2f}s")
    print(f"Flows/sec: {len(features) / elapsed:.2f}")

Sample Output¶

Text Only

======================================================================
JoyfulJay Pipeline Benchmark
======================================================================
PCAP: tests/data/sample.pcap
Flows: 1234
Elapsed: 2.45s
Flows/sec: 503.67
======================================================================

Statistics Benchmark¶

Tests performance of statistical functions:

Python

from joyfuljay.utils.stats import compute_statistics

def benchmark_statistics():
    data_sizes = [10, 100, 1000, 10000]

    for size in data_sizes:
        data = [random.random() for _ in range(size)]

        start = time.perf_counter()
        for _ in range(1000):
            compute_statistics(data)
        elapsed = time.perf_counter() - start

        print(f"Size {size}: {elapsed/1000*1e6:.2f} us/call")

Test Markers¶

JoyfulJay uses pytest markers to categorize tests:

Marker	Description
`@pytest.mark.slow`	Long-running tests (fuzz, large data)
`@pytest.mark.integration`	End-to-end integration tests

pytest.ini Configuration¶

INI

[pytest]
markers =
    slow: marks tests as slow (deselect with '-m "not slow"')
    integration: marks tests as integration tests

addopts = -ra -q --strict-markers --strict-config
testpaths = tests
pythonpath = src

Test Fixtures¶

Shared Packet Fixtures¶

Located in tests/fixtures/tranalyzer_packets.py:

Python

def create_tcp_flow(num_packets=10):
    """Create a synthetic TCP flow for testing."""
    packets = []
    for i in range(num_packets):
        pkt = Packet(
            timestamp=1000.0 + i * 0.01,
            src_ip="192.168.1.100" if i % 2 == 0 else "10.0.0.1",
            dst_ip="10.0.0.1" if i % 2 == 0 else "192.168.1.100",
            src_port=54321 if i % 2 == 0 else 443,
            dst_port=443 if i % 2 == 0 else 54321,
            protocol=6,
            payload_len=100,
            total_len=140,
            tcp_flags=0x18,
        )
        packets.append(pkt)

    flow = Flow.from_first_packet(packets[0])
    for pkt in packets[1:]:
        flow.add_packet(pkt)
    return flow

Using Fixtures¶

Python

from tests.fixtures.tranalyzer_packets import create_tcp_flow

def test_tcp_extractor():
    flow = create_tcp_flow(num_packets=20)
    extractor = TCPExtractor()
    features = extractor.extract(flow)

    assert features["tcp_is_tcp"] is True
    assert features["tcp_data_packets"] > 0

Coverage¶

Generating Coverage Reports¶

Bash

# Run tests with coverage
pytest tests/ --cov=src/joyfuljay --cov-report=html --cov-report=term

# Open HTML report
open htmlcov/index.html

Coverage Configuration¶

In pyproject.toml:

TOML

[tool.coverage.run]
source = ["src/joyfuljay"]
branch = true

[tool.coverage.report]
exclude_lines = [
    "pragma: no cover",
    "if TYPE_CHECKING:",
    "@abstractmethod",
]

Continuous Integration¶

GitHub Actions Example¶

YAML

name: Tests
on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: '3.12'

      - name: Install dependencies
        run: pip install -e ".[dev]"

      - name: Run tests
        run: pytest tests/ -v --cov=src/joyfuljay

      - name: Upload coverage
        uses: codecov/codecov-action@v4

Best Practices¶

Writing Good Tests¶

Test one thing per test - Keep tests focused
Use descriptive names - test_tls_extracts_ja3_from_valid_handshake
Test edge cases - Empty inputs, boundary values, malformed data
Use fixtures - Share setup code via pytest fixtures
Mark slow tests - Use @pytest.mark.slow for long-running tests

Property Test Guidelines¶

Identify invariants - What should ALWAYS be true?
Use appropriate strategies - Match input constraints
Set reasonable limits - max_examples=100 is often enough
Reproduce failures - Use --hypothesis-seed for reproducibility

Fuzz Test Guidelines¶

Never crash - Extractors should handle any input
Return valid types - Always return a dict, even if empty
Log interesting inputs - Use hypothesis.note() for debugging
Set timeouts - Prevent infinite loops with deadline settings

Testing Guide¶

Test Organization¶

Running Tests¶

All Tests¶

Specific Test Categories¶

Single Test File¶

Property-Based Testing¶

Installation¶

Statistics Properties¶

Tested Properties¶

Statistics (test_stats_property.py)¶

Entropy (test_entropy_property.py)¶

Inter-arrival Times¶

Writing Property Tests¶

Fuzz Testing¶

Protocol Extractor Fuzzing¶

Fuzz Targets¶

Running Fuzz Tests¶

Writing Fuzz Tests¶

Performance Benchmarks¶

Running Benchmarks¶

Pipeline Benchmark¶

Sample Output¶

Statistics Benchmark¶

Test Markers¶

pytest.ini Configuration¶

Test Fixtures¶

Shared Packet Fixtures¶

Using Fixtures¶

Coverage¶

Generating Coverage Reports¶

Coverage Configuration¶

Continuous Integration¶

GitHub Actions Example¶

Best Practices¶

Writing Good Tests¶

Property Test Guidelines¶

Fuzz Test Guidelines¶

Statistics (`test_stats_property.py`)¶

Entropy (`test_entropy_property.py`)¶