Architecture Layers¶
Engineering Documentation
This is internal engineering documentation. For general usage, see the Architecture page.
Package Structure¶
JoyfulJay is organized into clear architectural layers:
src/joyfuljay/
├── core/ # Core abstractions (Flow, Packet, Pipeline, Config)
├── capture/ # Packet capture backends (Scapy, dpkt, remote)
├── extractors/ # Feature extraction modules
├── schema/ # Feature registry, profiles, schema generation
├── resources/ # Runtime resources (profiles, schemas)
├── output/ # Output formats (CSV, JSON, Parquet, database)
├── remote/ # Remote capture protocol and server
├── monitoring/ # Prometheus metrics integration
├── utils/ # Shared utilities
├── cli/ # Command-line interface
└── extensions/ # Optional Cython accelerators
Layer Responsibilities¶
Core Layer (core/)¶
The foundation of JoyfulJay:
packet.py:Packetdataclass representing parsed network packetsflow.py:FlowandFlowKeyfor bidirectional flow trackingpipeline.py:Pipelineorchestrates extraction from capture to outputconfig.py:Configdataclass for all configuration options
Dependency rule: Core depends only on stdlib and typing.
Capture Layer (capture/)¶
Packet capture backends:
scapy_backend.py: Primary backend using Scapy (default)dpkt_backend.py: Alternative backend using dpkt (faster for some workloads)remote_backend.py: Client for remote capture fromremote/servers
Dependency rule: Capture depends on Core and external capture libraries.
Extractors Layer (extractors/)¶
Feature extraction modules:
base.py:BaseExtractorabstract class defining the extractor interface- Each extractor file implements one feature group (timing, tls, quic, etc.)
Extractor contract:
class MyExtractor(BaseExtractor):
@staticmethod
def feature_ids() -> list[str]:
"""Return list of feature IDs this extractor produces."""
@staticmethod
def feature_meta() -> list[FeatureMeta]:
"""Return metadata for all features."""
def extract(self, flow: Flow) -> dict[str, Any]:
"""Extract features from a flow."""
Dependency rule: Extractors depend on Core and Schema (for FeatureMeta).
Schema Layer (schema/)¶
Feature registry and profiles:
registry.py:FeatureMetadataclass, feature ID collectionprofiles.py: Profile loading (JJ-CORE, JJ-EXTENDED, JJ-EXPERIMENTAL)tiering.py: Validation that all features are assigned to profilesgenerate.py: Schema JSON generation
Dependency rule: Schema depends only on Core and resources.
Resources Layer (resources/)¶
Runtime resources shipped with the package:
profiles/: Profile text files listing feature IDsschema/v1.0/: Generated feature schema JSON
Loaded via importlib.resources for reliable access in installed packages.
Remote Layer (remote/)¶
Remote capture protocol:
protocol.py: Message types and serialization (msgpack)server.py: WebSocket server for streaming packetsdiscovery.py: mDNS/Bonjour service discovery
Relationship to capture: - remote/ defines the transport protocol and server - capture/remote_backend.py is a capture backend that consumes remote/
┌─────────────────┐ WebSocket ┌─────────────────┐
│ capture/ │ ◄────────────────► │ remote/ │
│ remote_backend │ (msgpack proto) │ server │
└─────────────────┘ └─────────────────┘
Output Layer (output/)¶
Output format handlers:
formats.py: CSV, JSON, Parquet writersdatabase.py: SQLite, PostgreSQL writerskafka.py: Kafka streaming outputschema.py: Feature documentation generation
Extensions Layer (extensions/)¶
Optional Cython accelerators:
_fast_entropy.pyx: Accelerated entropy calculation_fast_stats.pyx: Accelerated statistical computationsbuild_extensions.py: Build script for Cython modules
Pure Python fallback: All accelerated code has pure Python alternatives. If Cython modules aren't available, the library works with slightly lower performance.
Dependency Graph¶
┌──────────┐
│ CLI │
└────┬─────┘
│
┌───────────────┼───────────────┐
│ │ │
▼ ▼ ▼
┌─────────┐ ┌──────────┐ ┌──────────┐
│ Output │ │ Pipeline │ │ Monitoring│
└────┬────┘ └────┬─────┘ └────┬─────┘
│ │ │
│ ┌─────────┴─────────┐ │
│ │ │ │
▼ ▼ ▼ │
┌─────────┐ ┌─────────┐│
│ Schema │◄─────────────│Extractors││
└────┬────┘ └────┬────┘│
│ │ │
├────────────────────────┤ │
│ │ │
▼ ▼ │
┌─────────┐ ┌─────────┐│
│Resources│ │ Capture │◄┘
└─────────┘ └────┬────┘
│
┌────┴────┐
│ Core │
└─────────┘
Extension Points¶
Adding a New Extractor¶
- Create
extractors/my_feature.py - Implement
BaseExtractorwithfeature_ids(),feature_meta(),extract() - Register in
extractors/__init__.py - Add feature IDs to appropriate profile in
resources/profiles/ - Add tests in
tests/unit/extractors/test_my_feature.py
Adding a New Capture Backend¶
- Create
capture/my_backend.py - Implement the backend interface (iter_packets_offline, iter_packets_live)
- Register in
capture/__init__.py - Add to
Config.backendchoices
Adding a New Output Format¶
- Add writer function to
output/formats.pyor create new module - Register in CLI
--formatchoices - Add tests