Quick Start Guide¶
Get started with JoyfulJay in 5 minutes.
Installation¶
Verify installation:
Your First Extraction¶
Python¶
Python
import joyfuljay as jj
# Extract features from a PCAP file
df = jj.extract("capture.pcap")
# View results
print(f"Flows: {len(df)}")
print(f"Features: {len(df.columns)}")
print(df.head())
Command Line¶
Select Feature Groups¶
Don't need all 387 features? Select specific groups:
Python
import joyfuljay as jj
# Only timing and TLS features
df = jj.extract("capture.pcap", features=["timing", "tls"])
# Only flow metadata
df = jj.extract("capture.pcap", features=["flow_meta"])
Available groups:
| Group | Description |
|---|---|
flow_meta | IPs, ports, duration, packet counts |
timing | Inter-arrival times, burst metrics |
size | Packet length statistics |
tls | TLS version, ciphers, JA3 fingerprints |
quic | QUIC protocol metadata |
ssh | SSH version, HASSH fingerprints |
dns | DNS queries and responses |
tcp | TCP flags, handshake analysis |
fingerprint | Tor/VPN/DoH detection |
entropy | Payload entropy |
padding | Padding detection |
Live Capture¶
Capture traffic from a network interface:
Python¶
Python
import joyfuljay as jj
# Capture for 30 seconds
df = jj.extract_live("eth0", duration=30)
print(f"Captured {len(df)} flows")
Command Line¶
Note: Live capture requires root/admin privileges.
Configure Extraction¶
Use Config for fine-grained control:
Python
import joyfuljay as jj
# Create custom configuration
config = jj.Config(
features=["timing", "tls", "fingerprint"],
flow_timeout=30.0, # 30 second flow timeout
include_ip_addresses=True, # Include IPs in output
anonymize_ips=False, # Don't hash IPs
)
# Create pipeline
pipeline = jj.Pipeline(config)
# Process PCAP
df = pipeline.process_pcap("capture.pcap")
# Or iterate over flows (memory efficient)
for features in pipeline.iter_features("large_capture.pcap"):
print(features)
Output Formats¶
DataFrame (Default)¶
NumPy Array¶
Python
config = jj.Config(features=["timing", "size"])
pipeline = jj.Pipeline(config)
array = pipeline.process_pcap("capture.pcap", output_format="numpy")
# Returns numpy array with numeric features only
CSV File¶
JSON File¶
Use with Machine Learning¶
JoyfulJay output is designed for ML:
Python
import joyfuljay as jj
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
# Extract features
df = jj.extract("labeled_traffic.pcap", features=["timing", "size"])
# Prepare data
X = df.select_dtypes(include=['number']).fillna(0)
y = df['label'] # Your labels
# Train
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
clf = RandomForestClassifier()
clf.fit(X_train, y_train)
# Evaluate
print(f"Accuracy: {clf.score(X_test, y_test):.2%}")
Detect Encrypted Traffic Types¶
Use the fingerprint extractor:
Python
import joyfuljay as jj
df = jj.extract("capture.pcap", features=["fingerprint"])
# Check for Tor
tor_flows = df[df['likely_tor'] == True]
print(f"Tor flows: {len(tor_flows)}")
# Check for VPN
vpn_flows = df[df['likely_vpn'] == True]
print(f"VPN flows: {len(vpn_flows)}")
# Check for DoH
doh_flows = df[df['likely_doh'] == True]
print(f"DoH flows: {len(doh_flows)}")
Process Multiple Files¶
Python
import joyfuljay as jj
from pathlib import Path
# Create pipeline once
config = jj.Config(features=["timing", "tls"])
pipeline = jj.Pipeline(config)
# Process multiple files
pcap_dir = Path("./captures")
all_features = []
for pcap_file in pcap_dir.glob("*.pcap"):
df = pipeline.process_pcap(str(pcap_file))
df['source_file'] = pcap_file.name
all_features.append(df)
# Combine
import pandas as pd
combined = pd.concat(all_features, ignore_index=True)
combined.to_csv("all_features.csv", index=False)
Command Line Cheat Sheet¶
Bash
# Basic extraction
jj extract capture.pcap -o features.csv
# Select features
jj extract capture.pcap -o features.csv --features timing tls
# Live capture
jj live eth0 --duration 60 -o live.csv
# With BPF filter
jj live eth0 --filter "tcp port 443" -o tls.csv
# View PCAP info
jj info capture.pcap
# Check system status
jj status
# List available features
jj features
# Export feature schema
jj schema -o schema.json
# Start remote capture server
jj serve eth0 --port 8765
# Discover remote servers
jj discover
# Connect to remote server
jj connect jj://192.168.1.100:8765 -o features.csv
Next Steps¶
- Configuration - All configuration options
- Features - Complete feature reference
- CLI Reference - All CLI commands
- Extractors - Extractor documentation
- Remote Capture - Distributed capture setup