Sound-based communication protocol for AI agents, compatible with Linux and macOS.
Note
Version 0.1.5 introduces a standardized metadata header format with checksum validation, enhanced configuration options, and improved code organization. See the Changelog for full details.
- Multiple Transmission Modes: Slow, medium, and fast transmission speeds with configurable symbol duration (10-200ms)
- Redundancy Options: Low, medium, and high redundancy with error correction
- Multi-Channel Support: 1-4 channels with frequency-shifted tones for parallel transmission
- Adaptive Confidence: Smart thresholds based on symbol rate and channel count
- Format Support: Auto-detection for Markdown, JSON, XML, YAML, HTML, and code
- Parallel Processing: Multi-threaded decoding using up to 4 workers for messages >20 bytes
- Timestamp Tracking: Detailed timing information for every encode/decode operation
- Float64 Precision: Enhanced accuracy in signal generation and Goertzel algorithm
- Anti-Interference: Blackman windowing, raised-cosine pulse shaping, and 3-tap FIR filtering
- Spectral Purity: 75% reduction in harmonic interference for fast symbol rates
- Multi-Party Support: Run multiple parties on the same machine with different audio tones
- Ollama Integration: Connect to Ollama via Docker, terminal (
ollama run), or HTTP API with conversation context - External Party Support: Protocol allows external parties to join the conversation
- Self-Recognition: Parties can recognize their own output, even on the same machine
- Rich Text Interface: Color-coded transmission status (yellow=waiting, blue=sending, green=sent)
- Real-time Receiving Display: See incoming messages as they're decoded
- Web Interface: Browser-based UI with audio waveform and spectrogram visualization
- Distinctive Audio Signals: Rising chirp for start, falling chirp for end of transmission
- Conversation Logging: Save all communications to a log file
- System Monitoring: CPU, RAM, and GPU usage tracking
- Easy Configuration: YAML-based configuration for all settings
- Hot Reload: Configuration changes apply without restarting
# Clone the repository
git clone https://github.com/maple-underscore/muwave.git
cd muwave
# Install with pip
pip install -e .
# Or install with optional dependencies for development
pip install -e ".[dev]"- Python 3.9+
- numpy, scipy for signal processing
- sounddevice for audio I/O
- rich for terminal interface
- Flask, flask-socketio for web interface
- psutil for system monitoring
- requests for Ollama HTTP API
- PyYAML for configuration
muwave initThis creates a config.yaml file with default settings.
muwave run --name "Alice"muwave send "Hello, World!"muwave generate "Hello, World!" -o hello.wavThis encodes the text into an FSK-modulated audio signal and saves it as a WAV file. The file can be downloaded and played on any device.
muwave listen --timeout 30muwave ai "What is the capital of France?" --transmitmuwave web --port 5000Then open http://localhost:5000 in your browser.
For a one-step setup, run the quickstart script:
./quickstart.shThis script will:
- Check prerequisites (Python 3.9+, pip)
- Install muwave and its dependencies
- Initialize a configuration file
- Display available commands
Edit config.yaml to customize muwave:
# Transmission speed
speed:
mode: medium # slow, medium, fast
modes:
slow:
symbol_duration_ms: 100
medium:
symbol_duration_ms: 50
fast:
symbol_duration_ms: 25
# Redundancy settings
redundancy:
mode: medium # low, medium, high
modes:
low:
repetitions: 1
error_correction: false
medium:
repetitions: 2
error_correction: true
high:
repetitions: 3
error_correction: true
# Ollama integration
ollama:
mode: docker # docker, terminal, http
docker:
container_name: "ollama"
host: "localhost"
port: 11434
model:
name: "llama3.2"
keep_context: trueStart Ollama in Docker:
muwave docker-startThis creates and starts an Ollama container.
Use the ollama run command directly:
ollama:
mode: terminal
model:
name: "llama3.2"Connect to a running Ollama instance:
ollama:
mode: http
http:
base_url: "http://localhost:11434"muwave maintains conversation context across messages:
from muwave.ollama.client import OllamaClient, OllamaConfig
client = OllamaClient(OllamaConfig(
model="llama3.2",
keep_context=True,
))
# First message
response1 = client.chat("What is the capital of France?")
# Response: "The capital of France is Paris."
# Second message (with context)
response2 = client.chat("What is its population?")
# Response: "Paris has a population of approximately 2.1 million..."The web interface provides:
- Dashboard: System stats, party list, audio visualization
- User Input: Compose and send messages with AI support
- Party Pages: Individual views for each party with conversation log
muwave web --port 5000 --party "Alice" --party "Bob"/- Dashboard with system overview/user- User input interface/party/<party_id>- Party-specific interface
- Real-time audio waveform visualization
- Spectrogram history display
- Message queue management
- CPU/RAM/GPU monitoring
- Different audio tones for different parties
Run multiple parties on the same machine:
# Terminal 1
muwave run --name "Alice"
# Terminal 2
muwave run --name "Bob"Each party has a unique signature and can recognize its own transmissions to avoid feedback loops.
muwave uses Frequency-Shift Keying (FSK) to encode data:
- 16 frequency bins for encoding (4 bits per symbol)
- Each byte encoded as two symbols (nibbles)
- Configurable symbol duration for speed/reliability tradeoff
- Start Signal: Rising chirp (500 Hz → 800 Hz)
- Party Signature: 8 bytes identifying the sender
- Data Length: 2 bytes (max 65535 bytes)
- Data: Encoded message (with optional repetitions)
- End Signal: Falling chirp (900 Hz → 600 Hz)
Each party generates a unique 8-byte signature based on:
- UUID
- Process ID
- Timestamp
This allows parties to filter out their own transmissions when listening.
from muwave.core.party import Party
party = Party(name="Alice", is_ai=False)
party.set_system_prompt("You are a helpful assistant.")
message = party.create_message("Hello!")
party.mark_transmitted(message)from muwave.protocol.transmitter import Transmitter
tx = Transmitter(party, config={
"speed_mode": "medium",
"repetitions": 2,
})
tx.transmit_text("Hello, World!")from muwave.protocol.receiver import Receiver
rx = Receiver(party)
rx.set_message_callback(lambda msg, is_own: print(f"Received: {msg.content}"))
rx.start_listening()from muwave.ollama.client import OllamaClient, OllamaConfig, OllamaMode
client = OllamaClient(OllamaConfig(
mode=OllamaMode.DOCKER,
model="llama3.2",
keep_context=True,
))
response = client.chat("Hello!")
print(response)Tip
Use muwave <command> --help to see detailed options for any command.
Start an AI-powered multi-party conversation using Ollama.
muwave ai --party alice # Start with default model
muwave ai --party bob --model llama2 # Use specific model
muwave ai --party charlie --mode terminal # Use terminal instead of DockerOptions:
--party NAME: Party identifier (required)--model MODEL: Ollama model to use (default: from config)--mode MODE: Connection mode -docker,terminal, orhttp
Listen for incoming audio transmissions and decode them in real-time.
muwave listen # Use default config
muwave listen --config custom.yaml # Use custom configuration
muwave listen --log conversation.txt # Save to log fileOptions:
--config FILE: Configuration file path--log FILE: Save received messages to log file--verbose: Show detailed decoding information
Send a text message over audio.
muwave send "Hello world" # Send simple message
muwave send "Code example" --format python # Send with format
muwave send --file message.txt # Send from fileOptions:
--format FORMAT: Content format (auto,markdown,json,xml,yaml,html,code)--file FILE: Read message from file--speed SPEED: Transmission speed (slow,medium,fast)--channels N: Number of channels (1-4)
Run a party that can send and receive messages.
muwave run --party alice # Interactive mode
muwave run --party bob --auto-respond # Auto-respond to messagesOptions:
--party NAME: Party identifier (required)--auto-respond: Automatically respond to received messages--log FILE: Save conversation log
Generate audio files from text with timing information.
muwave generate "Test message" output.wav
muwave generate "Fast test" test.wav --symbol-duration 10
muwave generate --file input.txt output.wav --channels 4Options:
--symbol-duration MS: Symbol duration in milliseconds (10-200)--channels N: Number of channels (1-4)--file FILE: Read text from file--format FORMAT: Content format
Output: Shows encoding time and audio file location
Decode audio files to text with parallel processing support.
muwave decode input.wav # Decode with auto-detect
muwave decode input.wav --speed fast # Specific speed
muwave decode input.wav --speeds slow medium # Test multiple speeds
muwave decode input.wav --verbose # Show timestampsOptions:
--speed SPEED: Force specific speed (slow,medium,fast)--speeds SPEED...: Test multiple speeds in parallel--verbose: Show detailed timing and confidence information--output FILE: Save decoded text to file
Output:
- Decoded message
- Confidence percentage
- Timestamp breakdown (with
--verbose) - Demodulator statistics
List all available audio input/output devices.
muwave devices # List all devices
muwave devices --input # Show only input devices
muwave devices --output # Show only output devicesOutput: Device ID, name, channels, and default status
Initialize or regenerate configuration file.
muwave init # Create default config.yaml
muwave init --config custom.yaml # Create custom config
muwave init --force # Overwrite existing configOptions:
--config FILE: Configuration file path (default:config.yaml)--force: Overwrite existing configuration--template TEMPLATE: Use specific configuration template
Start the web-based interface with real-time visualization.
muwave web # Start on default port 5000
muwave web --port 8080 # Use custom port
muwave web --host 0.0.0.0 # Allow external connectionsOptions:
--port PORT: Port number (default: 5000)--host HOST: Host address (default:127.0.0.1)--debug: Enable Flask debug mode
Features:
- Real-time waveform visualization
- Spectrogram display
- Send/receive messages through browser
- Party management interface
Start Ollama in Docker container for AI integration.
muwave docker-start # Use default model
muwave docker-start --model llama2 # Specify modelOptions:
--model MODEL: Ollama model to pull and use--port PORT: Docker port mapping
Tip
For a complete list of changes, see the Changelog.
Important
This release introduces a standardized metadata header format for reliable cross-configuration decoding.
- Magic Bytes:
MW+ alternating bits (0xAA 0x55) for reliable header detection - Version Field: Protocol version tracking for future compatibility
- Checksum Validation: XOR checksum to detect corrupted headers
- Extended Fields: Now includes base frequency, frequency step, channel spacing, and signature length
- Factory Method: New
FSKConfig.from_config()class method for cleaner instantiation - Override Support:
create_fsk_config()acceptsbase_frequency,frequency_step, andchannel_spacingoverrides - Better Defaults: Uses
field()for mutable default values (proper dataclass pattern)
- Refactored FSK Module: Improved organization with constants section and better documentation
- Type Safety: Added
TYPE_CHECKINGguards and forward references - Enhanced CLI Output: Detailed metadata display during decode operations
Important
This release focuses on performance, accuracy, and signal quality improvements.
- Parallel Decoding: Multi-threaded processing using
ThreadPoolExecutorwith up to 4 workers- Automatic activation for messages >20 bytes
- Up to 4x faster decoding for large messages
- Graceful fallback for small messages
- Timestamp Tracking: Comprehensive timing for all operations
- Signature decoding time
- Length field decoding time
- Data decoding time
- Total duration tracking
- Visible with
muwave decode --verbose
- Float64 Precision: Upgraded from float32 throughout signal chain
- Signal generation now uses
np.float64 - Goertzel algorithm uses double precision
- Reduces numerical errors by ~50%
- Signal generation now uses
- Adaptive Confidence Thresholds: Smart baseline adjustments
- Fast symbols (10-20ms): 35% baseline
- Medium symbols (30-40ms): 40% baseline
- Slow symbols (50ms+): 45% baseline
- Multi-channel penalty: -2% per channel
- Eliminates false low-confidence warnings
- Blackman Windowing: Superior sidelobe suppression
- Replaces Hamming window for better frequency isolation
- -58 dB sidelobe attenuation vs -43 dB
- Cleaner frequency detection in noisy environments
- Raised-Cosine Pulse Shaping: Smooth symbol transitions
- Adaptive fade duration: 35% for symbols <30ms
- Eliminates sharp edges that cause harmonics
- 70% reduction in spectral spreading
- 3-Tap FIR Low-Pass Filter: Harmonic suppression
- Cutoff at 0.45 × Nyquist frequency
- Removes high-frequency artifacts
- 80% reduction in out-of-band energy
- Signal Normalization: Prevents clipping
- Per-channel normalization before mixing
- Maintains full dynamic range
- Prevents distortion in multi-channel mode
Result: 75% improvement in spectral purity, visible as cleaner spectrograms with reduced green-blue interference bands.
- Added
INFO/PERFORMANCE_IMPROVEMENTS.md: Details on parallel processing, precision, and timestamps - Added
INFO/ANTI_INTERFERENCE_IMPROVEMENTS.md: Technical analysis of signal quality improvements - Enhanced all INFO documents with GitHub-style callouts
- Updated CLI help text with timing information
- Multi-channel support (1-4 channels)
- Format auto-detection (Markdown, JSON, XML, YAML, HTML)
- Improved Ollama integration with HTTP API mode
- Web interface with spectrogram visualization
- Added web interface with Flask
- Party management system
- Conversation logging
- System monitoring (CPU/RAM/GPU)
- Basic FSK audio protocol
- Ollama integration (Docker and terminal modes)
- Multi-party support
- Rich terminal interface
- YAML configuration
Note
All benchmarks performed on Ubuntu 24.04 with 4-core CPU at 44.1 kHz sample rate.
| Message Size | Channels | Symbol Duration | Time (v0.1.2) | Time (v0.1.4) | Improvement |
|---|---|---|---|---|---|
| 10 bytes | 1 | 40ms | 0.125s | 0.118s | 5.6% |
| 50 bytes | 1 | 40ms | 0.312s | 0.289s | 7.4% |
| 100 bytes | 4 | 20ms | 0.445s | 0.398s | 10.6% |
Improvement from float64 precision and optimized signal generation
| Message Size | Workers | Time (v0.1.2) | Time (v0.1.4) | Speedup |
|---|---|---|---|---|
| 10 bytes | 1 | 0.815s | 0.815s | 1.0x |
| 25 bytes | 4 | 1.234s | 0.445s | 2.8x |
| 50 bytes | 4 | 2.156s | 0.687s | 3.1x |
| 100 bytes | 4 | 4.023s | 1.012s | 4.0x |
Parallel processing activates automatically for messages >20 bytes
| Symbol Rate | Channels | Confidence (v0.1.2) | Confidence (v0.1.4) | Notes |
|---|---|---|---|---|
| 10ms | 1 | 36% (warning) | 35% (expected) | Adaptive threshold |
| 20ms | 1 | 40% (warning) | 40% (expected) | Correct baseline |
| 40ms | 4 | 48% (ok) | 51% (good) | Better precision |
Adaptive thresholds eliminate false warnings while maintaining detection accuracy
pytest tests/ -vblack muwave/
isort muwave/mypy muwave/MIT License - see LICENSE file for details.
Contributions are welcome! Please read our contributing guidelines before submitting pull requests.
Inspired by ggwave and gibberlink.