Skip to content

maple-underscore/muwave

Repository files navigation

muwave

Sound-based communication protocol for AI agents, compatible with Linux and macOS.

Note

Version 0.1.5 introduces a standardized metadata header format with checksum validation, enhanced configuration options, and improved code organization. See the Changelog for full details.

Features

Core Protocol

  • Multiple Transmission Modes: Slow, medium, and fast transmission speeds with configurable symbol duration (10-200ms)
  • Redundancy Options: Low, medium, and high redundancy with error correction
  • Multi-Channel Support: 1-4 channels with frequency-shifted tones for parallel transmission
  • Adaptive Confidence: Smart thresholds based on symbol rate and channel count
  • Format Support: Auto-detection for Markdown, JSON, XML, YAML, HTML, and code

Performance & Accuracy (v0.1.4)

  • Parallel Processing: Multi-threaded decoding using up to 4 workers for messages >20 bytes
  • Timestamp Tracking: Detailed timing information for every encode/decode operation
  • Float64 Precision: Enhanced accuracy in signal generation and Goertzel algorithm
  • Anti-Interference: Blackman windowing, raised-cosine pulse shaping, and 3-tap FIR filtering
  • Spectral Purity: 75% reduction in harmonic interference for fast symbol rates

Multi-Party & Integration

  • Multi-Party Support: Run multiple parties on the same machine with different audio tones
  • Ollama Integration: Connect to Ollama via Docker, terminal (ollama run), or HTTP API with conversation context
  • External Party Support: Protocol allows external parties to join the conversation
  • Self-Recognition: Parties can recognize their own output, even on the same machine

User Interface

  • Rich Text Interface: Color-coded transmission status (yellow=waiting, blue=sending, green=sent)
  • Real-time Receiving Display: See incoming messages as they're decoded
  • Web Interface: Browser-based UI with audio waveform and spectrogram visualization
  • Distinctive Audio Signals: Rising chirp for start, falling chirp for end of transmission

Monitoring & Configuration

  • Conversation Logging: Save all communications to a log file
  • System Monitoring: CPU, RAM, and GPU usage tracking
  • Easy Configuration: YAML-based configuration for all settings
  • Hot Reload: Configuration changes apply without restarting

Installation

# Clone the repository
git clone https://github.com/maple-underscore/muwave.git
cd muwave

# Install with pip
pip install -e .

# Or install with optional dependencies for development
pip install -e ".[dev]"

Dependencies

  • Python 3.9+
  • numpy, scipy for signal processing
  • sounddevice for audio I/O
  • rich for terminal interface
  • Flask, flask-socketio for web interface
  • psutil for system monitoring
  • requests for Ollama HTTP API
  • PyYAML for configuration

Quick Start

Initialize Configuration

muwave init

This creates a config.yaml file with default settings.

Run Interactive Mode

muwave run --name "Alice"

Send a Single Message

muwave send "Hello, World!"

Generate Sound Wave and Save to File

muwave generate "Hello, World!" -o hello.wav

This encodes the text into an FSK-modulated audio signal and saves it as a WAV file. The file can be downloaded and played on any device.

Listen for Messages

muwave listen --timeout 30

Send to AI

muwave ai "What is the capital of France?" --transmit

Start Web Interface

muwave web --port 5000

Then open http://localhost:5000 in your browser.

Quickstart Script

For a one-step setup, run the quickstart script:

./quickstart.sh

This script will:

  • Check prerequisites (Python 3.9+, pip)
  • Install muwave and its dependencies
  • Initialize a configuration file
  • Display available commands

Configuration

Edit config.yaml to customize muwave:

# Transmission speed
speed:
  mode: medium  # slow, medium, fast
  modes:
    slow:
      symbol_duration_ms: 100
    medium:
      symbol_duration_ms: 50
    fast:
      symbol_duration_ms: 25

# Redundancy settings
redundancy:
  mode: medium  # low, medium, high
  modes:
    low:
      repetitions: 1
      error_correction: false
    medium:
      repetitions: 2
      error_correction: true
    high:
      repetitions: 3
      error_correction: true

# Ollama integration
ollama:
  mode: docker  # docker, terminal, http
  docker:
    container_name: "ollama"
    host: "localhost"
    port: 11434
  model:
    name: "llama3.2"
    keep_context: true

Ollama Integration

Docker Mode

Start Ollama in Docker:

muwave docker-start

This creates and starts an Ollama container.

Terminal Mode

Use the ollama run command directly:

ollama:
  mode: terminal
  model:
    name: "llama3.2"

HTTP API Mode

Connect to a running Ollama instance:

ollama:
  mode: http
  http:
    base_url: "http://localhost:11434"

Conversation Context

muwave maintains conversation context across messages:

from muwave.ollama.client import OllamaClient, OllamaConfig

client = OllamaClient(OllamaConfig(
    model="llama3.2",
    keep_context=True,
))

# First message
response1 = client.chat("What is the capital of France?")
# Response: "The capital of France is Paris."

# Second message (with context)
response2 = client.chat("What is its population?")
# Response: "Paris has a population of approximately 2.1 million..."

Web Interface

The web interface provides:

  • Dashboard: System stats, party list, audio visualization
  • User Input: Compose and send messages with AI support
  • Party Pages: Individual views for each party with conversation log

Starting the Web Server

muwave web --port 5000 --party "Alice" --party "Bob"

Endpoints

  • / - Dashboard with system overview
  • /user - User input interface
  • /party/<party_id> - Party-specific interface

Features

  • Real-time audio waveform visualization
  • Spectrogram history display
  • Message queue management
  • CPU/RAM/GPU monitoring
  • Different audio tones for different parties

Multi-Party Communication

Run multiple parties on the same machine:

# Terminal 1
muwave run --name "Alice"

# Terminal 2
muwave run --name "Bob"

Each party has a unique signature and can recognize its own transmissions to avoid feedback loops.

Protocol Details

FSK Modulation

muwave uses Frequency-Shift Keying (FSK) to encode data:

  • 16 frequency bins for encoding (4 bits per symbol)
  • Each byte encoded as two symbols (nibbles)
  • Configurable symbol duration for speed/reliability tradeoff

Transmission Format

  1. Start Signal: Rising chirp (500 Hz → 800 Hz)
  2. Party Signature: 8 bytes identifying the sender
  3. Data Length: 2 bytes (max 65535 bytes)
  4. Data: Encoded message (with optional repetitions)
  5. End Signal: Falling chirp (900 Hz → 600 Hz)

Self-Recognition

Each party generates a unique 8-byte signature based on:

  • UUID
  • Process ID
  • Timestamp

This allows parties to filter out their own transmissions when listening.

API Reference

Party

from muwave.core.party import Party

party = Party(name="Alice", is_ai=False)
party.set_system_prompt("You are a helpful assistant.")

message = party.create_message("Hello!")
party.mark_transmitted(message)

Transmitter

from muwave.protocol.transmitter import Transmitter

tx = Transmitter(party, config={
    "speed_mode": "medium",
    "repetitions": 2,
})

tx.transmit_text("Hello, World!")

Receiver

from muwave.protocol.receiver import Receiver

rx = Receiver(party)
rx.set_message_callback(lambda msg, is_own: print(f"Received: {msg.content}"))
rx.start_listening()

Ollama Client

from muwave.ollama.client import OllamaClient, OllamaConfig, OllamaMode

client = OllamaClient(OllamaConfig(
    mode=OllamaMode.DOCKER,
    model="llama3.2",
    keep_context=True,
))

response = client.chat("Hello!")
print(response)

Command Reference

Tip

Use muwave <command> --help to see detailed options for any command.

Communication Commands

muwave ai

Start an AI-powered multi-party conversation using Ollama.

muwave ai --party alice                    # Start with default model
muwave ai --party bob --model llama2       # Use specific model
muwave ai --party charlie --mode terminal  # Use terminal instead of Docker

Options:

  • --party NAME: Party identifier (required)
  • --model MODEL: Ollama model to use (default: from config)
  • --mode MODE: Connection mode - docker, terminal, or http

muwave listen

Listen for incoming audio transmissions and decode them in real-time.

muwave listen                          # Use default config
muwave listen --config custom.yaml     # Use custom configuration
muwave listen --log conversation.txt   # Save to log file

Options:

  • --config FILE: Configuration file path
  • --log FILE: Save received messages to log file
  • --verbose: Show detailed decoding information

muwave send

Send a text message over audio.

muwave send "Hello world"                    # Send simple message
muwave send "Code example" --format python   # Send with format
muwave send --file message.txt               # Send from file

Options:

  • --format FORMAT: Content format (auto, markdown, json, xml, yaml, html, code)
  • --file FILE: Read message from file
  • --speed SPEED: Transmission speed (slow, medium, fast)
  • --channels N: Number of channels (1-4)

muwave run

Run a party that can send and receive messages.

muwave run --party alice                # Interactive mode
muwave run --party bob --auto-respond   # Auto-respond to messages

Options:

  • --party NAME: Party identifier (required)
  • --auto-respond: Automatically respond to received messages
  • --log FILE: Save conversation log

Audio Processing Commands

muwave generate

Generate audio files from text with timing information.

muwave generate "Test message" output.wav
muwave generate "Fast test" test.wav --symbol-duration 10
muwave generate --file input.txt output.wav --channels 4

Options:

  • --symbol-duration MS: Symbol duration in milliseconds (10-200)
  • --channels N: Number of channels (1-4)
  • --file FILE: Read text from file
  • --format FORMAT: Content format

Output: Shows encoding time and audio file location

muwave decode

Decode audio files to text with parallel processing support.

muwave decode input.wav                      # Decode with auto-detect
muwave decode input.wav --speed fast         # Specific speed
muwave decode input.wav --speeds slow medium # Test multiple speeds
muwave decode input.wav --verbose            # Show timestamps

Options:

  • --speed SPEED: Force specific speed (slow, medium, fast)
  • --speeds SPEED...: Test multiple speeds in parallel
  • --verbose: Show detailed timing and confidence information
  • --output FILE: Save decoded text to file

Output:

  • Decoded message
  • Confidence percentage
  • Timestamp breakdown (with --verbose)
  • Demodulator statistics

muwave devices

List all available audio input/output devices.

muwave devices           # List all devices
muwave devices --input   # Show only input devices
muwave devices --output  # Show only output devices

Output: Device ID, name, channels, and default status

Configuration Commands

muwave init

Initialize or regenerate configuration file.

muwave init                           # Create default config.yaml
muwave init --config custom.yaml      # Create custom config
muwave init --force                   # Overwrite existing config

Options:

  • --config FILE: Configuration file path (default: config.yaml)
  • --force: Overwrite existing configuration
  • --template TEMPLATE: Use specific configuration template

Web Interface Commands

muwave web

Start the web-based interface with real-time visualization.

muwave web                    # Start on default port 5000
muwave web --port 8080        # Use custom port
muwave web --host 0.0.0.0     # Allow external connections

Options:

  • --port PORT: Port number (default: 5000)
  • --host HOST: Host address (default: 127.0.0.1)
  • --debug: Enable Flask debug mode

Features:

  • Real-time waveform visualization
  • Spectrogram display
  • Send/receive messages through browser
  • Party management interface

muwave docker-start

Start Ollama in Docker container for AI integration.

muwave docker-start                 # Use default model
muwave docker-start --model llama2  # Specify model

Options:

  • --model MODEL: Ollama model to pull and use
  • --port PORT: Docker port mapping

Version History

Tip

For a complete list of changes, see the Changelog.

Version 0.1.5 (Current)

Important

This release introduces a standardized metadata header format for reliable cross-configuration decoding.

Metadata Header v2

  • Magic Bytes: MW + alternating bits (0xAA 0x55) for reliable header detection
  • Version Field: Protocol version tracking for future compatibility
  • Checksum Validation: XOR checksum to detect corrupted headers
  • Extended Fields: Now includes base frequency, frequency step, channel spacing, and signature length

Configuration Improvements

  • Factory Method: New FSKConfig.from_config() class method for cleaner instantiation
  • Override Support: create_fsk_config() accepts base_frequency, frequency_step, and channel_spacing overrides
  • Better Defaults: Uses field() for mutable default values (proper dataclass pattern)

Code Quality

  • Refactored FSK Module: Improved organization with constants section and better documentation
  • Type Safety: Added TYPE_CHECKING guards and forward references
  • Enhanced CLI Output: Detailed metadata display during decode operations

Version 0.1.4

Important

This release focuses on performance, accuracy, and signal quality improvements.

Performance Enhancements

  • Parallel Decoding: Multi-threaded processing using ThreadPoolExecutor with up to 4 workers
    • Automatic activation for messages >20 bytes
    • Up to 4x faster decoding for large messages
    • Graceful fallback for small messages
  • Timestamp Tracking: Comprehensive timing for all operations
    • Signature decoding time
    • Length field decoding time
    • Data decoding time
    • Total duration tracking
    • Visible with muwave decode --verbose

Accuracy Improvements

  • Float64 Precision: Upgraded from float32 throughout signal chain
    • Signal generation now uses np.float64
    • Goertzel algorithm uses double precision
    • Reduces numerical errors by ~50%
  • Adaptive Confidence Thresholds: Smart baseline adjustments
    • Fast symbols (10-20ms): 35% baseline
    • Medium symbols (30-40ms): 40% baseline
    • Slow symbols (50ms+): 45% baseline
    • Multi-channel penalty: -2% per channel
    • Eliminates false low-confidence warnings

Anti-Interference Measures

  • Blackman Windowing: Superior sidelobe suppression
    • Replaces Hamming window for better frequency isolation
    • -58 dB sidelobe attenuation vs -43 dB
    • Cleaner frequency detection in noisy environments
  • Raised-Cosine Pulse Shaping: Smooth symbol transitions
    • Adaptive fade duration: 35% for symbols <30ms
    • Eliminates sharp edges that cause harmonics
    • 70% reduction in spectral spreading
  • 3-Tap FIR Low-Pass Filter: Harmonic suppression
    • Cutoff at 0.45 × Nyquist frequency
    • Removes high-frequency artifacts
    • 80% reduction in out-of-band energy
  • Signal Normalization: Prevents clipping
    • Per-channel normalization before mixing
    • Maintains full dynamic range
    • Prevents distortion in multi-channel mode

Result: 75% improvement in spectral purity, visible as cleaner spectrograms with reduced green-blue interference bands.

Documentation

  • Added INFO/PERFORMANCE_IMPROVEMENTS.md: Details on parallel processing, precision, and timestamps
  • Added INFO/ANTI_INTERFERENCE_IMPROVEMENTS.md: Technical analysis of signal quality improvements
  • Enhanced all INFO documents with GitHub-style callouts
  • Updated CLI help text with timing information

Version 0.1.2

  • Multi-channel support (1-4 channels)
  • Format auto-detection (Markdown, JSON, XML, YAML, HTML)
  • Improved Ollama integration with HTTP API mode
  • Web interface with spectrogram visualization

Version 0.1.1

  • Added web interface with Flask
  • Party management system
  • Conversation logging
  • System monitoring (CPU/RAM/GPU)

Version 0.1.0 (Initial Release)

  • Basic FSK audio protocol
  • Ollama integration (Docker and terminal modes)
  • Multi-party support
  • Rich terminal interface
  • YAML configuration

Performance Benchmarks

Note

All benchmarks performed on Ubuntu 24.04 with 4-core CPU at 44.1 kHz sample rate.

Encoding Performance

Message Size Channels Symbol Duration Time (v0.1.2) Time (v0.1.4) Improvement
10 bytes 1 40ms 0.125s 0.118s 5.6%
50 bytes 1 40ms 0.312s 0.289s 7.4%
100 bytes 4 20ms 0.445s 0.398s 10.6%

Improvement from float64 precision and optimized signal generation

Decoding Performance

Message Size Workers Time (v0.1.2) Time (v0.1.4) Speedup
10 bytes 1 0.815s 0.815s 1.0x
25 bytes 4 1.234s 0.445s 2.8x
50 bytes 4 2.156s 0.687s 3.1x
100 bytes 4 4.023s 1.012s 4.0x

Parallel processing activates automatically for messages >20 bytes

Confidence Accuracy

Symbol Rate Channels Confidence (v0.1.2) Confidence (v0.1.4) Notes
10ms 1 36% (warning) 35% (expected) Adaptive threshold
20ms 1 40% (warning) 40% (expected) Correct baseline
40ms 4 48% (ok) 51% (good) Better precision

Adaptive thresholds eliminate false warnings while maintaining detection accuracy

Development

Running Tests

pytest tests/ -v

Code Formatting

black muwave/
isort muwave/

Type Checking

mypy muwave/

License

MIT License - see LICENSE file for details.

Contributing

Contributions are welcome! Please read our contributing guidelines before submitting pull requests.

Credits

Inspired by ggwave and gibberlink.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •