Skip to content

e2e: add daily devnet QA test for device provisioning#2832

Open
martinsander00 wants to merge 2 commits intomainfrom
ms/2755
Open

e2e: add daily devnet QA test for device provisioning#2832
martinsander00 wants to merge 2 commits intomainfrom
ms/2755

Conversation

@martinsander00
Copy link
Contributor

@martinsander00 martinsander00 commented Feb 5, 2026

Resolves: #2755

Summary

  • Add QA test that exercises the full device provisioning lifecycle as defined in RFC12
  • Test deletes and recreates a device and its links, triggering new pubkey assignment via PDA
  • Runs Ansible playbooks to restart both doublezero-agent and doublezero-telemetry daemons
  • Next day's run validates provisioning succeeded by checking device health/status
  • Command-line flags (-device, -bm-host) allow future use on testnet and mainnet-beta

Testing Verification

  • Manual testing will be done when GitHub workflow is added to infra repo
  • Test is designed to be run daily; each run validates previous day's provisioning

Current state:

doublezero ms/2755 ❯ go test -tags qa -run TestQA_DeviceProvisioning -v -count=1 ./e2e -env=devnet -device=chi-dn-dzd4 -bm-host=chi-dn-bm2
=== RUN   TestQA_DeviceProvisioning
    qa_provisioning_test.go:50: Starting provisioning test for device chi-dn-dzd4 (CLI via SSH to chi-dn-bm2)
    qa_provisioning_test.go:56: ==> Verifying device is healthy (validates previous provisioning)
    qa_provisioning_test.go:65: Current device pubkey: 76vzLg3gha3jVYfCKEfMjwxjNAYzWGewySczhysura98
    qa_provisioning_test.go:67: ==> Capturing device and link configuration
    qa_provisioning_test.go:73: Found 0 links connected to device
    qa_provisioning_test.go:79: ==> Deleting links connected to device
    qa_provisioning_test.go:86: ==> Deleting 0 interfaces on device
    qa_provisioning_test.go:94: ==> Waiting for device reference count to reach zero
    qa_provisioning_test.go:98: ==> Deleting device chi-dn-dzd4 (pubkey: 76vzLg3gha3jVYfCKEfMjwxjNAYzWGewySczhysura98)
    qa_provisioning_test.go:102: ==> Recreating device
    qa_provisioning_test.go:106: New device pubkey: EhsfjghCizvjoVdSRUJbgmF4c26uJ6SwZfqfaVHYVJr8
    qa_provisioning_test.go:109: ==> Creating 0 interfaces
    qa_provisioning_test.go:120: ==> Recreating links
    qa_provisioning_test.go:129: ==> Setting device max-users and desired-status
    qa_provisioning_test.go:135: ==> Restarting agents with new pubkey via Ansible
    qa_test.go:73: 2026-02-06T16:57:55.310Z INF Running Ansible to restart doublezero-agent device=chi-dn-dzd4 pubkey=EhsfjghCizvjoVdSRUJbgmF4c26uJ6SwZfqfaVHYVJr8
    qa_provisioning_test.go:137: 
                Error Trace:    /home/martin/Documents/malbec/doublezero/e2e/qa_provisioning_test.go:137
                Error:          Received unexpected error:
                                ansible agents.yml failed: exit status 1, output: [ERROR]: the playbook: ../infra/ansible/playbooks/agents.yml could not be found
                Test:           TestQA_DeviceProvisioning
                Messages:       failed to restart agents via Ansible
--- FAIL: TestQA_DeviceProvisioning (5.58s)
FAIL
FAIL    github.com/malbeclabs/doublezero/e2e    5.585s
FAIL

@martinsander00 martinsander00 force-pushed the ms/2755 branch 7 times, most recently from 3c6e2b8 to 7ddb3cd Compare February 6, 2026 17:24
Add a QA test that exercises the full device provisioning lifecycle as
defined in RFC12. The test:

1. Verifies device is healthy (validates previous day's provisioning)
2. Deletes device and links from the ledger
3. Recreates device and links (gets new pubkey via PDA)
4. Creates required interfaces (Loopback255, Loopback256)
5. Runs Ansible to restart doublezero-agent and doublezero-telemetry
6. Verifies device was recreated with new pubkey

The next day's run validates provisioning succeeded by checking
health=ready-for-users and status=activated.

Command-line flags (-device, -bm-host) allow future use on testnet
and mainnet-beta environments.

Resolves: #2755
Add a QA test that exercises the full device provisioning lifecycle as
defined in RFC12. The test:

1. Verifies device is healthy (validates previous day's provisioning)
2. Deletes device and links from the ledger
3. Recreates device and links (gets new pubkey via PDA)
4. Creates required interfaces (Loopback255, Loopback256)
5. Runs Ansible to restart doublezero-agent and doublezero-telemetry
6. Verifies device was recreated with new pubkey

The next day's run validates provisioning succeeded by checking
health=ready-for-users and status=activated.

Command-line flags (-device, -bm-host) allow future use on testnet
and mainnet-beta environments.

Resolves: #2755
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

e2e: add a daily devnet QA test that exercises the provisioning process

2 participants