Skip to content

Conversation

@ryanbreen
Copy link
Owner

@ryanbreen ryanbreen commented Feb 3, 2026

Summary

  • Remove x86_64-only conditional compilation from TCP and UDP packet handling in ipv4.rs
  • Implement TCP read/write syscalls for ARM64 in io.rs
  • Add O_NONBLOCK support for TCP socket reads on both architectures

Changes

Commit 1: Enable TCP/UDP packet handling for all architectures

  • Remove #[cfg(target_arch = "x86_64")] from TCP/UDP match arms in handle_ipv4()
  • Fixes loopback networking on ARM64 (packets were silently dropped)

Commit 2: Implement TCP read/write syscalls for ARM64

  • Add TcpConnection handling to ARM64 sys_read and sys_write in io.rs
  • Mirrors x86-64 behavior with WFI instead of HLT for blocking

Commit 3: Implement O_NONBLOCK for TCP socket reads

  • Capture is_nonblocking flag from fd_entry.status_flags
  • Return EAGAIN immediately when no data available and O_NONBLOCK is set
  • Add TEST 4b to tcp_blocking_test.rs

Test Results (ARM64)

Test Status
tcp_socket_test ✅ PASS
tcp_blocking_test ✅ PASS (tests 1-2)
udp_socket_test ✅ PASS
unix_socket_test ✅ PASS
fcntl_test ✅ PASS
nonblock_test ✅ PASS

🤖 Generated with Claude Code

ryanbreen and others added 8 commits February 3, 2026 07:53
…tion

Replace hardcoded pointer validation in sys_socketpair with architecture-aware
is_valid_user_address() function to fix EFAULT errors on ARM64.

Problem:
- socketpair() was failing with EFAULT on ARM64 because the hardcoded check
  `sv_ptr > 0x7FFFFFFFFFFF` rejected ARM64 stack addresses
- ARM64 stacks are at 0x0000_FFFF_FF00_0000, which is > 0x7FFFFFFFFFFF
- This prevented unix_socket_test and other socket-based tests from running

Solution:
- Use is_valid_user_address() which correctly validates addresses for both
  x86-64 and ARM64 architectures
- This function checks if addresses fall within valid userspace regions:
  - Code/data region (0x40000000 - 0x80000000)
  - Mmap region (0x7000_0000_0000 - 0x7FFF_FE00_0000)
  - Stack region (varies by architecture)

Testing:
- Built and tested on ARM64
- Socket pair creation now succeeds on ARM64

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Remove x86_64-only conditional compilation from TCP and UDP packet handling
in handle_ipv4(). This was causing loopback networking to silently fail on
ARM64 - packets were correctly queued but dropped when they reached the
protocol dispatch because the match arms were compiled out.

Changes:
- Remove #[cfg(target_arch = "x86_64")] from PROTOCOL_TCP match arm
- Remove #[cfg(target_arch = "x86_64")] from PROTOCOL_UDP match arm
- Remove #[cfg(target_arch = "x86_64")] from unknown protocol debug log
- Remove #[allow(dead_code)] from PROTOCOL_TCP/UDP constants (now used)

Result:
- UDP loopback now works on ARM64 (udp_socket_test passes)
- TCP connection setup works on ARM64 (socket/bind/listen/connect/accept)
- Note: TCP data transfer has a separate ENOTSUP issue to investigate

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add TCP connection support to ARM64 sys_read and sys_write handlers in io.rs.
Previously these returned EOPNOTSUPP, now they mirror the x86-64 implementation.

Changes:
- Add TcpConnection variant to WriteOperation enum
- Implement sys_write for TcpConnection using tcp_send()
- Implement blocking sys_read for TcpConnection using tcp_recv()
- Use WFI for blocking instead of x86-64's HLT

This combined with the ipv4.rs fix (removing x86_64-only cfg) enables
full TCP networking on ARM64.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add non-blocking I/O support for TCP connection reads on both x86-64 and ARM64.

Changes:
- handlers.rs: Capture is_nonblocking flag before dropping manager_guard,
  return EAGAIN immediately when no data available and O_NONBLOCK is set
- io.rs (ARM64): Same implementation for ARM64 syscall path
- tcp_blocking_test.rs: Add TEST 4b to verify non-blocking read returns EAGAIN

When O_NONBLOCK is set via fcntl(fd, F_SETFL, O_NONBLOCK):
- read() returns EAGAIN (-11) immediately if no data is available
- Without O_NONBLOCK, read() blocks until data arrives (existing behavior)

Verified working on ARM64:
- fcntl_test: PASS (O_NONBLOCK flag works)
- nonblock_test: PASS (EAGAIN returned correctly)
- tcp_socket_test: PASS (all TCP operations work)
- tcp_blocking_test: TEST 1-2 PASS (blocking I/O works)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Rewrite TEST 4b to be unfalsifiable by proving both:
- Part A: Blocking read takes ~300ms (child sleeps before sending)
- Part B: Non-blocking read returns EAGAIN in <100ms

Changes:
- Add timing assertions using now_monotonic() and elapsed_ms()
- Use sleep_ms() for reliable wall-clock delay instead of yield iterations
- Part A forks child that sleeps 300ms after connect before sending
- Part B verifies EAGAIN returns immediately with no pending data
- Fix watchdog issue that prevented tests 3-6 from running

The test will FAIL if:
- O_NONBLOCK check removed (Part B would block like Part A)
- Always-return-EAGAIN bug (Part A would return EAGAIN, not data)
- Non-blocking path is slow (Part B timing fails)

Note: ARM64 has a child process crash bug that prevents TEST 4b from
running. Tests 1-2 pass, confirming blocking I/O works. The crash is
in the child's exit path after successful operation.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Previously only x0 and x19-x30 (callee-saved) were being saved/restored
during context switches. This caused forked child processes to crash
because x1-x18 contained garbage values from the previous thread's
exception frame.

The fix saves and restores all general-purpose registers, ensuring
correct execution state is preserved across context switches.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…lculation

When now.tv_nsec < start.tv_nsec, we need to borrow 1 second from
elapsed_sec. The previous code calculated elapsed_nsec correctly but
didn't decrement elapsed_sec, causing over-counting by 1 second when
crossing second boundaries.

Also adds sleep_debug_test.rs to help diagnose timing issues in forked
child processes on ARM64.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ked children

This commit addresses several interconnected issues that caused ARM64
TCP blocking operations to hang in forked child processes:

1. ARM64 TCP recv WFI loop fix (io.rs):
   - Move drain_loopback_queue() to START of WFI loop
   - ARM64 has no NIC interrupts, so must poll for localhost packets
   - Add reset_quantum() after blocking wait to prevent immediate preemption
   - Reorder loop: drain -> check -> yield/wfi (matching socket.rs pattern)

2. TCP listener reference counting (tcp.rs, fd.rs, process.rs):
   - Add ref_count to ListenSocket for fork() support
   - Add tcp_listener_ref_inc() for FdTable::clone() during fork
   - Add tcp_listener_ref_dec() for close_all_fds() and FdTable::drop()
   - Prevents listener removal when forked child exits while parent still listening

3. ARM64 clock_gettime CoW support (syscall_entry.rs):
   - Delegate to shared syscall/time.rs implementation
   - Use copy_to_user() which properly triggers CoW page faults
   - Direct userspace writes bypassed CoW, causing sleep_ms() to hang
   - Forked children would get zero Timespec, looping forever in sleep_ms

Root cause: After fork(), child inherits CoW-mapped pages. When ARM64's
clock_gettime wrote directly to userspace (bypassing copy_to_user), the
write would fail silently on read-only CoW pages. The Timespec remained
zero, causing sleep_ms() to calculate zero elapsed time forever.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@ryanbreen ryanbreen merged commit 668593e into main Feb 3, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants