-
Notifications
You must be signed in to change notification settings - Fork 208
Open
Labels
Description
Requesting offers is quick if the offers cache is already populated but can be painfully slow if the cache is cold, e.g. 30s. Getting offers is one of the most common actions in dstack, so its performance affects the UX massively. We need to research how to reduce get_offers time for the slowest backends.
On dstack Sky, returning offers may take >30s:
dstack apply 1.67s user 0.69s system 6% cpu 36.534 total
Locally, I get the following numbers for Sky backends:
[13:26:43] INFO dstack._internal.server.services.backends:365 Requesting instance offers
from backends: ['gcp', 'runpod', 'aws', 'azure', 'lambda', 'nebius',
'cudo', 'verda']
DEBUG dstack._internal.server.services.backends:404 Got offers from backend
BackendType.RUNPOD in 0.551435s
[13:26:45] DEBUG dstack._internal.server.services.backends:404 Got offers from backend
BackendType.LAMBDA in 1.839518s
DEBUG dstack._internal.server.services.backends:404 Got offers from backend
BackendType.VERDA in 1.817387s
[13:26:47] DEBUG dstack._internal.server.services.backends:404 Got offers from backend
BackendType.CUDO in 3.355270s
DEBUG dstack._internal.server.services.backends:404 Got offers from backend
BackendType.NEBIUS in 4.055332s
[13:26:48] DEBUG dstack._internal.server.services.backends:404 Got offers from backend
BackendType.GCP in 5.753253s
[13:26:55] DEBUG dstack._internal.server.services.backends:404 Got offers from backend
BackendType.AWS in 12.271486s
[13:26:57] DEBUG dstack._internal.server.services.backends:404 Got offers from backend
BackendType.AZURE in 14.012781s
For some reason, Sky is slower than local for same backends.