feat: update lassie to sync Retriever (before provider rewrite revert) by rvagg · Pull Request #167 · application-research/autoretrieve

rvagg · 2023-01-24T09:45:56Z

Retriever#Retrieve() calls are now synchronous, so we get to wait for the direct return value and error synchronously
Change the AwaitGet call order and make it cancellable
Make the provider context-cancel aware for cleaner shutdown
Other minor fixes and adaptions to the new Lassie code

DRAFT for now because it's using filecoin-project/lassie#41

@elijaharita I'd like your eyes on this for the synchronous call in the provider, provider.retriever.Retrieve() now blocks, holding up one of the goroutines while a retrieval attempt happens. Most of them still fail from indexer lookup fails but the ones that attempt may cause a backlog.

Is there any mechanism in here to prevent an excessive backlog of tasks on the retrievalQueue that we can't process fast enough to keep up with incoming? I don't see it but I might be missing something. I'm also not experiencing a huge queue while running this locally but that might change when on a faster open network.

codecov-commenter · 2023-01-25T00:20:26Z

Codecov Report

Base: 4.87% // Head: 5.24% // Increases project coverage by +0.36% 🎉

Coverage data is based on head (ebf5c4e) compared to base (4b4dcae).
Patch coverage: 12.14% of modified lines in pull request are covered.

Additional details and impacted files

@@            Coverage Diff            @@
##           master    #167      +/-   ##
=========================================
+ Coverage    4.87%   5.24%   +0.36%     
=========================================
  Files          15      14       -1     
  Lines        1723    1697      -26     
=========================================
+ Hits           84      89       +5     
+ Misses       1634    1603      -31     
  Partials        5       5

Impacted Files	Coverage Δ
autoretrieve.go	`0.00% <0.00%> (ø)`
bitswap/provider.go	`0.00% <0.00%> (ø)`
endpoint/estuary.go	`0.00% <0.00%> (ø)`
blocks/manager.go	`78.50% <92.85%> (+1.05%)`	⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

* Retriever#Retrieve() calls are now synchronous, so we get to wait for the direct return value and error synchronously * Change the AwaitGet call order and make it cancellable * Make the provider context-cancel aware for cleaner shutdown * Other minor fixes and adaptions to the new Lassie code

hannahhoward

These are just some possibilities. I don't know whether they fix underlying issues though

hannahhoward · 2023-01-25T23:37:02Z

bitswap/provider.go

+		blockManager:             blockManager,
+		retriever:                retriever,
+		requestQueue:             peertaskqueue.New(peertaskqueue.TaskMerger(&overwriteTaskMerger{}), peertaskqueue.IgnoreFreezing(true)),
+		requestQueueSignalChan:   make(chan struct{}, 10),


so here's my recommendation for these signals:

make them buffer 1

when writing, call:

select { case provider.requestQueueSignalChan <- struct{}{}: default: }

oh, nice, so if it blocks then bail, I didn't think of that!

hannahhoward · 2023-01-25T23:39:30Z

bitswap/provider.go

+			select {
+			case <-ctx.Done():
+			case <-time.After(time.Millisecond * 250):
+			case <-provider.responseQueueSignalChan:


when len(tasks) != 0, you had better still optionally drain the signal chan, i.e

if len(tasks == 0) { ///... continue } select { case <-provider.responseQueueSignalChan: default: } ///...

I wouldn't be surprised if you're getting stuck cause of this.

hannahhoward · 2023-01-25T23:41:31Z

bitswap/provider.go

@@ -325,13 +339,15 @@ func (provider *Provider) handleResponses() {
 	}


why are we calling TasksDone twice when an error occurs sending a message?

hannahhoward · 2023-01-25T23:41:48Z

bitswap/provider.go

+			}
 			continue
 		}



same, better drain the signal queue

hannahhoward · 2023-01-25T23:44:08Z

bitswap/provider.go

-	for {
+func (provider *Provider) handleRetrievals(ctx context.Context) {
+	for ctx.Err() == nil {
 		peerID, tasks, _ := provider.retrievalQueue.PopTasks(1)


now that retreival is synchronous, this appears to limit things to one retrieval per worker queue, no?

yes, default is 8 workers, but I hadn't seen a reason to increase this (yet) because it appears the pipe of incoming requests is so small; but maybe I'm not seeing it right

rvagg requested a review from elijaharita January 24, 2023 09:45

rvagg force-pushed the rvagg/lassie-update branch from d6c4231 to 8814195 Compare January 24, 2023 21:49

rvagg mentioned this pull request Jan 24, 2023

Update to use new retriever flow version of Lassie #160

Closed

rvagg force-pushed the rvagg/lassie-update branch from 8814195 to f7ec395 Compare January 25, 2023 00:14

rvagg mentioned this pull request Jan 25, 2023

fix: send dont_have when retrieval immediately fails #170

Open

rvagg force-pushed the rvagg/lassie-update branch 2 times, most recently from 0d024b6 to e3a5629 Compare January 25, 2023 05:17

rvagg force-pushed the rvagg/lassie-update branch from e3a5629 to b6db210 Compare January 25, 2023 08:15

fix: add signal channels to speed up processing

ebf5c4e

hannahhoward reviewed Jan 25, 2023

View reviewed changes

rvagg mentioned this pull request Jan 31, 2023

feat: update lassie to sync Retriever (after provider rewrite revert) #173

Merged

rvagg changed the title ~~feat: update lassie to sync Retriever~~ feat: update lassie to sync Retriever (before provider rewrite revert) Jan 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: update lassie to sync Retriever (before provider rewrite revert)#167

feat: update lassie to sync Retriever (before provider rewrite revert)#167
rvagg wants to merge 2 commits intomasterfrom
rvagg/lassie-update

rvagg commented Jan 24, 2023

Uh oh!

codecov-commenter commented Jan 25, 2023 •

edited

Loading

Uh oh!

hannahhoward left a comment

Uh oh!

hannahhoward Jan 25, 2023

Uh oh!

rvagg Jan 26, 2023

Uh oh!

hannahhoward Jan 25, 2023

Uh oh!

hannahhoward Jan 25, 2023

Uh oh!

hannahhoward Jan 25, 2023

Uh oh!

hannahhoward Jan 25, 2023

Uh oh!

hannahhoward Jan 25, 2023

Uh oh!

rvagg Jan 26, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -325,13 +339,15 @@ func (provider *Provider) handleResponses() {
		}

Conversation

rvagg commented Jan 24, 2023

Uh oh!

codecov-commenter commented Jan 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

hannahhoward left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov-commenter commented Jan 25, 2023 •

edited

Loading