Bugfix/blips on empty events #901

Jjm321814 · 2026-01-20T21:33:32Z

A few bugs existed in old versions of blip reco

Running blipreco on mostly empty event displays (somewhat common in simple MC productions of single neutrino events) would lead to empty hit collections being accessed and a subsequent memory issue/crash (Not specifically a segfault though).
A few blip-related hitcluster variables were not copying quite correctly. Effected variables are blip.clusters[i].blipID, and blip.clusters[i].isMatched, which both had default values. The independently saved clusters objects were correctly updated.
The MC truth PDG values for blip-energy deposits have puzzling labels and need their evaluation looked into.

The first bug was related to a hardcoded hitHandleGH label, which looked at gaushit rather than specialBlipGausHit. On ~emtpy events the lower threshold (specialBlipGausHit) will have a few entries, but the code uses hitHandleGH for truth matching, and that higher threshold hit collection can be empty. That leads to crashes.
For now the hardcoded label is just updated to accommodate specialBlipGausHit.

The second bug is not critical, as access hitclusters this way is done through a blip, so the ID is the same as the blip you are looking at and the clusters included must be matched (into this blip). The hitcluster collection values are updated after the struct assignment (copy constructor), so I updated the blip alg to also update the blip.Cluster[i].blipID, and isMatched.

The third bug I am still digging into. I was worried that the switch from sim::energy_deposition to sim::channel->IDE was causing issues, but in the new arrangement (with an optional interface to use either) gives consistent results between them.

I have some notes I can build into slides for the first two but would like to spend a little more time on the third bug.

Jjm321814 · 2026-01-26T21:37:23Z

https://docs.google.com/presentation/d/1igXPcS_Qst99layiHuMEZiUxz4EHMxGHutf7Mr-jbT0/edit?slide=id.g3bbdbff95f2_0_1#slide=id.g3bbdbff95f2_0_1 here is a useful overview set of slides for these changes. Sorry for the many commits most are just adding/removing debug.

For the hardcoded gaushit parameter the change in this PR is sufficient for any standard running someone does/for Fall production. Longterm I think these should not be hardcoded and just use the fcl-provided hit collection, and we assume the user knows what they are doing.

The existing MC truth back tracking works. I just added a few additional breaks in MC truth blips. I found a way to get labeling I am totally happy with but the performance hit is too large for anything other than a small production aimed specifically at blips.

I am happy with the current state of the MCTruth matching in this PR now. I promise some future slides looking at how the MC matching is doing in Fall production pre/post this PR in the next ~week

henrylay97

Hi Jacob, thanks for the PR. I have a few comments.

My main concern is around the fix for the empty event problem. Rather than finding the root cause of the problem, the solution seems to be to provide a second vector that you hope won't be empty as well.

I would rather the actual cause of the segfault be found and prevented. This would also remove any need for hard coded module labels.

The rest looks good to me, happy to discuss more offline!

sbndcode/BlipRecoSBND/Alg/BlipRecoAlg.cc

henrylay97 · 2026-01-28T08:54:02Z

sbndcode/BlipRecoSBND/Alg/BlipRecoAlg.cc

+    if(evt.getByLabel("specialblipgaushit",hitHandleGH)) art::fill_ptr_vector(hitlistGH, hitHandleGH);
+    else if(evt.getByLabel("gaushit",hitHandleGH))       art::fill_ptr_vector(hitlistGH, hitHandleGH);


Module labels should never be hard coded. I appreciate this was true before this PR but given you're touching the code at this point it needs fixing.

I also don't understand the use of two different hit collections - only one of them will have been an input to the BlipReco and so only one of them is a valid input for the truth matching. If the code crashes on an empty vector somewhere then the fix should be a catch to stop that happening rather than the providing of a different vector.

I have removed the hitHandleGH, and will remove the other spots that gaushit gets a special default privileging. It will just be the user's responsibility to make sure they have ran all the necessary inputs

sbndcode/BlipRecoSBND/Alg/BlipRecoAlg.cc

sbndcode/BlipRecoSBND/BlipAna_module.cc

… some branches to assume the hitHandle the user provided was run through truthmatching and pandoraTrack

Jjm321814 · 2026-01-28T16:09:33Z

Thanks for the suggestions, most are in. Let me check compilation and a simple output in a typical Fall production run

Jjm321814 · 2026-01-28T17:56:51Z

Merged in develop and the compiler works and entries in the reco2 files look good.

After getting the normal successful processing table I got this error
*** Error in `lar': corrupted size vs. prev_size: 0x000000002a0308b0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x80c37)[0x7fb0b2cb2c37]
/lib64/libc.so.6(+0x8120e)[0x7fb0b2cb320e]
/lib64/libc.so.6(+0x39ce9)[0x7fb0b2c6bce9]
/lib64/libc.so.6(+0x39d37)[0x7fb0b2c6bd37]
/lib64/libc.so.6(__libc_start_main+0xfc)[0x7fb0b2c5455c]
lar[0x4012d0]
Followed by a long memory map. I don't think this is my fault.

Jjm321814 · 2026-01-28T19:08:30Z

Fixed the above error.
After merging in develop it became v10_14_02_01 and my localproducts setup script was still asking for v10_14_02. Fixing that removed the error on exit.

henrylay97 · 2026-01-29T12:37:33Z

Thanks so much for responding so quickly. Looking much better.

I have a question. Will this now not segfault on empty events? Regardless of the label being applied.

I note you say: "I have removed the hitHandleGH, and will remove the other spots that gaushit gets a special default privileging. It will just be the user's responsibility to make sure they have ran all the necessary inputs"

Does this mean if the user mislabels the input we get a segfault? If so we need to avoid that and instead have a more graceful failure mode :)

Jjm321814 · 2026-01-29T15:22:25Z

I'll check that now. Noting down the possible combinations of wrong label:

Hitlabel doesn't exist
Track label and hit label are from different sets (regular pandora and special blip hits for example)
MC Truth Matching done on a different hit set
Track label doesn't exist
MC truth label doesn't exist
I'll check each combination and report back how it exits

Jjm321814 · 2026-01-29T16:06:28Z

We don't ever segfault. But we do have some cases where a normal output will be produced and the user must be careful that the fcl used to make it was correct.

For case 1

%MSG-s ArtException:  PostEndJob 29-Jan-2026 09:25:51 CST ModuleEndJob
---- EventProcessorFailure BEGIN
  EventProcessor: an exception occurred during current event processing
  ---- ScheduleExecutionFailure BEGIN
    Path: ProcessingStopped.
    ---- ProductNotFound BEGIN
      Found zero products matching all selection criteria
        C++ type: std::vector<recob::Hit>
        Module label: 'specialblipgaushitIdontexist'
        Product instance name: ''
        Process name: (empty)
      The above exception was thrown while processing module BlipRecoProducer/blipreco run: 1001 subRun: 18 event: 1
    ---- ProductNotFound END
    Exception going through path reco2
  ---- ScheduleExecutionFailure END
---- EventProcessorFailure END
---- FatalRootError BEGIN
  Fatal Root Error: TTree::SetEntries
  Tree branches have different numbers of entries, eg EventAuxiliary has 0 entries while sim::AuxDetSimChannels_genericcrt__G4. has 13 entries.
  ROOT severity: 2000
---- FatalRootError END

For case 2
The code seems to just run normally. The track input is really just to skip a few hits that are not-blip-like. Different indexing in the two track associations may cause some hits that are really blips to be skipped. This would be a tricky bug to catch so its important the fcl are ironed out.

For case 3:
Runs normally. Inspection of output should give you some clue that things aren't working well. But again this is a be careful with fcl case.

For case 4:
This one actually just runs normally and I would expect basically correct outputs. It is mildly slower, but not enough for me to notice in the time I did this check. It will just run over extra hits that form tracks and notice they don't look blip like, so it will end up dropping the cluster.

For case 5:

%MSG-s ArtException:  PostEndJob 29-Jan-2026 09:54:03 CST ModuleEndJob
---- EventProcessorFailure BEGIN
  EventProcessor: an exception occurred during current event processing
  ---- ScheduleExecutionFailure BEGIN
    Path: ProcessingStopped.
    ---- LogicError BEGIN
      Invalid FindMany
      ---- ProductNotFound BEGIN
        Found zero products matching all selection criteria
          C++ type: art::Assns<recob::Hit,simb::MCParticle,anab::BackTrackerHitMatchingData>
          Module label: 'blipgaushitTruthMatchwjejefiej'
          Product instance name: ''
          Process name: (empty)
      ---- ProductNotFound END
      Attempt to use a FindMany where the underlying art::Assns product was not found.The above exception was thrown while processing module BlipRecoProducer/blipreco run: 860 subRun: 71 event: 1
    ---- LogicError END
    Exception going through path reco2
  ---- ScheduleExecutionFailure END
---- EventProcessorFailure END
---- FatalRootError BEGIN
  Fatal Root Error: TTree::SetEntries
  Tree branches have different numbers of entries, eg EventAuxiliary has 0 entries while sim::AuxDetSimChannels_genericcrt__G4. has 22 entries.
  ROOT severity: 2000
---- FatalRootError END
%MSG

henrylay97

Brilliant. Thanks for the detailed checks Jacob! There is always some need for analysers / devs / release management to be careful with fcl labels so the responses to 2 and 3 are reasonable. Overall I'm happy that this will fail gracefully if at all.

As long as you're happy with the production fcl state then I'm happy with this now :D

Jjm321814 added 23 commits January 15, 2026 14:41

Fixed a hit mapper line to match new inputs

058d850

Checking what branch causes new crash

647fcac

debug

3cf18b9

make special blip gaus hit more prominent

64b158a

change hardcoded label

9e746c7

change hardcoded label

bcd7ca5

debug

fa03069

debug

43254cb

Adding back sim energyDeposit interface

fb096ac

Removed default arg

666f85f

wrong type

d4854a1

wrong type

645587b

debug

f1a0d28

checking on sim energy deposit patch

1ae2dd9

change ide performance

24c72c0

fixing blip ID markers? May not update on blip object

0f1ceda

Better spot for blip id update

0960ff9

Maybe we add it to the vector too fast?

baeb539

Try explicitly accessing array

0043a95

That fixed it. Add a check for filling in blipID on real clusters

803ed88

Removed debug

1a077c8

temp debug

70cec63

removed debug

a33d9ac

Jjm321814 requested review from henrylay97 and nathanielerowe January 20, 2026 21:33

Jjm321814 self-assigned this Jan 20, 2026

Jjm321814 added bug Something isn't working reco1/reco2 Reconstruction labels Jan 20, 2026

linyan-w added this to SBND 2025 Fall Production Jan 21, 2026

linyan-w moved this to Waiting on Reviewer in SBND 2025 Fall Production Jan 21, 2026

Jjm321814 added 18 commits January 22, 2026 17:20

debug tracking

4a0d59d

debug tracking

c5af4ee

debug tracking

e9214fa

debug tracking

30bfe5b

debug tracking

0e92018

debug tracking

5f9d19d

debug tracking

ee4eb00

Hiding away branch that shouldn't be accessed

6582f4c

special debug

4b6d952

remove old debug

37d2a28

additional special debug

dc3aaf4

additional special debug

851365f

A guess

79aef20

A guess2

b386844

Final guess

3d2dd5b

back to original

8d06c28

cleanup all my debug

63e10fc

Merge branch 'develop' into bugfix/BlipsOnEmptyEvents

90c1dd7

henrylay97 requested changes Jan 28, 2026

View reviewed changes

Jjm321814 added 2 commits January 28, 2026 09:50

generally cleaning up indentation

6b0f6a8

fixed general hardcoding of the producer labels. Minor refactoring of…

4dbab82

… some branches to assume the hitHandle the user provided was run through truthmatching and pandoraTrack

Updated fcl config with default param

fb1dc5f

henrylay97 approved these changes Jan 30, 2026

View reviewed changes

		if(evt.getByLabel("specialblipgaushit",hitHandleGH)) art::fill_ptr_vector(hitlistGH, hitHandleGH);
		else if(evt.getByLabel("gaushit",hitHandleGH)) art::fill_ptr_vector(hitlistGH, hitHandleGH);

Bugfix/blips on empty events #901

Are you sure you want to change the base?

Bugfix/blips on empty events #901

Conversation

Jjm321814 commented Jan 20, 2026

Uh oh!

Jjm321814 commented Jan 26, 2026

Uh oh!

henrylay97 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

henrylay97 Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Jjm321814 Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Jjm321814 commented Jan 28, 2026

Uh oh!

Jjm321814 commented Jan 28, 2026

Uh oh!

Jjm321814 commented Jan 28, 2026

Uh oh!

henrylay97 commented Jan 29, 2026

Uh oh!

Jjm321814 commented Jan 29, 2026

Uh oh!

Jjm321814 commented Jan 29, 2026 • edited by henrylay97 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

henrylay97 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Jjm321814 commented Jan 29, 2026 •

edited by henrylay97

Loading