Skip to content

Conversation

@michaelkedar
Copy link
Member

@michaelkedar michaelkedar commented Jan 23, 2026

#3850
This change makes OSV finally stop writing new Bug entities, moving the source of truth to solely be the Vulnerability entity and proto in GCS (for non-oss-fuzz stuff).

  • Changed worker to work with / write proto directly, instead of relying on the Bug's _post_put_hook
  • Changed importer to use Vulnerability entity in place of Bug during import time
    • Have not made the importer proactively write un-enriched records yet
  • Changed recoverer to handle missing GCS files by triggering a forced re-import in the worker, instead of trying to regenerate from the Bug entities. (and added worker logic to support this)
  • Updated some datafix tools to use Vulnerability, and moved some Bug-dependent tools to and old/ directory for reference.

Also added some missing type hints in some places.

@michaelkedar
Copy link
Member Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a significant refactoring to migrate from using Bug entities to Vulnerability entities as the source of truth. The changes are extensive and touch the importer, worker, and recoverer components, as well as associated tests and datafix tools.

Key changes include:

  • Removing Bug-dependent logic from the importer and worker.
  • Updating the worker to operate directly on Vulnerability entities and their corresponding protobufs in GCS.
  • Modifying the recoverer to trigger a forced re-import for missing GCS files instead of regenerating from Bug entities.
  • Updating data models and helper functions to reflect the new data structure.
  • Archiving old datafix tools that operated on Bug entities.

The changes appear to be well-executed and consistent with the goal of the migration. The test suite has been updated to reflect the new implementation, which is great. My review focuses on a minor logging improvement for better debuggability.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@michaelkedar michaelkedar marked this pull request as ready for review January 23, 2026 04:20
@michaelkedar michaelkedar requested review from Ly-Joey, another-rex, cuixq and jess-lowe and removed request for Ly-Joey and cuixq January 23, 2026 04:20
Copy link
Contributor

@another-rex another-rex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basically LGTM first try!

Just confirming that we want to delete all affected information on withdraw?

ds_vuln.is_withdrawn = True
ds_vuln.modified = proto_vuln.modified.ToDatetime(datetime.UTC)
osv.models.put_entities(ds_vuln, proto_vuln)
osv.update_affected_commits(vuln_id, [], False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be deleting all affected commits information?
It does not look like the existing implementation is deleting affected commits.


# NB: Only OSS-Fuzz is editable - all other sources are read-only.
# This should not be reachable by this worker.
output_path = os.path.join(osv.repo_path(repo), path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this add a warning log if it gets to this point.

else:
to_put.append(ListedVulnerability.from_vulnerability(vuln_pb))
new_affected = affected_from_proto(vuln_pb)
added, removed = diff_affected_versions(old_affected, new_affected)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, all affected versions are deleted when a record is withdrawn.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants