-
Notifications
You must be signed in to change notification settings - Fork 1
Ap 527 healthcheck #11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
anarchivist
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generally looks good; i have a couple of questions. thanks for tackling this!
awilfox
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this! This is a good first pass.
I did find some issues that need to be worked out.
| def check | ||
| result = validate_iiif_server | ||
| mark_message result[:message] | ||
| mark_failure if result[:warning] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the key called warning if it is marking a failure? This seems unclear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call - originally I had them as "warnings", then changed failed checks to "failures", but didn't think about the keys.
| mark_failure if result[:warning] | ||
| rescue StandardError => e | ||
| logger.error(e) | ||
| mark_message "#{e.class}: #{e.message}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably don't want to expose the full exception message to the public Internet as it may have sensitive info. (For example, consider PG::ConnectionBad: role "foo" does not exist - it has the database username in the short message.)
We can probably just settle for e.class.
| class TestItemExists < OkComputer::Check | ||
| include BerkeleyLibrary::Logging | ||
|
|
||
| ERR_NO_COMPLETE_ITEM = 'Unable to locate complete item'.freeze |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a constant only for the later reference in the test? Is it important for another reason?
I'm not necessarily saying it shouldn't be, but this doesn't match the way we do this anywhere else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it was referenced in the spec, but to stay more consistent I've removed it and updated the spec
| mark_failure if result[:warning] | ||
| rescue StandardError => e | ||
| logger.error(e) | ||
| mark_message "#{e.class}: #{e.message}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, this should probably just be something like "Error: #{e.class}".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
| mark_failure | ||
| end | ||
| rescue StandardError => e | ||
| mark_message "Failed to check item: #{e.message}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above:
| mark_message "Failed to check item: #{e.message}" | |
| mark_message "Failed to check item: #{e.class}" |
Also, we aren't logging this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I've added logging
| @@ -0,0 +1,3 @@ | |||
| module HealthChecks | |||
| Dir[File.join(__dir__, 'health_checks', '*.rb')].each { |f| require f } | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I understand the concept of making this less of a maintenance hassle, requiring a glob like this feels dangerous to me. It's easy code injection for any malicious actor with minimal ability to even be detected (just drop into app/lib/health_checks; this could even target developer workstations).
We could do something like a frozen array of checks and then iterate over them, calling require on each one.
| visit health_path | ||
| begin | ||
| # Register a check that always fails | ||
| OkComputer::Registry.register 'failing-check', OkComputer::HttpCheck.new('http://localhost:9999/nonexistent') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably use example.test or such as we did above (which is guaranteed to never exist). Otherwise, if another service is running locally on a dev machine, it could hit that (and if it's a "magic" service that returns a response for every URL, it might even pass).
| @@ -1,3 +1,7 @@ | |||
| # 1.7.6 (2026-01-16) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This date is already wrong. It should not be there until the release is actually made and tagged (in its own commit).
| Config.instance_variable_set(:@lending_root_path, nil) | ||
| end | ||
|
|
||
| it 'raises ConfigException when ENV IIIF base URL is invalid (covers line 70)' do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, it's a bad idea to reference specific line numbers. If the source file is changed, these test descriptions will quickly stop having any meaning and become outdated. Keeping them updated is an unexpected maintenance burden.
The test as you have written it is descriptive enough without the line numbers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UGH... bad me... letting AI do a bunch of the heavy lifting for test coverage and made the mistake of not proof reading all of the spec. Fixed those
No description provided.