I am wondering if it would be better to load keras model once, and assign it to a global variable. It seems get_predictions would load keras model every time it gets called.
https://github.com/PyDataBlog/fastapi-model-deployment/blob/master/app/data/predictions_handler.py