Machine Learning Model DB as a Service

In many Machine Learning or Data Science workflows, it is common to save and checkpoint several models and compare how they perform on a held-out validation dataset. These saved models could be -

  • checkpoints in a single training trial saved at different time epochs, in which case the network structure would be the same, but the weights and parameters of the network would be different for the various checkpoints. Checkpointing allows us to compare and contrast the training and validation accuracy across the other saved models in different time epochs.