At AnthologyAI, we are developing models for diverse use cases based on over 100 million datapoints per day that we ethically acquire via our app, Caden. Since I have joined the team, one of my primary responsibilities has been to figure out how to productionize the models our team has developed on Databricks.
Over the course of managing machine learning workflows on Databricks, I’ve become familiar with the best practices for model deployment using MLflow. For those who are new to MLflow, it logs a model and its dependencies, making it replicable across various environments (such as virtualenv, Docker, etc.). Registering a model in MLflow allows for seamless lifecycle management — from version control and staging to deployment on multiple platforms like AWS, Azure ML, or REST APIs.
When I recently tried following the same steps to register a model to the workspace with MLflow, I encountered errors I hadn’t seen before. I was able to log a model to the MLflow workspace, but I couldn’t register the model the same way I previously did.
The error I received while trying to register the model was a message saying CATALOG_DOES_NOT_EXIST: Catalog ‘x’ does not exist. x is a placeholder for our internal workspace name, but x wasn’t even mentioned in the code, and I hadn’t mentioned it before for model deployment, so I was confused why that error message is appearing now. I tried looking up the error and utilized the Databricks virtual assistant as well, but couldn’t resolve the issue. Since I couldn’t automate registering the model with code, I decided to go to the MLflow experiment tracking UI to see if I could manually register the model I had logged.
After clicking the Register model option on the top right, I found the Unity Catalog option.
I followed the steps from the image above. I updated MLflow and tried running the code to register the model, but I received an error mentioning that no model signature was included.
Model Signature
I had never heard of model signatures before this and did some investigating. I found documentation which helped me refactor the code to include the model signature. Model signature should be included while logging the model to MLflow so that an example input and output schema of the model can be captured.
I found that model signature is a critical step with model management on Unity Catalog. I found that the signature only works with scalar datatypes such as Pandas DataFrames. While working with Spark ML and Spark DataFrames, I was stuck for a while trying to log the model because there was an error creating the model signature.
Exception: Unsupported Spark Type ‘<class ‘pyspark.ml.linalg.VectorUDT’>’, MLflow schema is only supported for scalar Spark types.
The infer_signature function does not support Spark type VectorUDT which is utilized in Spark MLlib. While trying to resolve the error, I noticed on a forum that others have faced similar issues when trying to deploy a model to unity catalog.
I tried the solution listed in the forum but got errors while running the code. I also tried constructing the input and output schema with the ModelSignature function mentioned in the documentation but got errors. I found it simpler to convert a sample of the training data to Pandas to get the infer_signature function working. This solves the issue of not being able to deploy Spark ML Models to Unity Catalog.
With MLflow autolog, model signature is not required to be defined, however, I have not tried utilizing autologging yet for model deployment.
Logging a model
Registering a model
After including the model signature while logging the model, I was able to successfully register models for both sklearn and spark models.
Staging model to production
While figuring out how to stage a registered model to production in Unity Catalog, I finally found out that registering models to the workspace model registry is now legacy and not the current standard. I found a guide that explains the process of managing a model lifecycle in Unity Catalog, and this would have made it quicker for me to understand the steps of logging and registering models the new way.
I learned to stage models to production with aliases — different from staging to production with model registry.
Load model for inference
This is an example of the steps required to manage model lifecycle in Unity Catalog, and I have included full code examples with sklearn and spark models to show examples of MLOps Lifecycle in Databricks. You can modify the code accordingly with your training data and models. The process should still be similar for model deployment.
Overall summary of model deployment on Databricks with Unity Catalog is as follows:
- Have a catalog and schema created in Unity Catalog.
- Train your Machine Learning model with a dataset in a notebook.
- Log the model with a signature (save the model and any metadata in Unity Catalog). Include signature while logging model to capture input and output schema of model to ensure consistency of model as you transition model to another stage.
- Register the model: Creates a versioned record of the model in Unity Catalog to manage the model. The model must be registered in order to do any model serving.
- Transition model to Production: move models between stages and label your top-performing model “champion” for production use (model serving).
Reflection
As I have transitioned to deploying models on Unity Catalog, a few reasons I found why Databricks made the transition are unified governance and enhanced model management. With unified governance, it is easier to track model versioning, data lineage, and reduce errors using the wrong model version. I personally find creating model aliases or adding tags is easier to keep track of the model than transitioning to the production stage with the old way of model deployment. Model deployment on Unity Catalog is smooth once you understand the differences in the steps between deploying to the workspace model registry and to Unity Catalog. I struggled for a while because I wasn’t aware of the changes I outlined, including model signatures, aliases, etc. I hope this article helps as you navigate model deployment and manage Machine Learning Lifecycles on Databricks with MLflow.
Awnish is a Senior Machine Learning Engineer at AnthologyAI. He previously worked in the energy and AdTech space. He has successfully managed end-to-end projects as both a Data Engineer and Data Scientist, using unstructured data to create models that deliver real business value with tools like PySpark and Databricks. Additionally, Awnish has MLOps expertise, allowing him to deploy models efficiently to AWS within a strong machine learning pipeline that includes MLflow and Docker. He holds a Master of Science degree in Computer Science from the Georgia Institute of Technology.