A machine learning engineer is in the process of implementing a concept drift monitoring solution. They are planning to use the following steps:
1. Deploy a model to production and compute predicted values
2. Obtain the observed (actual) label values
3. _____
4. Run a statistical test to determine if there are changes over time
Which of the following should be completed as Step #3?
A data scientist has developed a scikit-learn modelsklearn_modeland they want to log the model using MLflow.
They write the following incomplete code block:
Which of the following lines of code can be used to fill in the blank so the code block can successfully complete the task?
A machine learning engineer needs to select a deployment strategy for a new machine learning application. The feature values are not available until the time of delivery, and results are needed exceedingly fast for one record at a time.
Which of the following deployment strategies can be used to meet these requirements?
In a continuous integration, continuous deployment (CI/CD) process for machine learning pipelines, which of the following events commonly triggers the execution of automated testing?
After a data scientist noticed that a column was missing from a production feature set stored as a Delta table, the machine learning engineering team has been tasked with determining when the column was dropped from the feature set.
Which of the following SQL commands can be used to accomplish this task?
Which of the following is a probable response to identifying drift in a machine learning application?
Which of the following Databricks-managed MLflow capabilities is a centralized model store?
A machine learning engineer is manually refreshing a model in an existing machine learning pipeline. The pipeline uses the MLflow Model Registry model "project". The machine learning engineer would like to add a new version of the model to "project".
Which of the following MLflow operations can the machine learning engineer use to accomplish this task?
A machine learning engineer has created a webhook with the following code block:
Which of the following code blocks will trigger this webhook to run the associate job?
A)
B)
C)
D)
E)
A machine learning engineer wants to move their model versionmodel_versionfor the MLflow Model Registry modelmodelfrom the Staging stage to the Production stage using MLflow Clientclient. At the same time, they would like to archive any model versions that are already in the Production stage.
Which of the following code blocks can they use to accomplish the task?
A)
B)
C)
D)
A machine learning engineer is using the following code block as part of a batch deployment pipeline:
Which of the following changes needs to be made so this code block will work when theinferencetable is a stream source?
A machine learning engineer wants to programmatically create a new Databricks Job whose schedule depends on the result of some automated tests in a machine learning pipeline.
Which of the following Databricks tools can be used to programmatically create the Job?
A machine learning engineer has developed a random forest model using scikit-learn, logged the model using MLflow as random_forest_model, and stored its run ID in the run_id Python variable. They now want to deploy that model by performing batch inference on a Spark DataFrame spark_df.
Which of the following code blocks can they use to create a function called predict that they can use to complete the task?
A)
B)
It is not possible to deploy a scikit-learn model on a Spark DataFrame.
C)
D)
E)
Which of the following tools can assist in real-time deployments by packaging software with its own application, tools, and libraries?
Which of the following is a benefit of logging a model signature with an MLflow model?