Databricks-Machine-Learning-Professional Databricks Certified Machine Learning Professional Questions and Answers

Questions 4

A machine learning engineer is in the process of implementing a concept drift monitoring solution. They are planning to use the following steps:

1. Deploy a model to production and compute predicted values

2. Obtain the observed (actual) label values

3. _____

4. Run a statistical test to determine if there are changes over time

Which of the following should be completed as Step #3?

Options:

Obtain the observed values (actual) feature values

Measure the latency of the prediction time

Retrain the model

None of these should be completed as Step #3

Compute the evaluation metric using the observed and predicted values

Buy Now

Questions 5

A data scientist has developed a scikit-learn modelsklearn_modeland they want to log the model using MLflow.

They write the following incomplete code block:

Which of the following lines of code can be used to fill in the blank so the code block can successfully complete the task?

Options:

mlflow.spark.track_model(sklearn_model, "model")

mlflow.sklearn.log_model(sklearn_model, "model")

mlflow.spark.log_model(sklearn_model, "model")

mlflow.sklearn.load_model("model")

mlflow.sklearn.track_model(sklearn_model, "model")

Buy Now

Questions 6

A machine learning engineer needs to select a deployment strategy for a new machine learning application. The feature values are not available until the time of delivery, and results are needed exceedingly fast for one record at a time.

Which of the following deployment strategies can be used to meet these requirements?

Options:

Edge/on-device

Streaming

None of these strategies will meet the requirements.

Batch

Real-time

Buy Now

Questions 7

In a continuous integration, continuous deployment (CI/CD) process for machine learning pipelines, which of the following events commonly triggers the execution of automated testing?

Options:

The launch of a new cost-efficient SQL endpoint

CI/CD pipelines are not needed for machine learning pipelines

The arrival of a new feature table in the Feature Store

The launch of a new cost-efficient job cluster

The arrival of a new model version in the MLflow Model Registry

Buy Now

Questions 8

After a data scientist noticed that a column was missing from a production feature set stored as a Delta table, the machine learning engineering team has been tasked with determining when the column was dropped from the feature set.

Which of the following SQL commands can be used to accomplish this task?

Options:

VERSION

DESCRIBE

HISTORY

DESCRIBE HISTORY

TIMESTAMP

Buy Now

Questions 9

Which of the following is a probable response to identifying drift in a machine learning application?

Options:

None of these responses

Retraining and deploying a model on more recent data

All of these responses

Rebuilding the machine learning application with a new label variable

Sunsetting the machine learning application

Buy Now

Answer:

Explanation:

Drift is the change over time in the statistical properties of the data that was used to train a machine learning model. This can cause the model to become less accurate or perform differently than it was designed to1. Drift can be detected by monitoring the statistics of the input and output data over time and comparing them with the baseline statistics from the training data2. Depending on the type and severity of the drift, different responses may be appropriate. Some possible responses are:

Retraining and deploying a model on more recent data: This can help the model adapt to the changes in the data and improve its performance. However, this may require frequent retraining and deployment cycles, which can be costly and time-consuming. Also, retraining may not be sufficient if the drift is caused by a change in the underlying concept or relationship between the input and output variables3.
Rebuilding the machine learning application with a new label variable: This can help the model capture the new concept or relationship that has emerged in the data. However, this may require a significant redesign of the application and the data pipeline, as well as collecting and labeling new data. Also, rebuilding may not be feasible if the concept or relationship is constantly changing or unknown3.
Sunsetting the machine learning application: This can help avoid the risks and costs of maintaining a model that is no longer reliable or useful. However, this may mean losing the benefits and value of the application and the data. Also, sunsetting may not be an option if the application is critical or mandatory for the business or the users3.

Therefore, all of these responses are probable, depending on the situation and the trade-offs involved. References:

Databricks Machine Learning Professional Exam Guide, Section 4: Solution and Data Monitoring, p. 5
Databricks Machine Learning Documentation, Monitoring ML Models, Data Drift Detection, p. 2-3
A Gentle Introduction to Concept Drift in Machine Learning, Types of Concept Drift, p. 3-4
Understanding Data Drift and Model Drift: Drift Detection in Python, Types of Drift, p. 2-3

Questions 10

Which of the following Databricks-managed MLflow capabilities is a centralized model store?

Options:

Models

Model Registry

Model Serving

Feature Store

Experiments

Buy Now

Questions 11

A machine learning engineer is manually refreshing a model in an existing machine learning pipeline. The pipeline uses the MLflow Model Registry model "project". The machine learning engineer would like to add a new version of the model to "project".

Which of the following MLflow operations can the machine learning engineer use to accomplish this task?

Options:

mlflow.register_model

MlflowClient.update_registered_model

mlflow.add_model_version

MlflowClient.get_model_version

The machine learning engineer needs to create an entirely new MLflow Model Registry model

Buy Now

Questions 12

A machine learning engineer has created a webhook with the following code block:

Which of the following code blocks will trigger this webhook to run the associate job?

Options:

Option A

Option B

Option C

Option D

Option E

Buy Now

Questions 13

A machine learning engineer wants to move their model versionmodel_versionfor the MLflow Model Registry modelmodelfrom the Staging stage to the Production stage using MLflow Clientclient. At the same time, they would like to archive any model versions that are already in the Production stage.

Which of the following code blocks can they use to accomplish the task?

Options:

Option A

Option B

Option C

Option D

Buy Now

Questions 14

A machine learning engineer is using the following code block as part of a batch deployment pipeline:

Which of the following changes needs to be made so this code block will work when theinferencetable is a stream source?

Options:

Replace "inference" with the path to the location of the Delta table

Replace schema(schema) with option("maxFilesPerTriqqer", 1}

Replace spark.read with spark.readStream

Replace formatfdelta") with format("stream")

Replace predict with a stream-friendly prediction function

Buy Now

Answer:

Explanation:

To read data from a stream source, such as Kafka, socket, or rate, the spark.readStream method should be used instead of spark.read. The spark.readStream method returns a streaming DataFrame that represents the unbounded input data stream. The spark.readStream method supports the same options and formats as the spark.read method, such as schema, delta, csv, json, etc. The spark.readStream method can also read from a Delta table as a stream source, by specifying the format("delta") and the path or table name of the Delta table123

The other options are incorrect because:

A. Replacing “inference” with the path to the location of the Delta table does not change the fact that spark.read is used to read from a stream source, which is not supported. The spark.readStream method should be used instead, and the path or table name of the Delta table can be specified as an option or argument.
B. Replacing schema(schema) with option("maxFilesPerTrigger", 1) does not change the fact that spark.read is used to read from a stream source, which is not supported. The spark.readStream method should be used instead, and the schema can be specified as an option or argument. The option("maxFilesPerTrigger", 1) is an optional configuration that limits the number of files processed in each trigger for file-based stream sources, such as delta, csv, json, etc. It does not affect the reading of data from a stream source4
D. Replacing format("delta") with format("stream") does not change the fact that spark.read is used to read from a stream source, which is not supported. The spark.readStream method should be used instead, and the format can be specified as an option or argument. The format("stream") is not a valid format for reading data from a stream source. The supported formats are delta, kafka, socket, rate, etc1
E. Replacing predict with a stream-friendly prediction function does not change the fact that spark.read is used to read from a stream source, which is not supported. The spark.readStream method should be used instead, and the prediction function can be applied to the streaming DataFrame as usual. The predict function does not need to be changed, as long as it can accept a streaming DataFrame as input and return a column of predictions as output5

References:

Input Sources - Structured Streaming Programming Guide - Spark 3.2.0 Documentation
Structured Streaming + Delta Lake - Databricks
Structured Streaming Programming Guide - Spark 3.2.0 Documentation
Configuration - Structured Streaming Programming Guide - Spark 3.2.0 Documentation
Machine Learning with Structured Streaming - Databricks

Questions 15

A machine learning engineer wants to programmatically create a new Databricks Job whose schedule depends on the result of some automated tests in a machine learning pipeline.

Which of the following Databricks tools can be used to programmatically create the Job?

Options:

MLflow APIs

AutoML APIs

MLflow Client

Jobs cannot be created programmatically

Databricks REST APIs

Buy Now

Questions 16

A machine learning engineer has developed a random forest model using scikit-learn, logged the model using MLflow as random_forest_model, and stored its run ID in the run_id Python variable. They now want to deploy that model by performing batch inference on a Spark DataFrame spark_df.

Which of the following code blocks can they use to create a function called predict that they can use to complete the task?

It is not possible to deploy a scikit-learn model on a Spark DataFrame.

Options:

Option A

Option B

Option C

Option D

Option E

Buy Now

Questions 17

Which of the following tools can assist in real-time deployments by packaging software with its own application, tools, and libraries?

Options:

Cloud-based compute

None of these tools

REST APIs

Containers

Autoscaling clusters

Buy Now

Questions 18

Which of the following is a benefit of logging a model signature with an MLflow model?

Options:

The model will have a unique identifier in the MLflow experiment

The schema of input data can be validated when serving models

The model can be deployed using real-time serving tools

The model will be secured by the user that developed it

The schema of input data will be converted to match the signature

Buy Now

Exam Code: Databricks-Machine-Learning-Professional

Exam Name: Databricks Certified Machine Learning Professional

Last Update: Jun 29, 2025

Questions: 60

Databricks-Machine-Learning-Professional PDF

$29.75 ~~$84.99~~

Add to Cart

Databricks-Machine-Learning-Professional Engine

Databricks-Machine-Learning-Professional Testing Engine

$35 ~~$99.99~~

Add to Cart

Databricks-Machine-Learning-Professional PDF + Engine

Databricks-Machine-Learning-Professional PDF + Testing Engine

$47.25 ~~$134.99~~

Add to Cart

Summer Special Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: geek65

clapgeek logo

Databricks-Machine-Learning-Professional Databricks Certified Machine Learning Professional Questions and Answers

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Databricks-Machine-Learning-Professional PDF

Databricks-Machine-Learning-Professional Testing Engine

Databricks-Machine-Learning-Professional PDF + Testing Engine

Quick Links

Recently New Released Certification Exams

Site Secure