Winter Special Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: geek65

DP-203 Data Engineering on Microsoft Azure Questions and Answers

Questions 4

You need to integrate the on-premises data sources and Azure Synapse Analytics. The solution must meet the data integration requirements.

Which type of integration runtime should you use?

Options:

A.

Azure-SSIS integration runtime

B.

self-hosted integration runtime

C.

Azure integration runtime

Buy Now
Questions 5

You need to implement versioned changes to the integration pipelines. The solution must meet the data integration requirements.

In which order should you perform the actions? To answer, move all actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Buy Now
Questions 6

You need to design a data storage structure for the product sales transactions. The solution must meet the sales transaction dataset requirements.

What should you include in the solution? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Buy Now
Questions 7

You need to implement the surrogate key for the retail store table. The solution must meet the sales transaction

dataset requirements.

What should you create?

Options:

A.

a table that has an IDENTITY property

B.

a system-versioned temporal table

C.

a user-defined SEQUENCE object

D.

a table that has a FOREIGN KEY constraint

Buy Now
Questions 8

You need to design an analytical storage solution for the transactional data. The solution must meet the sales transaction dataset requirements.

What should you include in the solution? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Buy Now
Questions 9

You need to implement an Azure Synapse Analytics database object for storing the sales transactions data. The solution must meet the sales transaction dataset requirements.

What solution must meet the sales transaction dataset requirements.

What should you do? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Buy Now
Questions 10

You need to design a data ingestion and storage solution for the Twitter feeds. The solution must meet the customer sentiment analytics requirements.

What should you include in the solution? To answer, select the appropriate options in the answer area

NOTE: Each correct selection b worth one point.

Options:

Buy Now
Questions 11

What should you recommend using to secure sensitive customer contact information?

Options:

A.

data labels

B.

column-level security

C.

row-level security

D.

Transparent Data Encryption (TDE)

Buy Now
Questions 12

What should you recommend to prevent users outside the Litware on-premises network from accessing the analytical data store?

Options:

A.

a server-level virtual network rule

B.

a database-level virtual network rule

C.

a database-level firewall IP rule

D.

a server-level firewall IP rule

Buy Now
Questions 13

You have an Azure Data Factory pipeline that has the activities shown in the following exhibit.

Use the drop-down menus to select the answer choice that completes each statement based on the information presented in the graphic.

NOTE: Each correct selection is worth one point.

Options:

Buy Now
Questions 14

You have an Azure Synapse Analytics dedicated SQL pool named Pool1 and a database named DB1. DB1 contains a fact table named Table1.

You need to identify the extent of the data skew in Table1.

What should you do in Synapse Studio?

Options:

A.

Connect to the built-in pool and query sysdm_pdw_sys_info.

B.

Connect to Pool1 and run DBCC CHECKALLOC.

C.

Connect to the built-in pool and run DBCC CHECKALLOC.

D.

Connect to Pool! and query sys.dm_pdw_nodes_db_partition_stats.

Buy Now
Questions 15

Which Azure Data Factory components should you recommend using together to import the daily inventory data from the SQL server to Azure Data Lake Storage? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Buy Now
Questions 16

What should you do to improve high availability of the real-time data processing solution?

Options:

A.

Deploy identical Azure Stream Analytics jobs to paired regions in Azure.

B.

Deploy a High Concurrency Databricks cluster.

C.

Deploy an Azure Stream Analytics job and use an Azure Automation runbook to check the status of the job and to start the job if it stops.

D.

Set Data Lake Storage to use geo-redundant storage (GRS).

Buy Now
Questions 17

You have an Azure subscription that contains a storage account. The account contains a blob container named blob1 and an Azure Synapse Analytic serve-less SQL pool

You need to Query the CSV files stored in blob1. The solution must ensure that all the files in a (older named csv and all its subfolders are queried

How should you complete the query? to answer, select the appropriate options in the answer area

NOTE: Each correct selection is worth one point.

Options:

Buy Now
Questions 18

You have an Azure Data Factory that contains 10 pipelines.

You need to label each pipeline with its main purpose of either ingest, transform, or load. The labels must be available for grouping and filtering when using the monitoring experience in Data Factory.

What should you add to each pipeline?

Options:

A.

a resource tag

B.

a correlation ID

C.

a run group ID

D.

an annotation

Buy Now
Questions 19

You have an Azure Synapse Analytics dedicated SQL pod.

You need to create a pipeline that will execute a stored procedure in the dedicated SQL pool and use the returned result set as the input (or a downstream activity. The solution must minimize development effort.

Which Type of activity should you use in the pipeline?

Options:

A.

Notebook

B.

U-SQL

C.

Script

D.

Stored Procedure

Buy Now
Questions 20

You are designing a monitoring solution for a fleet of 500 vehicles. Each vehicle has a GPS tracking device that sends data to an Azure event hub once per minute.

You have a CSV file in an Azure Data Lake Storage Gen2 container. The file maintains the expected geographical area in which each vehicle should be.

You need to ensure that when a GPS position is outside the expected area, a message is added to another event hub for processing within 30 seconds. The solution must minimize cost.

What should you include in the solution? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Buy Now
Questions 21

You are responsible for providing access to an Azure Data Lake Storage Gen2 account.

Your user account has contributor access to the storage account, and you have the application ID and access key.

You plan to use PolyBase to load data into an enterprise data warehouse in Azure Synapse Analytics.

You need to configure PolyBase to connect the data warehouse to storage account.

Which three components should you create in sequence? To answer, move the appropriate components from the list of components to the answer area and arrange them in the correct order.

Options:

Buy Now
Questions 22

Note: This question it part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You have an Azure Data Lake Storage account that contains a staging zone.

You need to design a daily process to ingest incremental data *rom the staging zone, transform the data by executing an R script and then insert the transformed data into a data warehouse in Azure Synapse Analytics.

Solution: You use an Azure Data Factory schedule trigger to execute a pipeline that executes a mapping data flow, and then inserts the data into the data warehouse.

Does this meet the goal?

Options:

A.

Yes

B.

NO

Buy Now
Questions 23

You have a SQL pool in Azure Synapse.

A user reports that queries against the pool take longer than expected to complete.

You need to add monitoring to the underlying storage to help diagnose the issue.

Which two metrics should you monitor? Each correct answer presents part of the solution.

NOTE: Each correct selection is worth one point.

Options:

A.

Cache used percentage

B.

DWU Limit

C.

Snapshot Storage Size

D.

Active queries

E.

Cache hit percentage

Buy Now
Questions 24

You are designing an Azure Synapse Analytics dedicated SQL pool.

You need to ensure that you can audit access to Personally Identifiable information (PII).

What should you include in the solution?

Options:

A.

dynamic data masking

B.

row-level security (RLS)

C.

sensitivity classifications

D.

column-level security

Buy Now
Questions 25

You need to build a solution to ensure that users can query specific files in an Azure Data Lake Storage Gen2 account from an Azure Synapse Analytics serverless SQL pool.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

NOTE: More than one order of answer choices is correct. You will receive credit for any of the correct orders you select.

Options:

Buy Now
Questions 26

You have an Azure Storage account that generates 200,000 new files daily. The file names have a format of {YYYY}/{MM}/{DD}/{HH}/{CustomerID}.csv.

You need to design an Azure Data Factory solution that will load new data from the storage account to an Azure Data Lake once hourly. The solution must minimize load times and costs.

How should you configure the solution? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Buy Now
Questions 27

You have an Azure Synapse Analytics dedicated SQL pool named pool1.

You plan to implement a star schema in pool1 and create a new table named DimCustomer by using the following code.

You need to ensure that DimCustomer has the necessary columns to support a Type 2 slowly changing dimension (SCD). Which two columns should you add? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.

Options:

A.

[HistoricalSalesPerson] [nvarchar] (256) NOT NULL

B.

[EffectiveEndDate] [datetime] NOT NULL

C.

[PreviousModifiedDate] [datetime] NOT NULL

D.

[RowID] [bigint] NOT NULL

E.

[EffectiveStartDate] [datetime] NOT NULL

Buy Now
Questions 28

You have an Azure subscription that contains an Azure Cosmos DB database. Azure Synapse Link is implemented on the database. You configure a full fidelity schema for the analytical store. You perform the following actions:

How many columns will the analytical store contain?

Options:

A.

1

B.

2

C.

3

D.

4

Buy Now
Questions 29

You are building an Azure Stream Analytics job to retrieve game data.

You need to ensure that the job returns the highest scoring record for each five-minute time interval of each game.

How should you complete the Stream Analytics query? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Buy Now
Questions 30

You have a C# application that process data from an Azure IoT hub and performs complex transformations.

You need to replace the application with a real-time solution. The solution must reuse as much code as

possible from the existing application.

Options:

A.

Azure Databricks

B.

Azure Event Grid

C.

Azure Stream Analytics

D.

Azure Data Factory

Buy Now
Questions 31

You have an Azure subscription that contains an Azure Synapse Analytics workspace named Workspaces a Log Analytics workspace named Workspace2, and an Azure Data Lake Storage Gen2 container named Container1.

Workspace1 contains an Apache Spark job named Job1 that writes data to Container1. Workspace1 sends diagnostics to Workspace2.

From Synapse Studio, you submit Job1.

What should you use to review the LogQuery output of the job?

Options:

A.

the files in the result subfolder of Container1

B.

the Spark monitoring URL returned after Job1 is submitted

C.

a table in Workspace2

D.

the Apache Spark applications option on the Monitor tab

Buy Now
Questions 32

A company uses Azure Stream Analytics to monitor devices.

The company plans to double the number of devices that are monitored.

You need to monitor a Stream Analytics job to ensure that there are enough processing resources to handle the additional load.

Which metric should you monitor?

Options:

A.

Early Input Events

B.

Late Input Events

C.

Watermark delay

D.

Input Deserialization Errors

Buy Now
Questions 33

You have an Azure Synapse Analytics dedicated SQL pool that hosts a database named DB1 You need to ensure that D81 meets the following security requirements:

• When credit card numbers show in applications, only the last four digits must be visible.

• Tax numbers must be visible only to specific users.

What should you use for each requirement? To answer, select the appropriate options in the answer area

NOTE: Each correct selection is worth one point.

Options:

Buy Now
Questions 34

You are building an Azure Stream Analytics job to identify how much time a user spends interacting with a feature on a webpage.

The job receives events based on user actions on the webpage. Each row of data represents an event. Each event has a type of either 'start' or 'end'.

You need to calculate the duration between start and end events.

How should you complete the query? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Buy Now
Questions 35

You are designing a real-time dashboard solution that will visualize streaming data from remote sensors that connect to the internet. The streaming data must be aggregated to show the average value of each 10-second interval. The data will be discarded after being displayed in the dashboard.

The solution will use Azure Stream Analytics and must meet the following requirements:

    Minimize latency from an Azure Event hub to the dashboard.

    Minimize the required storage.

    Minimize development effort.

What should you include in the solution? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point

Options:

Buy Now
Questions 36

You have an Azure Synapse Analytics serverless SQL pool named Pool1 and an Azure Data Lake Storage Gen2 account named storage1. The AllowedBlobpublicAccess porperty is disabled for storage1.

You need to create an external data source that can be used by Azure Active Directory (Azure AD) users to access storage1 from Pool1.

What should you create first?

Options:

A.

an external resource pool

B.

a remote service binding

C.

database scoped credentials

D.

an external library

Buy Now
Questions 37

You have a trigger in Azure Data Factory configured as shown in the following exhibit.

Use the drop-down menus to select the answer choice that completes each statement based upon the information presented in the graphic.

Options:

Buy Now
Questions 38

You configure version control for an Azure Data Factory instance as shown in the following exhibit.

Use the drop-down menus to select the answer choice that completes each statement based on the information presented in the graphic.

NOTE: Each correct selection is worth one point.

Options:

Buy Now
Questions 39

You plan to create an Azure Synapse Analytics dedicated SQL pool.

You need to minimize the time it takes to identify queries that return confidential information as defined by the company's data privacy regulations and the users who executed the queues.

Which two components should you include in the solution? Each correct answer presents part of the solution.

NOTE: Each correct selection is worth one point.

Options:

A.

sensitivity-classification labels applied to columns that contain confidential information

B.

resource tags for databases that contain confidential information

C.

audit logs sent to a Log Analytics workspace

D.

dynamic data masking for columns that contain confidential information

Buy Now
Questions 40

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You are designing an Azure Stream Analytics solution that will analyze Twitter data.

You need to count the tweets in each 10-second window. The solution must ensure that each tweet is counted only once.

Solution: You use a session window that uses a timeout size of 10 seconds.

Does this meet the goal?

Options:

A.

Yes

B.

No

Buy Now
Questions 41

You have an enterprise data warehouse in Azure Synapse Analytics named DW1 on a server named Server1.

You need to verify whether the size of the transaction log file for each distribution of DW1 is smaller than 160 GB.

What should you do?

Options:

A.

On the master database, execute a query against the sys.dm_pdw_nodes_os_performance_counters dynamic management view.

B.

From Azure Monitor in the Azure portal, execute a query against the logs of DW1.

C.

On DW1, execute a query against the sys.database_files dynamic management view.

D.

Execute a query against the logs of DW1 by using the

Get-AzOperationalInsightSearchResult PowerShell cmdlet.

Buy Now
Questions 42

You have two fact tables named Flight and Weather. Queries targeting the tables will be based on the join between the following columns.

You need to recommend a solution that maximum query performance.

What should you include in the recommendation?

Options:

A.

In each table, create a column as a composite of the other two columns in the table.

B.

In each table, create an IDENTITY column.

C.

In the tables, use a hash distribution of ArriveDateTime and ReportDateTime.

D.

In the tables, use a hash distribution of ArriveAirPortID and AirportID.

Buy Now
Questions 43

You are designing a dimension table for a data warehouse. The table will track the value of the dimension attributes over time and preserve the history of the data by adding new rows as the data changes.

Which type of slowly changing dimension (SCD) should use?

Options:

A.

Type 0

B.

Type 1

C.

Type 2

D.

Type 3

Buy Now
Questions 44

You are designing an Azure Databricks interactive cluster. The cluster will be used infrequently and will be configured for auto-termination.

You need to ensure that the cluster configuration is retained indefinitely after the cluster is terminated. The solution must minimize costs.

What should you do?

Options:

A.

Clone the cluster after it is terminated.

B.

Terminate the cluster manually when processing completes.

C.

Create an Azure runbook that starts the cluster every 90 days.

D.

Pin the cluster.

Buy Now
Questions 45

A company plans to use Apache Spark analytics to analyze intrusion detection data.

You need to recommend a solution to analyze network and system activity data for malicious activities and policy violations. The solution must minimize administrative efforts.

What should you recommend?

Options:

A.

Azure Data Lake Storage

B.

Azure Databricks

C.

Azure HDInsight

D.

Azure Data Factory

Buy Now
Questions 46

You are designing an Azure Synapse Analytics workspace.

You need to recommend a solution to provide double encryption of all the data at rest.

Which two components should you include in the recommendation? Each coned answer presents part of the solution

NOTE: Each correct selection is worth one point.

Options:

A.

an X509 certificate

B.

an RSA key

C.

an Azure key vault that has purge protection enabled

D.

an Azure virtual network that has a network security group (NSG)

E.

an Azure Policy initiative

Buy Now
Questions 47

You are designing an Azure Synapse Analytics dedicated SQL pool.

Groups will have access to sensitive data in the pool as shown in the following table.

You have policies for the sensitive data. The policies vary be region as shown in the following table.

You have a table of patients for each region. The tables contain the following potentially sensitive columns.

You are designing dynamic data masking to maintain compliance.

For each of the following statements, select Yes if the statement is true. Otherwise, select No.

NOTE: Each correct selection is worth one point.

Options:

Buy Now
Questions 48

You are planning a streaming data solution that will use Azure Databricks. The solution will stream sales transaction data from an online store. The solution has the following specifications:

* The output data will contain items purchased, quantity, line total sales amount, and line total tax amount.

* Line total sales amount and line total tax amount will be aggregated in Databricks.

* Sales transactions will never be updated. Instead, new rows will be added to adjust a sale.

You need to recommend an output mode for the dataset that will be processed by using Structured Streaming. The solution must minimize duplicate data.

What should you recommend?

Options:

A.

Append

B.

Update

C.

Complete

Buy Now
Questions 49

You have an Azure Data Lake Storage Gen2 account named account1 that stores logs as shown in the following table.

You do not expect that the logs will be accessed during the retention periods.

You need to recommend a solution for account1 that meets the following requirements:

    Automatically deletes the logs at the end of each retention period

    Minimizes storage costs

What should you include in the recommendation? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Buy Now
Exam Code: DP-203
Exam Name: Data Engineering on Microsoft Azure
Last Update: Jan 18, 2025
Questions: 351
DP-203 pdf

DP-203 PDF

$33.25  $94.99
DP-203 Engine

DP-203 Testing Engine

$38.5  $109.99
DP-203 PDF + Engine

DP-203 PDF + Testing Engine

$50.75  $144.99