A bank wants to use an algorithm to determine which applicants should be given a loan. The bank hires a data scientist to construct a logistic regression model to predict whether the applicant will repay the loan or not. The bank has enough data on past customers to randomly split the data into a training data set and a test/validation data set. A logistic regression model is constructed on the training data set using the following independent variables:
Gender
Marital status
Number of dependents
Education
Income
Loan amount
Loan term
Credit score
The model reveals that those with higher credit scores and larger total incomes are more likely to repay their loans. The data scientist has suggested that there might be bias present in the model based on previous models created for other banks.
Given this information, what is the best test approach to check for potential bias in the model?
“BioSearch” is creating an Al model used for predicting cancer occurrence via examining X-Ray images. The accuracy of the model in isolation has been found to be good. However, the users of the model started complaining of the poor quality of results, especially inability to detect real cancer cases, when put to practice in the diagnosis lab, leading to stopping of the usage of the model.
A testing expert was called in to find the deficiencies in the test planning which led to the above scenario.
Which ONE of the following options would you expect to MOST likely be the reason to be discovered by the test expert?
SELECT ONE OPTION
Which of the following characteristics of AI-based systems make it more difficult to ensure they are safe?
Which ONE of the following tests is MOST likely to describe a useful test to help detect different kinds of biases in ML pipeline?
SELECT ONE OPTION
An airline has created a ML model to project fuel requirements for future flights. The model imports weather data such as wind speeds and temperatures, calculates flight routes based on historical routings from air traffic control, and estimates loads from average passenger and baggage weights. The model performed within an acceptable standard for the airline throughout the summer but as winter set in the load weights became less accurate. After some exploratory data analysis it became apparent that luggage weights were higher in the winter than in summer.
Which of the following statements BEST describes the problem and how it could have been prevented?
Which ONE of the following options does NOT describe a challenge for acquiring test data in ML systems?
SELECT ONE OPTION
A beer company is trying to understand how much recognition its logo has in the market. It plans to do that by monitoring images on various social media platforms using a pre-trained neural network for logo detection. This particular model has been trained by looking for words, as well as matching colors on social media images. The company logo has a big word across the middle with a bold blue and magenta border.
Which associated risk is most likely to occur when using this pre-trained model?
Arihant Meditation is a startup using Al to aid people in deeper and better meditation based on analysis of various factors such as time and duration of the meditation, pulse and blood pressure, EEG patters etc. among others. Their model accuracy and other functional performance parameters have not yet reached their desired level.
Which ONE of the following factors is NOT a factor affecting the ML functional performance?
SELECT ONE OPTION
Which ONE of the following is the BEST option to optimize the regression test selection and prevent the regression suite from growing large?
SELECT ONE OPTION
A company is using a spam filter to attempt to identify which emails should be marked as spam. Detection rules are created by the filter that causes a message to be classified as spam. An attacker wishes to have all messages internal to the company be classified as spam. So, the attacker sends messages with obvious red flags in the body of the email and modifies the from portion of the email to make it appear that the emails have been sent by company members. The testers plan to use exploratory data analysis (EDA) to detect the attack and use this information to prevent future adversarial attacks.
How could EDA be used to detect this attack?
Upon testing a model used to detect rotten tomatoes, the following data was observed by the test engineer, based on certain number of tomato images.
For this confusion matrix which combinations of values of accuracy, recall, and specificity respectively is CORRECT?
SELECT ONE OPTION
Which ONE of the following characteristics is the least likely to cause safety related issues for an Al system?
SELECT ONE OPTION
Consider a natural language processing (NLP) algorithm that attempts to predict the next word that you would like to type in a text message. An update to the algorithm has been created that should increase the accuracy of the predictions based on user typing patterns. The old algorithm was rated for accuracy by the users. Then, after the new update was released, the users rated the updated algorithm. A statistical test was used to compare between the two versions of the algorithm to see whether or not the update should remain in place.
This is an example of what type of testing?
A team of software testers is attempting to create an AI algorithm to assist in software testing. This particular team has gone through over 40 iterations of testing and cannot afford to spend as much time as it takes to run the full regression test suite. They are hoping to have the algorithm reduce the amount of testing required thus reducing the time needed for each testing cycle.
How can an AI-based tool be expected to assist in this reduction?
"Splendid Healthcare" has started developing a cancer detection system based on ML. The type of cancer they plan on detecting has 2% prevalence rate in the population of a particular geography. It is required that the model performs well for both normal and cancer patients.
Which ONE of the following combinations requires MAXIMIZATION?
SELECT ONE OPTION
An image classification system is being trained for classifying faces of humans. The distribution of the data is 70% ethnicity A and 30% for ethnicities B, C and D. Based ONLY on the above information, which of the following options BEST describes the situation of this image classification system?
SELECT ONE OPTION
A tourist calls an airline to book a ticket and is connected with an automated system which is able to recognize speech, understand requests related to purchasing a ticket, and provide relevant travel options. When the tourist asks about the expected weather at the destination or potential impacts on operations because of the tight labor market the only response from the automated system is: "Idon't understand your question."
This AI system should be categorized as?
Which ONE of the following options describes a scenario of A/B testing the LEAST?
SELECT ONE OPTION
"AllerEgo" is a product that uses sell-learning to predict the behavior of a pilot under combat situation for a variety of terrains and enemy aircraft formations. Post training the model was exposed to the real-
world data and the model was found to be behaving poorly. A lot of data quality tests had been performed on the data to bring it into a shape fit for training and testing.
Which ONE of the following options is least likely to describes the possible reason for the fall in the performance, especially when considering the self-learning nature of the Al system?
SELECT ONE OPTION
The difficulty of defining criteria for improvement before the model can be accepted.
The fast pace of change did not allow sufficient time for testing.
The unknown nature and insufficient specification of the operating environment might have caused the poor performance.
Pairwise testing can be used in the context of self-driving cars for controlling an explosion in the number of combinations of parameters.
Which ONE of the following options is LEAST likely to be a reason for this incredible growth of parameters?
SELECT ONE OPTION