Amazon MLS-C01 Exam Questions - Real Practice Questions for Guaranteed Success

Question 1

A machine learning (ML) specialist is training a linear regression model. The specialist notices that the model is overfitting. The specialist applies an L1 regularization parameter and runs the model again. This change results in all features having zero weights.

What should the ML specialist do to improve the model results?

AIncrease the L1 regularization parameter. Do not change any other training parameters.

BDecrease the L1 regularization parameter. Do not change any other training parameters.

CIntroduce a large L2 regularization parameter. Do not change the current L1 regularization value.

DIntroduce a small L2 regularization parameter. Do not change the current L1 regularization value.

Correct : B

Applying L1 regularization encourages sparsity by penalizing weights directly, often driving many weights to zero. In this case, the ML specialist observes that all weights become zero, which suggests that the L1 regularization parameter is set too high. This high value overly penalizes non-zero weights, effectively removing all features from the model.

To improve the model, the ML specialist should reduce the L1 regularization parameter, allowing some features to retain non-zero weights. This adjustment will make the model less prone to excessive sparsity, allowing it to better capture essential patterns in the data without dropping all features. Introducing L2 regularization is another approach but may not directly resolve this specific issue of all-zero weights as effectively as reducing L1.

Options Selected by Other Users:

Mark Question:

Start a Discussions

Submit Your Answer:

AIncrease the L1 regularization parameter. Do not change any other training parameters.

BDecrease the L1 regularization parameter. Do not change any other training parameters.

CIntroduce a large L2 regularization parameter. Do not change the current L1 regularization value.

DIntroduce a small L2 regularization parameter. Do not change the current L1 regularization value.

0 / 1500

Question 2

A machine learning (ML) specialist at a retail company must build a system to forecast the daily sales for one of the company's stores. The company provided the ML specialist with sales data for this store from the past 10 years. The historical dataset includes the total amount of sales on each day for the store. Approximately 10% of the days in the historical dataset are missing sales data.

The ML specialist builds a forecasting model based on the historical dataset. The specialist discovers that the model does not meet the performance standards that the company requires.

Which action will MOST likely improve the performance for the forecasting model?

AAggregate sales from stores in the same geographic area.

BApply smoothing to correct for seasonal variation.

CChange the forecast frequency from daily to weekly.

DReplace missing values in the dataset by using linear interpolation.

Correct : D

When forecasting sales data, missing values can significantly impact model accuracy, especially for time series models. Approximately 10% of the days in this dataset lack sales data, which may cause gaps in patterns and disrupt seasonal trends. Linear interpolation is an effective technique for estimating and filling in missing data points based on adjacent known values, thus preserving the continuity of the time series.

By interpolating the missing values, the ML specialist can provide the model with a more complete and consistent dataset, potentially enhancing performance. This approach maintains the daily data granularity, which is important for accurately capturing trends at that frequency.

Options Selected by Other Users:

Mark Question:

Start a Discussions

Submit Your Answer:

AAggregate sales from stores in the same geographic area.

BApply smoothing to correct for seasonal variation.

CChange the forecast frequency from daily to weekly.

DReplace missing values in the dataset by using linear interpolation.

0 / 1500

Question 3

A company plans to build a custom natural language processing (NLP) model to classify and prioritize user feedback. The company hosts the data and all machine learning (ML) infrastructure in the AWS Cloud. The ML team works from the company's office, which has an IPsec VPN connection to one VPC in the AWS Cloud.

The company has set both the enableDnsHostnames attribute and the enableDnsSupport attribute of the VPC to true. The company's DNS resolvers point to the VPC DNS. The company does not allow the ML team to access Amazon SageMaker notebooks through connections that use the public internet. The connection must stay within a private network and within the AWS internal network.

Which solution will meet these requirements with the LEAST development effort?

ACreate a VPC interface endpoint for the SageMaker notebook in the VPC. Access the notebook through a VPN connection and the VPC endpoint.

BCreate a bastion host by using Amazon EC2 in a public subnet within the VPC. Log in to the bastion host through a VPN connection. Access the SageMaker notebook from the bastion host.

CCreate a bastion host by using Amazon EC2 in a private subnet within the VPC with a NAT gateway. Log in to the bastion host through a VPN connection. Access the SageMaker notebook from the bastion host.

DCreate a NAT gateway in the VPC. Access the SageMaker notebook HTTPS endpoint through a VPN connection and the NAT gateway.

Correct : A

In this scenario, the company requires that access to the Amazon SageMaker notebook remain within the AWS internal network, avoiding the public internet. By creating a VPC interface endpoint for SageMaker, the company can ensure that traffic to the SageMaker notebook remains internal to the VPC and is accessible over a private connection. The VPC interface endpoint allows private network access to AWS services, and it operates over AWS's internal network, respecting the security and connectivity policies the company requires.

This solution requires minimal development effort compared to options involving bastion hosts or NAT gateways, as it directly provides private network access to the SageMaker notebook.

Options Selected by Other Users:

Mark Question:

Start a Discussions

Submit Your Answer:

ACreate a VPC interface endpoint for the SageMaker notebook in the VPC. Access the notebook through a VPN connection and the VPC endpoint.

BCreate a bastion host by using Amazon EC2 in a public subnet within the VPC. Log in to the bastion host through a VPN connection. Access the SageMaker notebook from the bastion host.

DCreate a NAT gateway in the VPC. Access the SageMaker notebook HTTPS endpoint through a VPN connection and the NAT gateway.

0 / 1500

Question 4

A bank has collected customer data for 10 years in CSV format. The bank stores the data in an on-premises server. A data science team wants to use Amazon SageMaker to build and train a machine learning (ML) model to predict churn probability. The team will use the historical dat

a. The data scientists want to perform data transformations quickly and to generate data insights before the team builds a model for production.

Which solution will meet these requirements with the LEAST development effort?

AUpload the data into the SageMaker Data Wrangler console directly. Perform data transformations and generate insights within Data Wrangler.

BUpload the data into an Amazon S3 bucket. Allow SageMaker to access the data that is in the bucket. Import the data from the S3 bucket into SageMaker Data Wrangler. Perform data transformations and generate insights within Data Wrangler.

CUpload the data into the SageMaker Data Wrangler console directly. Allow SageMaker and Amazon QuickSight to access the data that is in an Amazon S3 bucket. Perform data transformations in Data Wrangler and save the transformed data into a second S3 bucket. Use QuickSight to generate data insights.

DUpload the data into an Amazon S3 bucket. Allow SageMaker to access the data that is in the bucket. Import the data from the bucket into SageMaker Data Wrangler. Perform data transformations in Data Wrangler. Save the data into a second S3 bucket. Use a SageMaker Studio notebook to generate data insights.

Correct : B

To prepare and transform historical data efficiently with minimal setup, Amazon SageMaker Data Wrangler is the optimal tool. Data Wrangler simplifies data preprocessing and exploratory data analysis (EDA) by providing a graphical interface for transformations and insights. By first uploading the CSV data to Amazon S3, the data becomes easily accessible to SageMaker and can be imported directly into Data Wrangler.

Once in Data Wrangler, the team can perform required data transformations and generate insights in a single workflow, avoiding the need for additional tools like Amazon QuickSight or further notebook configuration. This approach provides the simplest and most integrated solution for the data science team.

Options Selected by Other Users:

Mark Question:

Start a Discussions

Submit Your Answer:

AUpload the data into the SageMaker Data Wrangler console directly. Perform data transformations and generate insights within Data Wrangler.

0 / 1500

Question 5

A machine learning (ML) engineer is preparing a dataset for a classification model. The ML engineer notices that some continuous numeric features have a significantly greater value than most other features. A business expert explains that the features are independently informative and that the dataset is representative of the target distribution.

After training, the model's inferences accuracy is lower than expected.

Which preprocessing technique will result in the GREATEST increase of the model's inference accuracy?

ANormalize the problematic features.

BBootstrap the problematic features.

CRemove the problematic features.

DExtrapolate synthetic features.

Correct : A

In a classification model, features with significantly larger scales can dominate the model training process, leading to poor performance. Normalization scales the values of continuous features to a uniform range, such as [0, 1], which prevents large-value features from disproportionately influencing the model. This is particularly beneficial for algorithms sensitive to the scale of input data, such as neural networks or distance-based algorithms.

Given that the problematic features are informative and representative of the target distribution, removing or bootstrapping these features is not advisable. Normalization will bring all features to a similar scale and improve the model's inference accuracy without losing important information.

Options Selected by Other Users:

Mark Question:

Start a Discussions

Submit Your Answer:

ANormalize the problematic features.

BBootstrap the problematic features.

CRemove the problematic features.

DExtrapolate synthetic features.

0 / 1500

Master Amazon MLS-C01 Exam with Reliable Practice Questions

Options Selected by Other Users:

Options Selected by Other Users:

Options Selected by Other Users:

Options Selected by Other Users:

Options Selected by Other Users: