OCI DS MSA2955

COMMENTS

STATISTICS

RECORDS

TAKE THE TEST

Title of test:

OCI DS MSA2955

Description:
OCI DS MSA2955

Author:
MSA2955

Other tests from this author

Creation Date: 2025/10/07

Category: Computers

Number of questions: 50

Rating:

(0)

Share the Test:

Nuevo Comentario

New Comment
NO RECORDS

Content:

You have received machine learning model training code, without clear information about the optimal shape to run the training. How would you proceed to identify the optimal compute shape for your model training that provides a balanced cost and processing time?. Start with a random compute shape and monitor the utilization metrics and time required to finish the model training. Perform model training optimizations and performance tests in advance to identify the right compute shape before running the model training as a job. Start with a smaller shape and monitor the Job Run metrics and time required to complete the model training. Run metrics and time required to complete the model training. Tune the model so that it utilizes as much compute resources as possible, even at an increased cost. Start with a smaller shape and monitor the utilization metrics and time required to complete the model training.

As a data scientist, you are tasked with creating a model training job that is expected to take different hyperparameter values on every run. What is the most efficient way to set those parameters with Oracle Data Science Jobs?. Create a new job every time you need to run your code and pass the parameters as environment variables. Create a new job to expect different parameters as command-line arguments and create a new task each time the code runs. Create a new job by setting the required parameters in the code and create a new task each time the code changes. Create your code to expect different parameters either as environment variables or command-line arguments which are set on every job run with different values.

You have a complex Python code project that could benefit from using Data Science Jobs as it is a repeatable machine learning model training task. The project contains many subfolders and classes. What is the best way to run this project as a Job?. Rewrite your code so that it is a single executable Python or Bash/Shell script file. ZIP the entire code project folder, upload it as a Job artifact on job creation, and set JOB_RUN_ENTRYPOINT to point to the main executable file. ZIP the entire code project folder and upload it as a Job artifact on job creation. Jobs identifies the main executable file automatically. Zip the entire code project folder and upload it as a job artifact. The job will automatically identify the _main_ top-level where the code is run.

You want to use ADSTuner to tune the hyperparameters of a supported model you recently trained. You have just started your search and want to reduce the computational cost as well as access the quality of the model class that you are using. What is the most appropriate search space strategy to choose?. ADSTuner doesn't need a search space to tune the hyperparameters. Perfunctory. Pass a dictionary that defines a search space. Detailed.

You are a data scientist designing an air traffic control model, and you choose to leverage Oracle AutoML. You understand that the Oracle AutoML pipeline consists of multiple stages and automatically operates in a certain sequence. What is the correct sequence for the Oracle AutoML pipeline?. Adaptive sampling, Feature selection, Algorithm selection, hyperparameter tuning. Algorithm selection, Adaptive sampling, Feature selection, hyperparameter tuning. Adaptive sampling, Algorithm selection, Feature selection, hyperparameter tuning. Algorithm selection, Feature selection, Adaptive sampling, hyperparameter tuning.

Using Oracle AutoML, you are tuning hyperparameters on a supported model class and have specified a time budget. AutoML terminates computation once the time budget is exhausted. What would you expect AutoML to return in case the time budget is exhausted before hyperparameter tuning is completed?. A random hyperparameter configuration is returned. The last generated hyperparameter configuration is returned. The current best-known hyperparameter configuration is returned. A hyperparameter configuration with a minimum learning rate is returned.

You are using a git repository that is stored on GitHub to track your notebooks. You are working with another data scientist on the same project but in different notebook sessions. Which two statements are true?. To share your work, you commit it and push it to GitHub. Your coworker can then pull your changes on to their notebook session. It is a best practice that you and your coworker should work in the same branch because you are working on the same project. Once you have staged your changes, you run the git commit command to save a snapshot of the state of your code. Only one of you has to clone the GitHub repo as you can share it. You do not have to clone the GitHub repo as you can commit directly from the notebook session to GitHub.

Which two statements are true about published conda environments?. You can only create a published conda environment by modifying a data science conda environment. In addition to service job run environment variables, conda environment variables can be used in Data Science jobs. They are curated by Oracle Cloud Infrastructure (OCI) data science. The odsc conda init command is used to configure the location of published conda environments. Your notebook session acts as the source to share published conda environments with team members.

You have created a conda environment in your notebook session. This is the first time you are working with published conda environments. You have also created an Object Storage bucket with permission to manage the bucket. Which two commands are required to publish the conda environment?. odsc conda publish --slug. odsc conda list override. odsc conda create --file manifest.yaml. odsc conda init --bucket_namespace --bucket_name. conda activate /home/datascience/conda//.

During a job run, you receive an error message that no space is left on your disk device. To solve the problem, you must increase the size of the job storage. What would be the most efficient way to do this with Data Science Jobs?. On the job run, set the environment variable that helps increase the size of the storage. Edit the job, change the size of the storage of your job, and start a new job run. Create a new job with increased storage size and then run the job. Your code is using too much disk space. Refactor the code to identify the problem.

As a data scientist, you create models for cancer prediction based on mammographic images. The correct identification is very crucial in this case. After evaluating two models, you arrive at the following confusion matrix: Model 1: Test accuracy is 80% and recall is 70%. Model 2: Test accuracy is 75% and recall is 85%. Which model would you prefer and why?. Model 2, because recall is high. Model 2, because recall has more impact on predictions in this use case. Model 1, because recall has lesser impact on predictions in this use case. Model 1, because the test accuracy is high.

You train a model to predict housing prices for your city. Which two metrics from the Accelerated Data Science (ADS) ADSEvaluator class can you use to evaluate the regression model?. Weighted Recall. Explained Variance Score. Weighted Precision. Mean Absolute Error. F-1 Score.

When preparing your model artifact to save it to the oracle cloud infrastructure (OCI) data Science model catalog, you create a score.py file. What is the purpose of the score.py file?. Define the compute scaling strategy. Execute the inference logic code. Configure the deployment infrastructure. Define the inference server dependencies.

You realize that your model deployment is about to reach its utilization limit. What would you do to avoid the issue before requests start to fail?. Update the deployment to add more instances. Reduce the load balancer bandwidth limit so that fewer requests come in. Update the deployment to use a larger virtual machine (more CPUs/memory). Delete the deployment. Update the deployment to use fewer instances.

You are setting up a fine-tuning job for a pre-trained model on Oracle Data Science. You obtain the pre-trained model from Hugging Face, define the training job using the ADS Python API, and specify the OCI bucket. The training script includes downloading the model and dataset. Which of the following steps will be handled automatically by the ADS during the job run?. Setting up the conda environment and installing additional dependencies. Specifying the replica and shape of instances required for the training job. Saving the outputs to OCI Object Storage the training finishes. Fetching the source code from GitHub and checking out the spe commit.

What is the purpose of continuous training in MLOps?. To manually update software systems. To eliminate the need for data validation. To replace DevOps practices. To retrain machine learning models for redeployment.

You want to build a multistep machine learning workflow by using the Oracle Cloud Infrastructure (OCI) Data Science Pipeline feature. How would you configure the conda environment to run a pipeline step?. Use command-line variables. Configure a block volume. Use environmental variables. Configure a compute shape.

You want to ensure that all stdout and stderr from your code are automatically collected and logged, without implementing additional logging in your code. How would you achieve this with Data Science Jobs?. Make sure that your code is using the standard logging library and then store all the logs t ject Storage at the end of the job. You can implement custom logging in your cod using the Data Science Jobs logging service. Create your own log group and use a third-party logging service to capture job run details for log collection and storing. On job creation, enable logging and select a log group. Then, select either a log or the option to enable automatic log creation.

You need to make sure that the model you have deployed using AI Quick Actions is responding with suitable responses. How c the model. By evaluating the model. By de I Quick Actions help here?. By fine-tuning the model. By evaluating the model. By deploying the model.

What detector in PII Operator are you likely to use if you need to obfuscate the detected sensitive information?. Anonymize. Mask. Remove.

What is the sequence of steps you are likely to follow to use OCI Data Science Operator?. Install conda. Initialize operator. Configure operator. Run operator. Check results. Configure operator. Install conda. Initialize operator. Run operator. Check results. Check results. Install conda. Initialize operator. Run operator. Check results. Initialize operator. Install conda. Check results. Configure operator. Run operator.

You have created a Data Science project in ompartment called Development and shared it with a group of collaborators. You now need to move the project to a different compartment called Production after completing the current development iteration. Which statement is correct?. You cannot move a project to a different compartment after it has been created. Moving a project to a different compartment requires deleting all its associated notebook sessions and models first. You can move a project to a different compartment without affecting its associated notebook sessions and models. Moving a project to a different compartment also moves its associated notebook sessions and models to the new compartment.

You are creating an Oracle Cloud Infrastructure (OCI) Data Science job that will run on a recurring basis in a production environment. This job will pick up sensitive data from an Object Storage bucket, train a model, and save it to the model catalog. How would you design the authentication mechanism for the job?. Package your personal OCI config file and keys in the job artifact. Store your personal OCI config file and keys in the Vault, and access the Vault through the job run resource principal. Create a pre-authentication request (PAR) for the object storage bucket and use it in the job code. Use the job run's resource principal as the signer in the job code, ensuring that a dynamic group is created for this job run and has appropriate access permissions to the object storage and model catalog.

You are working as a data scientist for a healthcare company. They decide to analyze the data to find patterns in a large volume of electronic medical records. You are asked to build a PySpark solution to analyze these records in a JupyterLab notebook. What is the order of recommended steps to develop a PySpark application in Oracle Cloud Infrastructure (OCI) Data Science?. Install the Spark conda environment. Configure core-site.xml. Start a notebook session. Cr a dataflow application using the Accelerated Data Science (ADS) SDK. Develop your PySpark application. Launch a notebook session. Launch a notebook session. Install the PySpark conda environment. Configure core-site.xml. Develop your PySpark application. Create a dataflow application using the Accelerated Data Science (ADS) SDK. Launch a notebook session. Configure core-site.xml. Install the PySpark conda environment. Develop a PySpark application. Create a dataflow application using the Accelerated Data Science (ADS) SDK. Configure core-site.xml. Install the PySpark conda environment. Create a dataflow application using the Accelerated Data Science (ADS) SDK. Develop a PySpark application. Launch a notebook session.

A developer wants to create a reusable Spack application template that includes dependencies, default parameters, and runtime specifications. Which component of OCI Data Flow should they use?. Application. Logs. Library. Run.

For your next data science project, you need access to public geospatial images. Which Oracle Cloud service provides free access to those images?. Oracle Big Data Service. Oracle Cloud Infrastructure Data Science. Oracle Analytics Cloud. Oracle Open Data.

As a data scientist, you are working on a global health data set that has data from more than 50 countries. You want to encode three features such as 'countries', 'race' and 'body organ' as categories. Which option would you use to encode the categorical feature?. DataFrameLabelEncoder(). OneHotEncoder(). show_in_notebook(). auto_transform().

What is the key difference between one-host encoding and label encoding in categorical feature transformation?. Label encoding creates binary values, while one-hot encoding assigns numerical values. One-host encoding converts a categorical column into multiple binary columns, while label encoding assigns unique integers. Label encoding is only used for ordinal data, while one-hot encoding is only for nominal data. One-host encoding reduces dimensionality, while label encoding increases it.

Once the LangChain application is deployed to OCI Data Science, what are two ways to invoke it as an endpoint?. Use .predict method or Use CLI. Use CLI or Use .invoke(). Use .invoke() method or Use .predict method.

You are a data scientist trying to load data into your notebook session. You understand that Accelerated Data Science (ADS) SDK supports loading various data formats. Which of the following THREE are ADS supported formats?. Pandas DataFrame. DOCX. JSON. Raw Images. XML.

A data scientist is analyzing customer churn data and wants to visualize the relationship between monthly charges (a continuous variable) and churn status (a categorical variable). What is the best visualization that ADS will likely generate?. A violin plot. A scatterplot. A histogram. A line chart.

You want to make your model more frugal to reduce the cost of collecting and processing data. You plan to do this by removing features that are highly correlated. You would like to create a heat map that displays the correlation so that you can identify candidate features to remove. Which Accelerated Data Science (ADS) SDK method is appropriate to display the comparability between Continuous and Categorical features?. pearson_plot(). corr. correlation_ratio_plot(). cramersv_plot().

Select two reasons why it is important to rotate encryption keys when using Oracle Cloud Infrastructure (OCI) Vault to store credentials or other secrets. Key rotation allows you to encrypt no more than five keys at a time. Periodically rotating keys limits the amount of data encrypted by one key version. Key rotation reduces risk if a key is ever compromised. Periodically rotating keys make it easier to reuse keys. Key rotation improves encryption efficiency.

As a data scientist, you have stored sensitive data in a database. You need to protect this data by using a master encryption algorithm, which uses symmetric keys. Which master encryption algorithm would you choose in the Oracle Cloud Infrastructure (OCI) Vault service?. Elliptical Curve Cryptography Digital Signature Algorithm. Triple Data Encryption Standard Algorithm. Rivest-Shamir-Adleman Keys. Advanced Encryption Standard Keys.

You are a data scientist with a set of text and image files that need to be annotated, and you want to use the Oracle Cloud Infrastructure (OCI) Data Annotation Tool. Which three of the following annotation categories does the tool support?. Semantic Segmentation. Object Detection. Classification (Single/Multi-label). Named Entity Extraction. Keypoints and Labels. Polygon Segmentation.

You are a data scientist and have a large number of legal documents that needs to be classified. You decided to use OCI Data Labeling service to get your data labeled. What are the annotation classes available for annotating documents data using OCI Data Labeling service?. Single, Multiple, Key Value. Single, Multiple, Entity Extraction. Single, Multiple, Object Detection.

You need to build a machine learning workflow that has sequential and parallel steps. You have decided to use the Oracle Cloud Infrastructure (OCI) Data Science Pipeline feature. How is Directed Acyclic Graph (DAG) having sequential and parallel steps built using Pipeline?. Using Pipeline Designer. By running a Pipeline. Using dependencies. Using environmental variables.

Which of the following statements is true regarding metric-based autoscaling in Oracle Data Science model deployment?. The cool-down period starts when the Model Deployment is first created. Only custom metrics can be used for metric-based autoscaling. Metric-based autoscaling relies on performance metrics that are averaged across all instances in the model deployment resource. Multiple metric-based autoscaling policies can be added simultaneously.

You are deploying a machine learning model on Oracle Data Science and decide to use metric-based autoscaling to manage resources efficiently. You set the autoscaling policy to trigger when CPU utilization exceeds 70% for five consecutive monitoring intervals. The cool-down period is set to 10 minutes. During peak usage, the CPU utilization hits 75% for six consecutive intervals, triggering the autoscaling event. What will happen immediately after the autoscaling event is triggered?. The system will immediately trigger another autoscaling event if CPU utilization exceeds 70%. The model deployment will return to its original size after the cool-down period. The cool-down period will prevent any performance metrics from being evaluated. The cool-down period will begin, and no further autoscaling events will be triggered for 10 minutes.

You have built a machine model to predict whether a bank customer is going to default on a loan. You want to use Local Interpretable Model-Agnostic Explanations (LIME) to understand a specific prediction. What is the key idea behind LIME?. Global behavior of a machine learning model may be complex, while the local behavior may be approximated with a simpler surrogate model. Global and local behaviors of machine learning models are similar. Model-agnostic techniques are more interpretable than techniques that are dependent on the types of models. Local explanation techniques are model-agnostic, while global explanation techniques are not.

You want to evaluate the relationship between feature values and target variables. You have a large number of observations having a near uniform distribution and the features are highly correlated. Which model explanation technique should you choose?. Feature Dependence Explanations. Local Interpretable Model-Agnostic Explanations. Accumulated Local Effects. Feature Permutation Importance Explanations.

Six months ago, you created and deployed a model that predicts customer churn for a call center. Initially, it was yielding quality productions. However, over the last two months, users are questioning the credibility of the productions. Which two methods would you employ to verify the accuracy of the model?. Operational monitoring. Validate the model using recent data. Retrain the model. Drift monitoring. Redeploy the model.

You are a data scientist leveraging Oracle Cloud Infrastructure (OCI) Data Science to create a model and need some additional Python libraries for processing genome sequencing data. Which of the following THREE statements are correct with respect to installing additional Python libraries to process the data?. You can install any open-source package available on a publicly accessible Python Package Index (PyPI) repository. OCI Data Science allows root privileges in notebook sessions. You cannot install a library that's not preinstalled in the provided image. You can only install libraries using yum and pip as a normal user. You can install private or custom libraries from your own internal repositories.

You want to write a Python script to create a collection of different projects for your data science team. Which Oracle Cloud Infrastructure (OCI) Data Science interface would you use?. The OCI Software Development Kit (SDK). OCI Console. Command line interface (CLI). Mobile App.

After you have created and opened a notebook session, you want to use the Accelerated Data Science (ADS) SDK to access your data and get started with an exploratory data analysis. From which two places can you access or install the ADS SDK?. Conda environments in Oracle Cloud Infrastructure (OCI) Data Science. Oracle Autonomous Data Warehouse. Oracle Machine Learning (OML). Oracle Big Data Service. Python Package Index (PyPi).

You loaded data into Oracle Cloud Infrastructure (OCI) Data Science. To transform the data, you want to use the Accelerated Data Science (ADS) SDK. When you applied the get_recommendations() tool to the ADSDataset object, it showed you user-detected issues with all the recommended changes to apply to the dataset. Which option should you use to apply all the recommended transformations at once?. auto_transform(). fit_transform(). visualize_transforms(). get_transformed_dataset().

What is the main advantage of using a Resource Principal for authentication in OCI Data Science?. It is only required when using the OCI CLI. It eliminates the need to store credentials manually. It prevents the need for policies in IAM. It provides unrestricted access to all OCI resources.

You are attempting to save a model from a notebook session to the model catalog by using the Accelerated Data Science (ADS) SDK, with resource principal as the authentication signer, and you get a 404-authentication error. Which two should you look for to ensure permissions are set up correctly?. The model artifact is saved to the block volume of the notebook session. The policy for a dynamic group grant manages permissions for the model catalog in this compartment. The policy for your user group grants manages permissions for the model catalog in this compartment. The networking configuration allows access to Oracle Cloud Infrastructure services through a Service Gateway. A dynamic group with matching rules and permissions for the notebook sessions in this compartment.

You are a data scientist working for a utilities company. You have developed an algorithm that detects anomalies from a utility reader in the grid. The size of the model artifact is about 2 GB, and you are trying to store it in the model catalog. Which three interfaces could you use to save the model artifact into the model catalog?. Git CLI. Console. OCI Python SDK. Oracle Cloud Infrastructure (OCI) Command Line Interface (CLI). Accelerated Data Science (ADS) Software Development Kit (SDK). Data Science Continuous Integration (ODSC) CLI.

You are a data scientist working for a manufacturing company. You have developed a forecasting model to predict the sales demand in the upcoming months. You created a model artifact that contained custom logic requiring third-party libraries. When you deployed the model, it failed to run because you did not include all the third-party dependencies in the model artifact. What file should be modified to include the missing libraries?. requirements.txt. score.py. runtime.yaml. model_artifact_validate.py.

Report abuse

▲