Course Name:- Data Science Tools
Module 1. Language of Data Science
Question 1.Which of the following statements is true?
- 80% of data scientists worldwide use Python.
- Python is useful for AI, machine learning, web development, and IoT.
- Python is the most popular language in data science.
- Keras, Scikit-learn, Matplotlib, Pandas, and TensorFlow are all built with Python.
- All of the above
Question 2. Which of the following are SQL databases? (Select all that apply.)
- MongoDB
- MariaDB
- MySQL
- PostgreSQL
- CouchDB
- Oracle
Question 3. Which statements are true about Open Source and Free Software? (Select all that apply.)
- Free Software and Open Source can be used interchangeably.
- Free Software can always be run, studied, modified and redistributed with or without changes.
- Most of Free Software licenses also qualify for Open Source.
- Open Source Software can be modified without sharing the modified source code depending on the Open Source license.
Question 4. Is the following statement true or false: “R integrates well with other computer languages like C++, Java, C, .Net and Python.”
- True
- False
Question 5. Which of the following languages can be used for data science?
- SQL
- Java
- Julia
- R
- Scala
- Javascript
- All of the above
Question 6.Which of the following is used to make Artificial intelligence and Machine Learning possible? (Select all that apply.)
- Oracle
- PyTorch
- TensorFlow.js
- Apache Spark
- GNU
- Caffe
Module 2. Data Science Tools
Question 1.Which of the following are common tasks in data science?
- Data Management
- Model Deployment
- Data Integration and Transformation
- Data Visualization
- Model Monitoring and Assessment
- Model Building
- All of the above
Question 2. Which of the following are data management tools? (Select all that apply.)
- GitHub
- MySQL
- PostgreSQL
- KubeFlow
- PixieDust
Question 3. Which of the following are Data Integration and Transformation tools? (Select all that apply.)
- Cassandra
- Apache Kafka
- Apache Nifi
- Apache AirFlow
- Ceph
Question 4. Which statement about JupyterLab is correct?
- JuypterLab can run R and Python code in addition to other programming languages.
- JuypterLab can run R code only.
- JuypterLab can run R and Python code only.
- JuypterLab can run Python code only.
Question 5. Which statement about RStudio is correct?
- RStudio is the primary choice for web development.
- RStudio is the primary choice for development in the Python programming language.
- RStudio is the primary choice for development in the R programming language.
Question 6. Which statements about IBM Watson Studio and OpenScale are correct? (Select all that apply.)
- Watson Studio together with Watson OpenScale is a database management system.
- Watson Studio together with Watson OpenScale covers the complete development life cycle for all data science, machine learning and AI tasks.
- Watson Studio together with Watson OpenScale is available as a Cloud offering as well as a package running on top of Kubernetes/RedHat OpenShift in a local data center called IBM Cloud Pak for Data.
Module 3. Packages,APIs, Datasets and Models
Question 1. Which scientific computing library provides data structures and data analysis tools for Python?
- Seahorse
- YumPies
- Pandas
- TensorFlow
Question 2. What does the acronym API stand for?
- Abstract Programming Interface
- Abstract Python Interface
- Algorithmic Programming Interface
- Application Programming Interface
Question 3. True or False: Open data is always distributed under a Community Data License Agreement.
- True
- False
Question 4. Which of the following is not a type of Machine Learning?
- Reinforcement learning
- Supervised teaching
- Unsupervised learning
- Supervised learning
Question 5. Which of the following is NOT a deep learning framework?
- Keras
- PyTorch
- TensorFlow
- Tommy
Question 6. Fill in the blank: The MAX model-serving microservices expose a _________________ that applications use to consume a model.
- Python API
- Java API
- REST API
- Scala API
Module 4. GitHub
Question 1. Which of the following statements are true?
- Git is an integrated development environment for data science.
- Git is a system for version control of source code.
- Git is very useful for data science as well since data science often involves a lot of source code to be written and managed.
Question 2. Which of the following statements about repositories are correct? (Select all that apply.)
- The remote repository is only accessible by myself.
- The local repository is only accessible by myself.
- The staging is only accessible by myself.
- The remote repository is accessible by all contributors.
- The local repository is accessible by all contributors.
Question 3. What is the best process contributing a bugfix to a foreign repository?
- Fork the repository, update the fork and create a pull request.
- Send the fix via email to the author.
- Ask the repository owner for write access to the repository.
Module 5. Jupyter Notebooks and Jupyter Lab
Question 1. Which of the following functions do Jupyter Notebooks unify?
- Editing and display of documentation
- Visualization of charts
- Editing and execution of source code
- All of the above
Question 2. Which statement is true about Jupyter Notebooks?
- Jupyter Notebooks are a commercial product of IBM.
- Jupyter Notebooks are free and open source.
Question 3. What is a Jupyter Notebook kernel?
Module 6. RStudio IDE
- It is part of the operating system the Jupyter server runs on.
- It is a wrapper running on the Jupyter server encapsulating the programming language interpreter.
Question 1.Which of the following functions does RStudio unify? (Select all that apply.)
- Storing of data.
- Editing and execution of source code.
- Display of the R Console.
- Visualization of plots.
- Visualization of data in table form.
Question 2. Which statement is true about the R Studio IDE?
- RStudio is a commercial product of IBM.
- RStudio is free and open source.
Question 3.Which statement about R packages is correct?
- R doesn’t require any packages to be installed since it contains all functionality necessary which a data scientists ever requires.
- R currently supports more than 15,000 packages which can be installed to extend R’s functionality.
Module 7. Watson Studio
Question 1. Fill in the blank: In Watson Studio, a ____________ is how you organize your resources to achieve a particular goal. Resources can include data, collaborators, and analytic assets like notebooks and models.
- Asset
- Job
- Notebook
- Project
Question 2. Fill in the blank: It’s a best practice to remove or replace _____________ before publishing to GitHub.
- Charts
- Code cells
- Markdown text
- Credentials
Question 3. Which of the following do you need to create in order to publish a notebook to your GitHub repository?
- Profile
- Access token
- Apps
- Login credential
Question 4. Fill in the blank: If you’d like to schedule a notebook in Watson Studio to run at a different time you can create a(n) _____________.
- Job
- API
- Markdown cell
- Asset
Question 5. Fill in the blank: On the environments tab you can define the _________________.
- Runtime configuration for notebook editor
- Runtime configuration for flow editor
- Hardware size
- Software configuration
- All of the above
Question 6. Fill in the blank: When sharing a read only version of a notebook, you can choose to share __________________.
- Only text and output
- All content including code
- A permalink
- All content, excluding sensitive code cells
- All of the above
Question 7. Fill in the blank: When working in a Jupyter Notebook, before returning to a project, it’s important to ________________________.
- Run cells
- Save your notebook
- Insert cells
- Insert to code
Question 8. Fill in the blank: Before running a notebook, it’s a best practice to ____________ to describe what the notebook does.
- Insert a cell at the top of the notebook
- Insert a cell at the bottom of the notebook
- Delete notebook cells
- Refresh your page
Question 9. Fill in the blank: In the _____________ tab you can define the hardware size and software configuration for the runtime associated with Watson Studio tools such as notebooks.
- Overview
- Assets
- Environments
- Settings
Question 10.Fill in the blank: IBM Cloud uses ______________ as a way for you to organize your account resources in customizable groupings so that you can quickly assign users access to more than one resource at a time.
- Resource groups
- Services
- Catalogs
- Projects
Data Science Tools Cognitive Class Final Exam Answers:-
Question 1. Which are the three most used languages for data science? (Select all that apply.)
- Java
- R
- Scala
- Python
- SQL
Question 2. Which of these is a database query language?
- SQL
- Python
- Julia
- All of the Above
Question 3. Is it possible to use machine learning within a web browser with Javascript?
- No
- Yes
Question 4. Which of these is not a machine learning or deep learning library for Python?
- Scikit-learn
- PyTorch
- Keras
- NumPy
Question 5. Comma Separated Values (CSV) is a commonly used format to store:
- Tabular data
- Hierarchical or network data
- All of the above
Question 6. Classification models can be used to determine whether:
- an image contains a dog
- a video contains a specific sound
- an email is likely spam
- All of the above
Question 7.Generally speaking, which type of model is used to predict a numerical value, such as the potential sales price of a used car?
- Classification model
- Regression model
- Clustering model
Question 8. Fill in the blank: ________________ is the heart of every organization.
- The cloud
- Data
- Integration
- Open source
Question 9. What does the “BI” in BI Tools stand for?
- Build Information
- Business Intelligence
- Build Integration
- Business Integration
Question 10. Which of the following are true about Data Asset Management?
- To be done effectively data must be versioned and annotated with meta data.
- A crucial part of data science at the enterprise level.
- Also known as data governance.
- All of the above
Question 11. Which are the two most used open source tools for data science? (Select all that apply.)
- Notepad
- RStudio
- Jupyter Notebooks / JupyterLab
- Spyder
- VSCode
Question 12. What tool do most R developers use?
- Jupyter Notebooks / JupyterLab
- RStudio
Question 13. What tool do most Python developers use?
- Jupyter Notebooks / JupyterLab
- RStudio
Question 14. True or false? Jupyter Notebooks / JupyterLab support development in R.
- True
- False
Question 15. Which tool unifies documentation, source code and data visualizations into a single document?
- Notepad
- VSCode
- Jupyter Notebooks / JupyterLab
Question 16. Which command is used to install packages in R?
- install.package(“package name”)
- install(“package name”)
- package(“package name”)
- install.packages(“package name”)
Question 17. Which of the following functions does RStudio provide?
- Documenting R code applications.
- Storing data in tables.
- Creating relationships between data tables.
- Editing and execution of R code.
Question 18. True or False: The Jupyter Notebook kernel must be installed on a local server.
- True
- False
Question 19. Which of the following statements about Jupyter Notebooks is correct?
- Jupyter Notebooks are only available if installed locally on your computer.
- Jupyter Notebooks support the Visualization of data in charts.
- Jupyter Notebooks are a commercial product of IBM.
- Jupyter Notebooks provide storage of massive quantities of data in data lakes.
Question 20. True or false? R studio supports development in Python.
- True
- False
Question 21. Which feature in Watson Studio helps to keep track of and discover relevant Machine Learning assets?
- Watson Knowledge Catalog
- AutoAI
- Modeler Flows
- OpenScale
- All of the above
Question 22. Fill in the blank: If you’d like to schedule a notebook in Watson Studio to run at a different time you can create a(n) ________.
- API
- markdown cell
- job
- asset
Question 23. Fill in the blank: In the __________ tab you can define the hardware size and software configuration for the runtime associated with Watson Studio tools such as Notebook.
- settings
- assets
- environments
- overview
Question 24. Fill in the blank: It’s a best practice to remove or replace _____________ before publishing to GitHub.
- credentials
- charts
- markdown text
- code cells