Saturday , April 20 2024

Hadoop 101 cognitive class Exam Answers:-

 Course Name:- Hadoop 101

Module 1. Introduction to Hadoop

Question 1. Hadoop is designed for Online Transactional Processing. True or False?

  • True
  • False

Question 2. When is Hadoop useful for an application?

  • When all of the application data is unstructured
  • When work can be parallelized
  • When the application requires low latency data access
  • When random data access is required

Question 3. With the help of InfoSphere Streams, Hadoop can be used with data-at-rest as well as data-in-motion. True or false?

  • True
  • False

Module 2. Hadoop Architecture & HDFS

Question 1. Network bandwidth between any two nodes in the same rack is greater than bandwidth between two nodes on different racks. True or False?

  • True
  • False

Question 2. Hadoop works best on a large data set. True or False?

  • True
  • False

Question 3. HDFS is a fully POSIX compliant file system. True or False?

  • True
  • False

Module 3. Hadoop Administration

Question 1. You can add or remove nodes from the open source Apache Ambari console. True or False?

  • True
  • False

Question 2. It is recommended that you start all of the services in Ambari in order to speed up communications. True or False?

  • True
  • False

Question 3.To remove a node using Ambari, you must first remove all of the services using that node. True or False?

  • True
  • False

Module 4. Hadoop Components

Question 1.The output of the shuffle operation goes into the mapper before going into the reducer. True or False?

  • True
  • False

Question 2. What is true about Pig and Hive in relation to the Hadoop ecosystem?

  • HiveQL requires that you create the data flow
  • PigLatin requires that the data have a schema
  • Fewer lines of code are required compared to a Java program
  • All of the above

Question 3. Which of the following tools is designed to move data to and from a relational database?

  • Pig
  • Flume
  • Oozie
  • Sqoop

Hadoop 101 Cognitive Class final exam Answers:-

Question 1. HDFS is designed for:

  • Large files, streaming data access, and commodity hardware
  • Large files, low latency data access, and commodity hardware
  • Large files, streaming data access, and high-end hardware
  • Small files, streaming data access, and commodity hardware
  • None of the options is correct

Question 2. The Hadoop distributed file system (HDFS) is the only distributed file system supported by Hadoop. True or false?

  • True
  • False

Question 3. The input to a mapper takes the form < k1, v1 > . What form does the mapper’s output take?

  • < list(k2), v2 >
  • list( < k2, v2 > )
  • < k2, list(v2) >
  • < k1, v1 >
  • None of the options is correct

Question 4. What is Flume?

  • A service for moving large amounts of data around a cluster soon after the data is produced.
  • A distributed file system.
  • A programming language that translates high-level queries into map tasks and reduce tasks.
  • A platform for executing MapReduce jobs.
  • None of the options is correct

Question 5. What is the purpose of the shuffle operation in Hadoop MapReduce?

  • To pre-sort the data before it enters each mapper node.
  • To distribute input splits among mapper nodes.
  • To transfer each mapper’s output to the appropriate reducer node based on a partitioning function.
  • To randomly distribute mapper output among reducer nodes.
  • None of the options is correct

Question 6. Which of the following is a duty of the DataNodes in HDFS?

  • Control the execution of an individual map task or a reduce task.
  • Maintain the file system tree and metadata for all files and directories.
  • Manage the file system namespace.
  • Store and retrieve blocks when told to by clients or the NameNode.
  • None of the options is correct

Question 7. Which of the following is a duty of the NameNode in HDFS?

  • Control the MapReduce job from end-to-end
  • Maintain the file system tree and metadata for all files and directories
  • Store the block data
  • Transfer block data from the data nodes to the clients
  • None of the options is correct

Question 8.Which component determines the specific nodes that a MapReduce task will run on?

  • The NameNode
  • The JobTracker
  • The TaskTrackers
  • The JobClient
  • None of the options is correct

Question 9.Which of the following characteristics is common to Pig, Hive, and Jaql?

  • All translate high-level languages to MapReduce jobs
  • All operate on JSON data structures
  • All are data flow languages
  • All support random reads/writes
  • None of the options is correct

Question 10.Which of the following is NOT an open source project related to Hadoop?

  • Pig
  • UIMA
  • Jackal
  • Avro
  • Lucene

Question 11.During the replication process, a block of data is written to all specified DataNodes in parallel. True or false?

  • True
  • False

Question 12.With IBM BigInsights, Hadoop components can be started and stopped from a command line and from the Ambari Console. True or false?

  • True
  • False

Question 13.When loading data into HDFS, data is held at the NameNode until the block is filled and then the data is sent to a DataNode. True or false?

  • True
  • False

Question 14.Which of the following is true about the Hadoop federation?

  • Uses JournalNodes to decide the active NameNode
  • Allows non-Hadoop programs to access data in HDFS
  • Allows multiple NameNodes with their own namespaces to share a pool of DataNodes
  • Implements a resource manager external to all Hadoop frameworks

Question 15.Which of the following is true about Hadoop high availability?

  • Uses JournalNodes to decide the active NameNode
  • Allows non-Hadoop programs to access data in HDFS
  • Allows multiple NameNodes with their own namespaces to share a pool of DataNodes
  • Implements a resource manager external to all Hadoop frameworks

Question 16.Which of the following is true about YARN?

  • Uses JournalNodes to decide the active NameNode
  • Allows non-Hadoop programs to access data in HDFS
  • Allows multiple NameNodes with their own namespaces to share a pool of DataNodes
  • Implements a resource manager external to all Hadoop frameworks

Question 17. Which of the following sentences is true?

  • Hadoop is good for OLTP, DSS, and big data
  • Hadoop includes open source components and closed source components
  • Hadoop is a new technology designed to replace relational databases
  • All of the options are correct
  • None of the options is correct

Question 18. In which of these scenarios should Hadoop be used?

  • Processing billions of email messages to perform text analytics
  • Obtaining stock price trends on a per-minute basis
  • Processing weather sensor information to predict a hurricane path
  • Analyzing vital signs of a baby in real time
  • None of the options is correctv

About Machine Learning

Check Also

Python for Data Science Cognitive Class Exam Answers:-

Course Name:- Python for Data Science Module 1. Python Basics Question 1. What is the …

Leave a Reply

Your email address will not be published. Required fields are marked *