Course Name :- Moving Data into Hadoop
Module 1 :- Lord Scenario
Question 1 : What is Data at rest?
- Data that is being transferred over
- Data that is already in a file in some directory
- Data that hasn’t been used in a while
- Data that needs to be copied over
Question 2 : Data can be moved using BigSQL Load. True or false?
- True
- False
Question 3 : Which of the following does not relate to Flume?
- Pipe
- Sink
- Interceptors
- Source
- unanswered
Module 2 :- Using Sqoop
Question 1: Sqoop is designed to
- export data from HDFS to streaming software
- read and understand data from a relational database at a high level
- prevent “bad” data in a relational database from going into Hadoop
- transfer data between relational database systems and Hadoop
Question 2: Which of the following is NOT an argument for Sqoop?
- –update-key
- –split-from
- –target-dir
- –connect
Question 3 : By default, Sqoop assumes that it’s working with space-separated fields and that each record is terminated by a newline. True or false?
- True
- False
Module 3 : – Flume Overview
Question 1: Avro is a remote procedure call and serialization framework, developed within a separate Apache project. True or false?
- True
- False
Question 2 : Data sent through Flume
- may have different batching but must be in a constant stream
- may have different batching or a different reliability setup
- must be in a particular format
- has to be in a constant stream
Question 3: A single Avro source can receive data from multiple Avro sinks. True or false?
- True
- False
Module 4 : – Using Flume
Question 1 : Which of the following is NOT a supplied Interceptor?
- Regex extractor
- Regex sinker
- HostType
- Static
Question 2: Channels are:-
- where the data is staged after having been read in by a source and not yet written out by a sink
- where the data is staged after having been read in by a sink and not yet written out by a source
- where the data is staged after having been written in by a source and not yet read out by a sink
- where the data is staged after having been written in by a sink and not yet written out by a source
Question 3 : One property for sources is selector.type? True or false?
- True
- False
Moving Data into Hadoop Cognitive class final exam Answers:-
Question 1. The HDFS copyFromLocal command can be used to
- capture streaming data that you want to store in Hadoop
- ensure that log files which are actively being used to capture logging from a web server are moved into Hadoop
- move data from a relational database or data warehouse into Hadoop
- None of the above
Question 2. What is the primary purpose of Sqoop in the Hadoop architecture?
- To “catch” logging data as it is written to log files and move it into Hadoop
- To schedule scripts that can be run periodically to collect data into Hadoop
- To import data from a relational database or data warehouse into Hadoop
- To move static files from the local file system into HDFS
- To stream data into Hadoop
Question 3. A Sqoop JDBC connection string must include
- the name of the database you wish to connect to
- the hostname of the database server
- the port that the database server is listening on
- the name of the JDBC driver to use for the connection
- All of the above
Question 4. Sqoop can be used to either import data from relational tables into Hadoop or export data from Hadoop to relational tables. True or false?
- True
- False
Question 5. When importing data via Sqoop, the imported data can include
- a collection of data from multiple tables via a join operation, as specified by a SQL query
- specific rows and columns from a specific table
- all of the data from a specific table
- All of the Above
Question 6. When importing data via Sqoop, the incoming data can be stored as
- Serialized Objects
- JSON
- XML
- None of the Above
Question 7. Sqoop uses MapReduce jobs to import and export data, and you can configure the number of Mappers used. True or false?
- True
- False
Question 8. What is the primary purpose of Flume in the Hadoop architecture?
- To “catch” logging data as it is written to log files and move it into Hadoop
- To schedule scripts that can be run periodically to collect data into Hadoop
- To import data from a relational database or data warehouse into Hadoop
- To move static files from the local file system into HDFS
- To stream data into Hadoop
Question 9. When you create the configuration file for a Flume agent, you must configure
- an Interceptor
- a Sink
- a Channel
- a Source
- All of the above
Question 10. When using Flume, a Source and a Sink are “wired together” using an Interceptor. True or false?
- True
- False
Question 11. Flume agents can run on multiple servers in the enterprise, and they can communicate with each other over the network to move data. True or false?
- True
- False
Question 12. Possible Flume channels include
- The implementation of your own channel
- File Storage
- Database Storage
- In Memory
- All of the Above
Question 13. Flume provides a number of source types including
- Elastic Search
- HBase
- Hive
- HDFS
- None of the Above
Question 14. Flume agent configuration is specified using
- CSV
- a text file, similar to the Java.properties format
- JSON
- XML, similar to Sqoop configuration
Question 15. To pass data from a Flume agent on one node to another, you can configure an Avro sink on the first node and an Avro source on the second. True or false?
- True
- False