# Practice Test: Question Set - 02

1. The denominator (bottom) of the z-score formula is
(A) The standard deviation
(B) The difference between a score and the mean
(C) The range
(D) The mean

2. Which of the following is not an example of Social Media?
(C) Instagram

3. How many main statistical methodologies are used in data analysis?
(A) 2
(B) 3
(C) 4
(D) 5

4. Files are divided into ________ sized Chunks.
(A) Static
(B) Dynamic
(C) Fixed
(D) Variable

5. For improving supply chain management to optimize stock management, replenishment, and forecasting
(A) Descriptive
(B) Diagnostic
(C) Predictive
(D) Prescriptive

6. ________ provides performance through distribution of data and fault tolerance through replication
(A) HDFS
(B) PIG
(C) HIVE

7. Sentiment Analysis is an example of

1. Regression
2. Classification
3. Clustering
4. Reinforcement Learning

(A) 1, 2 and 4
(B) 1, 2 and 3
(C) 1 and 3
(D) 1 and 2

8. Text Analytics, also referred to as Text Mining?
(A) True
(B) False
(C) Can be true or false
(D) Cannot say

9. The process of quantifying data is referred to as ________
(A) Topology
(B) Diagramming
(C) Enumeration
(D) Coding

10. If the assumed hypothesis is tested for rejection considering it to be true is called?
(A) Null Hypothesis
(B) Statistical Hypothesis
(C) Simple Hypothesis
(D) Composite Hypothesis

11. In descriptive statistics, data from the entire population or a sample is summarized with?
(A) Integer descriptor
(B) Floating descriptor
(C) Numerical descriptor
(D) Decimal descriptor

12. ________ phase sorts the data & ________ creates logical clusters.
(A) Reduce, YARN
(B) MAP, YARN
(C) REDUCE, MAP
(D) MAP, REDUCE

13. As an example, an expectation of using a recommendation engine would be to increase same-customer sales by adding more items into the market basket
(A) Lowering costs
(B) Increasing revenues
(C) Increasing productivity
(D) Reducing risk

14. What is a hypothesis?
(A) A statement that the researcher wants to test through the data collected in a study
(B) A research question the results will answer
(C) A theory that underpins the study
(D) A statistical method for calculating the extent to which the results could have happened by chance

15. By 2025, the volume of digital data will increase to
(A) TB
(B) YB
(C) ZB
(D) EB

