NOTE: There are 11 Questions in all.
· Question 1 is compulsory and carries 16 marks. Answer to Q. 1. must be written in the space provided for it in the answer book supplied and nowhere else.
· Answer any THREE Questions each from Part I and Part II. Each of these questions carries 14 marks.
· Any required data not explicitly given, may be suitably assumed and stated.
Q.1 Choose the correct or best alternative in the following: (2x8)
a. Retrieval of task relevant data in Data Mining is called
(A) Data selection. (B) Pattern presentation.
(C) Data integration. (D) Task data.
b. Multiple data sources are combined in
(A) Data cleaning. (B) Data integration.
(C) Data transforming. (D) Knowledge representation.
c. A set of views over operational databases is called a
(A) Data mart. (B) Enterprise warehouse.
(C) Data cube. (D) Virtual warehouse.
d. Market Basket Analysis is an example of
(A) Association. (B) Clustering.
(C) Clarification. (D) Integration.
e. An efficient association rule-mining algorithm is
(A) apriori. (B) nearest neighbour.
(C) principal component analysis. (D) back propagation.
f. Tree pruning methods are used in
(A) classification. (B) clustering.
(C) data cleaning. (D) pattern recognition.
g. The following databases contain word descriptions for data objects
(A) Time series databases. (B) Text databases.
(C) Multimedia databases. (D) Spatial databases.
h. The normalized schema is called
(A) star schema (B) snowflake schema
(C) multidimensional schema (D) cube.
Answer any THREE Questions. Each question carries 14 marks.
Q.2 a. Present an example where data mining is crucial to the success of business. What Data mining function does this business need? (7)
b. How is a Data Warehouse different from a database? How are they similar? (7)
Q.3 a. Explain the data warehouse back-end tools and utilities to populate and refresh data. (7)
b. Describe two challenges to Data Mining regarding performance issues. (7)
Q.4 Briefly compare the following concepts. You may use an example to explain your points.
(i)
i. Snow flake Schema
ii. Fact Constellation.
iii. Starnet query model.
(ii)
i. Data Clearing.
ii. Data transformation.
iii. Refresh. (14)
Q.5 a. What is the difference between the below three main types of Data warehouse usages.
(i) Information Processing.
(ii) Analytical Processing.
(iii) Data Mining. (9)
b. Discuss OLAM. Is OLAP related to it? (5)
Q.6 a. What is the role of EIS analyst in Data Warehouse? How does a data warehouse assist EIS analyst. (7)
b. Explain the knowledge discovery process. (7)
Answer any THREE Questions. Each question carries 14 marks.
Q.7 a. Describe why concept hierarchies are useful in DM? (4)
b. Describe the different views that must be considered while designing a data warehouse. (10)
Q.8 a. While mining association rules, how does the Data Mining system know which rules are likely to be interesting to the user? (8)
b. What is the importance of meta data in a data warehouse? Explain the components. (6)
Q.9 Write notes on
a. Rollup Analysis. (5)
b. Machine learning. (5)
c. Any data reduction method. (4)
Q.10 a. Explain the three tier Data Ware House Architecture. (8)
b. Explain the three Data Ware House models. (6)
b. Why is tree pruning useful in decision tree induction? What is the drawback of using a separate set of samples to evaluate purring? (8)