MGM’s College of Engineering, Nanded.
Department of IT
Semester I (2019-20)
Department of IT
Semester I (2019-20)
Class: BE(IT) Subject: DMDW Assignment II
1. Define Data Mining. What types of attributes are used for data mining?
2. Suppose that the data for analysis includes the attribute Marks. The Marks values for the data
tuples are (in increasing order) 23, 25, 26, 26, 29, 30, 30, 31, 32, 32, 35, 35, 35, 35, 40,
43, 43, 45, 45, 45, 45, 46, 50, 55, 56, 62, 80. Calculate Mean, Median, Mode and Midrange of
the data.
the data.
3. Suppose we have the following values for sales (in thousands of rupees), shown in
increasing order: 40, 46, 57, 60, 62, 62, 66, 70, 73, 80, 80, 120. Calculate Variance and
Standard Deviation.
increasing order: 40, 46, 57, 60, 62, 62, 66, 70, 73, 80, 80, 120. Calculate Variance and
Standard Deviation.
4. Given two objects represented by the tuples (12, 5) and (25, 20):
(a) Compute the Euclidean distance between the two objects.
(b) Compute the Manhattan distance between the two objects.
5. Compare (i) Discrete and continuous attributes.
(ii) Interval-scaled and ratio-scaled attributes.
6. Describe the major steps involved in data preprocessing.
6. Describe the major steps involved in data preprocessing.
7. Define Noise in data. For the following data : 23, 25, 26, 26, 29, 30, 30, 31, 32, 32, 35, 35,
35, 35, 40, 43, 43, 45, 45, 45, 45, 46, 50, 55, 56, 62, 80. Use smoothing by bin means to
smooth these data, using a bin depth of 3.
smooth these data, using a bin depth of 3.
8. Compare min-max and z-score normalization.
9. Use these methods to normalize the following group of data:
100, 200, 300, 500,900
(a) min-max normalization by setting min= 0 and max = 1
(b) z-score normalization
10. Define: (i) Support (ii) Confidence (iii) closed frequent itemset (iv) maximal frequent
itemset.
itemset.
11. Discuss the steps of the Apriori algorithm.
12. Apply the Apriori Algorithm to the following transaction database and find the frequent itemsets using minimum support count as 2.
|
13. Identify the drawbacks of Apriori algorithm.
14. Evaluate the Bayesian classifier with an appropriate example.
15. Evaluate k-means clustering algorithm with an appropriate example.
16. Classify web mining techniques.
Faculty Incharge: Hashmi S A