Welcome To Blog of Hashmi S A, Department of IT, MGM's College of Engineering, Nanded, MS, India: Class: BE(IT) Subject: DMDW Assignment II Semester I (2018-19)

MGM’s College of Engineering, Nanded.
Department of IT
Semester I (2018-19)

Class: BE(IT) Subject: DMDW Assignment II

1. List and describe the types of attributes used for data mining.

2. Suppose that the data for analysis includes the attribute age. The age values for the data

tuples are (in increasing order) 13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30,

33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70. Calculate Mean, Median, Mode and Midrange of
the data.

3. Suppose we have the following values for salary (in thousands of dollars), shown in
    increasing order: 30, 36, 47, 50, 52, 52, 56, 60, 63, 70, 70, 110. Calculate Variance and
    Standard Deviation.

4. Given two objects represented by the tuples (22, 1, 42, 10) and (20, 0, 36, 8):

(a) Compute the Euclidean distance between the two objects.

(b) Compute the Manhattan distance between the two objects.

5. Compare (i) Discrete and continuous attributes.

(ii) Interval-scaled and ratio-scaled attributes.

6. Suppose that a hospital tested the age and body fat data for 18 randomly selected adults

with the following results:

Calculate the mean, median, and standard deviation of age and %fat.

7. Draw the boxplots for age and %fat for the data of Q.No.6 above.

8. Describe the major steps involved in data preprocessing.

9. Imagine that you need to analyze AllElectronics sales and customer data. You note that
many tuples have no recorded value for several attributes such as customer income. Which
methods will be employed to fill the missing values?

10. Define Noise in data. For the following data : 13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25,
25, 25, 30, 33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70. Use smoothing by bin means to
smooth these data, using a bin depth of 3.

11. Compare min-max and z-score normalization.

12. Use these methods to normalize the following group of data:

200, 300, 400, 600,1000

(a) min-max normalization by setting min= 0 and max = 1

(b) z-score normalization

13. Define: (i) Support (ii) Confidence (iii) closed frequent itemset (iv) maximal frequent
itemset.

14. Discuss the steps of Apriori algorithm.

15. For the following transaction database, calculate the frequent itemsets using Apriori algoritm where the minimum support count is 2.

16. List the drawbacks of Apriori algorithm.

Faculty Incharge: Hashmi S A

Welcome To Blog of Hashmi S A, Department of IT, MGM's College of Engineering, Nanded, MS, India

Wednesday, October 3, 2018

Class: BE(IT) Subject: DMDW Assignment II Semester I (2018-19)

Blog Archive