MGM’s College of
Engineering, Nanded.
Department of IT
Semester I (2017-18)
Class: BE(IT) Subject: DMDW Assignment II
Department of IT
Semester I (2017-18)
Class: BE(IT) Subject: DMDW Assignment II
1. Suppose
we have the following values for salary (in thousands of dollars), shown
in increasing order: 30, 36, 47, 50, 52, 52, 56, 60, 63, 70, 70, 110. (a) Calculate
mean, median, mode, midrange, standard deviation of the salary. (b) Can you
find (roughly) the first quartile (Q1) and the third quartile (Q3)
of the data?
2.
For the above values for salary (a) Give the five-number summary of the
data. (b) Show a boxplot of the data.
3.
Given two objects represented by the tuples (22, 1, 42, 10) and (20, 0, 36,
8):
(a) Compute the Euclidean
distance between the two objects.
(b) Compute the Manhattan
distance between the two objects.
(c)
Compute the Minkowski distance between the two objects, using q=
3.
4.
Explain min-max and z-score normalization with an example.
5.
Imagine that you need to analyze AllElectronics sales and customer data.
You note that many tuples have no recorded value for several attributes such as
customer income. How can you go about filling in the missing values for
this attribute?
6.
Price (in dollars) for a product are: 8, 4, 21, 15, 21, 24, 34, 25, 28.
Apply binning methods for smoothing the Price
data.
7.
Write an overview of data transformation strategies.
8.
Discuss issues to consider during data integration.
9.
A database has five transactions. Let min_ sup=
60% and min_ conf= 80%.
TID Items Bought
T100 M, O, N, K, E, Y
T200 D, O, N, K, E, Y
T300 M, A, K, E
T400 M, U, C, K, Y
T500 C, O, O, K, I, E
Find all frequent itemsets using Apriori.
10. Define : (i) Support (ii) Confidence (iii) Frequent Itemset
11. State and explain the steps of apriori algorithm.
12. For the following transaction database, Generate
Association rules from frequent itemset.(take support=2)
TID List of item IDs
T100 I1, I2, I5
T200 I2, I4
T300 I2, I3
T400 I1, I2, I4
T500 I1, I3
T600 I2, I3
T700 I1, I3
T800 I1, I2, I3, I5
T900 I1, I2, I3
13. How can we improve the efficiency of Apriori-based
mining?
14. Create an FP-tree for the above transaction
database (Q.12).
15. What is classification? Enlist applications of
classification.
16. Write short note on: Pattern evaluation methods.
Faculty Incharge: Hashmi S A