Wednesday, October 3, 2018

Class: BE(IT) Subject: DMDW Assignment II Semester I (2018-19)


MGM’s College of Engineering, Nanded.
Department of IT
Semester I (2018-19)

Class: BE(IT)       Subject: DMDW         Assignment II

1. List and describe the types of attributes used for data mining.
2. Suppose that the data for analysis includes the attribute age. The age values for the data
    tuples are (in increasing order) 13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30,
    33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70. Calculate Mean, Median, Mode and Midrange of     
    the data.
3. Suppose we have the following values for salary (in thousands of dollars), shown in   
    increasing order: 30, 36, 47, 50, 52, 52, 56, 60, 63, 70, 70, 110. Calculate Variance and
    Standard Deviation.
4. Given two objects represented by the tuples (22, 1, 42, 10) and (20, 0, 36, 8):
     (a) Compute the Euclidean distance between the two objects.
     (b) Compute the Manhattan distance between the two objects.
5. Compare      (i) Discrete and continuous attributes.
                        (ii) Interval-scaled and ratio-scaled attributes.
6. Suppose that a hospital tested the age and body fat data for 18 randomly selected adults
    with the following results:


   Calculate the mean, median, and standard deviation of age and %fat.

7. Draw the boxplots for age and %fat for the data of Q.No.6 above.
8. Describe the major steps involved in data preprocessing.
9. Imagine that you need to analyze AllElectronics sales and customer data. You note that
    many tuples have no recorded value for several attributes such as customer income. Which
    methods will be employed to fill the missing values?
10. Define Noise in data. For the following data : 13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25,
    25, 25, 30, 33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70. Use smoothing by bin means to
    smooth these data, using a bin depth of 3.
11. Compare min-max and z-score normalization.
12. Use these methods to normalize the following group of data:
                        200, 300, 400, 600,1000
(a) min-max normalization by setting min= 0 and max = 1
(b) z-score normalization
13. Define: (i) Support (ii) Confidence (iii) closed frequent itemset (iv) maximal frequent   
      itemset.
14. Discuss the steps of Apriori algorithm.
15. For the following transaction database, calculate the frequent itemsets using Apriori algoritm where the minimum support count is 2.

16. List the drawbacks of Apriori algorithm.



Faculty Incharge: Hashmi S A

Class: TE(IT) Subject: DBMS Assignment II Semester I (2018-19)


MGM’s College of Engineering, Nanded.
Department of IT
Semester I (2018-19)
Class: TE(IT)       Subject: DBMS         Assignment II

1. Classify the parts of SQL language.
2. Discuss the built-in data types supported by SQL? What is the difference between char and varchar in SQL?
3. What are the different integrity constraints supported by SQL? Explain briefly.
4. Design a database in SQL for the university database having following relations: Department, Course, Instructor, Section and Teaches. Use appropriate primary keys and foreign keys for the relations.
5. Categorize built-in set operations in SQL with examples.
6. Explain how the GROUP BY clause works. Differentiate between WHERE and HAVING clauses with the help of an example.
7. Construct queries for different Join Operations in SQL?
8. Explain the concept of nested queries with examples.
9. Discuss the different operators in SQL, which are used with subqueries/nested queries.
10. Analyze the purpose of view in SQL. Construct a SQL query for creating a view?
11. Create the following queries in SQL, using the university schema. (Refer Question 3.1 Korth 6th edition)
a. Find the titles of courses in the Comp. Sci. department that have 3 credits
b. Find the IDs of all students who were taught by an instructor named Einstein; make sure 
there are no duplicates in the result.
c. Find the highest salary of any instructor.
12. Create the following inserts, deletes or updates in SQL, using the university schema.
(Refer Question 3.3 Korth 6th edition)
a. Increase the salary of each instructor in the Comp. Sci. department by 10%.
b. Delete all courses that have never been offered (that is, do not occur in the section relation)
c. Insert every student whose tot cred attribute is greater than 100 as an instructor in the same 
department, with a salary of $10,000.
13. Consider the following insurance database , where the primary keys are underlined. 
  (Refer Question 3.4 Korth 6th edition)

person (driver id, name, address)
car (license, model, year)
accident (report number, date, location)
owns (driver id, license)
participated (report number, license, driver id, damage amount)
Construct the following SQL queries for this relational database.
a. Find the total number of people who owned cars that were involved in accidents in 2009.
b. Add a new accident to the database; assume any values for required attributes.
c. Delete the Mazda belonging to “John Smith”.

14. List the built-in aggregate functions in SQL with an appropriate example queries.
15. Explain scalar queries with an example.
16. Categorize the string matching operations in SQL with example queries.



Faculty Incharge: Hashmi S A