Mid KD

COMMENTS

STATISTICS

RECORDS

TAKE THE TEST

Title of test:

Mid KD

Description:
Midterm kd

Author:

Other tests from this author

Creation Date: 2026/04/02

Category: Others

Number of questions: 28

Rating:

(0)

Share the Test:

Nuevo Comentario

New Comment
NO RECORDS

Content:

Feature scaling is the process of converting data to a uniform scale. True. False.

Data reduction techniques involve compressing and summarizing large amounts of data. True. False.

Every subset of a frequent itemset is also frequent. True. False.

The support of every subset J of I is at least equal to that of the support of itemset I. True. False.

The confidence of a rule is the percentage of transactions that contain the antecedent of the rule that also contain the consequent. True. False.

The basic idea of Apriori algorithm is to use the downward closure property to prune the candidate search space. True. False.

Discretization is the process of transforming continuous data into discrete values. True. False.

One of the key challenges in data mining is dealing with the curse of dimensionality. True. False.

The Euclidean distance is a measure of similarity between two data points. True. False.

When minSup increase, frequent itemset increase. True. False.

Generating association rules from frequent itemset is the difficult part of mining operation. True. False.

Data preprocessing is an optional step in the data mining process. True. False.

minSup a sufficient strength in terms of conditional probabilities. True. False.

Which of the following is a common method for reducing the number of association rules generated by a frequent pattern mining algorithm?. Increasing the minimum support threshold. Increasing the minimum confidence threshold. Decreasing the minimum support threshold. Decreasing the minimum confidence threshold.

Which of the following is a common approach for handling noise and outliers in frequent pattern mining?. Increasing the minimum support threshold. Decreasing the minimum support threshold. Removing transactions with low support. Using clustering to identify groups of similar transactions.

What is the relationship between data mining and knowledge discovery in databases (KDD)?. Data mining is a synonym for KDD. KDD is a synonym for data mining. Data mining is a step in the KDD process. KDD is a step in the data mining process.

Which of the following is a method for outlier detection?. Local Outlier Factor (LOF). Principal Component Analysis (PCA). Linear Discriminant Analysis (LDA). Support Vector Machines (SVM).

What is the primary goal of data reduction in data preprocessing?. To eliminate irrelevant or redundant data. To combine multiple datasets into a single dataset. To convert unstructured data into structured data. To scale data to a standardized range.

What is the primary challenge of data preprocessing in data mining?. Finding the right data. Handling big data. Cleaning and integrating data from multiple sources. Building accurate models.

What is the primary difference between data mining and machine learning?. Data mining is a subset of machine learning. Machine learning is a subset of data mining. Data mining focuses on discovering patterns and relationships in data, while machine learning focuses on making predictions and decisions. Machine learning focuses on discovering patterns and relationships in data, while data mining focuses on making predictions and decisions.

What is the primary goal of data discretization in data preprocessing?. To correct errors and inconsistencies in the data. To reduce the size of the data. To transform the data into a more meaningful format. To convert continuous data into categorical data.

Which of the following is a disadvantage of using the Euclidean distance for similarity measurement?. It is not suitable for high-dimensional data. It is sensitive to outliers. It is not suitable for continuous data. It assumes that the data is normally distributed.

What is the primary goal of data normalization in data preprocessing?. To correct errors and inconsistencies in the data. To reduce the size of the data. To transform the data into a more meaningful format. To scale the data to a common range.

Which of the following is an example of a sequential pattern mining problem?. Finding frequent itemsets in a transaction database. Identifying groups of customers with similar purchasing behavior. Detecting patterns of activity in a time series of events. Classifying images based on their content.

Which of the following distance measures is not appropriate for categorical data?. Euclidean distance. Hamming distance. Jaccard distance. Minkowski distance.

Support of {cheese, Milk} is. 0.2. 0.4. 0.6. 3.

For minSupport = 0.5 is {Eggs, Milk} considered frequent itemset?. Yes. No. No available data to judge. It depends on minimum confidence value.

For minSupport =0.3 ,Which of the following itemsets is a maximal frequent itemset?. {Bread, Milk}. {Cheese, Milk}. {Eggs, Milk, Yogurt}. All of the Above.

Report abuse

▲