Data mining techniques
Arun K. Pujari
- 4th ed.
- Kolkata University press 2024
- 407p. P.B.
Foreword xv Prologue xvii Preface to the Fourth Edition xix Preface to the First Edition xxi Acknowledgements 1. INTRODUCTION 1.1 Introduction 1.2 Data Mining as a Subject 1.3 Guide to this Book 2. DATA WAREHOUSING 2.1 Introduction 2.2 Data Warehouse Architecture 2.3 Dimensional Modelling 2.4 Categorisation of Hierarchies 2.5 Aggregate Function 2.6 Summarisability 2.7 Fact–Dimension Relationships 2.8 OLAP Operations 2.9 Lattice of Cuboids 2.10 OLAP Server 2.11 ROLAP 2.12 MOLAP 2.13 Cube Computation 2.14 Multiway Simultaneous Aggregation (ArrayCube) 2.15 BUC - Bottom-Up Cubing Algorithm 2.16 Condensed Cube 2.17 Coalescing 2.18 Dwarf 2.19 Other Cubing Techniques 2.20 Skycube 2.21 View Selection - Partial Materialisation 2.22 Data Marting 2.23 ETL 2.24 Data Cleaning 2.25 ELT vs. ETL 2.26 Cloud Data Warehousing Further Reading Exercises Bibliography 3. DATA MINING 3.1 Introduction 3.2 What is Data Mining? 3.3 Data Mining: Definitions 3.4 KDD vs. Data Mining 3.5 DBMS vs. DM 3.6 Other Related Areas 3.7 DM Techniques 3.8 Other Mining Problems 3.9 Issues and Challenges in DM 3.10 DM Application Areas 3.11 DM Applications—Case Studies 3.12 Conclusions Further Reading Exercises Bibliography 4. ASSOCIATION RULES 4.1 Introduction 4.2 What is an Association Rule? 4.3 Methods to Discover Association Rules 4.4 Apriori Algorithm 4.5 Partition Algorithm 4.6 Pincer-Search Algorithm 4.7 Dynamic Itemset Counting Algorithm 4.8 FP-tree Growth Algorithm 4.9 Eclat and dEclat 4.10 Rapid Association Rule Mining (RARM) 4.11 Discussion on Different Algorithms 4.12 Incremental Algorithm 4.13 Border Algorithm 4.14 Generalised Association Rule 4.15 Association Rules with Item Constraints 4.16 Summary Further Reading Exercises Bibliography 5. CLUSTERING TECHNIQUES 5.1 Introduction 5.2 Clustering Paradigms 5.3 Partitioning Algorithms 5.4 k-Medoid Algorithms 5.5 CLARA 5.6 CLARANS 5.7 Hierarchical Clustering 5.8 DBSCAN 5.9 BIRCH 5.10 CURE 5.11 Categorical Clustering Algorithms 5.12 STIRR 5.13 ROCK 5.14 CACTUS 5.15 Conclusions Further Reading Exercises Bibliography 6. DECISION TREES 6.1 Introduction 6.2 What is a Decision Tree? 6.3 Tree Construction Principle 6.4 Best Split 6.5 Splitting Indices 6.6 Splitting Criteria 6.7 Decision Tree Construction Algorithms 6.8 CART 6.9 ID3 6.10 C4.5 6.11 CHAID 6.12 Summary 6.13 Decision Tree Construction with Presorting 6.14 RainForest 6.15 Approximate Methods 6.16 CLOUDS 6.17 BOAT 6.18 Pruning Technique 6.19 Integration of Pruning and Construction 6.20 Summary: An Ideal Algorithm 6.21 Other Topics 6.22 Conclusions Further Reading Exercises Bibliography 7. ROUGH SET THEORY 7.1 Introduction 7.2 Definitions 7.3 Example 7.4 Reduct 7. 5 Propositional Reasoning and PIAP to Compute Reducts 7.6 Types of Reducts 7.7 Rule Extraction 7.8 Decision tree 7.9 Rough Sets and Fuzzy Sets 7.10 Granular Computing Further Reading Exercises Bibliography 8. GENETIC ALGORITHM 8.1 Introduction 8.2 Basic Steps of GA 8. 3 Selection 8.4 Crossover 8.5 Mutation 8.6 Data Mining Using GA 8.7 GA for Rule Discovery 8.8 GA and Decision Tree 8.9 Clustering Using GA Conclusions Further Reading Exercises Bibliography 9. OTHER TECHNIQUES 9.1 Introduction 9.2 What is a Neural Network? 9.3 Learning in NN 9.4 Unsupervised Learning 9.5 Data Mining Using NN: A Case Study 9.6 Support Vector Machines 9.7 Conclusions Further Reading Exercises Bibliography
10. Performance Evaluation - ROC Curve 10.1 Introduction 10.2 Classification Accuracy 10.3 ROC Space 10.4 ROC Curves 10.5 ROC Curves and Class Distribution 10.6 ROC Convex Hull (ROCCH) 10.7 Method to Find the Optimal Threshold Point 10.8 Combining Classifiers 10.9 Area Under the ROC Curve (AUC ) 10.10 Methods to Compute AUC 10.11 Averaging ROC Curves 10.12 R OC for Multi-class Classifiers 10.13 Precision–Recall Graph 10.14 DET Curves 10.15 Cost Curves Further Reading Exercises Bibliography 11. WEB MINING 11.1 Introduction 11.2 Web Mining 11.3 Web Content Mining 11.4 Web Structure Mining 11.5 Web Usage Mining 11.6 Text Mining 11.7 Unstructured Text 11.8 Episode Rule Discovery for Texts 11.9 Hierarchy of Categories 11.10 Text Clustering 11.11 Conclusions Further Reading Exercises Bibliography 12. TEMPORAL AND SPATIAL DATA MINING 12.1 Introduction 12.2 What is Temporal Data Mining? 12.3 Temporal Association Rules 12.4 Sequence Mining 12.5 The GSP Algorithm 12.6 SPADE 12.7 SPIRIT 12.8 WUM 12.9 Episode Discovery 12.10 Event Prediction Problem 12.11 Time-series Analysis 12.12 Spatial Mining 12.13 Spatial Mining Tasks 12.14 Spatial Clustering 12.15 Spatial Trends 12.16 Conclusions Further Reading Exercises Bibliography Index