Member-only story

Introduction to Data Mining

Serigne DIAW
3 min readJun 16, 2021

--

Data mining is a process of extracting and discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems.

Data mining is the analysis step of the Knowledge Discovery in Databases process or KDD.

Why mine data ?

Computerization and automated data gathering has resulted in extremely large data repositories.

Raw Data -> Patterns -> Knowledge

Scalability issues and desire for more automation makes more traditional techniques less effective.

  • Statistical Methods
  • Relational Query Systems
  • OLAP (OnLine Analytical Processing)

The Data Mining (KDD) Process

The data mining (KDD) process

Data Mining Techniques

The more popular data mining techniques include :

  • Classification
  • Clustering
  • Regression

The other significant ideas :

  • Associations Rules Learning
  • Topic Identification, tracking and drift analysis
  • Concept hierarchy creation
  • Relevance of content.
  • Anomaly detection

--

--

Serigne DIAW
Serigne DIAW

Written by Serigne DIAW

Data Engineer / Data Architect / Data Scientist

No responses yet