Technology projects: Data Mining
SCOPE
Data mining is a process of discovering essential and valuable information from large data sets using mathematical analysis to determine trends and patterns in the data. Due to the complexity of relationships and the large amount of data, traditional data methods are not applicable. The discovered trends and patterns can be defined as data mining model.
APPLICATION
Working with our clients, we have developed different data mining models, which still have valuable applications in areas such as:
- Risk and probability – determine the best customers for targeted specific actions, determining the break-even point for risk scenarios, assigning probabilities to diagnoses and etc;
- Forecasting – sales and revenue, etc.;
- Clustering – separating customers or events into cluster of related items, analyzing and predicting affinities;
- Product analysis – determining which products are likely to be sold together or which products are complementary, in order to establish product efficiency;
- Finding sequences – analyzing customer behavior and predicting next likely events;
- And others;
TECHNOLOGY
As we mentioned, data mining is a process of defining trends and patterns in large datasets involving methods from machine learning, statistics, and database systems. Additional to the analysis step, it involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating.
In other words, this is long and intensive process which involves several steps, and the main are:
- Problem specification;
- Data preparation;
- Data exploration;
- Building model/s;
- Model validation/s;
- Model update and improvement;
The final target of the process is the information extraction from the available datasets which will be transform into an understandable structure for further use.