Data Mining Analysis
Data mining analysis is the process of using data mining software
and practices to analyze understand patterns in data. The process
of data mining analysis starts with taking data from the source systems
and storing it in a means that is more readily available for data mining
analysis. In larger deployments the data mining analysis is
always performed on what are known as "Data Warehouses";
but a data mining analysis can also be performed on something as simple
and straight-forward as an excel spreadsheet.
Two Types of Data Components
Data mining analysis is a major component that will ultimately influence
how useable and valuable the acquired information turns out to be.
Once the data has been extracted, stored in an accessible database
system, and provided to the IT professional, consultant or analyst,
it is time to begin data mining analysis. Just like there are different
types of data mining software available, there are varying levels of
data mining analysis available as well. Some frequently used data mining
analysis methods are artificial neural networks, genetic algorithms,
decision trees, nearest neighborhood method, rule induction and data
visualization.
Two Types of Data Mining Analysis Methods
Artificial neural networks are frequently used in data mining analysis. This
form of data mining analysis involves a system which uses non-linear
predictive models that are enabled through training and quite similar
in basic composition to biological neural networks. Another type of
data mining analysis method is genetic algorithms, which are based
on making the data work more efficiently as a whole or take fewer resources
to work in the same capacity. These types of processes mimic those
of natural evolution, including genetic combination, mutation, and
natural selection.
Additional Data Mining Analysis Methods
Data mining analysis using decision trees is as the name suggests,
structures that represent sets of decisions (because of the structure’s
tree shape, they are called decision trees.) Because this type of data
mining analysis involves decisions, each decision comes with its own
set of rules for the data classification. There are specific types
of decision tree methods used in data mining analysis, including CART
(Classification and Regression Trees) which uses 2-way splits to fracture
a set of data, and CHAID (Chi Square Automatic Interaction Detection)
which uses chi square tests to create more than two splits. If using
CART methods in a data mining analysis, the time spent on data preparation
will be slightly less than with CHAID. If the nearest neighbor method
(also called the k-nearest neighbor technique) is being used in the
data mining analysis, each set of data is classified based on a combination
of the classes of the k record(s) it is most similar to in previous
sets of data. The rule induction method of data mining analysis develops “if-then” rules
to compare against the archives of data, extracting only the useful
information that passes the test. The last type of data mining analysis
is the method of data visualization, which is based on visual interpretation
of existing relationships, such as in charts, graphs or other illustrated
graphic tools.
After Data Mining Analysis
Once the data mining analysis has taken place, the end information
is placed in a user-friendly format that can easily be interpreted
by other users. Usually these formats are placed into graphics or tables. Through
data mining analysis a business can glean greater insights into key
business metrics and drivers. It is this ability of data mining
analysis that has made it a high-growth area for enterprise companies. Data
mining analysis, however, is also available for the smaller business. Companies
like EMANIO are making data mining analysis software available at price
points and ease of use levels that have previously been unknown.
|