Data mining (also known as data or knowledge discovery in Databases (KDD)) defined as the process of analyzing data from different perspectives and summarizing it into useful information. This information can be used to increase revenue, cuts costs, or both. Data mining software is one of analytical tools for analyzing data. It helps and allows users to analyze data from many different dimensions or angles, categorize it, and summarize the relationships identified. Technically, data mining is the process of finding correlations or patterns among dozens of fields in large relational databases.
Data mining is a class of database applications that look for hidden patterns in a group of data that can be used to predict future behavior. For example, data mining software can help retail companies find customers with common interests. It is basically the idea of looking for patterns in large amounts of data. The most common application has to do with data that describe people's purchases, and trying to understand the relationships. For example, the grocery man used data mining software to analyze local buying patterns. He found out that when men bought, they also tended to buy beer. The grocery man could use this newly discovered information in various ways to increase revenue and he could move the beer display closer to the diaper display. And, he could make sure beer and diapers were sold at full price.
Nowadays, data mining has been using lots in both public and private sector. For examples in private companies such as medicine, banking and insurance use data mining to increase sales, decrease costs and enhance research. In public sector, data mining application were used to measure and improve programming performance, assessing risk, and product retailing.
Data Mining is the center of Knowledge Discovery process. There are steps leading from raw data collections to some form of new knowledge. This process starts from data cleaning. In this phase we removed the irrelevant and noise data from the collection because we will not use it while we analysis the information we collected.
The second process is data integration, and in this phase involves combining data in different sources and providing users with a unified view of these data. Data selection is the third process, at this step; the data relevant to the analysis is decided on and retrieved from the data collection.
Data transformation which transformed selected data into forms appropriate for the mining procedure. Then we use clever techniques to extract patterns useful and this phase called data mining. Interpretation & Evaluation, in this step we identified patterns based on given measures.
The final phase called Knowledge representation which uses to represent the discovered knowledge to the user. This is an important step and we use visualization techniques to help users understand and interpret the data mining results.
Data mining is involved in building a data warehouse and electronic database to organize the wealth of information collected. Without a data warehouse, companies lack the infrastructure to mine useful knowledge out of the data available. For example word processing programs and computer operating systems, data mining has grown more user-friendly and graphics-based as its application has spreads throughout society to less technically disposed users.
E-Commerce and Data Mining
Electronic commerce, it also known as (electronic marketing) e-commerce, consists of the buying and selling of products or services over electronic systems such as the Internet and other computer networks. The amount of trade conducted electronically has grown extremely with widespread Internet usage. If we come to know the history of E-Commerce we will notes that until about 1994, electronic commerce was not web-based and they use computers and telecommunication to automatically forward and process commercial documents such as invoices. Then they started use electronic data interchange to communicate between computers through shipment of magnetic tapes.
In 1990s the first version of electronic commerce focused on business to business transactions because electronic data interchange systems were too expensive and personal computers were rare. Later on the infrastructure was growing too fast and the estimated growth in personal computers and the size of the internet was too high. Yahoo, Amazon, Google and e-Bay all this website began on a small scale with simple business models which typically was to have customers place orders over the internet through web-browsing sessions. Everything became complex and Computer security software became a necessary enabler, as well as spam filters and technology such as PayPal to facilitate commercial purchases. This growth leads a result of applying data mining algorithms.
E-commerce also allows businesses to access new data streams that inform management ways that were not previously possible. One example is the on-line auctions at e-Bay, which provide data on how much money customers are willing to pay for a product. Another example is clickstream analysis, which provides clues on how people weigh information in making purchase decisions.
E-commerce plays an important role to change the face of most business function in competitive enterprise. E-business and e-commerce help to enable online transaction and this lead to generate a large amount of data which not easier to control and manage. So to make sense out of these data we need data mining services.
As we know we can find too much information on the internet, but the problem is how to manage, manipulate and exploit this information to be useful to use. Data mining is the solution to find out useful knowledge from the wealth of information stored in databases, computer systems, sales data, communications records, financial data and other sources. Data mining is valuable to any firms as their help to understand the customer's behavior, lower transaction costs, enable better management, new services and better customer relations.
In the e-commerce world, data mining carries an additional range of benefits. In particular, as e-commerce merchants worked to create the maximum amount of value out of what the Web has to offer, they moved to personalize products and services. The extraction of personal information allowed by data mining greatly facilitated this process. By plugging data-mining analysis into customer-service databases and their Web applications, companies can tailor products and services to accord with individual customers' habits and preferences, thereby maximizing value.
Electronic commerce is an information-intensive environment in which businesses typically collect huge amounts of information about consumers. Retailers and suppliers need to customize their shops and services. On-line consumers have shown interest in making quicker, better-informed decisions rather than just demanding the lowest price. All these requirements call for data warehousing (the process of extracting, cleaning, augmenting, and organizing operational data) and data mining (the extraction of structure from the warehoused data). Data mining tools for clustering, associating, and performing pattern analysis on the data generated by electronic commerce enable usage tracking.
There is an argument said that electronic commerce is the killer app for data mining? Internet and specific e-commerce websites grape huge amounts of information about customer's orders, browsing patterns, offerings, prices, competitors, usage times and preferences. Many of the traditional barriers to the effective application of data mining are lower such as automation, data transformations, and data access.
E-commerce is allows better customer management, expanded range of products, and create new strategies for marketing. Data mining tools help in enable all this change in business world.
Data Mining Limitation
Although Data mining is a powerful tool, there are limits to data mining. As we know that data mining is still an art and requires skilled technical and analytical specialists to structure and interpret the output that we found. It requires subject area expertise, experience with large databases, and skills with data-mining algorithms.
Data mining help to find patterns and relationships, but it does not tell the value or significance of these patterns to the users and the user have to worked out to determine this value. Also related to validity of the patterns found it is dependent on how they compare to the real world circumstances. For example, we have a data mining application designed to identify the terrorist and we want to assess the validity, in this case the user may test it using data that includes information about known terrorists. However, while possibly re-affirming a particular profile, it does not necessarily mean that the application will identify a criminal whose behavior significantly deviates from the original one.
In addition, data mining can identify the connections between variables and behaviors; it doesn't help to identify a causal relationship. To support this idea, for example if I want to purchase airline tickets just before the flight is scheduled to depart, if this related to characteristics such as income or internet use, that does not indicate that the ticked purchasing behavior is caused by one of these variable.