AUTHORED BY DONALD C. GILLETTE, PH.D., DATA CONSULTANT @ GUIDEIT
Last weekend I read a very interesting book entitled “The Quants: How a New Breed of Math Whizzes Conquered Wall Street and Nearly Destroyed It” by Scott Patterson. I highly recommend this as a must read for all of you that are doing Business Intelligence and especially Data Mining.
So what is Data Mining? Basically it is the practice of examining large databases in order to generate new information. Ok, let’s dig into that to understand some business value.
Let us consider the US Census. Of course by law, it is done every ten years and produces petabytes (1 petabyte is one quadrillion bytes of data), which are crammed full of facts that are important to almost anyone that is doing data mining for almost any consumer based product, service, etc. Quick sidebar and promo…in part 2 of this micro series, I will share where databases like the census and others can be accessed to help make your data mining exercise valuable.
So if I was asked by the marketing department to help them predict how much to spend on a new advertising campaign to sell a new health care product that enhances existing dental benefits of those already in qualified dental plans, I would have a need for data mining. With this criteria, I would, for example, query the average commute time of people over 16 in the state of Texas. It is 25 minutes. We would now have a cornerstone insight to work from. This of course narrows the age group to those receiving incomes and not on Social Security and Medicare. In an effort to validate a possible conclusion, we run a secondary query on additional demographic criteria and learn that a 25 minute commute volume count doesn’t change. Yet we learn that 35% of the people belong to one particular minority segment.
I pass this information to the Marketing Department and they now have the basis to understand how much they should pay for a statewide marketing campaign to promote their new product, when to run the campaign, and what channels and platforms to use.
DATA MINING, can’t live without it. Next week we’ll cover how and where to mine.