Business Intelligence: OLAP

Within a Business Intelligence (BI) program online analytical processing (OLAP) and customer relationship management (CRMs) are both applications have strategic uses for the company and are dependent on the data warehouse to help analyze multidimensional datasets stored in them to provide data-driven solutions to queries. They are both systems that require data analytics to turn all the multidimensional data into insightful information. OLAP’s multidimensional view of the data warehouse data sets can occur because it is mapped onto n-dimensional data cubes, where data can then be easily rolled up, drilled down, slice and dice, and pivot (Conolly & Begg, 2014). OLAP can have many applications outside of customer relationships.  Thus, OLAP is more versatile compared to CRMS, because CRMs are more targeted/focused with their approach, analysis of the customer relationship to the company/product.  CRMs main goal is to analyze internal and external data stored in the data warehouse, to come up with insights like “predicted affinity to buy” of a customer, the “cost or profit” of a customer, “prediction of future customer behavior”, etc. (Ahlemeyer-Stubbe & Shirley, 2014).  The information gained from the CRM can empower employees at the company on a customer’s affinity towards a product to either sell similar items or items of the result in a market basket analysis.

OLAP is the online analytical processing application, which allows people to examine data in real time from different points of view in aid to driving more data-driven decisions (McNurlin et al., 2008).  With OLAP, computers can now make what-if analysis and goal-based decisions using data. The key ability of OLAPs systems are to help answer the “Why?” question, as well as the typical “Who?” and “What?” questions (Conolly & Begg, 2014).  Connolly and Begg (2014) further explain that OLAP is a specialized implementation of SQL. Unfortunately, data queried is assumed to be static and unchanging.  Hence, the low volatile aspect of a data warehouse, with multidimensional databases is ideal for OLAP apps.  They value of the data warehouse does not come from just storing the right kind of data, but through making and conducting analysis, to solve queries that will in the end help make data driven decisions that are the best for the company.  According to Conolly & Begg (2014), OLAP tools have been used in studying the effectiveness of marking campaigns, product sales forecasting, and capacity planning.  However, it is of the opinion of Conolly & Begg (2014) that data mining tools can surpass the capabilities of OLAP tools.

CRMs, on the other hand, focuses a wide range of concepts revolving how companies store, capture and analyze customer, vendor, and partner relationship data. Information stored in CRMs could be interactions with customers, vendors or partners, which allow the company to gain insights based on previous interactions and could even be grouped/associated into different customer segments, market basket analysis, etc. (Ahlemeyer-Stubbe & Shirley, 2014). CRMs can assist in real time with making data-driven decisions with respects to a company’s customers (Mcnurlin, Sprague, & Bui, 2008).  The goal is to use the current data, to help the company build more optimal communications and relationships with it customers, vendors or partners.  Both internal and external data of the company is usually added to the data warehouse for the CRM. Through the use of the internet, companies can study more about their customers and their noncustomers, to aid a company to become more customer centric (McNurlin et al., 2008).  McNurlin et al. (2008) stated a case study with Wachovia Bank purchasing a pay-by-use CRM system from salesforce.com.  After the system was set up within six weeks, sales reps had 30 more hours to use on selling more bank services, and managers can use the data that was collected by the CRM to tell the sales reps which customers would have the highest yield.

References:

Advertisements

Business Intelligence: Decision Support Systems

Many years ago a measure of Business Intelligence (BI) systems was on how big the data warehouse was (McNurlin, Sprague,& Bui, 2008).   This measure made no sense, as it’s not all about the quantity of the data but the quality of the data.  A lot of bad data in the warehouse means that it will provide a lot of bad data-driven decisions. Both BI and Decision Support Systems (DSS) help provide data to support data-driven decisions.  However, McNurlin et al. (2008) state that a DSS is one of five principles of BI, along with data mining, executive information systems, expert systems, and agent-based modeling.

  • A BI strategies can include, but is not limited to data extraction, data processing, data mining, data analysis, reporting, dashboards, performance management, actionable decisions, etc. (Fayyad, Piatetsky-Shapiro, & Smyth, 1996; Padhy, Mishra, & Panigrahi, 2012; and McNurlin et al., 2008). This definition along with the fact the DSS is 1/5 principles to BI suggest that DSS was created before BI and that BI is a more new and holistic view of data-driven decision making.
  • A DSS helps execute the project, expand the strategy, improve processes, and improves quality controls in a quickly and timely fashion. Data warehouses’ main role is to support the DSS (Carter, Farmer, & Siegel, 2014).  The three components of a DSS are Data Component (comprising of databases, or data warehouse), Model Component (comprising of a Model base) and a dialog component (Software System, which a user can interact with the DSS) (McNurlin et al., 2008).

McNurlin et al (2008) state a case study, where Ore-Ida Foods, Inc. had a marketing DSS to support its data-driven decisions by looking at the: data retrieved (internal data and external market data), market analysis (was 70% of the use of their DSS, where data was combined, and relationships were discovered), and modeling (which is frequently updated).  The modeling offered great insight for the marketing management.  McNurlin et al. (2008), emphasizes that DSS tend to be defined, but heavily rely on internal data with little or some external data and that vibrational testing on the model/data is rarely done.

The incorporation of internal and external data into the data warehouse helps both BI strategies and DSS.  However, the one thing that BI strategies provide that DSS doesn’t is “What is the right data that should be collected and presented?” DSS are more of the how component, whereas BI systems generate the why, what, and how, because of their constant feedback loop back into the business and the decision makers.  This was seen in a hospital case study and was one of the main key reasons why it succeeded (Topaloglou & Barone, 2015).  As illustrated in the hospital case study, all the data types were consolidated to a unifying definition and type and had a defined roles and responsibilities assigned to it.  Each data entered into the data warehouse had a particular reason, and that was defined through interviews will all different levels of the hospital, which ranged from the business level to the process level, etc.

BI strategies can affect supply chain management in the manufacturing setting.  The 787-8, 787-9, and 787-10 Boeing Dreamliners have outsourced ~30% of its parts and components or more, this approach to outsourcing this much of a product mix is new since the current Boeing 747 is only ~5% outsourced (Yeoh, & Popovič, 2016).  As more and more companies increase their outsourcing percentages for their product mix, the more crucial it is to capture data on fault tolerances on each of those outsourced parts.  Other things that BI data could be used is to make decisions on which supplier to keep or not keep.  Companies as huge as Boeing can have multiple suppliers for the same part, if in their inventory analysis they find an unusually larger than average variance in the performance of an item: (1) they can either negotiate a lower price to overcompensate a larger than average variance, or (2) they could all together give the company a notice that if they don’t lower that variance for that part they will terminate their contract.  Same things can apply with the auto manufacturing plants or steel mills, etc.

Resources:

 

Business Intelligence: Data Mining

Data mining is just a subset of the knowledge discovery process (or concept flow of Business Intelligence), where data mining provides the algorithms/math that aid in developing actionable data-driven results (Fayyad, Piatetsky-Shapiro, & Smyth, 1996). It should be noted that success has much to do with the events that lead to the main event as it does with the main event.  Incorporating data mining processes into Business Intelligence, one must understand the business task/question behind the problem, properly process all the required data, analyze the data, evaluate and validate the data while analyzing the data, apply the results, and finally learn from the experience (Ahlemeyer-Stubbe & Coleman, 2014). Conolly and Begg (2014), stated that there are four operations of data mining: predictive modeling, database segmentation, link analysis, and deviation detection.  Fayyad et al. (1996), classifies data mining operations by their outcomes: prediction and descriptive.

It is crucial to understand the business task/question behind the problem you are trying to solve.  The reason why is because some types of business applications are associated with particular operations like marketing strategies use database segmentation (Conolly & Begg, 2014).  However, any of the data mining operations can be implemented for any business application, and many business applications can use multiple operations.  Customer profiling can use database segmentation first and then use predictive modeling next (Conolly & Begg, 2014). By thinking outside of the box about which combination of operations and algorithms to use, rather than using previously used operations and algorithms to help meet the business objectives, it could generate even better results (Minelli, Chambers, & Dhiraj, 2013).

A consolidated list (Ahlemeyer-Stubbe & Coleman, 2014; Berson, Smith, & Thearling 1999; Conolly & Begg, 2014; Fayyad et al., 1996) of the different types of data mining operations, algorithms and purposes are listed below.

  • Prediction – “What could happen?”
    • Classification – data is classified into different predefined classes
      • C4.5
      • Chi-Square Automatic Interaction Detection (CHAID)
      • Support Vector Machines
      • Decision Trees
      • Neural Networks (also called Neural Nets)
      • Naïve Bayes
      • Classification and Regression Trees (CART)
      • Bayesian Network
      • Rough Set Theory
      • AdaBoost
    • Regression (Value Prediction) – data is mapped to a prediction formula
      • Linear Regression
      • Logistic Regression
      • Nonlinear Regression
      • Multiple linear regression
      • Discriminant Analysis
      • Log-Linear Regression
      • Poisson Regression
    • Anomaly Detection (Deviation Detection) – identifies significant changes in the data
      • Statistics (outliers)
  • Descriptive – “What has happened?”
    • Clustering (database segmentation) – identifies a set of categories to describe the data
      • Nearest Neighbor
      • K-Nearest Neighbor
      • Expectation-Maximization (EM)
      • K-means
      • Principle Component Analysis
      • Kolmogorov-Smirnov Test
      • Kohonen Networks
      • Self-Organizing Maps
      • Quartile Range Test
      • Polar Ordination
      • Hierarchical Analysis
    • Association Rule Learning (Link Analysis) – builds a model that describes the data dependencies
      • Apriori
      • Sequential Pattern Analysis
      • Similar Time Sequence
      • PageRank
    • Summarization – smaller description of the data
      • Basic probability
      • Histograms
      • Summary Statistics (max, min, mean, median, mode, variance, ANOVA)
  • Prescriptive – “What should we do?” (an extension of predictive analytics)
    • Optimization
      • Decision Analysis

Finally, Ahlemeyer-Stubbe and Coleman (2014) stated that even though there are a ton of versatile data mining software available that would do any of the abovementioned operations and algorithms; a good data mining software would be deployable across different environments and include tools for data prep and transformation.

References

Business Intelligence: Corporate Planning

Corporate Planning

The main difference between business planning and corporate planning is the actors.  They both are defining strategies that will help the meet the business goals and objectives.  However, business planning is describing how the business will do it, through focusing on business operations, marketing, and products and services (Smith, n.d).  Meanwhile, corporate planning is describing how the employees will do it, through focusing on staff responsibilities and procedures (Smith, n.d.).  Smith (n.d.) implied that corporate planning will succeed if it is aligned with the company’s strategy and missions, drawing on the strengths and improving on its weaknesses. A successful and realistic corporate and business plan can help the company succeed.  Business Intelligence can help in creating these plans.  In order to make the right plans, we must make better decisions that help the company out, and data-driven decisions (through Business Intelligence).  Business Intelligence, will help provide answers to questions much faster and quite easily, make better use of the corporate time, and finally aid in making improvements for the future (Carter, Farmer, & Siegel, 2014).

A small, medium, or large organization deals with planning differently, so BI solutions are not a one-size-fits-all.  Small companies have the freedom, creativity, motivation, and flexibility that large companies lack (McNurlin, Sprague, & Bui, 2008).  Large companies have the economies of scales and knowledge that small companies do not (McNurlin et al., 2008).  Large companies are beginning to advocate centralized corporate planning yet decentralized execution, which is a similar structure of a medium size company (McNurlin et al., 2008).  Thus, medium size companies have the benefits of both large and small companies, but also the disadvantages of both.  Unfortunately, a huge drawback on large organizations is a fear of collaboration and tightly holding onto their proprietary information (Carter et al., 2014). The issues of holding tightly to proprietary information and lack of collaboration is not conducive for a solid Knowledge Management nor Business Intelligence plan.

Business Intelligence

Business Intelligence uses data to create information that helps with data-driven decisions, which can be especially important for corporate planning.  Thus, we can reap the benefits of Business Intelligence to make data-driven decisions, if we balance the needs of the company, corporate vision, and the size of the company to help in choosing which models the company should use.  A centralized model is when one team in the entire corporation owns all the data and provides all the needed analytical services (Minelli, Chambers, & Dhiraj, 2013).  A decentralized model of Business Intelligence is where each business function owns its data infrastructure and a team of data scientists (Minelli et al., 2013).  Finally, Minelli et al. (2013) defined that a federated model is where each function is allowed to access the data to make data-driven decisions, but also ensures that it is aligned to a centralized data infrastructure.

Knowledge Management

McNurlin et al. (2008), defines knowledge management as managing the transition between two states of knowledge, tacit (information that is privately kept in one’s mind) and explicit knowledge (information that is made public, which is articulated and codified). We need to discover the key people who have the key knowledge, which will aid in knowledge sharing to help benefit the company.  Knowledge management can rely on technology to be captured and share appropriately such that it can be used to sustain the individual and sustain the business performance (McNurlin et al., 2008).

Knowledge management can also include domain knowledge (knowledge of a particular field or subject).  The inclusion of domain knowledge into a data mining, which is a component of Business Intelligence System has aided in pruning association rules to help extract meaningful data to aid in developing data-driven decisions (Cristina, Garcia, Ferraz, & Vivacqua, 2009).  In this particular study, engineers helped to build a domain understanding to interpret the results as well as steer the search of specific if-then rules, which helped to find more significant patterns in the data (Cristina et al. 2009).

The addition of domain experts helped captured tacit knowledge and transformed it into explicit knowledge, which was then used to find significant patterns in the data that was collected and mined through.  This eventually leads to a more manageable set of information with high significance to the company to which data-driven decisions can be made to support the corporate planning. Thus, knowledge management can be an integral part of Business Intelligence.  Finally, Business Intelligence uses data to create information that when introduced with experience of the employees (through knowledge management) it can then create explicit knowledge, which can provide more meaningful data-driven decisions than if one were to focus on a Business Intelligence Systems alone.

The effectiveness of capturing and adding domain knowledge into a company’s Business Intelligence System depends on the quality of employees in the company and their willingness to share that knowledge.  At the end of the day, a corporate plan that focuses on staff responsibilities and procedures revolving both in Business Intelligence and Knowledge Management will gain more insights and a higher return on investment that will eventually feed back into the corporate and business plans.

References

  • Carter, K. B., Farmer, D., & Siegel C., (2014). Actionable Intelligence: A Guide to Delivering Business Results with Big Data Fast!. John Wiley & Sons P&T. VitalBook file.
  • Cristina, A., Garcia, B., Ferraz, I., & Vivacqua, A. S. (2009). From data to knowledge mining. http://doi.org/10.1017/S089006040900016X
  • McNurlin, B., Sprague, R., Bui, T. (2008). Information Systems Management, 8th Edition. Pearson Learning Solutions. VitalBook file.
  • Minelli, M., Chambers, M., and Dhiraj A. (2013). Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today’s Businesses. John Wiley & Sons P&T. VitalBook file.
  • Smith, C. (n.d.) The difference between business planning and corporate planning. Small Business Chron. Retrieved from http://smallbusiness.chron.com/differences-between-business-planning-corporate-planning-882.html

Business Intelligence: Zero Latency & Item Affinity

Types of data (based off of Laursen & Thorlund, 2010)

  • Real-time data: data that reveals events that are happening immediately, like a chat rooms, radar data, dropwindsonde data
  • Lag information: information that explains events that have recently just happened, like satellite data, weather balloon data
  • Lead information: information that helps predict events into the future based off of lag data, like regression data, forecasting model output, GPS ETA times

Everyone can easily find lag data, its old news, but what is interesting is to develop lead information from real-time data.  That has the biggest impact to any business trying to gain a competitive edge against its competitors.  When a company can use data mining and analytical formulas on real-time data they have a head start into generating lead information (Ahlemeyer-Stubbe & Coleman, 2014), which would allow for a company to make data-driven decisions much faster.  To do so, one needs to fully automate the modeling towards a predictive target, in an efficient manner (which is of particular importance when dealing with Big Data).  An example of zero latency (real-time) data analysis is seen through the production line on any manufacturing plant (i.e. Toyota, Tesla, etc.), data is stored in an enterprise resource planning (ERP) system (Carter, Keith, Farmer,  & Siegel, 2014).  Speed is vital, thus zero-latency means a manufacturing plant can meet its demands without incurring additional costs, and therefore keeping their profit margins up and their manufacturing programs in the black.  Carter et al. (2014) claim that General Electric could extract $150 billion of unrealized efficiencies just by analyzing their data.  They could get to that much faster if they drove down their latency to zero.  But, there is a caveat, the data must be not only real-time but accurate (Carter et al., 2014).

Item affinity (market basket analysis) uses rules-based analytics to understand what items frequently co-occur during transactions (Snowplow Analytics, 2016). Item affinity is similar to the Amazon.com current method to drive more sales through getting their customers to consume more.  But, to successfully implement the market basket analysis, the company like Amazon.com must implement real-time (zero-latency) analysis, to find those co-occurrence items and make suggestions to the consumer.  As the consumer adds more and more items into their shopping cart, Amazon in real-time begins to apply probabilistic mining (item affinity analysis) to find out what other items they would like to purchase in conjunction with their primary purchase (Pophal, 2014). For instance, buyers of a $40 swimsuit also bought this suntan lotion and beach towel.  Item affinity analysis doesn’t only impact the online shopping experience but also impacts shopping catalog placements, email marketing, and store layout (Snowplow Analytics, 2016).

Resources:

Business Intelligence: Data Warehouse

A data warehouse is a central database, which contains a collection of decision-related internal and external sources of data for analysis that is used for the entire company (Ahlemeyer-Stubbe & Coleman, 2014). The authors state that there are four main features to data warehouse content:

  • Topic Orientation – data which affects the decisions of a company (i.e. customer, products, payments, ads, etc.)
  • Logical Integration – the integration of company common data structures and unstructured big data that is relevant (i.e. social media data, social networks, log files, etc.)
  • Presence of Reference Period – Time is an important part of the structural component to the data because there is a need in historical data, which should be maintained for a long time
  • Low Volatility – data shouldn’t change once it is stored. However, amendments are still possible. Therefore, data shouldn’t be overridden, because this gives us additional information about our data.

Given the type of data stored in a data warehouse, it is designed to help support data-driven decisions.  Making decisions from just a gut feeling can cost millions of dollars, and degrade your service.  For continuous service improvements, decisions must be driven by data.  Your non-profit can use this data warehouse to drive priorities, to improve services that would yield short-term wins as well as long-term wins.  The question you need to be asking is “How should we be liberating key data from the esoteric systems and allowing them to help us?”

To do that you need to build a BI program.  One where key stakeholders in each of the business levels agree on the logical integration of data, common data structures, is transparent in the metrics they would like to see, who will support the data, etc.  We are looking for key stakeholders on the business level, process level and data level (Topaloglou & Barone, 2015).  The reason why, is because we need to truly understand the business and its needs, from there we can understand the current data you have, and the data you will need to start collecting.  Once the data is collected, we will prepare it before we enter it into the data warehouse, to ensure low volatility in the data, so that data modeling can be conducted reliable to enable your evaluation and data-driven decisions on how best to move forward (Padhy, Mishra, & Panigrahi,, 2012).

Another non-profit service organization that implemented a successful BI program through the creation of a data warehouse can be found by Topaloglou and Barone (2015).  This hospital experienced positive effects towards implementing their BI program:  end users can make strategic data based decisions and act on them, a shift in attitudes towards the use and usefulness of information, perception of data scientist from developers to problem solvers, data is an immediate action, continuous improvement is a byproduct of the BI system, real-time views with data details drill down features enabling more data-driven decisions and actions, the development of meaningful dashboards that support business queries, etc. (Topaloglou & Barone, 2015).

However, Topaloglou and Barone (2015) stressed multiple times in the study, which a common data structure and definition needs to be established, with defined stakeholders and accountable people to support the company’s goal based on of how the current processes are doing is key to realizing these benefits.  This key to realizing these benefits exists with a data warehouse, your centralized location of external and internal data, which will give you insights to make data-driven decisions to support your company’s goal.

Resources

Business Intelligence: Online Profiling

Online profiling is using a person’s online identity to collect information about them, their behaviors, their interactions, their tastes, etc. to drive a targeted advertising (McNurlin et al., 2008).  Online profiling straddles the point of becoming useful, annoying, or “Big Brother is watching” (Pophal, 2014).  Profiling can be based on simple third-party cookies, which are unknowingly placed when an end-user travels to a website and depending on the priority of the cookie, it can change the entire end-user experience when the visit a site with targeted messages on banner adds (McNurlin et al., 2008).  More complex tracking is when some end user uses a mobile device to scan a QR code or walks near an NFC area, where the phone then transmits about 40 different variables of that person to the company, which can then provide a more precise or perfect advertising (Pophal, 2014).

This data collection is all to gain more information about the consumer, to make better decisions about what to offer theses consumers like precise advertisements, deals, etc. (McNurlin, 2008).  The best way to describe this is through this quote by a current marketer in Phophal (2014): “So if I’m in L.A., and it’s a pretty warm day here-85 degrees-you shouldn’t be showing me an ad for hot coffee; you should be showing me a cool drink.” But, advertisers have to find a way to let the consumer know about their product, without overwhelming the consumer with “information overload.” How do consumers say “Hey look at me, I am important, and nothing else is… wouldn’t this look nice in your possession?”  If they do this too much, they can alienate the buyer from using the technology and from buying the product altogether. These advertisers need to find a meaningful and influencing connection to their consumers if they want to drive up their revenues.

At the end of the day, all this online profiling is aiming to collect enough or more than necessary data to make predictions of what the consumer is most likely going to buy and give them enough incentive to influence their purchasing decision.  The operating cost of such a tool must be done so that there is still a profit to be gained when the consumer completes a transaction and buys the product.  This, then becomes an important part of a BI program, because you are aiming to drive consumers away from your competitors and into your product.

The fear comes when the end-user doesn’t know what the data is currently being used for, what data do these companies or government have, etc.  Richards and King (2014) and McEwen, Boyer, and Sun (2013), expressed that it is the flow of information, and the lack of transparency is what feeds the fear of the public. Hence, the “Big Brother is watching”.  McEwen et al. (2013) did express many possible solutions, one which could gain traction in this case is having the consumers (end-users) know what variables is being collected and have an opt-out feature, where a subset of those variables stay with them and does not get transmitted.

Resources: