Quant: Central tendencies and variances

Central to quantitative research and methods is to understand the numerical, ordinal, or categorical dataset and what the data represents. This can be done through either descriptive statistics, where the researcher uses statistics to help describe a data set, or it can be done through inferential statistics, where conclusions can be drawn about the data set (Miller, n.d.).  However, researchers should aim to avoid situations where insights and conclusions are gathered and are drawn from the extreme or non-representative data, and understanding the central tendency can help avoid this scenario. For instance, in data mining for business and industry, current practice is comparing multiple random samples based on their central tendency (Ahlemeyer-Stubbe & Coleman, 2014).  Field (2013) and Schumacker (2014), defined central tendency as an all-encompassing term to help describe the “center of a frequency distribution” through the commonly used measures mean, median, and mode.

Central Tendency

In a symmetrical distribution, the central tendency is where most of the data values tend to occur. Thus we can use mean to help describe the central tendency (Schumacker, 2014).  The mean is the arithmetic average value of the data distribution, which is the sum of all the data values divided by the number of data points in the distribution (Field, 2013). Miller (n.d.) stated that the mean value is best when the data is interval data (equally distributed continuously), and the data is well balanced and not skewed.  However, if the data is heavily skewed, then the median is the best to use since it ignores the extreme values on both ends of the distribution (Miller, n.d.). Medians are easily calculated when the total number of data points in a distribution is an odd number, but if it is an even number of data points, the two most centered values will have to be averaged to obtain the median. Modes can help the researcher to identify patterns and is best for nominal or ordinal data (Miller, 2013). Therefore, the mode is the data value that is the most frequent among the data distribution, and a data distribution can be bimodal, which is having two modes in the distribution, or multi-modal, which is having three or more modes of the distribution (Field, 2013). The median is the data value at the center of the distribution when the data values are placed in ascending order (Field, 2013).

Example:  A random sample of fictitious Twitter user’s follower count consist of {22, 40, 57, 57, 68, 93, 116, 121, 168, 405, 2380, 8746}, the mean would be 1022.75, the mode would be 57, and the median would be 104.5.

Outliers, missing values, and multiplication of a constant, and adding a constant are some factors that can affect the central tendency (Schumacker, 2014).  In the case of outliers, it can draw the mean away from the central tendency and towards the outliers skewing the distribution (Miller, n.d.; Schumacker, 2014). The mean moves to the skew, or extreme values to the data, such as in the example above.  This is the danger in including the extreme values in the central tendency measures.  Then there is a need to know more about the central tendency of the data.  One way is to analyze the mean and median together to understand how skewed the data is and in which direction.  Heavily skewed distributions would heavily increase the distance between these two values, and if the mean less than the median the distribution is skewed negatively (Field, 2013).  To understand the distribution, better other measures like variance and standard deviations could be used.

Variance and Standard Deviation

Variance and standard deviations are considered as measures of dispersion, which talk about how the data values are spread across the data distribution and how they are different from the mean (Field, 2013; Schumacker, 2014).  The difference between the central tendency value and the data value is considered the deviance of the value (Field, 2013).  Squaring each deviance to get rid of negative signs and summing up all those deviances for each and every data value will result in obtaining a sum of squared error value (Field, 2013).   Taking the sum of squared errors value and dividing that up by the number of data values -1, will result in obtaining the variance of the distribution (Field, 2013).  Therefore, the variance is the average deviation between the central tendency and the data (Field, 2013).  Variance is the standard deviation value squared, and the variance is a measured value that is not in the same units of the data set (Field, 2013; Miller, n.d.; Schumacker, 2014).  In the above example, the variance about the mean is 6349719 with a standard deviation about the mean is 2519.865.  While removing the two extreme values the variance about the mean is 12293.34 and the standard deviation about the mean are 110.8754. This helps shows that the variability scores can change drastically with and without the extreme values.  Thus, this illustrates that variability shrinks when there is an agreement in the data (Miller, n.d.).

References:

  • Ahlemeyer-Stubbe, A., & Coleman S. (2014). A practical guide to data mining for business and industry. UK, Wiley-Blackwell. VitalBook file.
  • Field, A. (2013) Discovering Statistics Using IBM SPSS Statistics (4th ed.). UK: Sage Publications Ltd. VitalBook file.
  • Miller, R. (n.d.). Week 1: Central tendency [Video file]. Retrieved from http://breeze.careeredonline.com/p9fynztexn6/?launcher=false&fcsContent=true&pbMode=normal
  • Schumacker, R. E. (2014) Learning statistics using R. California, SAGE Publications, Inc, VitalBook file.

Quantitative Vs Qualitative Analysis

Field (2013) states that both quantitative and qualitative methods are complimentary at best not competing approaches to solving the world’s problems. Although these methods are quite different from each other. Creswell (2014) explain how these two, quantitative and qualitative methods, can be combined to study a phenomenon through what is called a “Mixed Method” Approach, which is out of scope for this discussion.  Simply put, quantitative methods are utilized when the research contains variables that are numerical, and qualitative methods are utilized when the research contains variables that are based on language (Field, 2013).  Thus, each methods goals and procedures are quite different

Goals and procedures

Quantitative methods derive from positivist, numerically driven, and epistemological (Joyner, 2012).   Quantitative methods use closed-ended questions, i.e. hypothesis, and collect their data numerically through instruments (Creswell, 2014). In quantitative research, there is an emphasis on experiments, measurement, and a search of relationships via fitting data to a statistical model and through observing a collection of data graphically to identify trends via deduction (Field, 2013; Joyner, 2012). According to Creswell (2014), quantitative researchers build protections against biases and control for alternative explanations through experiments which are generalizable and replicable. Quantitative studies could be experimental, quasi-experimental, causal-comparative, correlational, descriptive, and evaluation (Joyner, 2012).  According to Edmondson and McManus (2007), quantitative methodologies fit best when the underlying research theory is mature.  The maturity of the theory should tend to drive researchers towards one method over the other, along the spectrum quantitative, mixed, or qualitative methodologies (Creswell, 2014; Edmondson & McManus, 2007).

Comparatively, Edmondson and McManus (2007) stated, qualitative methodologies fit best when the underlying research theory is nascent. Quantitative methods derive from phenomenological view, the perceptions of people (Joyner, 2012).  Qualitative methods use open-ended questions, i.e. interview questions and collect their data through observations of a situation (Creswell, 2014).  Qualitative research focuses on meaning and understanding of a situation where the researcher searches for meaning through interpretation of the data via induction (Creswell, 2014; Joyner, 2012).  Qualitative research could be case studies, ethnographic, action, philosophical, historical, legal, educational, etc. (Joyner, 2012).

Commonalities and differences

The commonalities that exist between these two methods is that each method has a question to answer, an identified area of interest (Creswell, 2014; Edmonson & McManus, 2007; Field, 2013; Joyner 2012).  Each method requires a survey of the current literature to help develop the research question (Creswell, 2014; Edmondson & McManus, 2007). Finally, there is a need to design a study to collect and analyze data to help answer that research question (Creswell, 2014; Edmonson & McManus, 2007; Field, 2013; Joyner 2012).  Therefore, the similarities between these two methods exist on why research is conducted and at a high level the what and the how research is conducted.  They differ in the particulars of the what and the how research is conduction.

The research question(s) can either become a centralized question with(out) sub-questions, but in quantitative research is driven by a series of statistically testable theoretical-hypothesis (Creswell, 2014; Edmonson & McManus, 2007). For quantitative methods data analysis, statistical tests are done to seek relationships, with hopes of testing a theory-driven hypothesis and providing a precise model, via a collection of numerical measures and established constructs (Edmonson & McManus, 2007). Given the need to statistically accept or reject theoretical-hypothesis, the sample size for a quantitative methods tend to be greater than those of qualitative methods (Creswell, 2014).  Qualitative research is driven by exploration and observations to test their hypothesis (Creswell, 2014; Edmonson & McManus, 2007). For qualitative methods data analysis, there should be an iterative and explorative content analysis, with hopes to build a new construct (Edmonson & McManus, 2007).  These are some of many other differences that exist between these two methods.

When are the advantages of quantitative methods maximized

Based off of Edmondson and McManus (2007), the best time to use quantitative methods is when the underlying theory of the research subject is mature.  Maturity consists of extensive literature that could be reviewed, the existence of theoretical constructs, and extensively tested measures (Edmondson & McManus, 2007).  Thus, the application of quantitative methods will help build effectively on prior work which will help fill in the gap of knowledge on a particular topic, whereas qualitative methods and mixed methods would fail to do so. Applying quantitative methods to a mature theory is reinventing the wheel, and applying mixed methods to it, will uneven the status of the evidence (Edmondson & McManus, 2007).

References:

  • Creswell, J. W. (2014) Research design: Qualitative, quantitative and mixed method approaches (4th ed.). California, SAGE Publications, Inc. VitalBook file.
  • Edmondson, A. C., & McManus, S. E. (2007). Methodological fit in management field research. Academy of Management Review, 32(4), 1155–1179. http://doi.org/10.5465/AMR.2007.26586086
  • Field, A. (2013) Discovering Statistics Using IBM SPSS Statistics (4th ed.). UK: Sage Publications Ltd. VitalBook file.
  • Joyner, R. L. (2012) Writing the Winning Thesis or Dissertation: A Step-by-Step Guide (3rd ed.). Corwin. VitalBook file.

Business Intelligence: Compelling Topics

Departments are currently organized in a silo. Thus, their information is in silo systems, which makes it difficult to leverage that information across the company.  When we employ a data warehouse, which is a central database that contains a collection of decision-related internal and external sources of data, it can aid in the data analysis for the entire company (Ahlemeyer-Stubbe & Coleman, 2014). When we build a multi-level Business Intelligence (BI) system on top of a centralized data warehouse, we no longer have silo data systems, and thus, can make a data-driven decision.  Thus, to support data-driven decision while moving away from a silo department kept data to a centralized data warehouse, Curry,  Hasan, and O’Riain (2012) created a system that shows results from the hospital centralized data warehouse at different levels of the company, as the organization level (stakeholders are executive members, shareholders, regulators, suppliers, consumers), the functional level (stakeholders are functional managers, organization manager), and the individual level (stakeholders are the employees).  Data may be centralized, but specialized permissions on data reports can exist on a multi-level system.

The types of data that exist and can be stored in a centralized data warehouse are: Real-time data: data that reveals events that are happening immediately, Lag information: information that explains events that have recently just happened; and Lead information: information that helps predict events into the future based off of lag data, like regression data, forecasting model output (based off of Laursen & Thorlund, 2010).  All with the goal of helping decision makers if certain Target Measures are met.  Target measures are used to improve marketing efforts through tracking measures like ROI, NVP, Revenue, lead generation, lag generations, growth rates, etc. (Liu, Laguna, Wright, & He, 2014).

Decision Support Systems (DSS) were created before BI strategies.  A DSS helps execute the project, expand the strategy, improve processes, and improves quality controls in a quickly and timely fashion.  Data warehouses’ main role is to support the DSS (Carter, Farmer, & Siegel, 2014).  Unfortunately, the talks above about data types and ways to store data to enable data-driven decisions it doesn’t explain the “how,” “what,” “when,” “where,” “who”, and “why.”  However, a strong BI strategy is imperative to making this all work.  A BI strategies can include, but is not limited to data extraction, data processing, data mining, data analysis, reporting, dashboards, performance management, actionable decisions, etc. (Fayyad, Piatetsky-Shapiro, & Smyth, 1996; Padhy, Mishra, & Panigrahi, 2012; McNurlin, Sprague,& Bui, 2008).  This definition along with the fact the DSS is 1/5 principles to BI suggest that DSS was created before BI and that BI is a more new and holistic view of data-driven decision making.

But, what can we do with a strong BI strategy? Well with a strong BI strategy we can increase a company’s revenue through Online profiling.  Online profiling is using a person’s online identity to collect information about them, their behaviors, their interactions, their tastes, etc. to drive a targeted advertising (McNurlin et al., 2008).  Unfortunately, the fear comes when the end-users don’t know what the data is currently being used for, what data do these companies or government have, etc.  Richards and King (2014) and McEwen, Boyer, and Sun (2013), expressed that it is the flow of information, and the lack of transparency is what feeds the fear of the public. McEwen et al. (2013) did express many possible solutions, one which could gain traction in this case is having the consumers (end-users) know what variables is being collected and have an opt-out feature, where a subset of those variables stay with them and does not get transmitted.

 

Reference:

  • Ahlemeyer-Stubbe, Andrea, Shirley Coleman. (2014). A Practical Guide to Data Mining for Business and Industry, 1st Edition. [VitalSource Bookshelf Online]. Retrieved from https://bookshelf.vitalsource.com/#/books/9781118981863/
  • Carter, K. B., Farmer, D., & Siegel, C. (2014-08-25). Actionable Intelligence: A Guide to Delivering Business Results with Big Data Fast!, 1st Edition. [VitalSource Bookshelf Online]. Retrieved from https://bookshelf.vitalsource.com/#/books/9781118920657/
  • Curry, E., Hasan, S., & O’Riain, S. (2012, October). Enterprise energy management using a linked dataspace for energy intelligence. In Sustainable Internet and ICT for Sustainability (SustainIT), 2012 (pp. 1-6). IEEE.
  • Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. (1996). From data mining to knowledge discovery in databases. AI magazine, 17(3), 37. Retrieved from: http://www.aaai.org/ojs/index.php/aimagazine/article/download/1230/1131/
  • Laursen, G. H. N., & Thorlund, J. (2010) Business Analytics for Mangers: Taking Business Intelligence Beyond Reporting. Wiley & SAS Business Institute.
  • Liu, Y., Laguna, J., Wright, M., & He, H. (2014). Media mix modeling–A Monte Carlo simulation study. Journal of Marketing Analytics, 2(3), 173-186.
  • McEwen, J. E., Boyer, J. T., & Sun, K. Y. (2013). Evolving approaches to the ethical management of genomic data. Trends in Genetics, 29(6), 375-382.
  • McNurlin, B., Sprague, R., & Bui, T. (09/2008). Information Systems Management, 8th Edition. [VitalSource Bookshelf Online]. Retrieved from https://bookshelf.vitalsource.com/#/books/9781323134702/
  • Padhy, N., Mishra, D., & Panigrahi, R. (2012). The survey of data mining applications and feature scope. arXiv preprint arXiv:1211.5723.  Retrieved from: https://arxiv.org/ftp/arxiv/papers/1211/1211.5723.pdf
  • Richards, N. M., & King, J. H. (2014). Big data ethics. Wake Forest L. Rev., 49, 393

Business Intelligence: Predictions Followup

  • Potential Opportunities:

o    Health monitoring.  Currently, smart watches are tracking our heart rate, steps, standing time, climbing stairs, siting time, heart beats, workouts, biking, sleep, etc.  But, what if we had a device that measured daily our chemicals in our blood, that is no longer as painful as pricking your finger if you are diabetic.  This, the technology could not only measure your blood chemical makeup but could send alerts to EMT and doctors if there is a dangerous imbalance of chemicals in your blood (Carter et al., 2014).  This would require a strong BI program across emergency responders, individuals, and doctors.

o    As Moore’s law of computational speed moves forward in time, the more chances are companies able to interpret real-time data and produce lead information which can drive actionable data-driven decisions. Companies can finally get answers to strategic business questions in minutes as well (Carter et al., 2014).

o    Both internal data (corporate data) and external data (competitor analysis, costumer analysis, social media, affinity and sentiment analysis), will be reported to senior leaders and executives who have the authority to make decisions on behalf of the company on a frequent basis.  These issues may show up in a dashboard, with x number of indicators/metrics as successfully implemented in a case study of a hospital (Topaloglou & Barone, 2015).

  • Potential Pitfalls:

o    Tools for threat detection, like those being piloted in New York City, could have an increased level of discrimination (Carter, Farmer, & Siegel, 2014). As big data analytics is being used to do facial recognition of photographs and live video to identify threats, it can lead to more racial profiling if the knowledge fed into the system as a priori has elements of racial profiling.  This could lead to a bias in reporting, track higher levels of a particular demographic, and the fact that past performance doesn’t indicate the future.

o    Data must be validated before it is published onto a data warehouse.  Due to the low data volatility feature of data warehouses, we need to ensure that the data we receive is correct, thus expected value thresholds must be set to capture errors before they are entered.  Wrong data in, means wrong data analysis, and wrong data-drove decisions.  An example of expected value thresholds could be that earth’s temperature cannot exceed 500K at the surface.

o    Amplified customer experience.  As BI incorporates social media to gauge what is going on in the minds of their customer, if something were to go viral that could hurt the company, it can be devastating for the company.  Essentially we are giving the customer an amplified voice.  This can be rumors of software, hardware leaks as what happens for every Apple iPhone generation/release, which can put current proprietary information into the hands of their competitors.  A nasty comment or post that gets out of control on a social media platform, to celebrity boycotts.  Though, the opportunity here lies in receiving key information on how to improve their products, identify leakers of information, and settle nasty rumors, issues, or comments.

  • Potential Threats:

o    Loss of data through hackers, which are aiming to steal someone’s identity.  Firewalls must be tighter than ever, and networks must be more secure than ever as a company goes into a centralized data warehouse.  Data warehouses are vital for BI initiatives, but if HR data is located in the warehouse, (for example to help HR identify likelihood measures of disgruntled employees to aid in their retention efforts) then if a hacker were to get a hold of that data, thousands of people information can be compromised.  This is nothing new, but this is a potential threat that must be mitigated as we proceed into BI systems.  This can not only apply to people data but company proprietary data.

o    Consumer advertisement blitz. If companies use BI to blast their customers with ads in hopes to better market to people and use item affinity analysis, to send coupons and attract more sales and higher revenues.  There is a personal example here for me:  XYZ is a clothing store, when I moved to my first house, the old owner never switched their information in their database.  But, since they were a frequent buyer and those magazines, coupons, flyers, and sales were working on the old owner of the house, they kept getting blasted with marketing ads.  When I moved in, I got a magazine every two days.  It was a waste of paper and made me less likely to shop there.  Eventually, I had enough and called customer service.  They resolved the issue, but it took six weeks after that call, for my address to be removed from their marketing and customer database.  I haven’t shopped there since.

o    Informational overload.  As companies go forward into implementing BI systems, they must meet with the entire multi-level organization to find out their data needs.  Just because we have the data, doesn’t mean we should display it.  The goal is to find the right amount of key success factors, key performance indicators, and metrics, to help out the decision makers at all different levels.  Complicating this part up can compromise the adoption of BI in the organization and will be seen as a waste of money rather than a tool that could help them in today’s competitive market.  This is such a hard line to walk on, but it is one of the biggest threats.  It was realized in the hospital case study (Topaloglou & Barone, 2015) and therefore mitigated for through extensive planning, buy-in, and documentation.

 

Resources:

Business Intelligence: Predictions

The future of …

  • Data mining:

o    Web structure mining (studying the web structure of web pages) and web usage analysis (studying the usage of web pages) will become more prominent in the future.  Victor and Rex (2016) stated that web mining differs from traditional data mining by scale (web information is much larger in number, making 10M web pages seem like it’s too small), access (web information is mostly public, whereas traditional data could be private), and structure (web pages have unstructured, and semi-structured data, whereas traditional data mining, has some explicit level of structure).  The structure of a website can contain: Page Rank, Page number, Damping factor, Number of pages, out-links, in-links, etc.  Your page is considered an authoritative piece if there are many in-links, or it can be considered a hub if it has many out-links, and this helps define page rank and structure of the website (Victor & Rex, 2016).  But, page rank is too trivial of calculation.  One day we will be able to not only know a page rank of a website, but learn its domain authority, page authority, and domain validity, which will help define how much value a particular site can bring to the person.  If Google were to adopt these measures, we could see

  • Data mining’s link to knowledge management (KM):

o    A move towards the away from KM tools and tool set to seeing knowledge as being embedded into as many processes and people as possible (Ferguson, 2016). KM relies on sharing, and as we move away from tools, processes will be setup to allow this sharing to happen.  Sharing occurs more frequently with an increase in interactive and social environments (Ferguson, 2016).  Thus, internal corporate social media platforms may become the central data warehouse, hosting all kinds of knowledge.  The issue and further research need to go into this, is how do we more people engaged on a new social media platform to eventually enable knowledge sharing. Currently, forums, YouTube, and blogs are inviting, highly inclusive environments that share knowledge, like how to solve a particular issue (evident by YouTube video tutorials).  In my opinion, these social platforms or methods of sharing, show the need for a more social, inclusive, and interactive environment needs to be for knowledge sharing to happen more organically.

o    IBM (2013), shows us a glimpse of how knowledge management from veteran police officers, crime data stored in a crime data warehouse, the power of IBM data mining, can be to identifying criminals.  Mostly criminals commit similar crimes with similar patterns and motives.  The IBM tools augment officer’s knowledge, by narrowing down a list of possible suspects of crime down to about 20 people and ranking them on how likely the suspects committed this new crime.  This has been used in Miami-Dade County, the 7th largest county in the US, and a tool like this will become more widespread with time.

  • Business Intelligence (BI) program and strategy:

o    Potential applications of BI and strategy will go into the health care industry.  Thanks to ObamaCare (not being political here), there will be more data coming in due to an increase in patients having coverage, thus more chances to integrate: hospital data, insurance data, doctor diagnosis, patient care, patient flow, research data, financial data, etc. into a data warehouse to run analytics on the data to create beneficial data-driven decisions (Yeoh, & Popovič, 2016; Topaloglou & Barone, 2015).

o    Potential applications of BI and strategy will affect supply chain management.  The Boeing Dreamliner 787, has outsourced 30% of its parts and components, and that is different to the current Boeing 747 which is only 5% outsourced (Yeoh, & Popovič, 2016).  As more and more companies increase their outsourcing percentages for their product mix, the more crucial is capturing data on fault tolerances on each of those outsourced parts to make sure they are up to regulation standards and provide sufficient reliability, utility, and warranty to the end customer.  This is where tons of money and R&D will be spent on in the next few years.

References

  • Ferguson, J. E. (2016). Inclusive perspectives or in-depth learning? A longitudinal case study of past debates and future directions in knowledge management for development. Journal of Knowledge Management, 20(1).
  • IBM (2013). Miami-Dade Police Department: New patterns offer breakthroughs for cold cases. Smarter Planet Leadership Series.  Retrieved from http://www.ibm.com/smarterplanet/global/files/us__en_us__leadership__miami_dade.pdf
  • Topaloglou, T., & Barone, D. (2015) Lessons from a Hospital Business Intelligence Implementation. Retrieved from http://www.idi.ntnu.no/~krogstie/test/ceur/paper2.pdf
  • Victor, S. P., & Rex, M. M. X. (2016). Analytical Implementation of Web Structure Mining Using Data Analysis in Educational Domain. International Journal of Applied Engineering Research, 11(4), 2552-2556.
  • Yeoh, W., & Popovič, A. (2016). Extending the understanding of critical success factors for implementing business intelligence systems. Journal of the Association for Information Science and Technology, 67(1), 134-147.

Business Intelligence: OLAP

Within a Business Intelligence (BI) program online analytical processing (OLAP) and customer relationship management (CRMs) are both applications have strategic uses for the company and are dependent on the data warehouse to help analyze multidimensional datasets stored in them to provide data-driven solutions to queries. They are both systems that require data analytics to turn all the multidimensional data into insightful information. OLAP’s multidimensional view of the data warehouse data sets can occur because it is mapped onto n-dimensional data cubes, where data can then be easily rolled up, drilled down, slice and dice, and pivot (Conolly & Begg, 2014). OLAP can have many applications outside of customer relationships.  Thus, OLAP is more versatile compared to CRMS, because CRMs are more targeted/focused with their approach, analysis of the customer relationship to the company/product.  CRMs main goal is to analyze internal and external data stored in the data warehouse, to come up with insights like “predicted affinity to buy” of a customer, the “cost or profit” of a customer, “prediction of future customer behavior”, etc. (Ahlemeyer-Stubbe & Shirley, 2014).  The information gained from the CRM can empower employees at the company on a customer’s affinity towards a product to either sell similar items or items of the result in a market basket analysis.

OLAP is the online analytical processing application, which allows people to examine data in real time from different points of view in aid to driving more data-driven decisions (McNurlin et al., 2008).  With OLAP, computers can now make what-if analysis and goal-based decisions using data. The key ability of OLAPs systems are to help answer the “Why?” question, as well as the typical “Who?” and “What?” questions (Conolly & Begg, 2014).  Connolly and Begg (2014) further explain that OLAP is a specialized implementation of SQL. Unfortunately, data queried is assumed to be static and unchanging.  Hence, the low volatile aspect of a data warehouse, with multidimensional databases is ideal for OLAP apps.  They value of the data warehouse does not come from just storing the right kind of data, but through making and conducting analysis, to solve queries that will in the end help make data driven decisions that are the best for the company.  According to Conolly & Begg (2014), OLAP tools have been used in studying the effectiveness of marking campaigns, product sales forecasting, and capacity planning.  However, it is of the opinion of Conolly & Begg (2014) that data mining tools can surpass the capabilities of OLAP tools.

CRMs, on the other hand, focuses a wide range of concepts revolving how companies store, capture and analyze customer, vendor, and partner relationship data. Information stored in CRMs could be interactions with customers, vendors or partners, which allow the company to gain insights based on previous interactions and could even be grouped/associated into different customer segments, market basket analysis, etc. (Ahlemeyer-Stubbe & Shirley, 2014). CRMs can assist in real time with making data-driven decisions with respects to a company’s customers (Mcnurlin, Sprague, & Bui, 2008).  The goal is to use the current data, to help the company build more optimal communications and relationships with it customers, vendors or partners.  Both internal and external data of the company is usually added to the data warehouse for the CRM. Through the use of the internet, companies can study more about their customers and their noncustomers, to aid a company to become more customer centric (McNurlin et al., 2008).  McNurlin et al. (2008) stated a case study with Wachovia Bank purchasing a pay-by-use CRM system from salesforce.com.  After the system was set up within six weeks, sales reps had 30 more hours to use on selling more bank services, and managers can use the data that was collected by the CRM to tell the sales reps which customers would have the highest yield.

References:

Business Intelligence: Decision Support Systems

Many years ago a measure of Business Intelligence (BI) systems was on how big the data warehouse was (McNurlin, Sprague,& Bui, 2008).   This measure made no sense, as it’s not all about the quantity of the data but the quality of the data.  A lot of bad data in the warehouse means that it will provide a lot of bad data-driven decisions. Both BI and Decision Support Systems (DSS) help provide data to support data-driven decisions.  However, McNurlin et al. (2008) state that a DSS is one of five principles of BI, along with data mining, executive information systems, expert systems, and agent-based modeling.

  • A BI strategies can include, but is not limited to data extraction, data processing, data mining, data analysis, reporting, dashboards, performance management, actionable decisions, etc. (Fayyad, Piatetsky-Shapiro, & Smyth, 1996; Padhy, Mishra, & Panigrahi, 2012; and McNurlin et al., 2008). This definition along with the fact the DSS is 1/5 principles to BI suggest that DSS was created before BI and that BI is a more new and holistic view of data-driven decision making.
  • A DSS helps execute the project, expand the strategy, improve processes, and improves quality controls in a quickly and timely fashion. Data warehouses’ main role is to support the DSS (Carter, Farmer, & Siegel, 2014).  The three components of a DSS are Data Component (comprising of databases, or data warehouse), Model Component (comprising of a Model base) and a dialog component (Software System, which a user can interact with the DSS) (McNurlin et al., 2008).

McNurlin et al (2008) state a case study, where Ore-Ida Foods, Inc. had a marketing DSS to support its data-driven decisions by looking at the: data retrieved (internal data and external market data), market analysis (was 70% of the use of their DSS, where data was combined, and relationships were discovered), and modeling (which is frequently updated).  The modeling offered great insight for the marketing management.  McNurlin et al. (2008), emphasizes that DSS tend to be defined, but heavily rely on internal data with little or some external data and that vibrational testing on the model/data is rarely done.

The incorporation of internal and external data into the data warehouse helps both BI strategies and DSS.  However, the one thing that BI strategies provide that DSS doesn’t is “What is the right data that should be collected and presented?” DSS are more of the how component, whereas BI systems generate the why, what, and how, because of their constant feedback loop back into the business and the decision makers.  This was seen in a hospital case study and was one of the main key reasons why it succeeded (Topaloglou & Barone, 2015).  As illustrated in the hospital case study, all the data types were consolidated to a unifying definition and type and had a defined roles and responsibilities assigned to it.  Each data entered into the data warehouse had a particular reason, and that was defined through interviews will all different levels of the hospital, which ranged from the business level to the process level, etc.

BI strategies can affect supply chain management in the manufacturing setting.  The 787-8, 787-9, and 787-10 Boeing Dreamliners have outsourced ~30% of its parts and components or more, this approach to outsourcing this much of a product mix is new since the current Boeing 747 is only ~5% outsourced (Yeoh, & Popovič, 2016).  As more and more companies increase their outsourcing percentages for their product mix, the more crucial it is to capture data on fault tolerances on each of those outsourced parts.  Other things that BI data could be used is to make decisions on which supplier to keep or not keep.  Companies as huge as Boeing can have multiple suppliers for the same part, if in their inventory analysis they find an unusually larger than average variance in the performance of an item: (1) they can either negotiate a lower price to overcompensate a larger than average variance, or (2) they could all together give the company a notice that if they don’t lower that variance for that part they will terminate their contract.  Same things can apply with the auto manufacturing plants or steel mills, etc.

Resources: