Quant: Variances

When looking at the performance of two groups on a given task, one speaks of two kinds of variance (between-groups and within-groups). What does each represent?

Variance is considered as measures of average dispersion (Field, 2013; Schumacker, 2014).  Variance is a numerical value that describes how the observed data values are spread across the data distribution and how they differ from the mean on average (Huck, 2011; Field, 2013; Schumacker, 2014).  The smaller the variance indicates that the observed data values are close to the mean and vice versa (Field, 2013). What happens when researchers want to study if the difference between two means from two groups of data is statistically significant from each other? Researchers could use ANOVA, which is an analysis of variances that test whether or not to reject the null hypothesis of the mean of one group is equal to the mean of another group (Huck, 2011; Schumacker, 2014).  ANOVAs usually test categorical independent variables (groups) and continuous dependent variables (Creswell, 2014).  One of the results of a one-way analysis of variance presents in a table the variance between groups and within groups (Huck, 2011).  Schumacker (2014), explained that the variance between groups indicates the variation between the overall grand mean of the groups, while variance within the groups indicates the variance within the means of the groups.  The variances between groups have a degree of freedom equal to the number of groups analyzed – 1, whereas the variance within the groups has a degree of freedom equal to the number of data points within each group – 1 – the number of groups (Huck, 2011).  Information from within and between the groups are used to calculate the F-statistic to establish statistical significance which can allow the researcher to reject or fail to reject their null hypothesis (Field, 2013; Huck, 2011; Schumacker, 2014).

References

  • Creswell, J. W. (2014) Research design: Qualitative, quantitative and mixed method approaches (4th ed.). California, SAGE Publications, Inc. VitalBook file.
  • Field, A. (2013) Discovering Statistics Using IBM SPSS Statistics (4th ed.). UK: Sage Publications Ltd. VitalBook file.
  • Huck, S. W. (2011) Reading Statistics and Research (6th ed.). Pearson Learning Solutions. VitalBook file.
  • Schumacker, R. E. (2014) Learning statistics using R. California, SAGE Publications, Inc, VitalBook file.

Quant: Independent and Dependent Variables

Remember that variables have to be clearly definable and measureable. Remember that variables have more than one values or levels.

Below is a list of examples of a scenario for each of the following sets of variables:

  • One independent variable and one dependent variable
    • Example 1:
      • Independent variable: Demographics of Gender (Male, Female, Other)
      • Dependent variable: management reported job performance level
    • Example 2:
      • Independent variable: Satisfaction level with their job (5 point Likert scale response)
      • Dependent variable: management reported job performance level
    • Example 3:
      • Independent variable: years of service at their company (number of years)
      • Dependent variable: management reported job performance level
  • Two independent variables and one dependent variable
    • Example 1:
      • Independent variable #1: Demographics of Gender (Male, Female, Other)
      • Independent variable #2: years of service at their company (number of years)
      • Dependent variable: management reported job performance level
    • Example 2:
      • Independent variable #1: Satisfaction level with their job (5 point Likert scale response)
      • Independent variable #2: years of service at their company (number of years)
      • Dependent variable: management reported job performance level

Quant: Getting Lost in the Numbers

It is easy to get lost in numbers when you do quantitative research.
These are suggestions that can help keep the focus on people and organizations when you are dealing with numbers representing them.

In quantitative research, data that is collected is numerical in nature. Rarely is every member of the population studied, and instead a sample from that population is randomly taken to represent that population for analysis in quantitative research (Gall, Gall, & Borg 2006). At the end of the day, the insights gained from this type of research should be impersonal, objective, and generalizable.  To generalize the results of the research the insights gained from a sample of data needs to use the correct mathematical procedures for using probabilities and information, statistical inference (Gall et al., 2006).  Gall et al. (2006), stated that statistical inference is what dictates the order of procedures, for instance, a hypothesis and a null hypothesis must be defined before a statistical significance level, which also has to be defined before calculating a z or t statistic value.

Essentially, a statistical inference allows for quantitative researchers to make inferences about a population.  A population, where researchers must remember where that data was generated and collected from during quantitative research process.  However, it is easy to get lost in the numbers during quantitative research, thus here is a list of some of the ways to keep the focus on the people and organizations when research deal with the numbers that represent their population: To design a quantitative research project, researchers must understand the purpose and rationale of their own research designs and their research methods (Creswell, 2014).  Knowing the purpose and rationale can help the development of a research question(s) and hypothesis.  With a clear research question and hypothesis can a researcher to design and review their data collection from people, organizations, or instruments.  It is when focusing on the methods section that researchers can keep their focus on the people and organizations by identifying the population, consideration of a stratified population before sampling, sampling design and procedures, selection process for the individuals, which variables to study (their name, how they relate to the research question, and collection description) (Creswell, 2014).

  • The numerical data used in the quantitative research was generated and collected from people, a social group, an organizational entity, or an instrument. The numerical value alone does not have any meaning nor value to the research. But, when the numerical value is paired with contextual information, then it provides researchers a wealth of information to conduct their statistical analysis on the data (Ahlemeyer-Stubbe, & Coleman, 2014; Miller, n.d.a.).
  • Remember each data point, row or column represents a person, group, or thing with all its features and bugs. It would be wise to create a metadata file that describes the data points variables to help keep the focus on the people and organizations.  In SPSS, the metadata section is called the “Variable View”, and each person is represented as an entity or row of data in the “Data View” (Field, 2013; Miller, n.d.b.).
  • Data sets are never neutral and theory-free data repositories but require researchers to interpret that data through their personal lenses (Crawford, Miltner, & Gray, 2014). One must gather and analyze data ethically to avoid social and legal concerns. Thus, the researcher must be aware of how their analysis of the data can be used to cause harm to others or help facilitate discriminate against disenfranchised groups of people (Robinson, 2015).

References:

  • Ahlemeyer-Stubbe, A., & Coleman S. (2014). A practical guide to data mining for business and industry. UK, Wiley-Blackwell. VitalBook file.
  • Crawford, K., Miltner, K., & Gray, M. L. (2014). Critiquing Big Data : Politics , Ethics , Epistemology Special Section Introduction. International Journal of Communication, 8, 1663–1672.
  • Creswell, J. W. (2014) Research design: Qualitative, quantitative and mixed method approaches (4th ed.). California, SAGE Publications, Inc. VitalBook file.
  • Field, A. (2013) Discovering Statistics Using IBM SPSS Statistics (4th ed.). UK: Sage Publications Ltd. VitalBook file.
  • Gall, M. D., Gall, J., & Borg W. (2006). Educational research: An introduction (8th ed.). Pearson Learning Solutions. VitalBook file.
  • Miller, R. (n.d.a.). Week 1: Central tendency [Video file]. Retrieved from http://breeze.careeredonline.com/p9fynztexn6/?launcher=false&fcsContent=true&pbMode=normal
  • Miller, R. (n.d.b.). Week 2: All about SPSS. [Video file]. Retrieved from http://breeze.careeredonline.com/p99kywtldbw/?launcher=false&fcsContent=true&pbMode=normal
  • Robinson, S. C. (2015). The good, the bad, and the ugly: Applying Rawlsian ethics in data mining marketing. Journal of Mass Media Ethics, 30(1), 19–30. http://doi.org/10.1080/08900523.2014.985297

Quant: Introduction to SPSS

SPSS is at the mercy of your input. What are variables and in what ways can you enter data into SPSS? Once you have numeric data into SPSS, what steps are required to define the meanings of the numbers for SPSS? (This requires explaining the components of Variable View in SPSS.) Why is it important to SPSS that you define these meanings?

IBM SPSS aids in the entire quantitative analytical process, which aids in gaining insights on your data, to allow for better data-driven decisions (IBM, n.d.).  SPSS allows for the quick statistical practice and analysis of the data, without getting too focused and bogged by the statistical equations (Field, 2013). SPSS allows the end user to graphically tell a story about their data by discovering hidden relationships for pattern analysis through the table, graphs, charts, and maps that are allowing pivoting (IBM, n.d).  This tool also provides high accuracy, flexibility, and advanced statistical procedures which can be made available through the guided user interface or by allowing programmable options such internal command line syntax and external programming interfaces with R, Python, Java, .NET, etc. for automating procedures (IBM, n.d.).  However, Field (2013), warned that software like SPSS, which can automate statistical equations and procedures should not be used without fully understanding the statistical theory.

Variables and how to insert them into SPSS

A variable is a measurable and observed characteristic, attribute, or object which can differ between time, space, entity, person, organization, etc. (Creswell, 2014; Field, 2013). How these variables interact with other variables helps define what type of variable they are.  There are many types of variables such as dependent variables, independent variables, intervening/mediating variables, moderating variables, control variables, confounding variables, and extraneous variables (Creswell, 2014; Field, 2013). Dependent variables measure the outcome variation and are explained and influenced by independent variables (Schumacker, 2014). Thus, the dependent variables depend on the outcomes of the independent variables (Creswell, 2014).   Independent variables which are those that can be manipulated to help explain the dependent variable’s variation (Schumacker, 2014). Thus, the independent variables are the probable cause, influence, or affect the dependent variable (Creswell, 2014).  Intervening/mediating variables stand between the independent and dependent variable as a probable causal link between the two (Creswell, 2014).  Moderating variables are a type of independent variables that influence the direction or strength between the independent and dependent variables (Cresswell, 2014). Control variables are a type of independent variable that is restricted in some way or another to help find possible influences on the dependent variable.  Confounding variables are not measured or observed, but its influences cannot be detected.  Finally, there are extraneous variables are a type of independent variable, which are not controlled in quasi-experimental research and can influence the variation of the dependent variable (Schumacker, 2014)

In SPSS, one could enter in a variable in the data editor through the “Data View” window (see Figure 1) or through the “Variable View” window (see Figure 2).  In the “Data View” data can be entered in the cells below the variable name and new variables could be added by right clicking on the top most cell and selecting “Insert Variable,” though it should be avoided (Field, 2013; Miller, n.d.).  Whereas in the “Variable View” allows the end user to not only add new variables but add defining descriptions and characteristics of the variable (Field, 2013; Miller n.d.).  Every row in “Variable View” is variable and to add a new cell just select the cell below the last variable shown and start typing the variable’s name (Field, 2013).

u1db3f1

Figure 1: SPSS “Data View” on a sample dataset called bodyfat.sav.

u1db3f2

Figure 2: SPSS “Variable View” on a sample dataset called bodyfat.sav.

Data consists of numbers.  Numbers alone do not mean a thing.  The number 3 alone doesn’t mean a thing, however, three apples, three diamonds, 3oC means something. Once numerical data has been collected and entered into SPSS, it must be defined.  It is good practice to define the data in the “Variable View” immediately after collection and population into SPSS, because as time goes on memory can fade, and if the variable is not defined it can easily be forgotten what all those numbers mean.  Thus, defining the meaning to the data through the variable view allows the end user to remember what the data in each column of SPSS is, and tells SPSS how to treat, categorize, analyze, and display the variable. In order to do that the end user would need to enter in the: name of the variable, type of variable (numeric, string, currency, date, Boolean, etc.), width of the variable (number of digits and characters in the cell), decimals (how many decimals are displayed), label (a place holder to write the full name or description of the variable), values (assign numbers for representing groups), missing (if data is missing what value should it have), columns (width of the display column), align (cell data display alignment), measure (nominal, ordinal, or scale), and the variable’s role (input, target, both, split, partition, or none, which is used for regression analysis) (Field, 2013; Miller, n.d.).

References: