Quant: Lack of detail

You have found a notice about a research study examining two styles of leadership. The researchers only told you that they are trying to recruit subjects for a research project to determine which leadership style is more effective. They have put out this scant, general description of the project on a website, asking for volunteers as subjects.

Concerns about the lack of detail

In this scenario, there is a lack of detail, and to get subjects to participate in this research problem Miller (n.d.) said: “People need to know the specifics.” From the scenario described above, there is no indication of who these researchers are nor their credentials.  Without a quick biography on the website, it is hard to discern if these researchers are credible to conduct the research. From the scenario, the recruit of subjects for their study seems to be lacking a statement of purpose, which sets the stage, intent, objectives, and major idea of the study to begin with (Creswell, 2014).  The statement of purpose gives the reader (the subjects) the reason as to why these researchers want to examine the two styles of leadership.  The statement of purpose demonstrates the problem statement, and defines the specific research questions these researchers are studying (Creswell, 2014).  Creswell, (2014) stated that effective purpose statements for quantitative research will be written in a deductive language and should include the variables, the relationships between the variables, the participants, and the research location.  The intent in quantitative research is demonstrated in the purpose statement through describing the relationships or lack thereof between the variable to be found through either a survey or experiments.  Miller (n.d.) and Creswell (2014) stated that identification of theory or conceptual framework is needed to build a strong statement of purpose.  Miller (n.d.) goes further to explain that there needs to be a statement of which two leadership styles theories or dimensions will be evaluated in this study.

There is no mention of whether the recruitment of the subjects is part the pilot study, which is used to help develop and try out methods and procedures, or conducting the main study, which is where the collection of actual data for the study is collected (Gall, Gall, & Borg, 2006).  The methodology section of this call for subjects should have addressed this.  It should also address what type of instrument these researchers are using to collect data from the subjects.  There are two main types of data collection: Survey and experiments.  It is more likely that this study recruiting subjects to study two types of leadership styles will use surveys as their means of quantitative data collection.  Creswell (2014) defines surveys a numerical data collected, studied and analyzed from a sample of the population to find out participant opinions and attitudes.  If done correctly, a statistical inference could be applied to aid in applying the results gained from this study to those of the population these researchers are trying to understand on these two leadership styles (Gall et al., 2006). Miller (n.d.) suggested that the surveys could ask about the subjects’ opinions or attitudes towards certain leadership style traits, or the survey could state a few scenarios and have the subjects select a multiple choice answer.

The survey instrument should be either valid and reliable.  It should have been either used before in other studies, with slight modifications to fit the parameters of this study, and it should be listed on their website.  A slight modification to the instrument may not have held the same validity and reliability as the old instrument.  Plus, if there is a lack of validity or reliability in the study’s instrument, then why should the subjects participate and waste their time.  Validity and reliability ensure that the results captured through the instrument will provide valid and meaningful results (Creswell, 2014; Miller, n.d.).  If the current instrument is not fully valid and reliable, then this could be indications of a pilot study to help refine and build validity and reliability in the instrument (Gall et al., 2006).   According to Creswell (2014), there are internal, external, and statistical conclusion threats to validity that must be controlled or mitigated as to help draw out the correct inference of the population.

There is no mention of the population in which these researchers are trying to study on the two leadership styles.  If the subject doesn’t fall under the conditions of the population, then the subject doesn’t know if even applying would seem like a waste of time.  Creswell (2014) states that depending on the population of the certain study instruments could work better than others, while others are just not well-suited enough to provide the needed validity and reliability needed to generalize results to that population.  The researchers could try to narrow their population by stating, “This study aims to understand the relationship between X, Y, and Z, that are displayed in A & B leadership styles among the Latin(x)-Americans population in the state of Oklahoma, from the ages of 25-35 and 45-55.”  Thus, subjects that do not fall under this population should not need to worry about applying for the study, saving time for both prospective subjects and the researchers.  The study has not mentioned how the population should be narrowed into a few dimensions to fit their study.  Thus, one can assume that these researchers may be trying to study the general population, which has a huge number of diverse dimensions that are impossible to study (Miller, n.d.).  Unless otherwise stated, any assumption goes based on the facts of this scenario.  The scenario does not mention how the researchers plan to obtain a rand selection from this population, and submitting a call through their website, would only draw a special type of population, which may or may not represent the population these researchers are trying to study.  The closer the sample represents the study’s aimed population, the more powerful is the statistical inference is to help draw inferences that are more representative of the population (Gall et al., 2006; Miller, n.d.).

Finally, there is a need for subject participation information that would entice participation: how long will the survey take; is there compensation; and will the subject be informed of the results at the end of the survey.  If the survey takes too much time and the population that these researchers are trying to sample doesn’t have that time readily available, then the participation rate would decrease.  The longer it takes to fill out an assessment, the need for compensation for the subjects is needed.  There are two ways to compensate subjects in a study: hand out small amounts of compensation to each participant or at the conclusion of the study; a random drawing is conducted to give out 2-3 prizes of substantial size (Miller, n.d.).  Regardless, if there is or is not any form of compensation available, the researchers should consider if there are at least some results or “lessons learned” that would be earned by the subjects through the participation in their study.

References:

Quant: Parametric and Non-Parametric Stats

There are numerous times when the information collected from a real organization will not conform to the requirements of a parametric analysis. That is, a practitioner would not be able to analyze the data with a t-test or F-test (ANOVA). Presume that a young professional came to you and said he or she had read about tests—such as the Chi-Square, the Mann-Whitney U test, the Wilcoxon Signed-Rank test, and Kruskal-Wallis one-way analysis of variance—and wanted to know when you would use each and why each would be used instead of the t-tests and ANOVA.

Parametric statistics is inferential and based on random sampling from a well-defined population, and that the sample data is making strict inferences about the population’s parameters. Thus tests like t-tests, chi-square, f-tests (ANOVA) can be used (Huck, 2011; Schumacker, 2014).  Nonparametric statistics, “assumption-free tests”, is used for tests that are using ranked data like Mann-Whitney U-test, Wilcoxon Signed-Rank test, Kruskal-Wallis H-test, and chi-square (Field, 2013; Huck, 2011).

First, there is a need to define the types of data.  Continuous data is interval/ratio data, and categorical data is nominal/ordinal data.  Modified from Schumacker (2014) with data added from Huck (2011):

Statistic Dependent Variable Independent Variable
Analysis of Variance (ANOVA)
     One way Continuous Categorical
t-Tests
     Single Sample Continuous
     Independent groups Continuous Categorical
     Dependent (paired groups) Continuous Categorical
Chi-square Categorical Categorical
Mann-Whitney U-test Ordinal Ordinal
Wilcoxon Ordinal Ordinal
Kruskal-Wallis H-test Ordinal Ordinal

ANOVAs (or F-tests) are used to analyze the differences in a group of three or more means, through studying the variation between the groups, and tests the null hypothesis to see if the means between the groups are equal (Huck, 2011). Student t-tests, or t-tests, test as a null hypothesis that the mean of a population has some specified number and is used when the sample size is relatively small compared to the population size (Field, 2013; Huck, 2011; Schumacker, 2014).  The test assumes a normal distribution (Huck, 2011). With large sample sizes, t-test/values are the same as z-tests/values, the same can happen with chi-square, as t and chi-square are distributions with samples size in their function (Schumacker, 2014).  In other words, at large sample sizes the t-distribution and chi-square distribution begin to look like a normal curve.  Chi-square is related to the variance of a sample, and the chi-square tests are used for testing the null hypothesis, which is the sample mean is part of a normal distribution (Schumacker, 2014).  Chi-square tests are so versatile it can be used as a parametric and non-parametric test (Field, 2013; Huck, 2011; Schumacker, 2014).

The Mann-Whiteney U-test and Wilcox signed-rank test are both equivalent, since they are the non-parametric equivalent of the t-tests and the samples don’t even have to be of the same sample length (Field, 2013).

The nonparametric Mann-Whitney U-test can be substituted for a t-test when the normal distribution cannot be assumed and was designed for two independent samples that do not have repeated measures (Field, 2013; Huck, 2011). Thus, this makes this a great substitution for the independent group’s t-test (Field, 2013). A benefit of choosing the Mann-Whitney U test is that it probably will not produce type II error-false negative (Huck, 2011). The null hypothesis is that the two independent samples come from the same population (Field, 2013; Huck, 2011).

The nonparametric Wilcoxon signed-rank test is best for distributions that are skewed, where variance homogeneity cannot be assumed, and a normal distribution cannot be assumed (Field, 2013; Huck, 2011).  Wilcoxon signed test can help compare two related/correlated samples from the same population (Huck, 2011). Each pair of data is chosen randomly and independently and not repeating between the pairs (Huck, 2011).  This is a great substitution for the dependent t-tests (Field, 2013; Huck, 2011).  The null hypothesis is that the central tendency is 0 (Huck, 2011).

The nonparametric Kruskal-Wallis H-test can be used to compare two or more independent samples from the same distribution, which is considered to be like a one-way analysis of variance (ANOVA) and focuses on central tendencies (Huck, 2011).  It is usually an extension of the Mann-Whitney U-test (Huck, 2011). The null hypothesis is that the medians in all groups are equal (Huck, 2011).

References

  • Field, A. (2013) Discovering Statistics Using IBM SPSS Statistics (4th ed.). UK: Sage Publications Ltd. VitalBook file.
  • Huck, S. W. (2011) Reading Statistics and Research (6th ed.). Pearson Learning Solutions. VitalBook file.
  • Schumacker, R. E. (2014) Learning statistics using R. California, SAGE Publications, Inc, VitalBook file.

Quant: Validity and Reliability

Presume that you are considered an expert in the area of survey construction and administration. What would you tell a young, aspiring professional about the construction process of a survey that would ensure a valid, reliable assessment instrument? What would you tell him or her about selecting a target group for the administration of the survey, as well as an administrative procedure to maximize the consistency of the survey?

construction process of a survey that would ensure a valid & reliable assessment instrument

Most flaws in research methodology exist because the validity and reliability weren’t established (Gall, Gall, & Borg, 2006). Thus, it is important to ensure a valid and reliable assessment instrument.  So, in using any existing survey as an assessment instrument, one should report the instrument’s: development, items, scales, reports on reliability, and reports on validity through past uses (Creswell, 2014; Joyner, 2012).  Permission must be secured for using any instrument and placed in the appendix (Joyner, 2012).    The validity of the assessment instrument is key to drawing meaningful and useful statistical inferences (Creswell, 2014). Creswell (2014), stated that there are multiple types of validity that can exist in the instruments: content validity (measuring what we want), predictive or concurrent validity (measurements aligned with other results), construct validity (measuring constructs or concepts).  Establishing validity in the assessment instrument helps ensure that it’s the best instrument to use for the right situation.  Reliability in assessments instruments is when authors report that the assessment instrument has internal consistency and have been tested multiple times to ensure stable results every single time (Creswell, 2014).

Unfortunately, picking up an assessment instrument that doesn’t match the content exactly will not benefit anyone, nor will the results be accepted by the greater community.  Modifying an assessment instrument that doesn’t quite match completely, can damage the reliability of this new version of the instrument, and it can take huge amounts of time to establish validity and reliability on this new version of the instrument (Creswell, 2014).  Also creating a brand new assessment instrument would mean extensive pilot studies and tests, along with an explanation of how it was developed to help establish the instrument’s validity and reliability (Joyner, 2012).

Selecting a target group for the administration of the survey

Through sampling of a population and using a valid and reliable survey instrument for assessment, attitudes and opinions about a population could be correctly inferred from the sample (Creswell, 2014).  Thus, not only is validity and reliability important but selecting the right target group for the survey is key.  A targeted group for this survey means that the population in which information will be inferred from must be stratified, which means that the characters of the population are known ahead of time (Creswell, 2014; Gall et al. 2006). From this stratified population, is where a random sampling of participants should be selected from, to ensure that statistical inference could be made for that population (Gall et al., 2006). Sometimes a survey instrument doesn’t fit those in the target group. Thus it would not produce valid nor reliable inferences for the targeted population. One must select a targeted population and determine the size of that stratified population (Creswell, 2014).  Finally, one must consider the sample size of the targeted group.

Administrative procedure to maximize the consistency of the survey

Once a stratified population and a random sample from that population have been carefully selected, there is a need to maximize the consistency of the survey.  Thus, researchers must take into account the availability of sampling, through either mail, email, website, or other survey tools like SurveyMonkey.com are ways to gather data (Creswell 2014). However, mail has a low rate of return (Miller, n.d.), so face-to-face methods or online the use of online providers may be the best bet to maximize the consistency of the survey.

References

Creswell, J. W. (2014) Research design: Qualitative, quantitative and mixed method approaches (4th ed.). California, SAGE Publications, Inc. VitalBook file.

Gall, M. D., Joyce Gall, Walter Borg. Educational Research: An Introduction (8th ed.). Pearson Learning Solutions. VitalBook file.

Joyner, R. L. (2012) Writing the Winning Thesis or Dissertation: A Step-by-Step Guide (3rd ed.). Corwin. VitalBook file.

Miller, R. (n.d.). Week 5: Research study construction. [Video file]. Retrieved from http://breeze.careeredonline.com/p8v1ruos1j1/?launcher=false&fcsContent=true&pbMode=normal

Quant: Exploring Data with SPSS

The aim of this analysis is to run a distribution analysis on diastolic blood pressure (DBP58), examining the following for individuals who have had no history of cardiovascular heart disease and individuals with a history of cardiovascular heart disease (CHD). The variable that looks at individual history is CHD.

Introduction

The aim of this analysis is to run a distribution analysis on diastolic blood pressure (DBP58), examining the following for individuals who have had no history of cardiovascular heart disease and individuals with a history of cardiovascular heart disease (CHD). The variable that looks at individual history is CHD.

From the SPSS outputs the following questions will be addressed:

  • What can be determined from the measures of skewness and kurtosis about a normal curve? What are the mean and median?
  • Does one seem better than the other to represent the scores?
  • What differences can be seen in the pattern of responses of those with history versus those with no history?
  • What information can be determined from the box plots?

Methodology

For this project, the electric.sav file is loaded into SPSS (Electric, n.d.).  The goal is to look at the relationships between the following variables: DBP58 (Average Diastolic Blood Pressure) and CHD (Incidence of Coronary Heart Disease). To conduct a descriptive analysis, navigate through Analyze > Descriptive Analytics > Explore.  The variable DBP58 was placed in the “Dependent List” box, and CHD was placed on the “Factor List” box.  Then on the Explore dialog box, “Statistics” button was clicked, and in this dialog box “Descriptives” at the 95% “Confidence interval for the mean” is selected along with outliers and percentiles.  Then going back to the on the Explore dialog box, “Plots” button was clicked, and in this dialog box under the “Boxplot” section only “Factor levels together” was selected, under the “Descriptive” section, both options were selected, and the “Spread vs. Level with Levene Test” section, “None” was selected.  The procedures for this analysis are provided in video tutorial form by Miller (n.d.). The following output was observed in the next four tables and five figures.

Results

Table 1: Case Processing Summary.

Incidence of Coronary Heart Disease Cases
Valid Missing Total
N Percent N Percent N Percent
Average Diast Blood Pressure 58 none 119 99.2% 1 0.8% 120 100.0%
chd 120 100.0% 0 0.0% 120 100.0%

According to Table 1, 99.2% or greater of the data is valid and not missing for when there is a history of Coronary Heart Disease (CHD) and when there isn’t. There is one missing data point in the case with no history of CHD. This data set contains 120 participants.

Table 2: Descriptive Statistics on the Incidents of Coronary Heart Disease and the Average Diastolic Blood Pressure.

Incidence of Coronary Heart Disease Statistic Std. Error
Average Diast Blood Pressure 58 none Mean 87.66 1.005
95% Confidence Interval for Mean Lower Bound 85.66
Upper Bound 89.65
5% Trimmed Mean 87.31
Median 87.00
Variance 120.312
Std. Deviation 10.969
Minimum 65
Maximum 125
Range 60
Interquartile Range 15
Skewness .566 .222
Kurtosis .671 .440
chd Mean 89.92 1.350
95% Confidence Interval for Mean Lower Bound 87.24
Upper Bound 92.59
5% Trimmed Mean 88.89
Median 87.00
Variance 218.732
Std. Deviation 14.790
Minimum 65
Maximum 160
Range 95
Interquartile Range 18
Skewness 1.406 .221
Kurtosis 3.620 .438

According to Table 2, there is a difference in the mean by +2 points and +0.345 in standard error in Diastolic Blood Pressure with CHD compared to when there isn’t.  The median for both cases of CHD or not are 87, with the mean for patients with CHD 89.92 (slightly skewed) and that can be seen with a skewness of 1.406 and a kurtosis of 3.620.  For the cases without a CHD, the mean blood pressure is 87.66 (showing little to now skewness in the data), as evident by the skewness of 0.566 and kurtosis of 0.671.  Upon further inspection of Figures 1 & 2, the skewness or lack thereof seems to appear to be the result of some outliers. The box plot in Figure 3 confirms these outliers.  The kurtosis values of 0.671 and 3.620 indicate they are Leptokurtic, which means they have higher peaks in their distribution and deviate from a normal distribution.

u2db3f1

Figure 1: Histogram on the Incidents of Coronary Heart Disease = none and the Average Diastolic Blood Pressure.

u2db3f2.png

Figure 2: Histogram on the Incidents of Coronary Heart Disease = chd and the Average Diastolic Blood Pressure.

u2db3f3.png

Figure 3: Box plots on the Incidents of Coronary Heart Disease and the Average Diastolic Blood Pressure.

Comparing the two histograms in Figures 1 & 2, there is a negative skewness to the data when there is CHD compared to when there isn’t.  The spread between the two histograms increases by about 3.7 points (the standard deviation from the mean) when there is CHD.  This shows that blood pressure in the sample population can vary greatly if there is CHD, whereas blood pressure is a bit more stable in the sample population that doesn’t have CHD.  Looking at the range of these the average diastolic blood pressure, if there is a CHD, then it increases, which is supported by the greater standard deviation number, and can be seen in Figure 3.  In the case with no CHD the interquartile range (which represents the middle 50% of the participants) is smaller than the participants with CHD. Participant 120 was excluded from the interquartile range due to its extreme nature.

Table 3: Percentiles on the Incidents of Coronary Heart Disease and the Average Diastolic Blood Pressure.

Incidence of Coronary Heart Disease Percentiles
5 10 25 50 75 90 95
Weighted Average (Definition 1) Average Diast Blood Pressure 58 none 71.00 75.00 80.00 87.00 95.00 102.00 105.00
chd 70.05 75.00 80.00 87.00 98.00 109.90 117.95
Tukey’s Hinges Average Diast Blood Pressure 58 none 80.00 87.00 94.50
chd 80.00 87.00 98.00

In Table 3, the percentiles on the incidents of CHD on the average diastolic blood pressure is mapped out.  95 % of all cases exist below 105 (117.95) diastolic blood pressure for no history of CHD (for the history of CHD).  These percentiles show that in the case where there is no CHD, the diastolic blood pressure values are centered more towards the median value of 87, which is supported by the above-mentioned Tables and Figures.

Table 4: Extreme Values on the Incidents of Coronary Heart Disease and the Average Diastolic Blood Pressure.

Incidence of Coronary Heart Disease Case Number Value
Average Diast Blood Pressure 58 none Highest 1 163 125
2 232 119
3 144 115
4 126 110
5 131 109
Lowest 1 157 65
2 156 65
3 175 68
4 153 68
5 237 69
chd Highest 1 120 160
2 56 133
3 42 125
4 26 121
5 111 120
Lowest 1 73 65
2 34 68
3 101 70
4 33 70
5 7 70a
a. Only a partial list of cases with the value 70 are shown in the table of lower extremes.

Examining the extreme values through Table 4, the top 5 and lowest 5 cases are considered.  In the case were there is no CHD, the lowest diastolic blood pressure value can be seen as 65 which is the same as those with CHD.  However, in the highest diastolic blood pressure value, there is a 35 point greater difference for the highest case with CHD on the highest case without CHD.

  •  Frequency    Stem &  Leaf
  •       .00        6 .
  •      5.00        6 .  55889
  •      4.00        7 .  1144
  •     18.00        7 .  555677777777888899
  •     21.00        8 .  000000000001122223344
  •     21.00        8 .  555556666777777888999
  •     20.00        9 .  00000111111222233334
  •     14.00        9 .  55666777888899
  •      8.00       10 .  00012233
  •      4.00       10 .  5559
  •      1.00       11 .  0
  •      1.00       11 .  5
  •      2.00 Extremes    (>=119)
  •  Stem width:   10
  •  Each leaf:        1 case(s)

Figure 4: Stem and leaf plot on the Incidents of Coronary Heart Disease = none and the Average Diastolic Blood Pressure.

  •  Frequency    Stem &  Leaf
  •       .00        6 .
  •      2.00        6 .  58
  •      9.00        7 .  000012233
  •     14.00        7 .  55555677788899
  •     23.00        8 .  00000000000111233333344
  •     24.00        8 .  555556667777777788999999
  •     11.00        9 .  00001122223
  •     13.00        9 .  6677788888999
  •      5.00       10 .  02333
  •      7.00       10 .  5557789
  •      4.00       11 .  0003
  •      3.00       11 .  578
  •      2.00       12 .  01
  •      1.00       12 .  5
  •      2.00 Extremes    (>=133)
  •  Stem width:   10
  •  Each leaf:        1 case(s)

Figure 5: Stem and leaf plot on the Incidents of Coronary Heart Disease = chd and the Average Diastolic Blood Pressure.

Figures 4 and 5 show more detail than the histogram information by stating the actual frequency to the left of the Stem values as well as stating what is considered to be extreme values.  In the case of CHD, a diastolic blood pressure greater than 133 is considered an outlier and when there is no CHD the extreme values are considered to be a diastolic blood pressure of 119 or more.

Conclusions

There is a difference between the distributions of those participants that have a history of Coronary Heart Disease (CHD) and those that don’t on their average diastolic blood pressure.  This is represented through the range, skewness, and distribution between both groups.  Both groups have similar medians, and lowest values, but vary greatly in the mean, standard deviation and highest values of diastolic blood pressure.

SPSS Code

DATASET NAME DataSet1 WINDOW=FRONT.

EXAMINE VARIABLES=dbp58 BY chd

  /PLOT BOXPLOT STEMLEAF HISTOGRAM

  /COMPARE GROUPS

  /PERCENTILES(5,10,25,50,75,90,95) HAVERAGE

  /STATISTICS DESCRIPTIVES EXTREME

  /CINTERVAL 95

  /MISSING LISTWISE

  /NOTOTAL.

References:

Quant: Crosstabs in SPSS

The aim of this analysis is to answer the question, if someone was rich, would they continue or stop working on their highest degree earned, gender, and job satisfaction.

Introduction

The aim of this analysis is to answer the question, if someone was rich, would they continue or stop working on their highest degree earned, gender, and job satisfaction.

Methodology

For this project, the gss.sav file is loaded into SPSS (GSS, n.d.).  The goal is to look at the relationships between the following variables: richwork (being wealthy), sex (demographics of gender), satjob (satisfaction level with the job), and degree (education degree level).   The variable richwork is the dependent variable and the other three variables are considered independent variables for this analysis. To conduct a crosstabs analysis, navigate through Analyze > Descriptive Analytics > Crosstabs.  The variable richwork was placed in the “Row(s)” box, and the other three variables were placed in the “Column(s)” box.  Then on the crosstabs dialog box, “Cells” button was clicked, and under the “Counts” section “Observed” was selected and all three boxes were seleceted under the “Percentages” section. The procedures for this analysis are provided in video tutorial form by Miller (n.d.).  The following output was observed in the next four tables.

Results

Table 1: Cases Processing Summary.

Cases
Valid Missing Total
N Percent N Percent N Percent
IF RICH, CONTINUE OR STOP WORKING * Respondent’s highest degree 625 44.0% 794 56.0% 1419 100.0%
IF RICH, CONTINUE OR STOP WORKING * Respondent’s sex 628 44.3% 791 55.7% 1419 100.0%
IF RICH, CONTINUE OR STOP WORKING * JOB OR HOUSEWORK 624 44.0% 795 56.0% 1419 100.0%

According to Table 1, about 44% (~625) of all cases are valid in all three scenarios and about 56% (~793) had missing data, from a total of 1419 respondents.

Table 2: If rich do people continue or stop working with respondent’s highest degree cross tabulation.

Respondent’s highest degree Total
Less than HS High school Junior college Bachelor Graduate
IF RICH, CONTINUE OR STOP WORKING CONTINUE WORKING Count 52 210 39 84 36 421
% within IF RICH, CONTINUE OR STOP WORKING 12.4% 49.9% 9.3% 20.0% 8.6% 100.0%
% within Respondent’s highest degree 69.3% 64.6% 81.3% 67.2% 69.2% 67.4%
% of Total 8.3% 33.6% 6.2% 13.4% 5.8% 67.4%
STOP WORKING Count 23 115 9 41 16 204
% within IF RICH, CONTINUE OR STOP WORKING 11.3% 56.4% 4.4% 20.1% 7.8% 100.0%
% within Respondent’s highest degree 30.7% 35.4% 18.8% 32.8% 30.8% 32.6%
% of Total 3.7% 18.4% 1.4% 6.6% 2.6% 32.6%
Total Count 75 325 48 125 52 625
% within IF RICH, CONTINUE OR STOP WORKING 12.0% 52.0% 7.7% 20.0% 8.3% 100.0%
% within Respondent’s highest degree 100.0% 100.0% 100.0% 100.0% 100.0% 100.0%
% of Total 12.0% 52.0% 7.7% 20.0% 8.3% 100.0%

According to Table 2, with further analysis on whether or not people would continue or stop working, 67.4% would stay, and 32.6% would stop working.  In our data about 12% have less than a high school diploma, 52% have a high school diploma, 7.7% have a gone to junior college, 20% have a bachelor degree and 8.3% have a graduate degree. With further analysis with respect to whether or not people would continue or stop working with respect to the respondent’s highest degree earned, 56.4% of respondents who have only a high school diploma would choose to leave work if they were rich making them the biggest demographic to leave in this “what if” scenario.  Finally, 81.3% of those with a junior college degree would stay at their job if they were rich, making them the biggest demographic to stay in this “what if” scenario. Those with a high school diploma, bachelor degree or graduate degree were approximately 65-69% more likely to continue working if they were rich.

Table 3: If rich do people continue or stop working with respondent’s gender cross tabulation.

Respondent’s sex Total
Male Female
IF RICH, CONTINUE OR STOP WORKING CONTINUE WORKING Count 214 209 423
% within IF RICH, CONTINUE OR STOP WORKING 50.6% 49.4% 100.0%
% within Respondent’s sex 69.3% 65.5% 67.4%
% of Total 34.1% 33.3% 67.4%
STOP WORKING Count 95 110 205
% within IF RICH, CONTINUE OR STOP WORKING 46.3% 53.7% 100.0%
% within Respondent’s sex 30.7% 34.5% 32.6%
% of Total 15.1% 17.5% 32.6%
Total Count 309 319 628
% within IF RICH, CONTINUE OR STOP WORKING 49.2% 50.8% 100.0%
% within Respondent’s sex 100.0% 100.0% 100.0%
% of Total 49.2% 50.8% 100.0%

In our sample data set about 49.2% were male and 50.8% were female, according to Table 3. With further analysis on whether or not people would continue or stop working on the respondent’s gender, 34.5% of women and 30.7% of men would choose to leave work if they were rich.  Gender doesn’t seem to be as strong of an indicator to help determine if a respondent were more likely to continue or stop working if they were rich in this “what if” scenario.

Table 4: If rich would people continue or stop working with respondent’s job satisfaction cross tabulation.

JOB OR HOUSEWORK Total
VERY SATISFIED MOD. SATISFIED A LITTLE DISSAT VERY DISSATISFIED
IF RICH, CONTINUE OR STOP WORKING CONTINUE WORKING Count 199 172 36 14 421
% within IF RICH, CONTINUE OR STOP WORKING 47.3% 40.9% 8.6% 3.3% 100.0%
% within JOB OR HOUSEWORK 71.8% 64.9% 60.0% 63.6% 67.5%
% of Total 31.9% 27.6% 5.8% 2.2% 67.5%
STOP WORKING Count 78 93 24 8 203
% within IF RICH, CONTINUE OR STOP WORKING 38.4% 45.8% 11.8% 3.9% 100.0%
% within JOB OR HOUSEWORK 28.2% 35.1% 40.0% 36.4% 32.5%
% of Total 12.5% 14.9% 3.8% 1.3% 32.5%
Total Count 277 265 60 22 624
% within IF RICH, CONTINUE OR STOP WORKING 44.4% 42.5% 9.6% 3.5% 100.0%
% within JOB OR HOUSEWORK 100.0% 100.0% 100.0% 100.0% 100.0%
% of Total 44.4% 42.5% 9.6% 3.5% 100.0%

In our sample data set about 49.2% were male and 50.8% were female, according to Table 3. With further analysis on whether or not people would continue or stop working on the respondent’s gender, 34.5% of women and 30.7% of menFinally, in Table 4, about 44.4% of respondents are very satisfied at work, 42.5% of respondents are moderately satisfied at work, 3.8% of respondents are moderately dissatisfied at work, and 1.3% of respondents are very dissatisfied at work. With further analysis on whether or not people would continue or stop working on the respondent’s job satisfaction level, 40% of respondents who are moderately dissatisfied would choose to leave work if they were rich making them the biggest demographic to leave in this “what if” scenario. In fact, if the respondents were anything but very satisfied with their job, they had an approximately 7-12% chance increase of wanting to leave their jobs if not rich.  This illustrates that 71.8% of those who are very satisfied with their jobs would stay at their job if they were rich, making them the biggest demographic to stay in this “what if” scenario.

Conclusions

Overall, this analysis has shown that to answer the question, if someone was rich, would they continue or stop working on their highest degree earned, and job satisfaction may have a contributing factor to the respondent’s decision in this “what if” scenario.  However, gender may not play an important role in answering this question.

Would choose to leave work if they were rich.  Gender doesn’t seem to be as strong of an indicator to help determine if a respondent were more likely to continue or stop working if they were rich in this “what if” scenario.

SPSS Code

DATASET NAME DataSet1 WINDOW=FRONT.

CROSSTABS

  /TABLES=richwork BY degree sex satjob

  /FORMAT=AVALUE TABLES

  /CELLS=COUNT ROW COLUMN TOTAL

  /COUNT ROUND CELL.

References:

Quant: Understanding Variance

If you looked at a measure of job performance resulting from 2 different manufacturing processes and found that the mean performance of process A was 82.5 and the mean performance of process B was 78.5, why can you not automatically assume that process A will consistently outperform process B?

If a researcher were to look at a measure of job performance resulting from 2 different manufacturing processes and found that the mean performance of process A was 82.5, and the mean performance of process B was 78.5, they could not automatically assume that process A will consistently outperform process B.  The reason the researchers cannot come to a conclusion until an analysis of variance done to that data.  There could be variance between the types of the statement of work that is uniquely different and are required between process A and process B (within-group variance), and there could be variances between the groups of people conducting the statement of work (between group variance).  These two types of variances will feed into the F-statistic result which would allow the researcher to state then whether or not they can reject the null hypothesis that the means between both mean performances are the same.

Quant: Variances

When looking at the performance of two groups on a given task, one speaks of two kinds of variance (between-groups and within-groups). What does each represent?

Variance is considered as measures of average dispersion (Field, 2013; Schumacker, 2014).  Variance is a numerical value that describes how the observed data values are spread across the data distribution and how they differ from the mean on average (Huck, 2011; Field, 2013; Schumacker, 2014).  The smaller the variance indicates that the observed data values are close to the mean and vice versa (Field, 2013). What happens when researchers want to study if the difference between two means from two groups of data is statistically significant from each other? Researchers could use ANOVA, which is an analysis of variances that test whether or not to reject the null hypothesis of the mean of one group is equal to the mean of another group (Huck, 2011; Schumacker, 2014).  ANOVAs usually test categorical independent variables (groups) and continuous dependent variables (Creswell, 2014).  One of the results of a one-way analysis of variance presents in a table the variance between groups and within groups (Huck, 2011).  Schumacker (2014), explained that the variance between groups indicates the variation between the overall grand mean of the groups, while variance within the groups indicates the variance within the means of the groups.  The variances between groups have a degree of freedom equal to the number of groups analyzed – 1, whereas the variance within the groups has a degree of freedom equal to the number of data points within each group – 1 – the number of groups (Huck, 2011).  Information from within and between the groups are used to calculate the F-statistic to establish statistical significance which can allow the researcher to reject or fail to reject their null hypothesis (Field, 2013; Huck, 2011; Schumacker, 2014).

References

  • Creswell, J. W. (2014) Research design: Qualitative, quantitative and mixed method approaches (4th ed.). California, SAGE Publications, Inc. VitalBook file.
  • Field, A. (2013) Discovering Statistics Using IBM SPSS Statistics (4th ed.). UK: Sage Publications Ltd. VitalBook file.
  • Huck, S. W. (2011) Reading Statistics and Research (6th ed.). Pearson Learning Solutions. VitalBook file.
  • Schumacker, R. E. (2014) Learning statistics using R. California, SAGE Publications, Inc, VitalBook file.