Quant: In-depth Analysis in SPSS

This short analysis attempts to understand the marital happiness level on combined income. It was found that marital happiness levels are depended on a couples’ combined income, but for the happiest couples, they were happy regardless how much money they had. This, quantitative analysis on the sample data, has shown that when the happiness levels are low, there is a higher chance of lower levels of combined income.

Advertisements

Abstract

This short analysis attempts to understand the marital happiness level on combined income.  It was found that marital happiness levels are depended on a couples’ combined income, but for the happiest couples, they were happy regardless how much money they had.  This, quantitative analysis on the sample data, has shown that when the happiness levels are low, there is a higher chance of lower levels of combined income.

Introduction

Mulligan (1973), was one of the first that stated arguments about money was one of the top reasons for divorce between couples.  Factors for financial arguments could stem from: Goals and savings; record keeping; delaying tactics; apparel cost-cutting strategies; controlling expenditures; financial statements; do-it-yourself techniques; and cost cutting techniques (Lawrence, Thomasson, Wozniak, & Prawitz, 1993). Lawrence et al. (1993) exerts that financial arguments are common between families.  However, when does money no longer become an issue?  Does the increase in combined family income affect the marital happiness levels?  This analysis attempts to answer these questions.

Methods

Crosstabulation was conducted to get a descriptive exploration of the data.  Graphical images of box-plots helped show the spread and distribution of combined income per marital happiness.  In this analysis of the data the two alternative hypothesis will be tested:

  • There is a difference between the mean values of combined income per marital happiness levels.
  • There is a dependence between the combined income and marital happiness level

This would lead to finally analyzing the hypothesis introduced in the previous section, one-way analysis of variance and two-way chi-square test was conducted respectively.

Results

Table 1: Case processing summary for analyzing happiness level versus family income.

u6db1f7Table 2: Crosstabulation for analyzing happiness level versus family income (<$21,250).

u6db1f3Table 3: Crosstabulation for analyzing happiness level versus family income for (>$21,250).
u6db1f4

Table 4: Chi-square test for analyzing happiness level versus family income.

u6db1f5

Table 5: Analysis of Variance for analyzing happiness level versus family income.

u6db1f6

u6db1f1.png

Figure 1: Boxplot diagram per happiness level of a marriage versus the family incomes.

u6db1f2.png

Figure 2: Line diagram per happiness level of a marriage versus the mean of the family incomes.

Discussions and Conclusions

There are 1419 participants, and only 38.5% had responded to both their happiness of marriage and family income (Table 1).  What may have contributed to this huge unresponsive rate is that there could have been people who were not married, and thus making the happiness of marriage question not applicable to the participants.  Thus, it is suggested that in the future, there should be an N/A classification in this survey instrument, to see if we can have a higher response rate.  Given that there are still 547 responses, there is other information to be gained from analyzing this data.

As a family unit gains more income, their happiness level increases (Table 2-3).  This can be seen as the dollar value increases, the % within the family income and ranges recorded to midpoint for the very happy category increases as well from the 50% to the 75% level.    The unhappiest couples seem to be earning a combined medium amount of $7500-9000 and at $27500-45000.  Though for marriages that are pretty happy, it’s about stable at 30-40% of respondents at $13750 or more.

The mean values of family income to happiness (Figure 2), shows that on average, happier couples make more money together, but at a closer examination using boxplots (Figure 1), the happiest couples, seem to be happy regardless of how much money they make as the tails of the box plot extend really far from the median.  One interesting feature is that the spread of family combined income is shrinks as happiness decreases (Figure 1).  This could possibly suggest that though money is not a major factor for those couples that are happy, if the couple is unhappy it could be driven by lower combined incomes.

The two-tailed chi-squared test, shows statistical significance between family combined income and marital happiness allowing us to reject the null hypothesis #2, which stated that these two variables were independent of each other (Table 4).  Whereas the analysis of variance doesn’t allow for a rejection of the null hypothesis #1, which states the means are different between the groups of marital happiness level (Table 5).

There could be many reasons for this analysis, thus future work could include analyzing other variables that could help define other factors for marital happiness.  A possible multi-variate analysis may be necessary to see the impact on marital happiness as the dependent variable and combined income as one of many independent variables.

SPSS Code

GET

  FILE=’C:\Users\mkher\Desktop\SAV files\gss.sav’.

DATASET NAME DataSet1 WINDOW=FRONT.

CROSSTABS

  /TABLES=hapmar BY incomdol

  /FORMAT=AVALUE TABLES

  /STATISTICS=CHISQ CORR

  /CELLS=COUNT ROW COLUMN

  /COUNT ROUND CELL.

ONEWAY rincome BY hapmar

  /MISSING ANALYSIS

* Chart Builder.

GGRAPH

  /GRAPHDATASET NAME=”graphdataset” VARIABLES=hapmar incomdol MISSING=LISTWISE REPORTMISSING=NO

  /GRAPHSPEC SOURCE=INLINE.

BEGIN GPL

  SOURCE: s=userSource(id(“graphdataset”))

  DATA: hapmar=col(source(s), name(“hapmar”), unit.category())

  DATA: incomdol=col(source(s), name(“incomdol”))

  DATA: id=col(source(s), name(“$CASENUM”), unit.category())

  GUIDE: axis(dim(1), label(“HAPPINESS OF MARRIAGE”))

  GUIDE: axis(dim(2), label(“Family income; ranges recoded to midpoints”))

  SCALE: cat(dim(1), include(“1”, “2”, “3”))

  SCALE: linear(dim(2), include(0))

  ELEMENT: schema(position(bin.quantile.letter(hapmar*incomdol)), label(id))

END GPL.

* Chart Builder.

GGRAPH

  /GRAPHDATASET NAME=”graphdataset” VARIABLES=hapmar MEAN(incomdol)[name=”MEAN_incomdol”]

    MISSING=LISTWISE REPORTMISSING=NO

  /GRAPHSPEC SOURCE=INLINE.

BEGIN GPL

  SOURCE: s=userSource(id(“graphdataset”))

  DATA: hapmar=col(source(s), name(“hapmar”), unit.category())

  DATA: MEAN_incomdol=col(source(s), name(“MEAN_incomdol”))

  GUIDE: axis(dim(1), label(“HAPPINESS OF MARRIAGE”))

  GUIDE: axis(dim(2), label(“Mean Family income; ranges recoded to midpoints”))

  SCALE: cat(dim(1), include(“1”, “2”, “3”))

  SCALE: linear(dim(2), include(0))

  ELEMENT: line(position(hapmar*MEAN_incomdol), missing.wings())

END GPL.

References

Quant: Chi-Square Test in SPSS

The aim of this analysis is to determine the association strength for the variables agecat and degree as well the major contributing cells through a chi-square analysis. Through the use of standardized residuals, it should aid in determining the cell contributions.

Introduction

The aim of this analysis is to determine the association strength for the variables agecat and degree as well the major contributing cells through a chi-square analysis. Through the use of standardized residuals, it should aid in determining the cell contributions.

Hypothesis

  • Null: There is no basis of difference between the agecat and degree
  • Alternative: There is are real differences between the agecat and degree

Methodology

For this project, the gss.sav file is loaded into SPSS (GSS, n.d.).  The goal is to look at the relationships between the following variables: agecat (Age category) and degree (Respondent’s highest degree).

To conduct a chi-square analysis, navigate through Analyze > Descriptive Statistics > Crosstabs.

The variable degree was placed in the “Row(s)” box and agecat was placed under “Column(s)” box.  Select “Statistics” button and select “Chi-square” and under the “Nominal” section select “Lambda”. Select the “Cells” button and select “Standardized” under the “Residuals” section. The procedures for this analysis are provided in video tutorial form by Miller (n.d.).  The following output were observed in the next four tables.

Results

Table 1: Case processing summary.

Cases
Valid Missing Total
N Percent N Percent N Percent
Degree * Age category 1411 99.4% 8 0.6% 1419 100.0%

From the total sample size of 1419 participants, 8 cases are reported to be missing, yielding a 99.4% response rate (Table 1).   Examining the cross tabulation, for the age groups 30-39, 40-49, 50-59, and 60-89 the standardized residual is far less than -1.96 or far greater than +1.96 respectively.  Thus, the frequencies between these two differ significantly.  Finally, for the 60-89 age group the standardized residual is less than -1.96, making these two frequencies differ significantly.  Thus, for all these frequencies, SPSS identified that the observed frequencies are far apart from the expected frequencies (Miller, n.d.).  For those significant standardized residuals that are negative is pointing out that the SPSS model is over predicting people of that age group with that respective diploma (or lack thereof).  For those significant standardized residuals that are positive is point out that the SPSS model is under-predicting people of that age group with a lack of a diploma.

Table 2: Degree by Age category crosstabulation.

Age category Total
18-29 30-39 40-49 50-59 60-89
Degree Less than high school Count 42 33 36 20 112 243
Standardized Residual -.1 -2.8 -2.3 -2.7 7.1
High school Count 138 162 154 113 158 725
Standardized Residual .9 .2 -.2 .4 -1.2
Junior college or more Count 68 115 114 78 68 443
Standardized Residual -1.1 1.8 1.9 1.4 -3.7
Total Count 248 310 304 211 338 1411

Deriving the degrees of freedom from Table 2, df = (5-1)*(3-1) is 8.  However, none of the expected counts were less than five because the minimum expected count is 36.3 (Table 3) which is desirable.  The chi-squared value is 96.364 and is significance at the 0.05 level. Thus, the null hypothesis is rejected, and there is a statistically significant association between a person’s age category and diploma level.  This test doesn’t tell us anything about the directionality of the relationship.

Table 3: Chi-Square Tests

Value df Asymptotic Significance (2-sided)
Pearson Chi-Square 96.364a 8 .000
Likelihood Ratio 90.580 8 .000
Linear-by-Linear Association 23.082 1 .000
N of Valid Cases 1411
a. 0 cells (0.0%) have expected count less than 5. The minimum expected count is 36.34.

Table 4: Directional Measures

Value Asymptotic Standard Errora Approximate Tb Approximate Significance
Nominal by Nominal Lambda Symmetric .029 .013 2.278 .023
Degree Dependent .000 .000 .c .c
Age category Dependent .048 .020 2.278 .023
Goodman and Kruskal tau Degree Dependent .024 .005 .000d
Age category Dependent .019 .004 .000d
a. Not assuming the null hypothesis.
b. Using the asymptotic standard error assuming the null hypothesis.
c. Cannot be computed because the asymptotic standard error equals zero.
d. Based on chi-square approximation

Since there is a statistically significant association between a person’s age category and diploma level, the chi-square test doesn’t show how much these variables are related to each other. The lambda value (when we reject the null hypothesis) is 0.029; there is a 2.9% relationship between the two variables. Thus the relationship has a very weak effect (Table 4). Thus, 2.9% of the variance is accounted for, and there is nothing going on in here.

Conclusions

There is a statistically significant association between a person’s age category and diploma level.  According to the crosstabulation, the SPSS model is significantly over-predicting the number of people with less education than a high school diploma for the age groups of 20-59 as well as those with a college degree for the 60-89 age group.  This difference in the standard residual helped drive a large and statistically significant chi-square value. With a lambda of 0.029, it shows that 2.9% of the variance is accounted for, and there is nothing going on in here.

SPSS Code

CROSSTABS

  /TABLES=ndegree BY agecat

  /FORMAT=AVALUE TABLES

  /STATISTICS=CHISQ CC LAMBDA

  /CELLS=COUNT SRESID

  /COUNT ROUND CELL.

References:

Quant: Crosstabs in SPSS

The aim of this analysis is to answer the question, if someone was rich, would they continue or stop working on their highest degree earned, gender, and job satisfaction.

Introduction

The aim of this analysis is to answer the question, if someone was rich, would they continue or stop working on their highest degree earned, gender, and job satisfaction.

Methodology

For this project, the gss.sav file is loaded into SPSS (GSS, n.d.).  The goal is to look at the relationships between the following variables: richwork (being wealthy), sex (demographics of gender), satjob (satisfaction level with the job), and degree (education degree level).   The variable richwork is the dependent variable and the other three variables are considered independent variables for this analysis. To conduct a crosstabs analysis, navigate through Analyze > Descriptive Analytics > Crosstabs.  The variable richwork was placed in the “Row(s)” box, and the other three variables were placed in the “Column(s)” box.  Then on the crosstabs dialog box, “Cells” button was clicked, and under the “Counts” section “Observed” was selected and all three boxes were seleceted under the “Percentages” section. The procedures for this analysis are provided in video tutorial form by Miller (n.d.).  The following output was observed in the next four tables.

Results

Table 1: Cases Processing Summary.

Cases
Valid Missing Total
N Percent N Percent N Percent
IF RICH, CONTINUE OR STOP WORKING * Respondent’s highest degree 625 44.0% 794 56.0% 1419 100.0%
IF RICH, CONTINUE OR STOP WORKING * Respondent’s sex 628 44.3% 791 55.7% 1419 100.0%
IF RICH, CONTINUE OR STOP WORKING * JOB OR HOUSEWORK 624 44.0% 795 56.0% 1419 100.0%

According to Table 1, about 44% (~625) of all cases are valid in all three scenarios and about 56% (~793) had missing data, from a total of 1419 respondents.

Table 2: If rich do people continue or stop working with respondent’s highest degree cross tabulation.

Respondent’s highest degree Total
Less than HS High school Junior college Bachelor Graduate
IF RICH, CONTINUE OR STOP WORKING CONTINUE WORKING Count 52 210 39 84 36 421
% within IF RICH, CONTINUE OR STOP WORKING 12.4% 49.9% 9.3% 20.0% 8.6% 100.0%
% within Respondent’s highest degree 69.3% 64.6% 81.3% 67.2% 69.2% 67.4%
% of Total 8.3% 33.6% 6.2% 13.4% 5.8% 67.4%
STOP WORKING Count 23 115 9 41 16 204
% within IF RICH, CONTINUE OR STOP WORKING 11.3% 56.4% 4.4% 20.1% 7.8% 100.0%
% within Respondent’s highest degree 30.7% 35.4% 18.8% 32.8% 30.8% 32.6%
% of Total 3.7% 18.4% 1.4% 6.6% 2.6% 32.6%
Total Count 75 325 48 125 52 625
% within IF RICH, CONTINUE OR STOP WORKING 12.0% 52.0% 7.7% 20.0% 8.3% 100.0%
% within Respondent’s highest degree 100.0% 100.0% 100.0% 100.0% 100.0% 100.0%
% of Total 12.0% 52.0% 7.7% 20.0% 8.3% 100.0%

According to Table 2, with further analysis on whether or not people would continue or stop working, 67.4% would stay, and 32.6% would stop working.  In our data about 12% have less than a high school diploma, 52% have a high school diploma, 7.7% have a gone to junior college, 20% have a bachelor degree and 8.3% have a graduate degree. With further analysis with respect to whether or not people would continue or stop working with respect to the respondent’s highest degree earned, 56.4% of respondents who have only a high school diploma would choose to leave work if they were rich making them the biggest demographic to leave in this “what if” scenario.  Finally, 81.3% of those with a junior college degree would stay at their job if they were rich, making them the biggest demographic to stay in this “what if” scenario. Those with a high school diploma, bachelor degree or graduate degree were approximately 65-69% more likely to continue working if they were rich.

Table 3: If rich do people continue or stop working with respondent’s gender cross tabulation.

Respondent’s sex Total
Male Female
IF RICH, CONTINUE OR STOP WORKING CONTINUE WORKING Count 214 209 423
% within IF RICH, CONTINUE OR STOP WORKING 50.6% 49.4% 100.0%
% within Respondent’s sex 69.3% 65.5% 67.4%
% of Total 34.1% 33.3% 67.4%
STOP WORKING Count 95 110 205
% within IF RICH, CONTINUE OR STOP WORKING 46.3% 53.7% 100.0%
% within Respondent’s sex 30.7% 34.5% 32.6%
% of Total 15.1% 17.5% 32.6%
Total Count 309 319 628
% within IF RICH, CONTINUE OR STOP WORKING 49.2% 50.8% 100.0%
% within Respondent’s sex 100.0% 100.0% 100.0%
% of Total 49.2% 50.8% 100.0%

In our sample data set about 49.2% were male and 50.8% were female, according to Table 3. With further analysis on whether or not people would continue or stop working on the respondent’s gender, 34.5% of women and 30.7% of men would choose to leave work if they were rich.  Gender doesn’t seem to be as strong of an indicator to help determine if a respondent were more likely to continue or stop working if they were rich in this “what if” scenario.

Table 4: If rich would people continue or stop working with respondent’s job satisfaction cross tabulation.

JOB OR HOUSEWORK Total
VERY SATISFIED MOD. SATISFIED A LITTLE DISSAT VERY DISSATISFIED
IF RICH, CONTINUE OR STOP WORKING CONTINUE WORKING Count 199 172 36 14 421
% within IF RICH, CONTINUE OR STOP WORKING 47.3% 40.9% 8.6% 3.3% 100.0%
% within JOB OR HOUSEWORK 71.8% 64.9% 60.0% 63.6% 67.5%
% of Total 31.9% 27.6% 5.8% 2.2% 67.5%
STOP WORKING Count 78 93 24 8 203
% within IF RICH, CONTINUE OR STOP WORKING 38.4% 45.8% 11.8% 3.9% 100.0%
% within JOB OR HOUSEWORK 28.2% 35.1% 40.0% 36.4% 32.5%
% of Total 12.5% 14.9% 3.8% 1.3% 32.5%
Total Count 277 265 60 22 624
% within IF RICH, CONTINUE OR STOP WORKING 44.4% 42.5% 9.6% 3.5% 100.0%
% within JOB OR HOUSEWORK 100.0% 100.0% 100.0% 100.0% 100.0%
% of Total 44.4% 42.5% 9.6% 3.5% 100.0%

In our sample data set about 49.2% were male and 50.8% were female, according to Table 3. With further analysis on whether or not people would continue or stop working on the respondent’s gender, 34.5% of women and 30.7% of menFinally, in Table 4, about 44.4% of respondents are very satisfied at work, 42.5% of respondents are moderately satisfied at work, 3.8% of respondents are moderately dissatisfied at work, and 1.3% of respondents are very dissatisfied at work. With further analysis on whether or not people would continue or stop working on the respondent’s job satisfaction level, 40% of respondents who are moderately dissatisfied would choose to leave work if they were rich making them the biggest demographic to leave in this “what if” scenario. In fact, if the respondents were anything but very satisfied with their job, they had an approximately 7-12% chance increase of wanting to leave their jobs if not rich.  This illustrates that 71.8% of those who are very satisfied with their jobs would stay at their job if they were rich, making them the biggest demographic to stay in this “what if” scenario.

Conclusions

Overall, this analysis has shown that to answer the question, if someone was rich, would they continue or stop working on their highest degree earned, and job satisfaction may have a contributing factor to the respondent’s decision in this “what if” scenario.  However, gender may not play an important role in answering this question.

Would choose to leave work if they were rich.  Gender doesn’t seem to be as strong of an indicator to help determine if a respondent were more likely to continue or stop working if they were rich in this “what if” scenario.

SPSS Code

DATASET NAME DataSet1 WINDOW=FRONT.

CROSSTABS

  /TABLES=richwork BY degree sex satjob

  /FORMAT=AVALUE TABLES

  /CELLS=COUNT ROW COLUMN TOTAL

  /COUNT ROUND CELL.

References: