Quant: Exploring Data with SPSS

The aim of this analysis is to run a distribution analysis on diastolic blood pressure (DBP58), examining the following for individuals who have had no history of cardiovascular heart disease and individuals with a history of cardiovascular heart disease (CHD). The variable that looks at individual history is CHD.

Advertisements

Introduction

The aim of this analysis is to run a distribution analysis on diastolic blood pressure (DBP58), examining the following for individuals who have had no history of cardiovascular heart disease and individuals with a history of cardiovascular heart disease (CHD). The variable that looks at individual history is CHD.

From the SPSS outputs the following questions will be addressed:

  • What can be determined from the measures of skewness and kurtosis about a normal curve? What are the mean and median?
  • Does one seem better than the other to represent the scores?
  • What differences can be seen in the pattern of responses of those with history versus those with no history?
  • What information can be determined from the box plots?

Methodology

For this project, the electric.sav file is loaded into SPSS (Electric, n.d.).  The goal is to look at the relationships between the following variables: DBP58 (Average Diastolic Blood Pressure) and CHD (Incidence of Coronary Heart Disease). To conduct a descriptive analysis, navigate through Analyze > Descriptive Analytics > Explore.  The variable DBP58 was placed in the “Dependent List” box, and CHD was placed on the “Factor List” box.  Then on the Explore dialog box, “Statistics” button was clicked, and in this dialog box “Descriptives” at the 95% “Confidence interval for the mean” is selected along with outliers and percentiles.  Then going back to the on the Explore dialog box, “Plots” button was clicked, and in this dialog box under the “Boxplot” section only “Factor levels together” was selected, under the “Descriptive” section, both options were selected, and the “Spread vs. Level with Levene Test” section, “None” was selected.  The procedures for this analysis are provided in video tutorial form by Miller (n.d.). The following output was observed in the next four tables and five figures.

Results

Table 1: Case Processing Summary.

Incidence of Coronary Heart Disease Cases
Valid Missing Total
N Percent N Percent N Percent
Average Diast Blood Pressure 58 none 119 99.2% 1 0.8% 120 100.0%
chd 120 100.0% 0 0.0% 120 100.0%

According to Table 1, 99.2% or greater of the data is valid and not missing for when there is a history of Coronary Heart Disease (CHD) and when there isn’t. There is one missing data point in the case with no history of CHD. This data set contains 120 participants.

Table 2: Descriptive Statistics on the Incidents of Coronary Heart Disease and the Average Diastolic Blood Pressure.

Incidence of Coronary Heart Disease Statistic Std. Error
Average Diast Blood Pressure 58 none Mean 87.66 1.005
95% Confidence Interval for Mean Lower Bound 85.66
Upper Bound 89.65
5% Trimmed Mean 87.31
Median 87.00
Variance 120.312
Std. Deviation 10.969
Minimum 65
Maximum 125
Range 60
Interquartile Range 15
Skewness .566 .222
Kurtosis .671 .440
chd Mean 89.92 1.350
95% Confidence Interval for Mean Lower Bound 87.24
Upper Bound 92.59
5% Trimmed Mean 88.89
Median 87.00
Variance 218.732
Std. Deviation 14.790
Minimum 65
Maximum 160
Range 95
Interquartile Range 18
Skewness 1.406 .221
Kurtosis 3.620 .438

According to Table 2, there is a difference in the mean by +2 points and +0.345 in standard error in Diastolic Blood Pressure with CHD compared to when there isn’t.  The median for both cases of CHD or not are 87, with the mean for patients with CHD 89.92 (slightly skewed) and that can be seen with a skewness of 1.406 and a kurtosis of 3.620.  For the cases without a CHD, the mean blood pressure is 87.66 (showing little to now skewness in the data), as evident by the skewness of 0.566 and kurtosis of 0.671.  Upon further inspection of Figures 1 & 2, the skewness or lack thereof seems to appear to be the result of some outliers. The box plot in Figure 3 confirms these outliers.  The kurtosis values of 0.671 and 3.620 indicate they are Leptokurtic, which means they have higher peaks in their distribution and deviate from a normal distribution.

u2db3f1

Figure 1: Histogram on the Incidents of Coronary Heart Disease = none and the Average Diastolic Blood Pressure.

u2db3f2.png

Figure 2: Histogram on the Incidents of Coronary Heart Disease = chd and the Average Diastolic Blood Pressure.

u2db3f3.png

Figure 3: Box plots on the Incidents of Coronary Heart Disease and the Average Diastolic Blood Pressure.

Comparing the two histograms in Figures 1 & 2, there is a negative skewness to the data when there is CHD compared to when there isn’t.  The spread between the two histograms increases by about 3.7 points (the standard deviation from the mean) when there is CHD.  This shows that blood pressure in the sample population can vary greatly if there is CHD, whereas blood pressure is a bit more stable in the sample population that doesn’t have CHD.  Looking at the range of these the average diastolic blood pressure, if there is a CHD, then it increases, which is supported by the greater standard deviation number, and can be seen in Figure 3.  In the case with no CHD the interquartile range (which represents the middle 50% of the participants) is smaller than the participants with CHD. Participant 120 was excluded from the interquartile range due to its extreme nature.

Table 3: Percentiles on the Incidents of Coronary Heart Disease and the Average Diastolic Blood Pressure.

Incidence of Coronary Heart Disease Percentiles
5 10 25 50 75 90 95
Weighted Average (Definition 1) Average Diast Blood Pressure 58 none 71.00 75.00 80.00 87.00 95.00 102.00 105.00
chd 70.05 75.00 80.00 87.00 98.00 109.90 117.95
Tukey’s Hinges Average Diast Blood Pressure 58 none 80.00 87.00 94.50
chd 80.00 87.00 98.00

In Table 3, the percentiles on the incidents of CHD on the average diastolic blood pressure is mapped out.  95 % of all cases exist below 105 (117.95) diastolic blood pressure for no history of CHD (for the history of CHD).  These percentiles show that in the case where there is no CHD, the diastolic blood pressure values are centered more towards the median value of 87, which is supported by the above-mentioned Tables and Figures.

Table 4: Extreme Values on the Incidents of Coronary Heart Disease and the Average Diastolic Blood Pressure.

Incidence of Coronary Heart Disease Case Number Value
Average Diast Blood Pressure 58 none Highest 1 163 125
2 232 119
3 144 115
4 126 110
5 131 109
Lowest 1 157 65
2 156 65
3 175 68
4 153 68
5 237 69
chd Highest 1 120 160
2 56 133
3 42 125
4 26 121
5 111 120
Lowest 1 73 65
2 34 68
3 101 70
4 33 70
5 7 70a
a. Only a partial list of cases with the value 70 are shown in the table of lower extremes.

Examining the extreme values through Table 4, the top 5 and lowest 5 cases are considered.  In the case were there is no CHD, the lowest diastolic blood pressure value can be seen as 65 which is the same as those with CHD.  However, in the highest diastolic blood pressure value, there is a 35 point greater difference for the highest case with CHD on the highest case without CHD.

  •  Frequency    Stem &  Leaf
  •       .00        6 .
  •      5.00        6 .  55889
  •      4.00        7 .  1144
  •     18.00        7 .  555677777777888899
  •     21.00        8 .  000000000001122223344
  •     21.00        8 .  555556666777777888999
  •     20.00        9 .  00000111111222233334
  •     14.00        9 .  55666777888899
  •      8.00       10 .  00012233
  •      4.00       10 .  5559
  •      1.00       11 .  0
  •      1.00       11 .  5
  •      2.00 Extremes    (>=119)
  •  Stem width:   10
  •  Each leaf:        1 case(s)

Figure 4: Stem and leaf plot on the Incidents of Coronary Heart Disease = none and the Average Diastolic Blood Pressure.

  •  Frequency    Stem &  Leaf
  •       .00        6 .
  •      2.00        6 .  58
  •      9.00        7 .  000012233
  •     14.00        7 .  55555677788899
  •     23.00        8 .  00000000000111233333344
  •     24.00        8 .  555556667777777788999999
  •     11.00        9 .  00001122223
  •     13.00        9 .  6677788888999
  •      5.00       10 .  02333
  •      7.00       10 .  5557789
  •      4.00       11 .  0003
  •      3.00       11 .  578
  •      2.00       12 .  01
  •      1.00       12 .  5
  •      2.00 Extremes    (>=133)
  •  Stem width:   10
  •  Each leaf:        1 case(s)

Figure 5: Stem and leaf plot on the Incidents of Coronary Heart Disease = chd and the Average Diastolic Blood Pressure.

Figures 4 and 5 show more detail than the histogram information by stating the actual frequency to the left of the Stem values as well as stating what is considered to be extreme values.  In the case of CHD, a diastolic blood pressure greater than 133 is considered an outlier and when there is no CHD the extreme values are considered to be a diastolic blood pressure of 119 or more.

Conclusions

There is a difference between the distributions of those participants that have a history of Coronary Heart Disease (CHD) and those that don’t on their average diastolic blood pressure.  This is represented through the range, skewness, and distribution between both groups.  Both groups have similar medians, and lowest values, but vary greatly in the mean, standard deviation and highest values of diastolic blood pressure.

SPSS Code

DATASET NAME DataSet1 WINDOW=FRONT.

EXAMINE VARIABLES=dbp58 BY chd

  /PLOT BOXPLOT STEMLEAF HISTOGRAM

  /COMPARE GROUPS

  /PERCENTILES(5,10,25,50,75,90,95) HAVERAGE

  /STATISTICS DESCRIPTIVES EXTREME

  /CINTERVAL 95

  /MISSING LISTWISE

  /NOTOTAL.

References:

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s