Fox Module 3: Univariate displays HW


Fox Module 3: Univariate displays HW

Author
Message
NEAS
Supreme Being
Supreme Being (5.7K reputation)Supreme Being (5.7K reputation)Supreme Being (5.7K reputation)Supreme Being (5.7K reputation)Supreme Being (5.7K reputation)Supreme Being (5.7K reputation)Supreme Being (5.7K reputation)Supreme Being (5.7K reputation)Supreme Being (5.7K reputation)

Group: Administrators
Posts: 4.2K, Visits: 1.2K

Module 3: Univariate displays

 

(The attached PDF file has better formatting.)

 

Homework assignment: stem and leaf display

 

A stem and leaf display for assault rates in the fifty U.S. states appears below. The assault rates range from 4.5% to 33.7%.

 

   4 | 568

   5 | 367

   6 |

   7 | 2

   8 | 136

   9 |

  10 | 2699

  11 | 035

  12 | 000

  13 |

  14 | 59

  15 | 1699

  16 | 1

  17 | 48

  18 | 8

  19 | 0

  20 | 14

  21 | 1

  22 |

  23 | 68

  24 | 99

  25 | 2459

  26 | 3

  27 | 69

  28 | 5

  29 | 4

  30 | 0

  31 |

  32 |

  33 | 57

 


A.   What is the median assault rate? There are 50 states, so average two points.

B.   What is the lower hinge (the 25th percentile)?

C.   What is the upper hinge (the 75th percentile)?

D.   What is the value of (HU – Median) / (Median – HL), where  HU is the upper hinge and HL is the lower hinge?

E.   This ratio indicates the skewness of the distribution. Is this distribution positively skewed or negatively skewed?

 


Attachments
Ron
Forum Newbie
Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)

Group: Forum Members
Posts: 5, Visits: 1
For the given data on assault rates,
X(13) = X(14) = 10.9%
X(25) = X(26) = 15.9%
X(37) = X(38) = 24.9%

The answers I'm getting are:
Median: 15.9%
Lower Hinge: 10.9%
Upper Hinge: 24.9%

In order to gain an understanding of where the formulas for the hinges came from, I thought about how they relate to the quantile function.

Suppose the cumulative probability for the i-th sorted data point X(i) is given by the formula
P(i) = (i - 0.5)/n [Fox, page 34]
The corresponding quantile function is
P_inv(z) = n*z + 0.5

Using this quantile function with n=50 gives
P_inv(0.5) = 25.5
P_inv(0.25) = 13

Note that 25.5 = (P(25) + P(26))/2, so the median can be computed as (X(25) + X(26))/2. Also, the lower hinge is X(13).

As an alternative, consider this formula for CDF:
P(i) = (i - 1)/(n - 1)
The corresponding quantile function is
P_inv(z) = (n-1)*z + 1

With n=50:
P_inv(0.5) = 25.5
P_inv(0.25) = 13.25

CalLadyQED
Forum Guru
Forum Guru (66 reputation)Forum Guru (66 reputation)Forum Guru (66 reputation)Forum Guru (66 reputation)Forum Guru (66 reputation)Forum Guru (66 reputation)Forum Guru (66 reputation)Forum Guru (66 reputation)Forum Guru (66 reputation)

Group: Forum Members
Posts: 62, Visits: 2
Ron,

Thanks for your help. I had noticed that we'd be taking a weighted average of two X(i)'s that were the same. However, I figure that the final may be slightly different, so I need to know why it works the way it does.

I'm probably forgetting something I learn in Stats years ago, but...where did you get that CDF formula?
jgorab17
Forum Newbie
Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)

Group: Forum Members
Posts: 4, Visits: 6
Can someone please explain what they mean by averaging 2 points for the first question? I've just never used a stem and leaf plot before...
CalLadyQED
Forum Guru
Forum Guru (66 reputation)Forum Guru (66 reputation)Forum Guru (66 reputation)Forum Guru (66 reputation)Forum Guru (66 reputation)Forum Guru (66 reputation)Forum Guru (66 reputation)Forum Guru (66 reputation)Forum Guru (66 reputation)

Group: Forum Members
Posts: 62, Visits: 2

For the median, the averaging by definition and has little to do with it being a stem and leaf plot. When an ordered data set has an odd number of observations n, then median = x((n+1)/2) . However, when n is an even number, median = ( x(n/2) +  x(n/2 + 1)) / 2. does that help?

[NEAS: Correct]


jgorab17
Forum Newbie
Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)

Group: Forum Members
Posts: 4, Visits: 6

so the number of observations here is 50 right? so then we'd be using x(25) and x(26) but what are those numbers? i see that they both equal 15.9% but i'm not sure how you get those numbers and i think it's just because i don't know how to look at a stem and leaf plot...

 

also, how are we supposed to tell from the ratio whether this is positively or negatively skewed? i understand how we'd know that from looking at a graph but i can't find anything about a ratio...

[NEAS: The ratio uses the first and third quartiles.]

[NEAS: Positive and negative skew can sometimes be unclear. For this homework assignment, if the upper hinge minus the median is more than the median minus the lower hinge, the distribution is positively skewed; if the upper hinge minus the median is less than the median minus the lower hinge, the distribution is negatively skewed. The relevant ratio is (upper hinge - median) / (median - lower hinge). For exact analysis of skewness, we should evaluate the significance of this ratio, but that is not covered in the text.]


slocal
Forum Newbie
Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)

Group: Awaiting Activation
Posts: 1, Visits: 33

2 things: first for each row (e.g. 15|1699), do we read this as the different first decimal place options of the set, e.g. this set is 15.1, 15.6, 15.9, and 15.9?

second, should we be using (.25)*x(12)+(.75)*x(13) for the first hinge, since 12.75=(50+1)/4?  If not, what am I missing?


bubba gump
Forum Newbie
Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)

Group: Forum Members
Posts: 6, Visits: 1
Isnt the lower hinge simply point 13 (10.9) and upper hinge point 38 (24.9)?
Briggs
Forum Newbie
Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)

Group: Forum Members
Posts: 2, Visits: 1

I'm confused by something on page 39. The subscript for the lower hinge comes from 197-49+1=149. Where does the 197 come from? There are 193 data points. I'm clearly missing something...


wb_munchausen
Forum Newbie
Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)

Group: Forum Members
Posts: 1, Visits: 4

NEAS, is the 197 in the equation 197-49+1=149 (p. 39, midpage) a typo?  I thought it should be 193, since n = 193.

[NEAS: Yes, it looks lke a typo.]


GO
Merge Selected
Merge into selected topic...



Merge into merge target...



Merge into a specific topic ID...





Reading This Topic


Login
Existing Account
Email Address:


Password:


Social Logins

  • Login with twitter
  • Login with twitter
Select a Forum....











































































































































































































































Neas-Seminars

Search