quote:
Don't worry. If you'd ever like to rap about stats I'm down, and do have a MS in it....there are several other posters here that know their shite too. Just Ignore him.

Yeah. I definitely enjoy these discussions. My university offered graduate students in other fields a minor or an MS in statistics. I ended up getting the MS because I liked the courses, think it is a great skill set to have, and it makes me more marketable in my own field. And although I feel competent enough to identify inaccuracies and misrepresentations (e.g., Tuba in this thread), I realize that I am far less knowledgeable and experienced in many of these areas compared to you and other posters (Taxing Authority seems to know a lot about complex modeling). Anyways, it's usually a good learning experience.

0 ...

Report Post

Posted by buckeye_vol

Member since Jul 2014

35250 posts

Posted on 10/1/14 at 10:37 pm to HubbaBubba

quote:
Your assertion is seriously flawed. If you have a population of 1,000,000 and the disease rate is 0.1%, then only 1000 people will get the disease. The biggest number you left out is "what percentage of the population are suspected of having the disease and are tested?" "What percentage of those tested, test negative?" "What percentage of those who test negative are falsely negative?"

Let's assume it is .15 percent being tested (1500 people). Of those, 1000 have tested positive for the disease, 15 people had a false positive and 485 tested negative. Therefore, the chance that someone who tested positive actually has the disease is 98.5%

I will say that his example is a good thought experiment to think about conditional probability; however, you bring up a good point that highlights why some preventative techniques are quite costly if we just have a blanket screening instead of using empirical evidence to narrow it down to individuals showing symptoms. Luckily, I think your example (1500 tested) is more consistent with reality. Although Tuba seems to have had difficulty seeing how theory and practice relate as evidence by his impractical hypotheticals on top of his misrepresentations.

0 ...

Report Post

Posted by SpidermanTUba

my house

Member since May 2004

36129 posts

Posted on 10/1/14 at 11:11 pm to buckeye_vol

quote:
. Regardless you seem to be arguing that if a sample doesn't meet each assumption of the CLM that the scores won't be normally distributed;

No. That is not what I am arguing. I am arguing that "Half the schools are below average" is not a tautology, even for large numbers of schools. That is it.

quote:
however, the underlying distribution is normal because the variables are theoretically normal and raw test scores are "transformed" into a scale that is normal (e.g., t-scores, standard scores, normal curve equivalents, etc).

Yes - IF the raw scores are transformed to conform to a normal distribution, the distribution will be normal - I certainly agree with that. Its not a presumption I was making.

It still doesn't guarantee a normal distribution of school performance as measured by average test scores, as the individuals' test scores that a school has aren't independent of one another.

0 ...

Report Post

Posted by SpidermanTUba

my house

Member since May 2004

36129 posts

Posted on 10/1/14 at 11:18 pm to buckeye_vol

quote:

Tests are designed to have an underlying normal distribution. Furthermore, test scores are then converted from raw scores to another scale so that the scores are truly normal.

That may or may not be true for a given exam, but it doesn't matter. School performance does necessarily follow a normal distribution - therefore neither will the average test scores that indicate that performance.

Just look our our own public education system. It is way skewed! There are a couple of exceptionally good public high schools - the Louisiana School for Math, Science, and the Arts comes to mind - a few places with decent high schools - and a lot of places with crap schools. The distribution is in no way normal!

Now - I agree - if you transform school performance measures themselves onto a normal scale - the distribution is normal.
That would be a statement like "half the schools are below the 50th percentile" - which is a tautology. That's not what I meant though.

quote:

Then maybe the score that represents the average will be different but since these are norm-referenced tests, they usually adjust accordingly when converted (e.g., Flynn Effect for IQ stores). Although the distribution may be slightly impacted, it should still be relatively normal because there will be ceiling and floor effects (i.e., usually can't go lower than a 200 on SAT composite or higher than an 800. Not to mention, there are so many other variables at play in your hypothetical that aren't accounted for (e.g., demographics, developmental limitations, cognitive limitations).

And its your idea that those unaccounted for variables will just happen to add up in exactly the right way to cancel out the skewed distribution of resource allocation in the 90/10 example? Come on man!

1 ...

Report Post

Posted by lsu480

Downtown Scottsdale

Member since Oct 2007

92877 posts

Posted on 10/1/14 at 11:24 pm to SpidermanTUba

quote:
You'll still assert the distribution of differences in performances of different schools is due entirely to random selection of students.

I actually read all of this stupid arguing and as much as I hate to agree with Tuba this is the only quote that matters in all of this, no matter how smart you wannabe statisticians are.

0 ...

Report Post

Posted by Nuts4LSU

Washington, DC

Member since Oct 2003

25468 posts

Posted on 10/1/14 at 11:37 pm to SpidermanTUba

quote:
the statement "half the schools are below average" is NOT automatically true,

quote:
4 schools, each with the same population. Three have an average test scores of 90 - while the 4th has an average test score of 70. The average is thus (3*90+70)/4 = 85. Thus only 25% of the schools in this case, are below average.

With a small sample size like that, and an unusual distribution of scores like that, you are right. It is possible for much less than half of schools to be below average. But over a large sample size, and with anything but a VERY unusual distribution of scores, the number that will fall below the average will almost always be extremely close to 50%.

1 ...

Report Post

Posted by SpidermanTUba

my house

Member since May 2004

36129 posts

Posted on 10/1/14 at 11:38 pm to buckeye_vol

quote:
Furthermore, it is often difficult to get a truly random sample of an entire population and instead use convenience sampling.

The difficultly in obtaining a random sample doesn't make non-random samples random!

In this case all you need is a list of individual scores - NOT by school - but in some random order.

quote:
and there are statistical methods that are used to account for the non-independence of participants (HLM) if that is necessary.

To account for it for what purpose? Shifting enough schools around in the distribution to make it look normal when it isn't?

1 ...

Report Post

Posted by buckeye_vol

Member since Jul 2014

35250 posts

Posted on 10/2/14 at 7:13 am to SpidermanTUba

quote:
And its your idea that those unaccounted for variables will just happen to add up in exactly the right way to cancel out the skewed distribution of resource allocation in the 90/10 example? Come on man!

Besides the implausibility of your (90/10) example, there are still so many other conditions at play here; resources being only one. Furthermore, we are dealing with variables that have developmental limits. You could spend a trillion dollars on one school, and you would quickly start to get diminishing returns.

0 ...

Report Post

Posted by buckeye_vol

Member since Jul 2014

35250 posts

Posted on 10/2/14 at 7:14 am to Nuts4LSU

quote:
With a small sample size like that, and an unusual distribution of scores like that, you are right. It is possible for much less than half of schools to be below average. But over a large sample size, and with anything but a VERY unusual distribution of scores, the number that will fall below the average will almost always be extremely close to 50%.

Exactly.

0 ...

Report Post

Posted by buckeye_vol

Member since Jul 2014

35250 posts

Posted on 10/2/14 at 9:19 am to SpidermanTUba

quote:
To account for it for what purpose? Shifting enough schools around in the distribution to make it look normal when it isn't?

The purpose isn’t to unnecessarily shift anything or normalize anything; in fact, it has broad applications because you can model random (e.g., non-random individual effects,) and fixed effects (e.g., treatment effects).

For our case, we are discussing schools but we may have individual student data. Therefore, we know that a score will be dependent on differences within-schools (i.e., student level differences) and between-schools (the purpose of this discussion). We would “nest” the students within their schools and identify the variability that occurs within- and between-schools. Individual student scores can then vary within their schools because they are dependent on school-specific variables (dependence can quantified by the intra-class correlation coefficient).

By then partialing out the within-school error, we can more accurately identify the quality of schools (at least based on our outcome variability). Valued-added modeling, which is used to measure school and teacher performance, is essentially a special case of this model.

In the case of your discussion on normality, it is true you can find examples (especially with an N of 4) of non-normal data. We shouldn’t just blindly assume a sample is normal and check this assumption. That being said, not only have the variables used to measure school quality are not only theoretically normal, the extensive studies of these variables across fields (economics, psychology, education, etc.) have verified the distribution. Therefore, we are using the theory that as a sample size increases it will converge on its probability distribution of that variable which in this case is the normal distribution.

Overall, I think it is important to know that there will be exceptions to any rule. But your insistance on arguing the exception as if it is the norm does not take away from the legitimacy of the general statement that half the schools will be below average.

This post was edited on 10/2/14 at 9:23 am

0 ...

Report Post

Posted by SpidermanTUba

my house

Member since May 2004

36129 posts

Posted on 10/12/14 at 5:43 pm to SpidermanTUba

quote:

If that is indeed true - then you are correct.

In fact statistics may be the most widely misunderstood concept amongst the educated professions (scientists, doctors, etc).

Nassim Taleb makes a point of this in his book "Fooled by Randomness".

Here's a good example he gives for how doctors don't understand statistics:

Given
A) that the false positive rate for a given test for a disease is 1% - and -
B) that the known rate of that disease in the population is 0.1%
EDIT - C) there are NO false negatives

What are the chances that a patient who tests positively for the disease actually has it?

According to Taleb (I forget if he cites a source or not) - doctors will give the wrong answer almost every time.

Just thought I'd bring up the false positive rate problem again - as it related to Ebola.

Given that the false positive rate for a certain rapid test is 3 in 1000

quote:

One major issue with this kind of rapid-testing quarantine is the phenomenon of false positives. But P.C.R.-based testing for Ebola has a low false-positive rate (three per 1,000), and its accuracy could be further improved by focusing on patients who come from particular geographic regions or by using more refined questionnaires.

LINK

I would say we should only be testing planes from countries with active ebola outbreaks, where the incidence rate exceeds 3 per 1000. Otherwise, most of the folks we isolate for the disease won't actually have it.

This post was edited on 10/12/14 at 5:44 pm

1 ...

Report Post

Posted by Vols&Shaft83