Started By
Message

re: Question for Statistics Gurus

Posted on 7/13/14 at 7:47 pm to
Posted by Volvagia
Fort Worth
Member since Mar 2006
51935 posts
Posted on 7/13/14 at 7:47 pm to
quote:

Anything outside the cl are outliers, traditionally.



Unfortunately the underlying chemistry here is such that you can expect to see SOME outliers due to unknown factors mucking up the univariate calibration.

Part of the expertise of doing this is separating the "valid" outliers to the ones that should be excluded from the model calibration. The ones that remain are not separate enough from the rest of the group to legitimately exclude them, regardless of confidence interval.
Posted by Volvagia
Fort Worth
Member since Mar 2006
51935 posts
Posted on 7/13/14 at 7:51 pm to
As a FWIW, here is the cross validation plot of the two models of residuals I already posted:







Posted by gaetti15
AK
Member since Apr 2013
13371 posts
Posted on 7/13/14 at 7:54 pm to
The process you are doing is correct.

Cross-validation is definitely the way to go with a regression problem like this.

If you are concerned in trying to find the difference between a true outlier and an something that would be considered wrong because of the process I would look at the r-studentized residuals.


ETA: If you want I can give you a reference to a professional statistician I know who loves this kind of stuff. Actually works with professors in Food Science on similar issues to yours.
These type of residuals are similar to z-scores.

If you have rstudent values over ~+/- 2.5 that means that the value the regression predicted had only a P(Z>=2.5) <0.0001 chance of being replicated again.

This post was edited on 7/13/14 at 7:57 pm
first pageprev pagePage 1 of 1Next pagelast page
refresh

Back to top
logoFollow TigerDroppings for LSU Football News
Follow us on Twitter, Facebook and Instagram to get the latest updates on LSU Football and Recruiting.

FacebookTwitterInstagram