- My Forums
- Tiger Rant
- LSU Recruiting
- SEC Rant
- Saints Talk
- Pelicans Talk
- More Sports Board
- Fantasy Sports
- Golf Board
- Soccer Board
- O-T Lounge
- Tech Board
- Home/Garden Board
- Outdoor Board
- Health/Fitness Board
- Movie/TV Board
- Book Board
- Music Board
- Political Talk
- Money Talk
- Fark Board
- Gaming Board
- Travel Board
- Food/Drink Board
- Ticket Exchange
- TD Help Board
Customize My Forums- View All Forums
- Show Left Links
- Topic Sort Options
- Trending Topics
- Recent Topics
- Active Topics
Started By
Message
re: Question for Statistics Gurus
Posted on 7/13/14 at 7:47 pm to Winkface
Posted on 7/13/14 at 7:47 pm to Winkface
quote:
Anything outside the cl are outliers, traditionally.
Unfortunately the underlying chemistry here is such that you can expect to see SOME outliers due to unknown factors mucking up the univariate calibration.
Part of the expertise of doing this is separating the "valid" outliers to the ones that should be excluded from the model calibration. The ones that remain are not separate enough from the rest of the group to legitimately exclude them, regardless of confidence interval.
Posted on 7/13/14 at 7:51 pm to Volvagia
As a FWIW, here is the cross validation plot of the two models of residuals I already posted:
Posted on 7/13/14 at 7:54 pm to Volvagia
The process you are doing is correct.
Cross-validation is definitely the way to go with a regression problem like this.
If you are concerned in trying to find the difference between a true outlier and an something that would be considered wrong because of the process I would look at the r-studentized residuals.
ETA: If you want I can give you a reference to a professional statistician I know who loves this kind of stuff. Actually works with professors in Food Science on similar issues to yours.
These type of residuals are similar to z-scores.
If you have rstudent values over ~+/- 2.5 that means that the value the regression predicted had only a P(Z>=2.5) <0.0001 chance of being replicated again.
Cross-validation is definitely the way to go with a regression problem like this.
If you are concerned in trying to find the difference between a true outlier and an something that would be considered wrong because of the process I would look at the r-studentized residuals.
ETA: If you want I can give you a reference to a professional statistician I know who loves this kind of stuff. Actually works with professors in Food Science on similar issues to yours.
These type of residuals are similar to z-scores.
If you have rstudent values over ~+/- 2.5 that means that the value the regression predicted had only a P(Z>=2.5) <0.0001 chance of being replicated again.
This post was edited on 7/13/14 at 7:57 pm
Popular
Back to top
Follow TigerDroppings for LSU Football News