Started By
Message

Excel help

Posted on 9/15/15 at 9:07 am
Posted by Sigma
Fairhope, AL
Member since Dec 2005
3643 posts
Posted on 9/15/15 at 9:07 am
I have two columns of data to be plotted in simple Y vs X fashion:

X
8.105
7.620
7.186
6.688
6.339
5.772

Y
1260
3790
10050
30300
60450
189300

I then need to fit the data to an appropriate regression, and I need that equation.

Using that equation, I would like to use an X value of 6.270 to calculate the corresponding Y value.

When I use a 4th order polynomial (R=0.9999) I get a Y value of -3x10^6, which cannot be correct. However when I use a power fit, I get the reasonable value of 6.58x10^4. Why is the calculation using the polynomial fit so off? I've done the calculation in both Excel and by hand using a calculator.

I'm sure I'm missing something very basic, so apologies in advance.
Posted by OysterPoBoy
City of St. George
Member since Jul 2013
35215 posts
Posted on 9/15/15 at 9:08 am to
Turn it off then turn it on again.
Posted by The Boob
Member since Mar 2010
767 posts
Posted on 9/15/15 at 9:09 am to
The answer is 288 (or 2, depending on whether you multiply or divide first)
Posted by musick
the internet
Member since Dec 2008
26125 posts
Posted on 9/15/15 at 9:09 am to
Help Board
Posted by Sigma
Fairhope, AL
Member since Dec 2005
3643 posts
Posted on 9/15/15 at 9:12 am to
quote:

The answer is 288 (or 2, depending on whether you multiply or divide first)


First thing I thought of before posting
Posted by Sigma
Fairhope, AL
Member since Dec 2005
3643 posts
Posted on 9/15/15 at 9:16 am to
quote:

Turn it off then turn it on again.


Generally, my first solution to every problem

But seriously I've done this on two different computers running two different version of excel.
Posted by Pintail
Member since Nov 2011
10460 posts
Posted on 9/15/15 at 9:22 am to
ETA: Don't know what Excel just did... that was wrong
This post was edited on 9/15/15 at 9:24 am
Posted by pioneerbasketball
Team Bunchie
Member since Oct 2005
132409 posts
Posted on 9/15/15 at 9:24 am to
quote:

Turn it off then turn it on again.
Posted by ksayetiger
Centenary Gents
Member since Jul 2007
68319 posts
Posted on 9/15/15 at 9:26 am to
X=about tree fiddy
Posted by Pintail
Member since Nov 2011
10460 posts
Posted on 9/15/15 at 9:31 am to
=5*10^12*EXP(-2.878*X)
This post was edited on 9/15/15 at 9:34 am
Posted by NoFlexZone
Member since Sep 2015
429 posts
Posted on 9/15/15 at 9:31 am to
quote:

When I use a 4th order polynomial (R=0.9999) I get a Y value of -3x10^6, which cannot be correct. However when I use a power fit, I get the reasonable value of 6.58x10^4. Why is the calculation using the polynomial fit so off? I've done the calculation in both Excel and by hand using a calculator.

I'm sure I'm missing something very basic, so apologies in advance.
First of all just because you got an R = 0.999 doesn't mean your line is a good fit. People who throw data into Excel and quote this number don't know jack.

The reason power function is a better fit is rooted in the nature of your data. Generally biological data that shows growth is best fitted using a power function.
Posted by Sigma
Fairhope, AL
Member since Dec 2005
3643 posts
Posted on 9/15/15 at 9:59 am to
quote:

First of all just because you got an R = 0.999 doesn't mean your line is a good fit. People who throw data into Excel and quote this number don't know jack.

The reason power function is a better fit is rooted in the nature of your data. Generally biological data that shows growth is best fitted using a power function.


If you have time, can you explain?
Posted by NoFlexZone
Member since Sep 2015
429 posts
Posted on 9/15/15 at 10:32 am to
I'll start of by mentioning again that how your data fits a function is rooted in the nature of the data. If you're looking at an association between the length of fish and weight, it turns out fitting a higher degree polynomial provides a good fit. Why? If you cube length, you get...volume and volume has a decent association with weight of fish.

On the other hand, biological data (e.g. population growth, bacteria growth, or decay) and financial data (e.g. interest on loan) and Moore's law(number of transistors on a chip), have a better fit when the relationship is a power function (logarithms, exponential functions).

I'm curious, what is your data about? Just curious - don't need much details.
Posted by Sigma
Fairhope, AL
Member since Dec 2005
3643 posts
Posted on 9/15/15 at 10:37 am to
Thanks.

It's the relationship between the size of a particle (Y) and the time it takes for the particle to move through a volume (X).

ETA: size in terms of molecular weight
This post was edited on 9/15/15 at 10:41 am
Posted by K E V 8 4
Member since Jul 2010
608 posts
Posted on 9/15/15 at 11:16 am to
I get 68,167 using an exponential function. Just using the "Add trend line" in XLS. The equation is 5E+10e^-2.154x

R Sq = 0.9991
Posted by NoFlexZone
Member since Sep 2015
429 posts
Posted on 9/15/15 at 11:23 am to
Interesting association.

I'm sure you're aware of R-squared, residual plots, etc - check those.

If you're trying to make a prediction, I'd steer away from using high degree polynomials. I don't know if this is something that is common in your line of work but high degree polynomials are pretty misleading.
Posted by Sigma
Fairhope, AL
Member since Dec 2005
3643 posts
Posted on 9/15/15 at 12:50 pm to
quote:

I get 68,167 using an exponential function. Just using the "Add trend line" in XLS. The equation is 5E+10e^-2.154x R Sq = 0.9991


This was part of my problem. Several other types of regressions would work, just the calculation using the 4th order fit was crazy.

quote:

If you're trying to make a prediction, I'd steer away from using high degree polynomials. I don't know if this is something that is common in your line of work but high degree polynomials are pretty misleading.


My knowledge of statistics is admittedly poor, but the plotted data are very smooth. I can read off the Y value using the given X value and be within 5%. I was just looking to understand the math I suppose. Thanks for your help.
Posted by KG6
Member since Aug 2009
10920 posts
Posted on 9/15/15 at 1:12 pm to
I've gotten some funny numbers over time trying to fit equations to curves in Excel. I like you don't really understand it. I can understand that you may not be able to get an extremely accurate answer, but sometimes the equation it spits out is just BS.
Posted by K E V 8 4
Member since Jul 2010
608 posts
Posted on 9/15/15 at 1:40 pm to
I think I might see what's happening. A four-order poly gives the following (and the line is perfectly through the points):

y = 15297x4 - 457991x3 + 5E+06x2 - 3E+07x + 5E+07
R² = 0.9999

The x4, x3, etc. are actually x^4, x^3, etc.

Yet.....I plugged in 6.27 and get some nonsensical answer- negative 30.8 million. I think the XLS reported coefficients are not providing enough significant digits and that is what's messed up. Note the 5.00 million, 30.0 million and 50.0 million coefficients. The rounded (truncated) diffs is what's busting the equation. Would need to do the calcs manually (or program) to get the precise coefficients. An XLS bug, it appears. Somebody shout if that's not it.
Posted by K E V 8 4
Member since Jul 2010
608 posts
Posted on 9/15/15 at 1:43 pm to
Solution: divide all of your Y data points by 10,000. Regress. Then take result of equation and multiply by 10,000. The equation and RSq are as follows:

y = 1.5297x4 - 45.799x3 + 515.07x2 - 2580.4x + 4862.3
R² = 0.9999



Answer: 71,386
This post was edited on 9/15/15 at 1:51 pm
first pageprev pagePage 1 of 2Next pagelast page

Back to top
logoFollow TigerDroppings for LSU Football News
Follow us on Twitter, Facebook and Instagram to get the latest updates on LSU Football and Recruiting.

FacebookTwitterInstagram