|
LeoStatistic software for data presentation, statistical analysis, marketing and prediction. Free download: |
Multivariate regression. When fitting function for modeling experimental data have more then one independent argument we can talk about multivariate regression. LeoStatistic offers linear and parabolic approximation as well near neighbors method for performing multivariate analysis and fitting by user defined formula. Presuming that status of parameters in data series is set to assign as arguments independent parameters and as value a modeling one. Go to "Statistics" tab of control panel:
By checking up the corresponding control do select modeling method to approximate data with: Linear multivariate equation: y = a0 + a1*x1 + a2*x2 +... +an*xn (1) where ai - found approximation coefficients, xi - i-th argument. LeoStatistic will also find standard deviations for each found coefficient. Parabolic multivariate equation: y = a0 + a1*(x1+b1)2 + a2*(x2+b2)2 +... +an*(xn+bn)2 (2) where ai and bi- found approximation coefficients, xi - i-th argument. Coefficients bi - are represented positions of extremums for each of the arguments these are equal -bi. It's important to understand that collection of values (-b1, -b2, ... -bn) could be a coordinates of global minimum in n-dimensional space if all ai coefficients have the same positive or negative signs. If signs are mixed it means that there is no global extremum for variable y. In case of two arguments multivariate approximation with linear equation is an analog of plane fitting in three dimensional space and it will be shown in form of the plane mesh in the result panel. A parabolic approximation also have as analog a surface fitting or with parabola or saddle shape. A visual presentation of multivariate regression for more then two arguments in LeoStatistic in done in form of of x-y coordinates chart with experimental values along x-axis and theoretical along y-axis. For ideal fitting by theoretical formula all point on the chart have to lie on the 0 - 1 diagonal that is shown on the chat. The large dispersion of the worst is approximation. Near neighbors estimation.
This method is based on the presumption that we have no advance knowledge about
mutual dependence between variables. One can assume that for non-sporadic data
the closer a point in multidimensional space is located to other points the more
reasons to suggest that their value will be approximately the same. An other
approach to described this method is to say that estimate value of the point in
n-dimensional space is to say that it is most possible value is an weighted
average of values for most closest points around. The formula for calculation
looks like this:
where - LeoStatistic software application implements following schemes to calculate A distance, dpi, in n-dimensional space between probe and
i-th points is calculated by formula:
where - summing is done by all n arguments. Also
the is an option to take into consideration only a exact part of most closest
records.
I general words a closest analog for near neighbors method in everyday
life is a estimation of the height of some point from the numerous measurements
taken all around. One can presume that it will be average of heights of near by
measured points. For plane like landscape is quite reasonable, for mountains too
but measured points should be much more dense. A visual presentation of the
result of presentation are the same as for linear and parabolic regression
described above. User defined
formula. By clicking on the button "User
defined formula" application will go to the free format interface that is almost
identical to the analogous in case of curve fitting with user defined formula of
one argument except instead of using "x" as a substitute argument name, in
multivariate situation in fitting formula user has to put actual names of
arguments. This is adding other natural limitation on the number of data sets
that has to be only one. Typical user interface for free format formula in case
of multivariate approximation is shown on the image:
User has to input a fitting formula that contains names of
arguments and fitting coefficients. Fitting coefficients can be added, edited
and deleted. By clicking on the "Run fitting" button
user starts one of the incorporated algorithm for finding best values of
coefficients in sense of minimum sum of square deviations calculated by user
formula and experimental values of the defined by it parameter. There are
following "Fitting schemes" are incorporated in LeoStatistic. All off them are
based on the idea that we are looking around some given set of fitting
coefficients in n-dimensional space where n is number of fitting coefficients.
As soon at the some of the attempts the best fit is found we are taken this
point as a vantage and continue the search. Difference between schemes are in
how next point to check is chosen: Depend on the profile of multidimensional space deviation(C The procedure to find best fit will stop automatically when absolute values of all Steps will be less then given by user Stop condition or user can brake the process manually.
|
Screenshots of the LeoStatistic software: click on picture to enlarge
|