We continue to develop resources related to the COVID-19 pandemic. See COVID-19 initiatives on Appropedia for more information.
Curve fitting to a set of data
|By Michigan Tech's Open Sustainability Technology Lab.
Wanted: Students to make a distributed future with solar-powered open-source 3-D printing.
- 1 Introduction
- 2 Theory
- 3 Curve fitting Programs
- 4 Residuals
- 5 Sources and Further Reading
To determine how a certain data set can be modeled as a mathematical curve, a curve fitting approach can be used. This approach is often taken with experimental data to find the function that best represents the observed data. The best fit of the curve to the data is quantitatively defined as the minimization of the difference between the data and curve.
Curve fitting is based on the underlying assumption that the observed data is driven by some process that can be modeled as a mathematical function. The small differences that arise between the observations and predicted values are then due to measurement errors and uncontrolled influencing factors. The driving mathematical function can be linear or non-linear, and different approaches to curve fitting can be undertaken depending on the type of function being fit to the data.
The main theory behind curve fitting data revolves around minimizing the sum of the squares of the residuals (where the residual of a curve fit for each data point is the difference between the observed data point and the predicted value as given by the function of the curve). This approach is known as the method of the least squaresW. The goodness of fitW can be measured in several capacities, including the common methods of the coefficient of determinationW and the chi-square testW.
For linear functions, the solution for a best fit curve is a defined closed solution that can be directed solved. However, for non-linear functions, the solution typically needs to converge through an iterative approach. For linear least squaresW methods, the function is considered linear if its coefficients (the parameters B1, B2, B3...) are linear (i.e. the function is a polynomial equation).
For non-linear functions, an iterative non-linear least squaresW approach is utilized to converge to the best fit curve. Several algorithms have been formulated to aid in converging the solution to non-linear curve fitting. One of the most popular algorithms is the Levenberg-Marquardt algorithmW, which uses the Gauss-Newton algorithmW and the gradient of the function to converge to an optimal solution. 
There are many Internet resources documenting the derivation of these algorithms and the mathematics associated with them. The different algorithms vary in complexity and speed of computation/convergence. As the complex mathematics are detailed in depth on the Internet, this article will focus more on user-friendly software methods of curve fitting. See the Further Reading section at the bottom of this page for more on the theory of curve fitting and in-depth articles on the different methods of achieving the best fit.
Curve fitting Programs
With the advent of computer numerical simulation software, curve fitting is rarely done without computers due to the ease and speed that can be achieved through computer calculations. There are many options available for computer curve fitting of data. This article will briefly look at a few options, each suited for different applications and users.
- Microsoft ExcelW is a commonly used spreadsheet program that offers the ability to perform basic curve fitting functions.
- The following standard function types can be fit using Excel: Exponential, linear, logarithmic, polynomial and power.
- To use the standard curve fitting function, graph the data using a scatter plotW and right-click the data points, selecting 'Add Trendline'.
- Excel is a program that allows for curve fitting.
- For more advanced curve fitting, including fitting non-standard function, the solver function in Excel can be used. Specific directions for more advanced solve functionality in Excel is found in this PDF file.
- Origin is a spreadsheet and data analysis program popularly used by the scientific community.
- Origin offers powerful data analysis capabilities including advanced curve fitting functionality. The program has around 200 built-in functions that can be fit, and offers the ability to easily create new user-defined functions to fit.
- The program offers powerful non-linear fitting, global variable fitting and an easy visual interface.
- To use the curve fitting functionality, graph the data and select a curve fitting option from the 'Analysis' menu.
- MATLABW is a numerical computing platform that is widely used in scientific and engineering applications. MATLAB is an extremely powerful and flexible software program, however it requires some training and programming knowledge.
- MATLAB allows a user to write custom scripts and programs and offers a variety of built-in functionality.
- In terms of curve fitting, a custom program can be made or a built-in curve fitting toolbox can be used.
- A free add-in toolbox called EzyFit is a powerful utility that simplifies the curve fitting process.
- MATLAB is the most powerful program of the three listed, but its complexity and cost may deter some users.
- Scilab is an open source, cross-platform numerical computational package and a high-level, numerically oriented programming language. It can be used for signal processing, statistical analysis, image enhancement, fluid dynamics simulations, numerical optimization, and modeling and simulation of explicit and implicit dynamical systems. MATLAB code, which is similar in syntax, can be converted to Scilab. Scilab is one of several open source alternatives to MATLAB.
- Download here
fitteia.com Online user defined n-dimensions function fitting
- No fees, advertisements or request for donations
- User account, project sharing
- Function fitting n-dimensions, data plotting, report writing, programable calculator
- Publishing ready plots
- Professional oriented interface
- Accessible from any browser
- Try it here
- Additional information 
Zunzun.com Online Curve And Surface Fitting:
- No fees, advertisements or request for donations.
- Performs both 2D curve fitting and 3D surface fitting with data histograms, error histograms, error plots, curve plots, surface plots, contour plots, and PDF file output.
- Automatically generates source code output in the C++, Java, C#, Python, VBA, SCILAB and MATLAB computer languages.
- Fits data to tens of thousands of equations, with several hundreds of named standard equations available. Users can also define their own functions.
- Online resource is freely available 24 hours a day from any computer in the world with an internet connection. Link is http://zunzun.com/
- The actual site fitting source code is available under a liberal BSD-style license on the source code repository at https://bitbucket.org/zunzuncode/zunzunsite3
The residual of a curve fit for each data point is the difference between the observed data point and the predicted value as given by the function of the curve. Plotting the residuals provides a visual method of verifying the fit of the curve. Any trends seen in the residuals plot can indicate the presence of a non-random error, an unaccounted trend or a poor fit for the model.
Sources and Further Reading
- Gerald, C., Wheatley, P. "Applied numerical analysis (Fourth edition)", (1989)Addison-Wesley Publishing Co.
- Press, W. H. Numerical Recipes in C (2nd edition)" (1992) Cambridge University Press
- Moisy, F., “EzyFit” (2009) Accessible at http://www.fast.u-psud.fr/ezyfit/
- The art of model fitting to experimental results, Pedro J Sebastião 2014 Eur. J. Phys. 35 015017 doi:10.1088/0143-0807/35/1/015017
Suggested reading for further in-depth discussion of curve fitting:
Curve Fitting Made Easy by Marko Ledvij
Curve Fitting in Microsoft Excel by William Lee
Curve Fitting, Data Modelling, Interpolation, Extrapolation by Mathcom Solutions
Curve Fitting in MatLab (Video) by Jake Blanchard
Online data fit through your browser by LelandStanfordJunior
"The art of model fitting to experimental results" by Pedro José Sebastião