非线性最小二乘

阅读: 评论:0

非线性最小二乘

非线性最小二乘

非线性最小二乘

目录

文章目录

  • 非线性最小二乘
    • 目录 @[toc]
    • 1 非线性最小二乘估计
    • 3 非线性最小二乘实现

1 非线性最小二乘估计

在经典最小二乘法估计中,假定被解释变量的条件期望是关于参数的线性函数,例如
E ( y ∣ x ) = a + b x E(y|x) = a+bx E(y∣x)=a+bx
其中 a , b a,b a,b为待估参数, E ( y ∣ x ) E(y|x) E(y∣x)是关于参数 a , b a,b a,b的线性函数。但 E ( y ∣ x ) E(y|x) E(y∣x)是关于参数的非线性函数,则利用ols求出的正规方程组没有解析解。只能通过相关数值计算。考虑一个简单的非线性模型
Y i = β X 1 i + β 2 X 2 i + ε i Y_{i}=beta X_{1 i}+beta^{2} X_{2 i}+varepsilon_{i} Yi​=βX1i​+β2X2i​+εi​
其中扰动项 ε i varepsilon_i εi​满足$ mathrm{E}left(varepsilon_{i}right)=0, operatorname{var}left(varepsilon_{i}right)=sigma^{2}$,且为独立同分布。其残差平方和为
S ( β ) = ∑ i = 1 n ε i 2 = ∑ i = 1 n [ Y i − f ( X i , β ) ] 2 = ∑ i = 1 n [ Y i − β X 1 i − β 2 X 2 i ] 2 begin{aligned} S(beta) &=sum_{i=1}^{n} varepsilon_{i}^{2}=sum_{i=1}^{n}left[Y_{i}-fleft(X_{i}, betaright)right]^{2} \ &=sum_{i=1}^{n}left[Y_{i}-beta X_{1 i}-beta^{2} X_{2 i}right]^{2} end{aligned} S(β)​=i=1∑n​εi2​=i=1∑n​[Yi​−f(Xi​,β)]2=i=1∑n​[Yi​−βX1i​−β2X2i​]2​
为了使回归线尽可能接近观测值,要求残差平方和最小。根据微积分的知识
d S d β = 2 ∑ i = 1 n [ Y i − f ( X i , β ) ] ( − d f ( X i , β ) d β ) = 2 ∑ i = 1 n [ Y i − β X 1 i − β 2 X 2 i ] [ − X 1 i − 2 β X 2 i ] = 0 begin{aligned} frac{mathrm{d} S}{mathrm{~d} beta} &=2 sum_{i=1}^{n}left[Y_{i}-fleft(X_{i}, betaright)right]left(-frac{mathrm{d} fleft(X_{i}, betaright)}{mathrm{d} beta}right) \ &=2 sum_{i=1}^{n}left[Y_{i}-beta X_{1 i}-beta^{2} X_{2 i}right]left[-X_{1 i}-2 beta X_{2 i}right]=0 end{aligned}  dβdS​​=2i=1∑n​[Yi​−f(Xi​,β)](−dβdf(Xi​,β)​)=2i=1∑n​[Yi​−βX1i​−β2X2i​][−X1i​−2βX2i​]=0​
整理得:
2 β 3 ∑ i = 1 n X 2 i 2 + 3 β 2 ∑ i = 1 n X 1 i X 2 i + β ( ∑ i = 1 n X 1 i 2 − 2 ∑ i = 1 n X 2 i Y i ) − ∑ i = 1 n X 1 i Y i = 0 2 beta^{3} sum_{i=1}^{n} X_{2 i}^{2}+3 beta^{2} sum_{i=1}^{n} X_{1 i} X_{2 i}+betaleft(sum_{i=1}^{n} X_{1 i}^{2}-2 sum_{i=1}^{n} X_{2 i} Y_{i}right)-sum_{i=1}^{n} X_{1 i} Y_{i}=0 2β3i=1∑n​X2i2​+3β2i=1∑n​X1i​X2i​+β(i=1∑n​X1i2​−2i=1∑n​X2i​Yi​)−i=1∑n​X1i​Yi​=0
这是关于参数 β beta β的三次函数。尽管三次函数存在解析解(利用卡丹或盛金公式),其结果极为复杂。若上述三次方程存在实根 β i ( i = 1 , 2 , 3 ) beta_i(i=1,2,3) βi​(i=1,2,3)(最多三个),则将 β i beta_i βi​代入残差平方和,取 S ( β ) S(beta) S(β)最小所对应的 β i beta_i βi​。上述例子中,被解释变量条件期望是关于参数的二次函数,如果将这种函数形式改为指数、对数或三角函数形式,则一般不存在解析解。


因此,数值分析自然成为解决上述问题的有力武器。考虑一般化的非线性回归问题,设总体回归模型满足
Y = f ( X , β ) + ε Y=f(X, beta)+varepsilon Y=f(X,β)+ε
对应的残差平方和为
S ( β ) = ∑ i = 1 n [ Y i − f ( X i , β ) ] 2 S(beta)=sum_{i=1}^{n}left[Y_{i}-fleft(X_{i}, betaright)right]^{2} S(β)=i=1∑n​[Yi​−f(Xi​,β)]2
要使其最小化,需要满足一阶条件
d S d β = − 2 [ ∑ i = 1 n [ Y i − f ( X i , β ) ] ( − d f ( X i , β ) d β ) ] = 0 frac{mathrm{d} S}{mathrm{~d} beta}=-2left[sum_{i=1}^{n}left[Y_{i}-fleft(X_{i}, betaright)right]left(-frac{mathrm{d} fleft(X_{i}, boldsymbol{beta}right)}{mathrm{d} beta}right)right]=0  dβdS​=−2[i=1∑n​[Yi​−f(Xi​,β)](−dβdf(Xi​,β)​)]=0
显然,上述问题不存在解析解,因此考虑对 f ( X i , β ) f(X_i, beta) f(Xi​,β)进行一阶泰勒展开。设参数向量 β beta β的初始值为 β 1 beta_1 β1​,则可以在 β 1 beta_1 β1​附近找到函数 f ( X i , β ) f(X_i, beta) f(Xi​,β)使得
f ( X i , β ) ≈ f ( X i , β 1 ) + d f ( X i , β ) d β ∣ β = β 1 ( β − β 1 ) fleft(X_{i}, betaright) approx fleft(X_{i}, beta_{1}right)+frac{mathrm{d} fleft(X_{i}, betaright)}{mathrm{d} beta} mid_{beta = beta_{1}}left(beta-beta_{1}right) f(Xi​,β)≈f(Xi​,β1​)+dβdf(Xi​,β)​∣β=β1​​(β−β1​)
记 d f ( X i , β ) d β ∣ β 1 ≈ f ( X i , β ) − f ( X , β ) β − β 1 left.frac{mathrm{d} fleft(X_{i}, betaright)}{mathrm{d} beta}right|_{beta_{1}} approx frac{fleft(X_{i}, betaright)-f(X, beta)}{beta-beta_{1}} dβdf(Xi​,β)​ ​β1​​≈β−β1​f(Xi​,β)−f(X,β)​,简记 X ~ i ( β 1 ) = d f ( X i , β ) d β ∣ β 1 widetilde{X}_{i}left(beta_{1}right)=left.frac{mathrm{d} fleft(X_{i}, betaright)}{mathrm{d} beta}right|_{beta_{1}} X i​(β1​)=dβdf(Xi​,β)​ ​β1​​,则
S ( β ) = ∑ i = 1 n [ Y i − f ( X i , β 1 ) − X ~ i ( β 1 ) ( β − β 1 ) ] 2 = ∑ i = 1 n [ Y ~ i ( β 1 ) − X i ( β 1 ) β ] 2 begin{aligned} S(beta) &=sum_{i=1}^{n}left[Y_{i}-fleft(X_{i}, beta_{1}right)-widetilde{X}_{i}left(beta_{1}right)left(beta-beta_{1}right)right]^{2} \ &=sum_{i=1}^{n}left[widetilde{Y}_{i}left(beta_{1}right)-X_{i}left(beta_{1}right) betaright]^{2} end{aligned} S(β)​=i=1∑n​[Yi​−f(Xi​,β1​)−X i​(β1​)(β−β1​)]2=i=1∑n​[Y i​(β1​)−Xi​(β1​)β]2​
其中
Y ~ i ( β 1 ) = Y i − f ( X i , β 1 ) + X ~ i ( β 1 ) β 1 widetilde{Y}_{i}left(beta_{1}right)=Y_{i}-fleft(X_{i}, beta_{1}right)+widetilde{X}_{i}left(beta_{1}right) beta_{1} Y i​(β1​)=Yi​−f(Xi​,β1​)+X i​(β1​)β1​
给定初始值向量 β i beta_i βi​,则 Y ~ i ( β 1 ) widetilde{Y}_{i}left(beta_{1}right) Y i​(β1​)与 X ~ i ( β 1 ) widetilde{X}_{i}left(beta_{1}right) X i​(β1​)可计算,从而求出最小残差平方和。 S ( β ) S(beta) S(β)对应的回归方程为
Y ~ i ( β 1 ) = X ~ i ( β ) β + ε i widetilde{Y}_{i}left(beta_{1}right)=widetilde{X}_{i}(beta) beta+varepsilon_{i} Y i​(β1​)=X i​(β)β+εi​
最小二乘估计量为
β 2 = [ X ~ ( β 1 ) ′ X ~ ( β 1 ) ] − 1 X ~ ( β 1 ) ′ Y ~ ( β 1 ) beta_{2}=left[widetilde{X}left(beta_{1}right)^{prime} widetilde{X}left(beta_{1}right)right]^{-1} widetilde{X}left(beta_{1}right)^{prime} widetilde{Y}left(beta_{1}right) β2​=[X (β1​)′X (β1​)]−1X (β1​)′Y (β1​)
其中
X ~ ( β 1 ) = [ X ~ 1 ( β 1 ) ⋮ X ~ n ( β 1 ) ] , Y ^ ( β 1 ) = [ Y ~ 1 ( β 1 ) ⋮ Y ~ n ( β 1 ) ] widetilde{X}left(beta_{1}right)=left[begin{array}{c} widetilde{X}_{1}left(beta_{1}right) \ vdots \ widetilde{X}_{n}left(beta_{1}right) end{array}right], quad hat{Y}left(beta_{1}right)=left[begin{array}{c} widetilde{Y}_{1}left(beta_{1}right) \ vdots \ widetilde{Y}_{n}left(beta_{1}right) end{array}right] X (β1​)= ​X 1​(β1​)⋮X n​(β1​)​ ​,Y^(β1​)= ​Y 1​(β1​)⋮Y n​(β1​)​
此时我们求出 β 2 beta_2 β2​,再将 β 2 beta_2 β2​作为初始值依次迭代计算,得到关于向量参数 β i beta_i βi​的一个序列,当且仅当
∣ ∣ β ( k + 1 ) − β ( k ) ∣ ∣ < δ ||beta^{(k+1)}-beta^{(k)}||<delta ∣∣β(k+1)−β(k)∣∣<δ
其中 δ > 0 delta>0 δ>0为事先预定的绝对误差。不难得到,参数 β beta β满足递推关系
β n + 1 = [ X ~ ( β n ) ′ X ~ ( β n ) ] − 1 X ~ ( β n ) ′ Y ~ ( β n ) = [ X ~ ( β n ) ′ X ~ ( β n ) ] − 1 X ~ ( β n ) ′ [ Y − f ( X ~ , β n ) + X ~ ( β n ) β n ] = β n + [ X ~ ( β n ) ′ X ~ ( β n ) ] − 1 X ~ ( β n ) ′ [ Y − f ( X , β n ) ] begin{aligned} boldsymbol{beta}_{n+1} &=left[widetilde{X}left(boldsymbol{beta}_{n}right)^{prime} widetilde{X}left(boldsymbol{beta}_{n}right)right]^{-1} widetilde{X}left(boldsymbol{beta}_{n}right)^{prime} widetilde{Y}left(boldsymbol{beta}_{n}right) \ &=left[widetilde{X}left(boldsymbol{beta}_{n}right)^{prime} widetilde{X}left(boldsymbol{beta}_{n}right)right]^{-1} widetilde{X}left(boldsymbol{beta}_{n}right)^{prime}left[boldsymbol{Y}-fleft(widetilde{X}, boldsymbol{beta}_{n}right)+widetilde{X}left(boldsymbol{beta}_{n}right) boldsymbol{beta}_{n}right] \ &=boldsymbol{beta}_{n}+left[widetilde{X}left(boldsymbol{beta}_{n}right)^{prime} widetilde{X}left(boldsymbol{beta}_{n}right)right]^{-1} widetilde{X}left(boldsymbol{beta}_{n}right)^{prime}left[Y-fleft(X, boldsymbol{beta}_{n}right)right] end{aligned} βn+1​​=[X (βn​)′X (βn​)]−1X (βn​)′Y (βn​)=[X (βn​)′X (βn​)]−1X (βn​)′[Y−f(X ,βn​)+X (βn​)βn​]=βn​+[X (βn​)′X (βn​)]−1X (βn​)′[Y−f(X,βn​)]​
通过证明,随着样本容量 n → ∞ ntoinfty n→∞,参数 β beta β估计量服从渐进正态分布,即
β ~ ∼ N ( β , σ ^ 2 [ X ~ ( β ) ′ X ~ ( β ) ] − 1 ) , σ ^ 2 = S ( β ~ ) n − 1 widetilde{beta} sim Nleft(beta, hat{sigma}^{2}left[widetilde{X}(beta)^{prime} widetilde{X}(beta)right]^{-1}right), hat{sigma}^{2}=frac{S(widetilde{beta})}{n-1} β ​∼N(β,σ^2[X (β)′X (β)]−1),σ^2=n−1S(β ​)​


3 非线性最小二乘实现

在R语言中,可以适用nls函数实现非线性最小二乘法。以C-D函数为例,

设一国产出取决于资本、劳动与全要素的投入,即
Y = A K α L β μ Y = AK^{alpha}L^{beta}mu Y=AKαLβμ
下面通过R代码运行实现对参数 α , β alpha,beta α,β的估计

t = 1:12 #时间设定
Y=c(26.74, 34.81, 44.72, 57.46, 73.84, 88.45, 105.82,126.16, 150.9, 181.6, 204.3, 222.8) #产出序列
K=c(23.66,30.55,38.12,46.77,56.45,67.15,78.92,91.67,105.5, 121.3, 128.6, 132.5) #资本序列
L=c(26, 28, 32, 36, 41, 45, 48, 52, 56, 60, 66, 70) #劳动投入序列
Cdnls <- nls(Y~A*K^a*L^b,start = list(A = 0.1,a = 0.5,b = 0.5)) #非线性最小二乘,start为参数初始值向量
summary(Cdnls)
#-------------------运行结果---------------------------
#Formula: Y ~ A * K^a * L^b
Parameters:Estimate Std. Error t value Pr(>|t|)    
A   0.1129     0.0159    7.12  5.6e-05 ***
a   0.6568     0.0652   10.07  3.4e-06 ***
b   1.0298     0.1044    9.86  4.0e-06 ***
---
Signif. codes:  
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1Residual standard error: 1.7 on 9 degrees of freedomNumber of iterations to convergence: 9 
Achieved convergence tolerance: 7.55e-06

结果显示,参数 α = 0.6568 alpha = 0.6568 α=0.6568, β = 1.0298 beta = 1.0298 β=1.0298。对比直接取对数的OLS,即估计
l n Y = l n A + α l n K + β l n L + e lnY = lnA+alpha lnK+beta lnL+e lnY=lnA+αlnK+βlnL+e

CDlm <- lm(log(Y)~log(K)+log(L))  #对数形式
summary(CDlm)
#--------------------运行结果--------------
Call:
lm(formula = log(Y) ~ log(K) + log(L))Residuals:Min       1Q   Median       3Q      Max 
-0.02714 -0.00595 -0.00118  0.00764  0.02557 Coefficients:Estimate Std. Error t value Pr(>|t|)    
(Intercept)  -2.0737     0.2355   -8.80  1.0e-05 ***
log(K)        0.6258     0.0916    6.83  7.6e-05 ***
log(L)        1.0379     0.1621    6.40  0.00012 ***
---
Signif. codes:  
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1Residual standard error: 0.0173 on 9 degrees of freedom
Multiple R-squared:     1,	Adjusted R-squared:  0.999 
F-statistic: 9.16e+03 on 2 and 9 DF,  p-value: 1.29e-15

结果显示,参数 α = 0.6268 alpha = 0.6268 α=0.6268, β = 1.0379 beta = 1.0379 β=1.0379。因此,CD函数对数化的结果回归与非线性最小二乘回归的参数基本一致。但一些不能对数化的方程,非线性最小二乘的作用更为明显。考虑真实模型
y = 2 s i n ( x ) + 4 c o s ( x ) y = 2sin(x)+4cos(x) y=2sin(x)+4cos(x)
接下来我们进行仿真模拟

set.seed(123) #随机种子
x <- seq(1,100,by = 0.1) #1-100,步长为0.1
e <- rnorm(length(x),0,1) #长度为序列x的长度,服从标准正态分布的误差
y <- 2*sin(x)+4*cos(x)+e #实际观测的被解释变量
plot(x,y,type = "o") #打印散点图nls1 <- nls(y~a*sin(x)+b*cos(x),start = list(a = 0,b =0)) #非线性最小二乘,初始值设定为0,0
nls1
#-------------运行结果------------------
Nonlinear regression modelmodel: y ~ a * sin(x) + b * cos(x)data: parent.frame()a    b 
1.92 4.03 residual sum-of-squares: 974Number of iterations to convergence: 1 
Achieved convergence tolerance: 6.73e-10

结果显示估计量 a = 1.92 a = 1.92 a=1.92, b = 4.03 b = 4.03 b=4.03,与总体参数 a = 2 , b = 4 a = 2,b = 4 a=2,b=4即为接近


-END-

参考文献

王斌会(2015).计量经济学建模及R语言应用[M].北京大学出版社

本文发布于:2024-02-04 23:05:32,感谢您对本站的认可!

本文链接:https://www.4u4v.net/it/170718446060591.html

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。

标签:小二
留言与评论(共有 0 条评论)
   
验证码:

Copyright ©2019-2022 Comsenz Inc.Powered by ©

网站地图1 网站地图2 网站地图3 网站地图4 网站地图5 网站地图6 网站地图7 网站地图8 网站地图9 网站地图10 网站地图11 网站地图12 网站地图13 网站地图14 网站地图15 网站地图16 网站地图17 网站地图18 网站地图19 网站地图20 网站地图21 网站地图22/a> 网站地图23