Document [original]

New nonlinear adjustment approaches for

applications in Geodesy and related fields

vorgelegt von

M.Sc.

Georgios Malissiovas

von der Fakult¨at VI – Planen Bauen Umwelt

der Technischen Universit¨at Berlin

zur Erlangung des akademischen Grades

Doktor der Ingenieurwissenschaften

-Dr.-Ing.-

genehmigte Dissertation

Promotionsausschuss:

Vorsitzender: Prof. Dr.-Ing. Martin Kada

Gutachter: Prof. Dr.-Ing. Frank Neitzel

Gutachter: Prof. Dr. techn. Wolf-Dieter Schuh

Gutachter: Prof. Dr. Andreas Wieser

Gutachter: Priv.-Doz. Dr. techn. habil. Svetozar Petrovi´c

Tag der wissenschaftlichen Aussprache: 10. Mai 2019

Berlin 2019

iii

Summary

This dissertation deals with a class of nonlinear adjustment problems that has a direct least squares solution

for certain weighting cases. In the literature of mathematical statistics these problems are expressed in a

nonlinear model called Errors-In-Variables (EIV) and their solution became popular as total least squares

(TLS). The TLS solution is direct and involves the use of singular value decomposition (SVD), presented in

most cases for adjustment problems with equally weighted and uncorrelated measurements. Additionally,

several weighted total least squares (WTLS) algorithms have been published in the last years for deriving

iterative solutions, when more general weighting cases have to be taken into account and without linearizing

the problem in any step of the solution process.

This research provides firstly a well defined mathematical relationship between TLS and direct least squares

solutions. As a by-product, a systematic approach for the direct solution of these adjustments is established,

using a consistent and complete mathematical formalization. By transforming the problem to the solution

of a quadratic or cubic algebraic equation, which is identical with those resulting from TLS, it will be shown

that TLS is an algorithmic approach already known to the geodetic community and not a new method.

A second contribution of this work is the clear overview of weighted least squares solutions for the discussed

class of problems, i.e. the WTLS solution in the terminology of the statistical community. It will be shown

that for certain weighting cases a direct solution still exists, for which two new solution strategies will be

proposed. Further, stochastic models with more general weight matrices are examined, including correlations

between the measurements or even singular cofactor matrices. New algorithms are developed and presented,

that provide iterative weighted least squares solutions without linearizing the original nonlinear problem.

The aim of this work is the popularization of the TLS approach, by presenting a complete framework for

obtaining a (weighted) least squares solution for the investigated class of nonlinear adjustment problems.

The proposed approaches and the implemented algorithms can be employed for obtaining direct solutions

in engineering tasks for which efficiency is important, while iterative solutions can be derived for stochastic

models with more general weights.

Zusammenfassung

Die vorliegende Dissertation besch¨aftigt sich mit einer Klasse von nichtlinearen Ausgleichungsproblemen,

die eine direkte L¨osung nach der Methode der kleinsten Quadrate unter spezifischen Gewichtungsf¨allen

aufweisen. In der Literatur der mathematischen Statistik werden derartige Probleme in einem nichtlinearen

Modell namens Errors-In-Variables (EIV) ausgedr¨uckt und deren L¨osung wurde als Total Least Squares

(TLS) popul¨ar. In den meisten F¨allen l¨asst sich f¨ur gleich gewichtete und unkorrelierte Messungen eine TLS

L¨osung direkt durch eine Singul¨arwertzerlegung (SVD) bestimmen. Dar¨uber hinaus wurden in den letzten

Jahren mehrere Weighted Total Least Squares (WTLS) Algorithmen zur Herleitung iterativer L¨osungen

ver¨offentlicht, bei denen allgemeinere Gewichtungsf¨alle ber¨ucksichtigt werden k¨onnen, ohne das Problem in

jedem Schritt des L¨osungsprozesses zu linearisieren.

Zun¨achst wird in dieser Arbeit eine klar definierte mathematische Beziehung zwischen TLS und direk-

ter L¨osungen nach der Methode der kleinsten Quadrate dargestellt. Des Weiteren wird ein systematis-

cher Ansatz zur direkten L¨osung derartiger Ausgleichungsproblemen unter Verwendung einer konsistenten

und vollst¨andigen mathematischen Formalisierung entwickelt. Durch die ¨

Uberf¨uhrung des Problems in die

L¨osung einer quadratischen oder kubischen algebraischen Gleichung wird gezeigt, dass TLS ein algorithmis-

cher Ansatz ist, der der geod¨atischen Gemeinschaft bereits bekannt ist und keine neue Methode darstellt.

Ein weiterer Beitrag dieser Arbeit besteht in einer klaren ¨

Ubersicht von gewichteten Kleinste-Quadrate

L¨osungen f¨ur die hier diskutierte Klasse von Problemen, wie z.B. der WTLS-L¨osung aus der Terminologie

der statistischen Gemeinschaft. Es wird gezeigt, dass f¨ur bestimmte Gewichtungsf¨alle noch eine direkte

L¨osung existiert, wof¨ur zwei neue L¨osungsstrategien vorgestellt werden. Weiterhin werden stochastische

Modelle mit allgemeineren Gewichtsmatrizen untersucht, einschließlich Korrelationen zwischen den Messun-

gen oder sogar singul¨aren Kofaktor-Matrizen. Es werden neue Algorithmen entwickelt und vorgestellt, die

gewichtete Kleinste-Quadrate L¨osungen iterativ berechnen, ohne das urspr¨ungliche nichtlineare Problem zu

linearisieren.

Das Ziel dieser Arbeit ist die Popularisierung des TLS-Ansatzes, indem eine umfassende Strategie zur

Berechnung einer (gewichteten) Kleinste-Quadrate L¨osung f¨ur die betrachtete Klasse an nichtlinearen Aus-

gleichungsproblemen bereitgestellt wird. Die vorgeschlagenen Ans¨atze und implementierten Algorithmen

k¨onnen zur Berechnung direkter L¨osungen in vielen Ingenieuraufgaben eingesetzt werden, bei denen Effizienz

wichtig ist, w¨ahrend f¨ur stochastische Modelle mit allgemeineren Gewichten auf die iterativen L¨osungsans¨atze

zur¨uckgegriffen werden kann.

vii

Contents

Titlepage i

Summary iii

Zusammenfassung v

Contents vii

List of Figures xi

List of Tables xiii

Abbreviations xv

1 Introduction and Motivation 1

1.1 Researchcontributions........................................ 3

1.2 Organizationofthisthesis...................................... 4

Part I - Fundamentals 7

2 Adjustment calculus 9

2.1 Mathematical modelling of adjustment problems . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.1.1 Thefunctionalmodel .................................... 10

2.1.2 Thestochasticmodel..................................... 13

2.1.3 Criteria for the solution of adjustment problems . . . . . . . . . . . . . . . . . . . . . 15

2.2 Adjustment of observations with the method of least squares . . . . . . . . . . . . . . . . . . 16

2.2.1 Statistical formulation of least squares problems . . . . . . . . . . . . . . . . . . . . . 17

2.2.2 Least squares parameter estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.2.3 Definition of linear and nonlinear least squares problems . . . . . . . . . . . . . . . . . 22

2.3 Error estimation of adjustment results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.4 Synopsis of the basics in adjustment calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3 Solutions of nonlinear least squares problems 29

3.1 Traditional geodetic solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.1.1 Adjustment with observation equations and constraints . . . . . . . . . . . . . . . . . 31

3.1.1.1 Least squares parameter estimation within the GMM . . . . . . . . . . . . . 33

3.1.1.2 Error estimation within the GMM . . . . . . . . . . . . . . . . . . . . . . . . 35

3.1.1.3 Least squares parameter estimation within the GMM with constraints . . . . 37

3.1.1.4 Error estimation within the GMM with constraints . . . . . . . . . . . . . . 39

3.1.2 Adjustment with condition equations and constraints . . . . . . . . . . . . . . . . . . . 40

3.1.2.1 Least squares parameter estimation within the GHM . . . . . . . . . . . . . 42

3.1.2.2 Error estimation within the GHM . . . . . . . . . . . . . . . . . . . . . . . . 44

3.1.2.3 Least squares parameter estimation within the GHM with constraints . . . . 46

3.1.2.4 Error estimation within the GHM with constraints . . . . . . . . . . . . . . . 49

viii Contents

3.2 Totalleastsquares .......................................... 49

3.2.1 Nonlinear adjustments within the EIV model . . . . . . . . . . . . . . . . . . . . . . . 50

3.2.1.1 Least squares parameter estimation using TLS . . . . . . . . . . . . . . . . . 53

3.2.1.2 Least squares parameter estimation using WTLS . . . . . . . . . . . . . . . . 54

3.3 Discussionandopenquestions.................................... 59

Part II - Methodological contributions 61

4 Direct solutions of nonlinear least squares problems with equal weights 63

4.1 Basic idea and general methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.2 Fitting of a straight line in 2D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4.2.1 Least squares adjustment with a direct solution . . . . . . . . . . . . . . . . . . . . . . 66

4.2.1.1 Definition of the problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

4.2.1.2 Simplification of the problem by substituting one unknown parameter . . . . 69

4.2.2 TLSsolutionwithSVD ................................... 71

4.2.2.1 TLS solution based on the minimum eigenvalue principle . . . . . . . . . . . 72

4.2.2.2 Solution by the eigenvalue/eigenvector decomposition . . . . . . . . . . . . . 72

4.3 Fitting of a straight line in 3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

4.3.1 Direct least squares solution for fitting a straight line in 3D . . . . . . . . . . . . . . . 75

4.3.2 TLS fitting of a straight line in 3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

4.4 Fittingofaplanein3D ....................................... 78

4.4.1 Direct least squares solution for fitting a plane in 3D . . . . . . . . . . . . . . . . . . . 79

4.4.2 TLSfittingofaplanein3D................................. 81

4.5 2D similarity transformation of coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

4.5.1 Direct least squares solution for the 2D similarity transformation . . . . . . . . . . . . 83

4.5.2 TLS 2D similarity transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

4.6 General formulation and classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

4.7 Discussionandopenquestions.................................... 89

5 Direct and iterative solutions of weighted nonlinear least squares problems 91

5.1 Basic idea and general methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

5.2 Fitting of a straight line in 2D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

5.2.1 Weighting case 1 - Equally weighted observations in each direction . . . . . . . . . . . 93

5.2.1.1 Direct least squares solution in a scaled coordinate system . . . . . . . . . . 95

5.2.2 Weighting case 2 - Individually weighted points in 2D . . . . . . . . . . . . . . . . . . 97

5.2.2.1 Direct weighted least squares solution . . . . . . . . . . . . . . . . . . . . . . 97

5.2.3 Weighting case 3 - Individually weighted 2D coordinates . . . . . . . . . . . . . . . . . 100

5.2.3.1 Iterative least squares solution without linearization . . . . . . . . . . . . . . 102

5.2.4 Weighting case 4 - Individually weighted and correlated 2D coordinates . . . . . . . . 105

5.2.4.1 Iterative least squares solution without linearization . . . . . . . . . . . . . . 107

5.2.4.2 Solution for singular cofactor matrices . . . . . . . . . . . . . . . . . . . . . . 109

5.3 Fittingofaplanein3D .......................................113

5.3.1 Weighting case 1 - Equally weighted observations in each direction . . . . . . . . . . . 113

5.3.2 Weighting case 2 - Individually weighted points in 3D . . . . . . . . . . . . . . . . . . 115

5.3.3 Weighting case 3 - Individually weighted 3D coordinates . . . . . . . . . . . . . . . . . 118

5.3.4 Weighting case 4 - Individually weighted and correlated 3D coordinates . . . . . . . . 122

5.4 2D similarity transformation of coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

5.4.1 Weighting case 1 - Equally weighted observations in each coordinate system . . . . . . 130

5.4.2 Weighting case 2 - Individual weight for each pair of homologous points in both systems132

5.4.3 Weighting case 3 - Individually weighted coordinates . . . . . . . . . . . . . . . . . . . 136

5.4.4 Weighting case 4 - Individually weighted and correlated coordinates in each coordinate

system.............................................141

5.5 Discussion of weighted nonlinear least squares solutions . . . . . . . . . . . . . . . . . . . . . 149

6 Numerical Investigations 151

6.1 Fitting of a straight line in 2D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

6.2 2D similarity transformation of coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

7 Conclusion and outlook 171

7.1 Conclusion ..............................................171

7.2 Outlook ................................................172

Appendices 175

A Stochastic models for the numerical investigations 177

A.1 Singular cofactor matrix for fitting a straight line in 2D . . . . . . . . . . . . . . . . . . . . . 177

A.2 Singular cofactor matrix for the 2D similarity transformation . . . . . . . . . . . . . . . . . . 178

Bibliography 181

List of Figures

2.1 Simple example of linear variance-covariance propagation . . . . . . . . . . . . . . . . . . . . 24

2.2 Simple example of a first order variance-covariance propagation . . . . . . . . . . . . . . . . . 26

3.1 Two optimization approaches for the solution of a class of nonlinear least squares problems. . 30

4.1 Flowchart for two possible direct solutions of a class of nonlinear least squares problems. . . . 64

4.2 Representation of a straight line in 2D using equation (4.1). . . . . . . . . . . . . . . . . . . . 65

4.3 Example of fitting a straight line to points in 2D with both xand ycoordinates subject to

measurementerrors. ......................................... 67

4.4 Example of fitting a straight line to points in 2D with coordinates reduced to the centre of

massofthemeasuredpoints. .................................... 70

5.1 Flowchart for possible direct and iterative solutions of a class of nonlinear weighted least

squaresproblems............................................ 92

5.2 Example of fitting a straight line to points in 2D, with observed xand ycoordinates and px,

pyindividual constant weights for each coordinate axis. . . . . . . . . . . . . . . . . . . . . . 94

5.3 Example of fitting a straight line to the scaled points in 2D. . . . . . . . . . . . . . . . . . . . 96

5.4 Example of fitting a straight line to points in 2D with xand ymeasured coordinates

and pxi,pyibeing equal weights for each point. . . . . . . . . . . . . . . . . . . . . . . . . . . 98

5.5 Example of fitting a straight line to points in 2D with observed xiand yicoordinates and

pxi,pyiindividual weights for the coordinates. . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

6.1 Fitting a straight line to points in 2D, with observed xand ycoordinates of equal precision. . 153

6.2 Fitting a straight line to points in 2D, with observed xand ycoordinates and px,pyindividual

constant weights for each coordinate axis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

6.3 Fitting a straight line to points in 2D, with individual weight for the coordinates of each point.156

6.4 Fitting a straight line to points in 2D, with individual weight for each measured coordinate. . 158

6.5 Fitting a straight line to points in 2D, with individually weighted and correlated coordinates

foreachpoint. ............................................161

6.6 Fitting a straight line to points in 2D, with a singular cofactor matrix. . . . . . . . . . . . . . 162

xiii

List of Tables

6.1 Example dataset of measured points in 2D. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

6.2 Solution within the GHM using the algorithm of Neitzel and Petrovic (2008). . . . . . . . . . 152

6.3 Direct least squares solution (section 4.2.1). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

6.4 TLSsolution(section4.2.2)......................................153

6.5 Direct least squares solution (section 5.2.1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

6.6 Individual weights for each point. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

6.7 Direct least squares solution (section 5.2.2). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

6.8 Individual weights for each coordinate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

6.9 Iterative least squares solution using Algorithm 1 (section 5.2.3). . . . . . . . . . . . . . . . . 157

6.10 Individual weights for each coordinate and correlations for each point. . . . . . . . . . . . . . 159

6.11 Iterative least squares solution using Algorithm 2 (section 5.2.4.1). . . . . . . . . . . . . . . . 160

6.12 Iterative least squares solution using Algorithm 3 (section 5.2.4.2). . . . . . . . . . . . . . . . 162

6.13 Example dataset for the 2D similarity transformation . . . . . . . . . . . . . . . . . . . . . . 163

6.14ResultsfromNeitzel(2010)......................................163

6.15 Direct least squares solution for the 2D similarity transformation . . . . . . . . . . . . . . . . 164

6.16 Direct least squares solution (section 5.4.1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

6.17 Individual weights for homologous points in both systems. . . . . . . . . . . . . . . . . . . . . 165

6.18 Direct least squares solution (section 5.4.2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

6.19 Individual weight for each coordinate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

6.20 Iterative least squares solution (section 5.4.3) . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

6.21 Weights and correlations for the coordinates of the points in the target system. . . . . . . . . 167

6.22 Weights and correlations for the coordinates of the points in the source system. . . . . . . . . 167

6.23 Iterative least squares solution (section 5.4.4) . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

6.24 Example dataset from Neitzel and Schaffrin (2016). . . . . . . . . . . . . . . . . . . . . . . . . 168

6.25 Iterative least squares solution (section 5.4.4) . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

Abbreviations

CTLS Constrained Total Least Squares

EIV Errors In Variables

EVD Eigen Value Decomposition

GHM Gauss Helmert Model

GMM Gauss Markov Model

MCS Monte Carlo Simulation

NS Neitzel-Schaffrin

STLS Structured Total Least Squares

SVD Singular Value Decomposition

TLS Total Least Squares

UT Uncented Transformation

VC Variance Covariance

WTLS Weighted Total Least Squares

2D Two dimensionional

3D Three dimensional

1 Introduction and Motivation

In geodetic practice, engineers are engaged in performing measurements for the numerical description of

reality, including the characteristics of some physical phenomena or geometrical properties of real objects.

The desired values often cannot be measured directly, but they are linked to the measured values via a

mathematical model. The mathematical modelling of the measurement results, together with the errors that

influence them, results in an under-determined algebraic problem. The target is in most cases the “optimal”

estimation of some unknown parameters. For more than two centuries mathematicians and geodesists have

solved these adjustment problems using the method of least squares, according to the fundamental studies of

Gauss (1809) and Gauss (1823). A least squares estimate can be obtained by minimizing a defined objective

function (the sum of squared residuals) and thus by solving a system of normal equations, i.e. a system of

equations that follow from the partial derivatives of the objective function with respect to all unknowns.

Depending on the nature of the problem, a least squares adjustment can be linear or nonlinear. Obviously

linear least squares problems can be solved using the rules of linear algebra, whilst the nonlinear cases require

most of the time a numerical method for obtaining a solution. This thesis investigates only the second type

of problems.

The solution of nonlinear adjustment problems with the use of least squares has a long history and the

simplicity of the “recipe” of this method is recognized by its wide application in all scientific fields that

deal with redundant observations and seeking for an “optimal” solution. Helmert (1924), for instance,

proposed a least squares solution by linearizing the functional model of the nonlinear adjustment problem.

Deming (1931) and Deming (1934) tackled the same problem by developing algorithms based on the iterative

linearization of the functional model following the Gauss-Newton approach. Pope (1974) pointed out that a

least squares solution of a nonlinear adjustment problem can be obtained either by using the Gauss-Newton

approach, or by setting up and linearizing the nonlinear normal equations, according to the Newton-Raphson

approach. It is often in geodetic literature that the least squares principle is applied in the form of two

adjustment models, namely the Gauss-Markov Model (GMM), see (Niemeier 2008, p. 137 ff.), and the

Gauss-Helmert Model (GHM), see e.g. (Niemeier 2008, p. 172 ff.). In (Krakiwsky 1975, pp. 7-26) these

models can be found under the name parametric (case) adjustment and combined (case) adjustment, as

well. A least squares solution from both models is based on the Gauss-Newton approach.

Furthermore, a class of nonlinear least squares problems exists, for which a direct solution is possible for

the nonlinear normal equations, especially for such cases where the solving of the normal equations can be

converted into an eigenvalue problem. Thus, a solution is obtained by computing the roots of a polynomial

(i.e. the characteristic equation of the eigenvalues) and a direct solution can be possible depending on the

polynomial’s degree. Such adjustment problems have been discussed in geodetic literature since long time.

Linkwitz (1960), for instance, presented a least squares solution for two adjustment problems that belong to

this class, the fitting of a straight line in two dimensions (2D) and the fitting of a plane in three dimensions

2Chapter 1. Introduction and Motivation

(3D), while the coordinates of the points in all directions are considered as measurements. Another example

is the work of Joviˇci´c et al. (1982), who investigated the fitting of a straight line in 3D. Therefore, it can

be seen from past publications that the solution of such nonlinear least squares problems, using eigenvalue

decomposition (EVD), has been already a standard procedure for the members of the geodetic community.

Nevertheless, an alternative approach was developed during the last decades for the solution of this class of

nonlinear least squares problems by the mathematical community. This is called total least squares (TLS)

and has been firstly defined and presented by Golub and Van Loan (1980). Since then many researchers dealt

with the solution of adjustment problems using TLS and developed modern and sophisticated algorithms.

As it was defined in (Golub and Van Loan 1980) or (Van Huffel and Vandewalle 1991, p. 33 ff.), the TLS

solution is related only with this class of nonlinear adjustment problems which can be expressed within an

errors in variables (EIV) model and solved with the use of singular value decomposition (SVD), without

involving any kind of linearization of the functional model.

Within literature the work coming from the TLS community is often distinguished from the classical least

squares by stating that TLS functions differently. There are expectations that TLS might produce a “more

realistic” result than the classic least squares, as indicated for example in (Felus and Schaffrin 2005) or

(Schaffrin et al. 2006). Petrovic (2003) has already pointed out that this view has been caused possibly

by the work of Golub and Van Loan (1980), where the solution of TLS was compared with that of least

squares for fitting a straight line in 2D. In that study, for the least squares solution it was assumed that

only the y-coordinates are regarded as observed values and the x-coordinates as error free, which led to the

misleading conclusion that TLS functions differently from least squares. For geodesists it has already been

clear that the most important steps for the adjustment of observations is to formulate a correct model and

minimize the correct objective function. When these requirements are fulfilled, then for a linear problem

the solution will be unique, regardless of the solution strategy that has been followed.

Contrary to the belief that TLS is an additional method like least squares (or even a generalisation of it),

several scientists have shown that TLS describes just a particular algorithmic approach to find the (weighted)

least squares solution. One of the first critical views on TLS can be found in the appendix of (Petrovic 2003).

This author concluded that least squares and TLS are not individual methods, but applications of minimizing

the sum of squared residuals. Afterwards, Neitzel and Petrovic (2008) and Neitzel (2010) showed on two

practical examples that in fact TLS can be regarded as a special case of the least squares method within the

GHM. The iterative solution of the GHM has been proven to be numerically equivalent to the TLS solution

in both cases. Other contributions that follow this line of thinking are (Reinking 2008) and (Mihajlovic and

Cvijetinovic 2016). These studies provided the motivation for the first part of this research, which has been

partly published in (Malissiovas et al. 2016). In that article, a clear mathematical relationship between TLS

and direct least squares solutions has been presented.

A TLS solution has been mainly investigated when the measurements are equally weighted and uncorrelated.

Consequently, the following scientific challenges focused on the algorithmic development of TLS when indi-

vidual precisions or correlations are postulated for the measurements. The solutions from these algorithms

are iterative, they do not include a linearization of the functional model and have been published under the

name “weighted TLS” (WTLS). For instance, Schaffrin and Wieser (2008) developed a WTLS algorithm for

the weigthed least squares solution of fitting a straight line in 2D, which was the basis for further algorith-

mic developments from Shen et al. (2011), Fang (2011), Amiri-Simkooei and Jazaeri (2012) and Mahboub

(2012). Additionally, various names for the TLS solution emerged due to algorithmic complications caused

by the stochastic model of each problem. For instance, the term STLS (structured TLS) was presented in

1.1. Research contributions 3

(Schaffrin et al. 2012) for the solution of a 2D similarity transformation of coordinates, or the term CTLS

(constrained TLS) presented in (Abatzoglou et al. 1991) or (Schaffrin 2006). Despite the name TLS, in

all above cases the solution has been obtained iteratively and does not follow the definition of TLS that

was established by Golub and Van Loan (1980). A clear overview of the latest WTLS algorithms has been

presented in the dissertation of Snow (2012), who also covered special cases of singular cofactor matrices

being present in the model.

Nevertheless, York (1966) and Williamson (1968) had already presented weighted least squares solutions

for the problems that have been tackled in terms of WTLS, i.e. iterative solutions of the nonlinear normal

equations without linearizing the functional model. York (1968) and Petrovi´c et al. (1983) developed also

algorithms even for the case of correlated observations. These contributions provide the motivation for the

next research question of this dissertation. In this second part, the solution of the discussed class of nonlinear

least squares problems is investigated for various cases where the observations have different precisions or

correlations.

1.1 Research contributions

In the first place a mathematical relationship between direct least squares solutions and TLS is developed

by investigating four individual adjustment problems, namely: the fitting of a straight line in 2D, the fitting

of a straight line in 3D, the fitting of a plane in 3D and the 2D similarity transformation of coordinates. As

a consequence, a systematic approach is established for the direct least squares solution of these nonlinear

adjustment problems. This leads to a deeper understanding of the underlying principles of TLS adjustment

and to the class of problems that can be solved via SVD, i.e. the solving of nonlinear normal equations can

be transformed into an eigenvalue problem.

The second part of this study focuses on the solution of the class of nonlinear least squares problems, when

individual precisions and correlations between the measurements have to be taken into account. Novel

algorithms are implemented to provide direct and iterative weighted least squares solutions, depending on

the stochastic model of each problem. In contrast to the known WTLS algorithms, the presented solutions

do not always need iterations but can have in some weighted cases a direct solution. When iterations are

necessary, the developed algorithms can provide a weighted least squares solution without performing any

linearization of the problem under investigation. The cases of singular cofactor matrices in the stochastic

model are also covered (when the criterion of Neitzel and Schaffrin (2016) is fulfilled) by the developed

approaches, without the need of any special treatment of the problem.

From an engineering point of view, the proposed solutions and algorithms can be an asset for providing

the (weighted) least squares solution for this class of problems without any iteration or starting values for

the unknown parameters. Thus, this is an advantageous algorithmic option for researchers and engineers

in terms of efficiency. The adjustment approaches that were developed so far have a significant impact on

several applications in geodesy. For instance, 3D point clouds obtained by terrestrial laser scanners can

be easily handled, depending on the needs of the engineering task. The developed algorithms can handle

efficiently the vast amount of data, providing for example direct solutions for fitting planes or estimating

the transformation parameters between several data sets. On the other hand, the presented algorithms

that provide iterative weighted least squares solutions can be utilized also when correlations between the

observations have to be taken into account.

4Chapter 1. Introduction and Motivation

One of the objectives from the contributions of this dissertation is to present a simple explanation of the

TLS and WTLS solutions to scientists, researchers and engineers from all scientific fields that deal with

adjustment calculus. Thus, it can enable them to use appropriately the TLS/WTLS algorithms for the

solution of nonlinear adjustment problems. Furthermore, the proposed solution strategies, accompanied by

individual algorithms for each case, can be utilized for the solution of this class of problems in the future

without the need of TLS or WTLS theory and practice.

1.2 Organization of this thesis

This thesis is organized into two main parts, the part describing fundamental ideas of adjustment and the

methodological contributions, and involves six chapters. The content of each chapter is summarized in the

following.

Fundamentals

Chapter 2 describes fundamental requirements for the adjustment of observations and gives a definition of

the method of least squares and hence sets the basis for the definition of nonlinear least squares problems

in this dissertation. These requirements are deduced from a review of related work on adjustment calculus.

Since the mathematical formalization of nonlinear normal equations with a solution using EVD or SVD is

one of the main goals of this thesis, basic mathematical notions and concepts which serve as foundation for

this purpose are briefly discussed.

Chapter 3 presents a part of existing solution strategies and numerical methods for the solution of nonlinear

least squares problems, based on the requirements of chapter 2. In this core chapter of the thesis, both the

traditional geodetic and the most modern solution strategies are elaborated in depth. Three fundamental

models, namely the GMM, the GHM and the EIV model, are defined and mathematically described. Based

on these models, further concepts such as the Gauss-Newton, the TLS and the WTLS approach are defined

and formalized. Scientific questions emerge from the developed concepts and are outlined at the end of this

chapter.

Methodological contributions

Chapter 4 is dedicated to the development of a well defined mathematical relation between direct least squares

solutions and TLS solutions for a class of nonlinear adjustment problems. It involves the elaboration of both

solutions for four adjustment problems that often occur in practice. Based on this, a new strategy for the

direct solution of such problems is elaborated and presented in detail. Subsequently, common features are

identified between the investigated adjustment problems that can serve as classification factors for others

that belong to this group. The chapter concludes with a discussion on the nature of TLS adjustment and

raises further scientific questions.

Chapter 5 addresses different solutions of the same class of nonlinear least squares problems as in chapter 4,

by postulating individual stochastic models for each case. In contrast to the iterative algorithms known as

WTLS, various weighting cases are identified in this chapter that can still lead to a direct solution, based

on the same concepts as in chapter 4. Additionally, a standard solution strategy is presented for cases

1.2. Organization of this thesis 5

where iterations are necessary for a weighted least squares solution, without applying any linearization to

the investigated problems. From this, individual algorithms are designed that fit into the framework of the

proposed solution approach. The feasibility of the developed algorithms is also demonstrated on cases where

the cofactor matrix of the problem is singular.

Chapter 6 illustrates the application of the developed methodologies and algorithms for two adjustment

cases. The first case presents the proposed direct and iterative least squares solution for fitting a straight

line to a set of measured points in 2D, while postulating various weights and correlations between the

observations. In the same line of thinking the 2D similarity transformation of coordinates is examined as a

second example.

Chapter 7 summarizes this work and draws conclusions with respect to the stated research goals as well as the

scientific questions that emerged in each chapter. It reviews and evaluates the results of this investigation,

lists and discusses contributions to the field of adjustment calculus and identifies and outlines scientific

problems that could be tackled in a future research.

Part I - Fundamentals

2 Adjustment calculus

This chapter briefly summarizes the concept of adjustment calculus. This is necessary for understanding the

fundamentals of the adjustment of observations and the estimation of unknown parameters. Many parts of

this chapter have been adopted from the work of Pasioti (2015) and have been extended appropriately to

fit the needs of this dissertation.

It can be said that adjustment calculus is a part of geodesy and all scientific fields that deal with redundant

measurements. The measurement results are necessary for describing numerically the characteristics of some

physical phenomena or geometrical properties of real objects. The target is in most cases the “optimal”

estimation of some unknown parameters by means of an under-determined algebraic system that occurs

from the mathematical modelling of the measured quantities1. Additionally, the statistical properties of

the estimated parameters are of great importance for drawing conclusions concerning the precision and the

reliability of the adjustment results. Based on the adjustment phases presented by Neitzel and Petrovic

(2008), five basic steps can be identified for the gradual process of any adjustment problem. These can be

summarized with the following systematic stages:

1. Definition of the problem/task:

In this step it must be clearly defined what are the measurements that are subject to errors, which

parameters are fixed (error free) and which parameters are unknown.

2. Clear description of the mathematical model:

The mathematical model is the combination of an appropriate functional model, which is selected by

the user, with the stochastic model that incorporates the stochastic properties of the measurements.

3. Selection of the method for a solution of the adjustment problem:

A solution of an adjustment problem can be derived by selecting an appropriate criterion for the

unknown errors, depending on their distribution. The choice of such a criterion implies the method

that will be followed. For example, in case of normally distributed errors the least squares method

should be employed.

4. Calculation of the adjustment results:

A least squares solution for an adjustment problem can be obtained by solving a system of equations

(i.e. the normal equations). Various approaches exist for the solution of the normal equations in

numerical mathematics, that depend on the nature (linear or nonlinear) and the well-posedness of the

problem.

1As it will be explained in subsection 2.1.1, the functional model of an adjustment problem includes unknown parameters

to be estimated and residuals, which are also unknown, leading to an under-determined set of equations

10 Chapter 2. Adjustment calculus

5. Computation of stochastics:

The precision measures associated with the adjustment results can be computed after obtaining the

solution of the problem using, for example, the rules of error propagation.

These fundamental stages for the solution of adjustment problems serve as a basis for the explanation of

the theory of adjustment calculus that is presented in this chapter.

2.1 Mathematical modelling of adjustment problems

Under the theoretical assumption that measurements are contaminated by errors (Taylor 1982, p. 3), an

under-determined algebraic system emerges that is strongly related to the stochastic properties of the mea-

surements. Important for the solution of an adjustment problem is the clear description of the deterministic

part (the derived under-determined system of equations) and the stochastic part2(the measurement errors

and their disribution). The first is described by the functional model and the second by the stochastic

model, which are always combined to form the mathematical model of the problem, as it has been defined in

(Mikhail and Ackermann 1976, p. 5) or (Perovi´c 2005, p. 72).

For obtaining meaningful adjustment results, a correct formulation of the mathematical model is essential

in every case and should clearly answer the following questions:

- What are the observations that are subject to measurement errors?

- What is the precision of the observations?

- What are the unknown parameters to be estimated?

- Which parameters are error-free or fixed/constant?

The two fundamental parts of a mathematical model, i.e. the functional and stochastic model, are explained

in the following subsections.

2.1.1 The functional model

Observations li(with i= 1, ..., n) and unknown parameters xj(j= 1, ..., m) can be mathematically related

through a set of functions, which are often referred as the functional model. Three main categories of

functional relationships are disinguished in the geodetic literature, see for example (Wells and Krakiwsky

1971, pp. 102-104) or (Perovi´c 2005, p. 63):

1. Implicit functional relationships between the observations

f(li)≈0.

2see (Perovi´c 2005, p. 55) for a definition of the deterministic and stochastic model.

2.1. Mathematical modelling of adjustment problems 11

2. Explicit functional relationships between the observations and the unknown parameters

li≈f(xj).

3. Explicit or implicit functional relationships between the observations and the unknown parameters

f(li, xj)≈0.

The presented equations are approximately equal, as the observed quantities are subject to errors and have

to be corrected, as it is explained below. According to (Wells and Krakiwsky 1971, p. 104), the mathematical

relationships of categories 1 and 2 are only special cases of the most general case in 3. This study investigates

adjustment problems in which some unknown parameters need to be estimated 3.

Explicit functional relationships between observations and unknown parameters

Let a number of observations l1, ..., lnbe performed and x1, ..., xmunknown parameters to be estimated.

The mathematical relationship between the observations and the unknown parameters can be expressed by

the equation system

l1≈f1(x1, ..., xm),

l2≈f2(x1, ..., xm),

ln≈fn(x1, ..., xm).

(2.1)

Depending on the number of equations in (2.1), three individual cases can emerge (Helmert 1924, p. 47):

1. The number of observations is smaller than the number of the parameters to be estimated (n < m).

There is no unique solution (there are infinitely many solutions).

2. The number of observations is equal to the number of the parameters to be estimated (n=m). There

exists a unique solution, but the presence of blunders cannot be identified.

3. The number of observations is larger than the number of the parameters to be estimated (n>m).

This is an overdetermined system with no unique solution.

A usual geodetic problem consists of repeated observations, seeking for a solution to a set of unknown

parameters. Obviously, the third case presented above is what geodesists usually face. Furthermore, in

equation (2.1) the measured values lhave to be corrected for their random errors. According to (Bjerhammar

1973, p. 1), the true value ˜

lcan be obtained by excluding the error (e) from the measurement:

l=l−e. (2.2)

3Category 1 leads to the famous adjustment of condition equations (when taking into account the necessary residuals).

However, this adjustment problem is out of the scope of this work. More information for this case can be found in most common

literature of adjustment calculations in geodetic science, see for example (Helmert 1924, p. 228 ff.).

12 Chapter 2. Adjustment calculus

However, because the true values are only a theoretical concept, the adjusted value of a measurement ˆ

lis

usually considered by adding a correction/residual v

l=l+v. (2.3)

Thus, including the necessary residuals in the functional model yields the under-determined4system of

equations

l1+v1=f1(x1, ..., xm),

l2+v2=f2(x1, ..., xm),

ln+vn=fn(x1, ..., xm).

(2.4)

The developed equations (with included residuals) are the observation equations. As claimed in (Wells and

Krakiwsky 1971, p. 102), adjustment problems of this type are widely known as adjustment of observation

equations,parametric adjustment or adjustment of indirect observations5.

Implicit functional relationships between observations and unknown parameters

A number of observations has been collected and some unknown parameters need to be estimated. Assuming

that the mathematical relationship between the observations and the unknown parameters can be expressed

f1(l1, ..., ln, x1, ..., xm)≈0,

f2(l1, ..., ln, x1, ..., xm)≈0,

fr(l1, ..., ln, x1, ..., xm)≈0.

(2.5)

Depending on the number of equations rin (2.5), an adjustment problem exists when the number of equations

is larger than the number of the parameters to be estimated (r > m). Adding the necessary residuals to the

observations yields

f1(l1+v1, ..., ln+vn, x1, ..., xm)=0,

f2(l1+v1, ..., ln+vn, x1, ..., xm) = 0,

fr(l1+v1, ..., ln+vn, x1, ..., xm) = 0.

(2.6)

This system of equations is known as condition equations with unknowns. Adjustment problems of this

type can be found under the name combined adjustment or adjustment of condition equations and unknown

parameters (Wells and Krakiwsky 1971, p. 102) 6.

4The developed functional model is under-determined, because viare also unknown quantities.

5More information from the traditional German literature can be found in (Helmert 1924, p. 43) or (Linkwitz 1960, p. 156).

It is worth noticing that the German term “Vermittelnde Beobachtungen” for this type of adjustment cannot be translated

literally.

6In the German literature this type of adjustment problems is known as “Bedingte Beobachtungen mit Unbekannten”, see

for example (Helmert 1924, p. 52) or (Linkwitz 1960, p. 192).

2.1. Mathematical modelling of adjustment problems 13

Constraints between the unknown parameters

Additionally, part of the functional model can be a number of constraints that have to be enforced on the

unknown parameters. For example, such a constraint can be formulated as

u(x1, x2, ..., xm)=0,(2.7)

with udenoting here a function of the unknowns. This is just a special case of the condition equations (2.6).

Adjustment problems with imposed constraints can be divided into two cases:

- Adjustment of observation equations with constraints between the unknown parameters, see for ex-

ample (Helmert 1924, p. 262) or (Linkwitz 1960, p. 197).

- Adjustment of condition equations and unknowns with constraints between the unknown parameters,

as explaned in (Wells and Krakiwsky 1971, p. 142) and (Mikhail and Ackermann 1976, p. 213 ff.).

2.1.2 The stochastic model

In a complete mathematical model the stochastic properties of the obtained observations must be taken

into account. In the literature, the theoretical error that can influence an observation is denoted with the

term “standard deviation” of the measurement, as for example in (Niemeier 2008, p. 6). The measurement

standard deviations are often based on the physical characteristics of some instrument that has been used

for measurements and imply how precise the observations are. This a priori7information influences the

adjustment procedure in a sense of how much an observation contributes to the adjustment (i.e. a precise

observation is more valuable than another with a low precision).

Theoretical variances and covariances of a random variable

In statistics the precision of an observation can be expressed in terms of its variance. Based on the theory

of errors the theoretical variance of a discrete random variable is defined as

σ2= lim

N→∞

i=1

i= lim

N→∞

i=1

(li−˜

l)2=E{(li−˜

l)2},(2.8)

see, e.g. (Bjerhammar, 1973, p. 22), (Niemeier, 2008, p. 23) or any other textbook on statistics. ˜

ldenotes

the true value of l,ethe observation errors and E{} the expectation of a variable. Nis the total number

of random errors with the same probability that belong to a universal set. The standard deviation can be

defined as the positive square root of the variance

σ= +√σ2.(2.9)

7In the literature the term “a priori” is often confused with the term “prior”. While the former defines some information

that has been obtained after theoretical deduction and has been traditionally used by scientists for expressing such arguments,

the latter refers only to some moment in time.

14 Chapter 2. Adjustment calculus

The theoretical covariance between two measurements, for example l1and l2, can be written as

σ1,2= lim

N→∞

i=1

(l1i−˜

l1)(l2i−˜

l2) = E{(l1i−˜

l1)(l2i−˜

l2)},(2.10)

with ˜

l1and ˜

l2denoting the true values of l1and l2, respectively.

Empirical variances and covariances of a random variable

In most cases the true value ˜

lis not known and thus it is replaced by the expected value E{l}(for least

squares problems the mean value ¯

lis regarded as the expected value E{l}), see for example (Montgomery

and Runger 2010, p. 74) or (Everitt and Skrondal 2010, p. 156) for a definition.

An estimate for the variance of lhas been given by (Bjerhammar, 1973, p. 37) or (Dekking et al., 2005, p.

292) as

s2=1

n−1

i=1

n−1

i=1

(li−E{l})2=1

n−1

i=1

(li−¯

l)2,(2.11)

with ndenoting a finite number of residuals v, that belong to a subset of the universal set containing the

random errors of the problem. In (Linnik, 1961, p. 79) and (Bjerhammar, 1973, p. 127) it has been shown

that for a linear adjustment problem, the estimated variance s2is an unbiased estimate of σ2, i.e.

E{s2}=σ2.(2.12)

Equivalently, an estimate for the covariance between two measurements l1and l2can be derived by

s1,2=1

n−1

i=1

(l1i−E{l1})(l2i−E{l2)}=1

n−1

i=1

(l1i−¯

l1)(l2i−¯

l2).(2.13)

Storing all the observations in one vector

L= [l1, l2,··· , ln]T,(2.14)

the variances and covariances of all measurements can be expressed in matrix notation, like in (Wells and

Krakiwsky 1971, p. 89) or (Niemeier 2008, p. 29), stored in the variance-covariance (VC) matrix

ΣLL =1

n−1(L−E{L})(L−E{L})T=





σ2

1σ1,2··· ···

σ2,1σ2

2··· ···

.....

.. . . σ2







.(2.15)

(Ghilani 2010, p. 166) is one of the many textbooks where it was shown that the cofactor matrix of the

observations can be computed by

QLL =1

σ2

ΣLL,(2.16)

2.1. Mathematical modelling of adjustment problems 15

which for a regular cofactor matrix results in the weight matrix

P=Q−1

LL.(2.17)

Matrices ΣLL,QLL and Pbecome diagonal, when the observed quantities are uncorrelated. Nevertheless,

it is worth mentioning that for correlated observations the cofactor matrix QLL is not always regular, as it

was pointed out by (Ghilani, 2010, p. 160). In such cases the inverse weight matrix cannot be computed

and a specific strategy has to be employed for a solution that is able to deal with singular cofactor matrices,

like for example in the work of Neitzel and Schaffrin (2016).

The theoretical variance of the universal weight8σ2

0has been defined by (Mikhail and Ackermann, 1976, p.

65) as an arbitrary scalar value that influences the stochastic model. Following (Linnik 1961, p. 100), in

case of large measurement samples it can be estimated by

0=1

i=1

i,redundancy : rd=n−m, (2.18)

with

Es2

0=σ2

0.(2.19)

According to (Niemeier 2008, p. 165), listing all residuals in one vector vand introducing the weight matrix

from equation (2.17), an estimate for the theoretical variance of the unit weight can be computed by

0=vTPv

.(2.20)

2.1.3 Criteria for the solution of adjustment problems

Both types of adjustment that were discussed in the previous subsection resulted in an under-determined

mathematical problem. Infinite possibilities exist for the solution of an adjustment problem, by choosing

a suitable criterion for the residuals. The most common of all is the least squares criterion, also known

as the method of least squares or the minimization of the L2-norm. A least squares estimate ˆxjof the

unknown parameters can be computed by minimizing the sum of squared residuals, formulated symbolically

for equally weighted and uncorrelated observations by the objective function

Ω(v1, v2, . . . , vn) =

i=1

i→min.(2.21)

Taking into account the individual precisions of the observations, the objective function obtains the form

Ω(v1, v2, . . . , vn) =

i=1

piv2

i→min,(2.22)

8Also known as a priori variance of the unit weight or reference variance.

16 Chapter 2. Adjustment calculus

or with correlated observations

Ω(v1, v2, . . . , vn) =

i=1

j=1

pi,j vivj→min.(2.23)

The objective function can be expressed equivalently in matrix notation as

Ω(v) = vTPv →min.(2.24)

As already mentioned by many authors, for example by Gonin (1989) or Neitzel (2004), different criteria

can be employed for a solution of an adjustment problem, other than least squares. An example of such a

criterion is the minimization of L1norm, also known as the method of least absolute deviations. A solution

for the estimation of the unknown parameters for this case can be obtained by minimizing the sum of

absolute residuals, i.e. the objective function

Ω(v) =

i=1 |vi| → min.(2.25)

Algorithmic approaches for the solution of this problem can be distinguished between the rigorous solutions

via linear programming or using the Simplex-Algorithm, as presented for example in (Dantzig 1949), (Dantzig

1963) or (Fuchs 1980) and the simulated L1solutions via reweighted least squares adjustment as offered for

example by (Krarup et al., 1980). In general, the minimization of any norm can be employed in this way.

The minimization of Lpnorm, with p ∈Rand p ≥1, can be formulated similarly to the presented objective

functions, see for example (Marx, 2013), by

Ω(v1, v2, . . . , vn) =

i=1 |vi|p→min.(2.26)

Additionally, the M-estimators are based on the maximum likelihood method and can be employed for a

robust solution of an adjustment problem. Such solutions have been presented for example by Huber (1964)

or Hampel (1980).

2.2 Adjustment of observations with the method of least squares

Amongst all methods (e.g. L1, L2, Lpor M-estimators) for the solution of adjustment problems, the method

of least squares is the most popular. There is a rich literature that deals with adjustment solutions based

on the principles of least squares, for example (Helmert 1924), (Linkwitz 1960), (Linnik 1961), (Deming

1964), (Wells and Krakiwsky 1971), (Bjerhammar 1973), (Krakiwsky 1975), (Meissl 1982), (Perovi´c 2005),

(Niemeier 2008) or (Ghilani 2010).

The method of least squares has been utilized for around two centuries to provide “optimal” solutions

for under-determined algebraic problems that often occur in the mathematical modelling of measurement

results. From a historical perspective, the French mathematician Adrien-Marie Legendre (1752-1833) was

the first who applied this method to the astronomic problem of determining the orbits of comets. In his

book “Nouvelles m´ethodes pour la d´etermination des orbites des com`etes”, that was published in 1805, he

2.2. Adjustment of observations with the method of least squares 17

included an appendix under the title “Sur la m´ethode des moindres quarr´es”, which can be translated into

English as “On the method of least squares”.

Four years later, the famous German geodesist and mathematician Johann Carl Friedrich Gauss (1777-1855)

presented his theory for the calculation of the orbits of celestial bodies in (Gauss 1809). In his work, Gauss

made extensive use of the method of least squares and claimed that he has already been using it since

1795. Although the statement of Gauss that he was the first being using least squares can be judged on

its merits as contradictory (in 1795 Gauss was 18 years old), he is deservedly acknowledged as the pioneer

of the method of least squares, as it has been already discussed by Stigler (1981). Gauss not only used

least squares for the solution of his geodetic problem, but also introduced the statistical distribution of

the errors, as well as the precision of the observed quantities, as an important parameter for obtaining the

most probable estimate of the unknown parameters. In this first work the famous Gaussian (or normal)

distribution has been presented for the first time. Nevertheless, the most important outcome is that both

Legendre and Gauss contributed greatly to the scientific community, with a method for the adjustment of

observations that is widely used in various scientific fields.

A year after Gauss’s first publication, Pierre-Simon Laplace (1749–1827) investigated Gauss’s method of

least squares and the errors distribution utilizing the central limit theorem. The development of the method

of least squares continues with the second publication of Gauss (also known as the 2nd foundation of least

squares), see (Gauss 1823). In his second book Gauss argued that the method of least squares could be

employed also in case of not normally distributed errors. However, in this case the solution is not the most

probable but can be considered as the most appropriate or the most plausible. After that, the application of

least squares met success in various scientific fields and in geodetic science the work of Helmert (1872) became

a standard textbook for the application of this method to geodetic problems with redundant observations.

2.2.1 Statistical formulation of least squares problems

Since the work of Gauss, the method of least squares is related to the normal distribution of the mea-

surement errors, as it has been already pointed out in (Bjerhammar 1973, p. 80). For normally distributed

observations the method of least squares will result in the most probable solution for the unknown quantities

or equivalently in the solution of maximum likelihood. The relationship between least squares and the error

distribution can be put forward with the following simple example adopted from (Petrovi´c et al. 1983):

Example 2.2.1. least squares and normally distributed errors:

An adjustment problem is under investigation, where nobservations land their residuals vare related to

munknown parameters x. Suppose that the measurement errors originate from a universal set of normally

distributed errors9, expressed symbolically by

ei∼N(0, σi).(2.27)

In this line of thinking, the residuals vare also assumed to be normally distributed, vi∼N(0, σi), with

known expectations

E(vi) = 0 (2.28)

9In fact, errors of different measurements (or type of measurements) belong to different normal distributions.

18 Chapter 2. Adjustment calculus

and variances

Ev2

i=σ2

i.(2.29)

The probability density function (or Likelihood function, see Linnik 1961, p. 321 for a similar example) of

each individual residual is then

φ1(v1) = 1

σ1√2πexp −v2

2σ2

1,

φ2(v2) = 1

σ2√2πexp −v2

2σ2

2,

φn(vn) = 1

σn√2πexp −v2

2σ2

n.

(2.30)

Assuming that the observations are uncorrelated and their random errors independent, the vector of residuals

v= [v1, v2, . . . , vn]T,(2.31)

would have the probability density function

L(v1, v2, . . . , vn) = φ1φ2. . . φn=

i=1

σi√2πexp −v2

2σ2

i=1

i=1 σi√2πexp −

i=1

2σ2

i!.(2.32)

Expressing the weights for the residuals by

pi=1

σ2

,(2.33)

function (2.32) can be reformulated to

L(v1, v2, . . . , vn) =

i=1

√pi

√2πnexp −1

i=1

piv2

i!=K1exp −1

i=1

piv2

i!,(2.34)

with the introduced parameter K1being a positive constant. There are infinite solutions for the residuals v,

as well as the unknown parameters x. A solution is required that maximizes the probability density function

L(i.e. the maximum likelihood solution for the unknowns). Thus, for K1>0 it is obvious that

L(v1, v2, . . . , vn)→max ⇔exp −1

i=1

piv2

i!→max.(2.35)

According to (Bronshtein et al. 2005, pp. 49-50), function L(v1, v2, . . . , vn) is strictly monotonically decreas-

ing. It follows

exp −1

i=1

piv2

i!→max ⇔1

i=1

piv2

i→min,(2.36)

i.e.

L(v1, v2, . . . , vn)→max ⇔

i=1

piv2

i→min.(2.37)

2.2. Adjustment of observations with the method of least squares 19

Thus, it can be said that for uncorrelated observations with normally distributed errors the solution of

maximum likelihood (by maximizing the probability density function) is equivalent to the least squares

solution (by minimizing the sum of weighted squared residuals). Equivalent explanations can be found in

(Merimman 1877, p. 16), (Helmert 1924, pp. 94-98), (Wells and Krakiwsky 1971, p. 93) or (Petrovi´c

et al. 1983). The first of these authors states in a few sentences that “... the most probable values of

quantities, which are the object of measurements, are those which render the sum of the squares of the errors

a minimum.”.

It should be pointed out that the method of least squares can be applied in cases that the errors are not

normally distributed as well. However, the estimated unknown parameters would not be the most probable,

but in some cases can be acceptable or appropriate.

2.2.2 Least squares parameter estimation

A least squares adjustment problem can be seen as an optimization problem, due to the fact that always

the extreme values of an objective function are requested (i.e., for least squares problems the minimum of

the sum of squared residuals). To clearly demonstrate the procedure for obtaining a least squares solution,

a simple adjustment problem will be employed. For example, a linear functional model between a set of n

observations l, their residuals vand the unknown parameters x, which can be expressed by the observation

equations

l1+v1=f1(x1, x2, . . . , xm),

l2+v2=f2(x1, x2, . . . , xm),

ln+vn=fn(x1, x2, . . . , xm),

(2.38)

with fdenoting linear functions of the unknown parameters. Obviously an adjustment problem exists when

the number of observations is larger than the number of unknown parameters (n > m). Such a functional

model occurs in practice, for example when the length of a side of an object has been measured repeatedly

and the length needs to be estimated (in this case only one unknown parameter is requested). Another

adjustment example of such a linear functional model, often presented in geodetic and statistical literature,

assumes that the y-coordinates from a set of npoints in 2D have been measured, while the x-coordinates

are taken as error-free and the parameters of a straight line that fits best to the points are unknown and

need to be estimated. Nevertheless, the adjustment problem will be kept more general here. Solving (2.38)

for the residuals, yields the system of equations

v1=f1(x1, x2, . . . , xm)−l1,

v2=f2(x1, x2, . . . , xm)−l2,

vn=fn(x1, x2, . . . , xm)−ln.

(2.39)

Postulating uncorrelated observations of equal precision, the least squares method leads to the objective

function

Ω(v1, v2, . . . , vn) =

i=1

i,(2.40)

20 Chapter 2. Adjustment calculus

which after substituting the residuals from (2.39) can be written as

Ω(x1, x2, . . . , xm) =

i=1

[fi(x1, x2, . . . , xm)−li]2.(2.41)

Estimates of the unknown parameters ˆx1,ˆx2,...,ˆxmare required, that minimize the objective function

(2.41). According to Fermat’s Theorem (Bronshtein et al. 2005, p. 388), the extreme values of a function

can be determined by setting the first derivative with respect to the unknown terms equal to zero. This is a

necessary condition to obtain the stationary points of the objective function Ω but it is not sufficient for the

least squares solution, in the sense that the derived stationary points can still be a maximum, a minimum or

even a saddle point (also called inflection point for cases of one variable). Stationary points of the objective

function (2.41) can be computed by

∂Ω(x1, x2, . . . , xm)

∂x1

i=1

2 [fi(x1, x2, . . . , xm)−li]∂fi(x1, x2, . . . , xm)

∂x1

= 0,

∂Ω(x1, x2, . . . , xm)

∂x2

i=1

2 [fi(x1, x2, . . . , xm)−li]∂fi(x1, x2, . . . , xm)

∂x2

= 0,

∂Ω(x1, x2, . . . , xm)

∂xm

i=1

2 [fi(x1, x2, . . . , xm)−li]∂fi(x1, x2, . . . , xm)

∂xm

= 0,

(2.42)

which is a system of mequations with munknown parameters, known as the system of normal equations10.

To ensure that the computed stationary point of an objective function of several variables is minimum, the

matrix including the second order partial derivatives should be built as







∂2Ω

∂x2

∂2Ω

∂x1∂x2··· ···

∂2Ω

∂x2∂x1

∂2Ω

∂x2

2··· ···

.....

.. . . ∂2Ω

∂x2













d11 d12 ··· ···

d21 d22 ··· ···

.....

.. . . dmm







.(2.43)

According to (Bronshtein et al. 2005, p. 402), if all subdeterminants of matrix Dare positive (i.e. Dis a

positive definite matrix):

d11 >0,

d11d22 −d12d21 >0,

(2.44)

then it is guaranteed that the computed extreme value of Ω is a minimum. Furthermore, for functions

fi(x1, x2, . . . , xm) in equation (2.38) being linear, the partial derivatives with respect to the unknown pa-

rameters are constants, resulting in linear normal equations in (2.42). The least squares estimates of the

10In (Wells and Krakiwsky 1971, p. 87) an equivalent explanation of the normal equations is presented using matrix notation.

2.2. Adjustment of observations with the method of least squares 21

unknown parameters ˆx1,ˆx2,...,ˆxmcan be computed in this case straightforward. The estimated unknowns

can be used in equation (2.39) to determine the residuals

ˆv1=f1(ˆx1,ˆx2,...,ˆxm)−l1,

ˆv2=f2(ˆx1,ˆx2,...,ˆxm)−l2,

ˆvn=fn(ˆx1,ˆx2,...,ˆxm)−ln,

(2.45)

leading to the adjusted observations

l1=l1+ ˆv1,

l2=l2+ ˆv2,

ln=ln+ ˆvn.

(2.46)

Least squares parameter estimation with constraints

In many cases there are additional constraints between the unknowns that have to be taken into account,

for example a constraint

u(x1, x2, . . . , xm) = 0,(2.47)

has to be enforced to the unknown parameters. The functional model includes not only the observation

equations (2.38), but also the constraint (2.47). Following (Bronshtein et al. 2005, p. 403), the Lagrange

multiplier method can be employed in this case for obtaining a least squares estimate for the unknown

parameters, that minimizes the objective function (2.41). Therefore, by combining Ω(x1, x2, . . . , xm) with

the constraint (2.47), leads to the formulation of the Lagrange function (also known as Lagrangian)

K(x1, x2, . . . , xm, k) =

i=1

[(fi(x1, x2, . . . , xm)−li)]2−2k(u(x1, x2, . . . , xm)) ,(2.48)

where parameter kdenotes the Lagrange multiplier. The stationary points of K can be obtained by taking

the partial derivatives with respect to all unknown parameters and setting them to zero, which yields the

22 Chapter 2. Adjustment calculus

normal equation system

∂K(x1, x2, . . . , xm, k)

∂x1

i=1

2 [fi(x1, x2, . . . , xm)−li]∂fi(x1, x2, . . . , xm)

x1−2k∂u(x1, x2, . . . , xm)

∂x1

= 0,

∂K(x1, x2, . . . , xm, k)

∂x2

i=1

2 [fi(x1, x2, . . . , xm)−li]∂fi(x1, x2, . . . , xm)

x2−2k∂u(x1, x2, . . . , xm)

∂x2

= 0,

∂K(x1, x2, . . . , xm, k)

∂xm

i=1

2 [fi(x1, x2, . . . , xm)−li]∂fi(x1, x2, . . . , xm)

xm−2k∂u(x1, x2, . . . , xm)

∂xm

= 0,

∂K(x1, x2, . . . , xm, k)

∂k =−2 (u(x1, x2, . . . , xm)) = 0.

(2.49)

The derived system of normal equations would be linear, as long as the functional model of the problem is

also linear (i.e. the observation equations and the constraint are linear). Thus, the least squares solution

for the unknown parameters and the Lagrange multiplier can be obtained straightforward by solving the

normal equations (2.49). Matrix notation can be easily utilized for such solutions of linear least squares

problems. Examples of this procedure using matrices can be found in most common adjustment literature,

like (Koch and Pope 1969), (Wolf 1979), (Wells and Krakiwsky 1971, p. 142ff.), (Mikhail and Ackermann

1976, p. 213ff.) or (Perovi´c, 2005, p. 189ff.).

2.2.3 Definition of linear and nonlinear least squares problems

Before discussing any strategy for solving nonlinear least squares problems, it is necessary to define what

is a linear and what a nonlinear least squares adjustment problem. A thorough analysis of the adjustment

theory can be be found in (Pasioti 2015), which makes clear the different interpretations that exist in geodetic

literature concerning the nature of least squares problems. For example, Teunissen and Knickmeyer (1988)

described the nature of an adjustment in terms of the solution space curvature. Nevertheless, two clear

definitions are given in (Pasioti 2015) regarding linear and nonlinear least squares, which are adopted here

and extended by taking into account the special cases with constraints between the unknown parameters.

Therefore, the following definitions can be stated:

Definition 2.1. Linear least squares problems

A least squares problem is linear, when the observation/condition equations and the additional constraints

are linear both with respect to the unknown parameters and the residuals.

Definition 2.2. Nonlinear least squares problems

A least squares problem is nonlinear, when the observation/condition equations or the additional constraints

are nonlinear with respect to the unknown parameters or the residuals.

Obviously, a linear functional model will lead to linear normal equations, respectively a nonlinear functional

model to nonlinear normal equations.

2.3. Error estimation of adjustment results 23

2.3 Error estimation of adjustment results

In addition to the parameter estimation and the adjustment of the observed quantities using least squares

or some other criterion, it is important to know the precision of the derived parameters in order to verify the

quality of the adjustment results and draw valid conclusions. The target is to calculate how an infinitesimal

change of the observed quantities would affect the unknown parameters. In other words, it is investigated

how the variances and covariances of the observations (l1, l2, . . . , ln) propagate to the unknown parameters

(x1, x2, . . . , xm) and the residuals (v1, v2, . . . , vn). This procedure can be found under the name “error

propagation” or “variance and covariance propagation” and has been presented in various publications, for

example in (Wells and Krakiwsky 1971, p. 20), (Mikhail and Ackermann 1976, p. 76ff.), (Cross 1994, p.

32) or (Niemeier 2008, p. 51ff.). A brief explanation of the error propagation is presented below for two

individual cases, regarding linear and nonlinear functional relationships.

Computation of stochastic parameters in linear functional relationships

Assuming for a moment a linear functional relationship between the unknown parameters and the observa-

tions

x1=f1(l1, l2, . . . , ln),

x2=f2(l1, l2, . . . , ln),

xm=fm(l1, l2, . . . , ln)

(2.50)

and taking into account that functions f1, f2, . . . , fmare linear, the latter equation system can be formulated

equivalently in matrix notation as

X=FL,(2.51)

with vector X= [x1, x2, . . . , xm]Tlisting all the unknown parameters and vector L= [l1, l2, . . . , ln]Tthe

observed quantities. The computation of the standard deviations of the unknown parameters can be ge-

ometrically interpreted as a projection of the error distribution from the measurements on the unknown

parameters. A naive example of such a projection is depicted in Figure 2.1.

Further, applying the expectation operator to equation (2.51) yields

E{X}=E{FL}.(2.52)

As long as matrix Fis deterministic and only Lcan be considered as stochastic, the last equation results in

E{X}=FE{L} ⇒ E{X}=F(L+v),(2.53)

24 Chapter 2. Adjustment calculus

meters

mea

remen

Figure 2.1: Simple example of linear variance-covariance propagation

with vector vholding the residuals of the observations in L. According to Koch and Pope (1969), the

theoretical VC matrix of the unknown parameters can be expressed by definition as

ΣXX =En[X−E{X}] [X−E{X}]To

⇒ΣXX =En[FL −(FL +v)] [FL −F(L+v)]To

⇒ΣXX =FEvvT

|{z }

ΣLL

⇒ΣXX =F ΣLL FT.

(2.54)

The last equation is known as the propagation law of variances and covariances and can be found e.g. in

(Wells and Krakiwsky 1971, p. 20) or (Niemeier 2008, p. 56). Introducing the variance of the unit weight

σ2

0, the developed VC matrix can be written equivalently as

ΣXX =σ2

0F QLL FT.(2.55)

2.3. Error estimation of adjustment results 25

with the respective cofactor matrix being defined as

QXX =FQLLFT.(2.56)

Analogously, the variances and covariances of the adjustment results (estimated unknown parameters, resid-

uals and adjusted observations) can be derived by applying the law of error propagation that is presented

above. A detailed formulation of these VC matrices is given in chapter 3, covering various adjustment cases.

Computation of stochastic parameters in nonlinear functional relationships

In practice, nonlinear functional relationships between the unknown parameters and the observed quantities

occur more often than linear ones. For example, assume a nonlinear least squares problem with the estimates

of the unknown parameters being expressed by the nonlinear system of equations

x1=ψ1(l1, l2, . . . , ln),

x2=ψ2(l1, l2, . . . , ln),

xm=ψm(l1, l2, . . . , ln),

(2.57)

with ψ1, ψ2, . . . , ψmdenoting nonlinear functions of the observations. The propagation law of variances and

covariances, that was applied in the linear case, cannot be utilized here. Several solution strategies exist for

obtaining estimates for the variances and covariances of parameters in nonlinear problems. Following the

study of L¨osler et al. (2016), four main procedures (but not only these) can be distinguished:

- First order variance-covariance propagation.

- Second order variance-covariance propagation.

- Monte Carlo simulation (MCS).

- Unscented Transformation (UT).

A detailed explanation of the first procedure is presented in chapter 3 for various adjustment examples.

The analysis of the remaining procedures is out of the scope of this work. Nevertheless, it is important to

discuss some main characteristics of all these cases and point out their main drawbacks that could eventually

mislead to wrong interpretations of the adjustment results and unfortunately lead to wrong conclusions.

The first order variance-covariance propagation is a procedure based on the linear approximation of the

problem and it has been presented e.g. in (Wells and Krakiwsky 1971, p. 21), (Mikhail and Ackermann

1976, pp. 79-81), (Taylor 1982, p. 80), (J¨ager et al. 2005, p. 68), (Niemeier 2008, p. 63-64) or (Ghilani

2010, p. 89). The derived linearized expressions can be utilized for obtaining approximate solutions for the

variances and covariances of the unknown parameters, based on the law of variance-covariance propagation.

Geometrically, the application of the error propagation on a linearized problem can be seen as the projection

of the measurements’ error distribution on the unknown parameters, by a linear approximation of the

“original” nonlinear function. A simple example of this procedure is depicted in Figure 2.2.

26 Chapter 2. Adjustment calculus

meters

mea

remen

Figure 2.2: Simple example of a first order variance-covariance propagation

The solution of the error propagation from a linearized function cannot be always trusted, as it is only a

linear approximation of the “real” solution. Therefore, the propagation law of variances and covariances

should not be applied to an arbitrary nonlinear problem, but only in these cases where the linearized

solution is a “good” approximation of the original nonlinear one. This has been already noticed by (Mikhail

and Ackermann, 1976, p. 80), who stated “Although in practical applications linearized functions are used

regularly for the propagation of variances and covariances, it should be pointed out that this is permitted

if the range of dispersion in ˜x1,˜x2is small when linear approximation is compared to the curvature of

the function in the neighborhood of x0

1,x0

2. In other words the function should be approximated well by its

tangent within the region of interest - that is, the region of dispersion of the random variables.”. In the same

line of thinking, L¨osler et al. (2016) considered this approximate solution for the variances and covariances

as “distorted”.

A solution coming from a second order variance-covariance propagation can be also seen as an approximate

solution for the variances and covariances of the unknown parameters, by taking into account the first and

second order terms of the nonlinear functional relationships. The solution coming from this procedure can

be seen as a better approximation than the first order variances-covariances, however, it also needs to be

verified in the same manner as the first order approximation. A numerical example for this approach can be

found in (L¨osler et al. 2016).

2.4. Synopsis of the basics in adjustment calculus 27

The Monte Carlo method is a famous statistical approach that is based on the sequential generation of

statistically random data samples for performing simulations. Some of the authors that employed MCS for

estimating the variances and covariances of adjustment results are Alkhatib (2007), Alkhatib and Schuh

(2007) or L¨osler et al. (2016). The advantage of MCS is that the variances and covariances can be estimated

directly by using the nonlinear functional relationships, in contrast to VC propagation that is restricted to

approximate functional relationships. To demonstrate a solution for the variances and covariances using

MCS, equation (2.57) is expressed in vector form as

X= Ψ(L),(2.58)

with Ψ denoting a vector that lists the nonlinear functions. Following (L¨osler et al. 2016), the expectation

values and the variances and covariances of the unknown parameters Xcan be obtained by means of a MC

simulation, using Nindependent repetitions of a random experiment. The first step involves the generation

of Nrandom samples of the measurements

Lj=L+ ∆Lj,for j= 1, . . . , N (2.59)

with the error vector ∆Lbeing randomly distributed (∆L∼ N(0,ΣLL)), that results in

E{X}=1

j=1

Xj=1

j=1

Ψ(Lj),

ΣXX =1

j=1

En[Xj−E{Xj}] [Xj−E{Xj}]To.

(2.60)

Moreover, the Unscented Transformation is a statistical approach that has been developed in the last decades,

utilized also for estimating variances and covariances from nonlinear functional relationships. It has been

firstly presented by Julier et al. (1995) for filtering nonlinear systems, and later on has been extended in

(Julier and Uhlmann 1996) and (Julier and Uhlmann 2000) for the approximation of distribution functions

and variances-covariances of unknown parameters. This approach involves a sampling strategy of “sigma

points”11 based on the a priori statistical information of the measurements. The derived sigma points are

used in their turn to approximate the distribution functions or variances and covariances of the unknown

parameters, even in cases of nonlinear functional relationships. Julier and Uhlmann (2000) state that

this approximation is similar to a second order approximation, however, without the need of computing

derivatives. A solution from the UT approach has been discussed in the study of L¨osler et al. (2016), as

well.

2.4 Synopsis of the basics in adjustment calculus

It has been shown that the mathematical modelling of redundant measurement results, as well as the

statistical properties of the measurement errors, embody the fundamental parts of every adjustment problem.

11“Sigma points” can be seen as synthetic measurements, that have been generated by adding randomly distributed errors

to the original measurements.

28 Chapter 2. Adjustment calculus

Only a correct mathematical model can lead to meaningful conclusions regarding estimates of unknown

parameters and their standard deviations.

An under-determined system of equations is a consequence of every adjustment of measurements that are

contaminated by errors. Depending on the nature of the errors (e.g. random errors) an appropriate method

is employed by means of a criterion for the residuals, which can lead to “optimal” adjustment results.

A least squares solution for the unknown parameters, the residuals and the adjusted measurements can

be obtained by solving a system of normal equations, which is a result of the minimization of a clearly

defined objective function. The adjustment results can be evaluated in terms of precision and reliability by

computing and interpreting their stochastic parameters. These are statistical measures that can be obtained

by using the rules of error propagation or some other approach, depending on the nature of the problem,

i.e. the functional relationship between the estimated parameters and the measurements.

A definition of linear and nonlinear least squares problems has been presented in this chapter. Linear

functional models lead to linear normal equations and a straightforward solution of the adjustment problem

is possible. However, for nonlinear cases the solution can be more complicated and a specific strategy might

be necessarily followed. In the next chapter various approaches are discussed for the solution of nonlinear

least squares problems. Thus, traditional approaches are covered that have been utilized in geodetic science

for many years and can be found in most standard adjustment textbooks, as well as the most modern

approaches that have been presented by the mathematical/statistical community in the last decades.

3 Solutions of nonlinear least squares problems

This chapter summarizes two main strategies for solving nonlinear least squares problems. The first has

been presented in most common geodetic literature and used extensively by geodesists, while the second

has been developed recently by the mathematical and statistical scientific community. Nevertheless, before

any discussion about solving nonlinear adjustment problems with the method of least squares, it would

be advantageous to make clear the viewpoint on the various models and optimization approaches that are

considered in the following.

Nonlinear adjustments occur often not only in geodetic science, but also in geodetic practice. Depending on

the functional relationship between the observed quantities, the unknown parameters and the fixed/constant

parameters, two nonlinear cases are distinguished here:

- The adjustment of nonlinear observation equations (extended to the case with nonlinear constraints);

- The adjustment of nonlinear condition equations with unknown parameters (extended to the case with

nonlinear constraints).

The least squares solution of such adjustment problems can be obtained iteratively involving sometimes a

linearization, or in some specific cases directly. Various optimization approaches exist for the solution of such

problems and can be classified into global optimization or local optimization. Some of those approaches have

been extensively applied in geodetic science, for instance in (Pope 1974), (Madsen et al. 2004) or (Neitzel

and Petrovic 2008), as for example the

- Gauss-Newton,

- Newton-Raphson,

- Levenberg–Marquardt,

- heuristic optimization.

The numerical characteristics of a class of iterative algorithms for the solution of nonlinear least squares

problems, with particular focus on the Gauss-Newton approach, have been investigated in (Teunissen 1985)

and (Teunissen 1990) in terms of differential geometric concepts.

30 Chapter 3. Solutions of nonlinear least squares problems

A comparison between these approaches1is out of the scope of this work and only the Gauss-Newton

approach will be utilized in the following sections. The rest have been mentioned here for the sake of

completeness and will not be analysed further.

An additional algorithmic approach for a class of nonlinear least squares problems has been defined and

presented by (Golub and Van Loan 1980) under the name TLS. In the last three decades many authors dealt

with this approach and various modern algorithms have been developed since then. A thorough analysis of

TLS and its development is provided at a later point in this chapter.

Depending on the nature of the adjustment problem, as well as the chosen optimization strategy, three

individual adjustment models are identified and discussed:

•The Gauss-Markov model (GMM)

•The Gauss-Helmert model (GHM)

•The Errors in variables model (EIV)

Figure 3.1 depicts a diagram with the solutions of nonlinear least squares problems that are discussed in

this chapter.

Class of nonlinear least-

squares problems

Geodetic Solution

Linearized functional model

Gauss-Helmert

model

Iterative solution -

Local optimization

Mathematical Solution

Nonlinear functional model

Direct solution

TLS with SVD -

Global optimization

Iterative solution

WTLS algorithms -

Local optimization

Errors in variables

model

Gauss-Newton approach Total least-squares approach

Gauss-Markov

model

Figure 3.1: Two optimization approaches for the solution of a class of nonlinear least squares problems.

It must be clarified that Gauss-Newton is a general optimization approach and can be employed for finding

the minimum solution of any nonlinear problem. On the other hand the TLS approach can be utilized only

1It is important to mention that all presented approaches are iterative and can be employed for the solution of nonlinear

least squares problems. However, without guaranteeing a convergence necessarily. Thus, it can happen that a solution is not

always possible by a specific approach, or some can provide a solution and some not, depending on the individual adjustment

problem. More information about additional optimization approaches and a thorough analysis of the presented ones can be

found in (Madsen et al. 2004).

3.1. Traditional geodetic solutions 31

for a special class of nonlinear least squares problems. It is, thus, necessary for the scope of this work to

discuss only adjustment problems that have a solution using TLS.

3.1 Traditional geodetic solutions

Solutions of nonlinear least squares problems with the Gauss-Newton approach have been already discussed

in (Wells and Krakiwsky 1971), (Pope 1974), (Niemeier 2008), (Neitzel and Petrovic 2008) or (Neitzel

2010). This involves a linear approximation of the observation/condition equations (as well as the additional

constraint between the unknowns in special cases), thus initial values for the unknown parameters are

necessary. The derived objective function in this case can have one minimum and after an iterative process

reaches a local minimum of the “original” nonlinear problem. The estimated unknown parameters in each

iteration serve as corrections for the initial values of the next iteration. This iterative process continues

until a predefined threshold (or until a condition is met). Thus, the main characteristics of a least squares

solution of an adjustment problem using Gauss-Newton are:

- linear approximation of the nonlinear observation/condition equations,

- linear approximation of the nonlinear constraints between the unknown parameters,

- initial values for the unknown parameters,

- iterative procedure,

- local optimization.

Depending on the individual nonlinear adjustment problem (for example adjustment with observation equa-

tions or adjustment with condition equations), a least squares solution can be obtained within the GMM or

the GHM. Both adjustment models are thoroughly discussed and analysed in the next subsections.

3.1.1 Adjustment with observation equations and constraints

Point of beginning is the functional model that explicitly relates the observations li, the residuals vi(with

i= 1,2, ..., n) and the unknown parameters xj(with j= 1,2, ..., m) and can be expressed by the nonlinear

observation equations 2

l1+v1=φ1(x1, ..., xm),

l2+v2=φ2(x1, ..., xm),

ln+vn=φn(x1, ..., xm),

(3.1)

with φidenoting nonlinear differentiable functions of the unknown parameters xj. Storing the observations

liin a column-vector L, the residuals viin vector vand the nonlinear functions φi(x1, ..., xm) in the formal

vector Φ(X), it is possible to write the system of observation equations (3.1) in vector notation as

L+v=Φ(X).(3.2)

2The linear case of this problem has been discussed in section 2.2.2, expressed by the linear system of equations 2.38 with

fidenoting in that case linear functions.

32 Chapter 3. Solutions of nonlinear least squares problems

Linearization of the functional model

The first step in the Gauss-Newton approach is the linear approximation of the nonlinear functional model.

In case of observation equations, a linearization is performed on the nonlinear differentiable functions

φi(x1, ..., xm). For instance, the first order Taylor series expansion of the formal vector Φ(X) at the point

X0can be taken, which reads

L+v=Φ(X)≈ΦX0+∂Φ(X)

∂XX=X0X−X0,(3.3)

with X0denoting the vector of approximate unknown parameters x0

1, x0

2, ..., x0

m(i.e. initial values are

necessary for approximating X). The partial derivatives of the nonlinear functions in Φ(X) with respect to

the unknown parameters can be expressed equivalently by the Jacobian matrix

Jx=∂Φ(X)

∂XX=X0







∂φ1

∂x1

∂φ1

∂x2··· ∂φ1

∂xm

∂φ2

∂x1

∂φ2

∂x2··· ∂φ2

∂xm

.....

∂φn

∂x1

∂φn

∂x2··· ∂φn

∂xm







.(3.4)

A vector of reduced observations is introduced with

l=L−Φ(X0) (3.5)

and the vector of corrections, containing the differences between the unknown parameters (X) to be estimated

and the approximated ones (X0), written as

x=X−X0.(3.6)

Substituting the Jacobian matrix Jx, the vector of reduced observations land the vector of corrections xin

equation (3.3), results in the linearized observation equations

l+v=Jxx.(3.7)

Following (Pasioti 2015), the Jacobian matrix Jxcan be represented symbolically as the design matrix A,

whose elements are the partial derivatives of the nonlinear functions φi(x1, x2, ..., xm) with respect to all

unknown parameters. Thus, the linearized observation equations can be expressed equivalently by

l+v=Ax.(3.8)

3.1. Traditional geodetic solutions 33

Taking into account the precision of the observed quantities, the stochastic model of the problem can be

expressed by the weight matrix P. The mathematical model of this problem represents the well known

GMM, as explained in (Niemeier 2008, p. 137). Assuming normally distributed residuals (v∼N(0,ΣLL)),

the method of least squares will provide the most probable solution for the vector of unknown parameters

and the residuals.

3.1.1.1 Least squares parameter estimation within the GMM

For obtaining a least squares solution of an adjustment problem within the GMM, an appropriate objective

function has to be built and the sum of squared residuals needs to be minimized. In matrix notation the

objective function is

Ω(v) = vTPv →min.(3.9)

Solving for the residual vector in equation (3.8) yields

v=Ax −l,(3.10)

which can be used to reformulate the objective function to

Ω(x)=(Ax −l)TP(Ax −l).(3.11)

A solution for the unknown correction vector xis requested that minimizes Ω(x). Taking the partial deriva-

tives of the objective function with respect to the unknowns and setting the solution to zero yields the

normal equation system

∂Ω(x)

∂xT=∂Ω

∂x1

,∂Ω

∂x2

,··· ,∂Ω

∂xm= 2(ATPAx −ATPl) = 0.(3.12)

Therefore, the least squares estimate for the correction vector is

x= (ATPA)−1(ATPl).(3.13)

However, it is worth mentioning that matrix inversion is prone to rounding errors and even the use of

double-precision floating-point format for computation is in many cases insufficient. An error analysis in

matrix computations and alternative direct and iterative solutions for linear systems can be found e.g. in

(Golub and Van Loan 1996) or (Bj¨orck 2015).

According to (Perovi´c 2005, p. 84), the second derivative of the objective function with respect to the

unknowns is ∂2Ω(x)

∂x2= 2(ATPA).(3.14)

As long as the product of matrices ATPA is positive semi-definite, the stationary point (3.13) of the objective

function will be a minimum.

34 Chapter 3. Solutions of nonlinear least squares problems

Iterative solution for the adjustment results

Madsen et al. (2004) explained that a solution for the unknown vector of parameters ˆ

Xcan be obtained

iteratively, by means of a local minimizer ˆ

x. The initial values given in the first iteration step will define the

area where the algorithm starts descending towards the local minimum of the objective function. From all

local minima the global minimum is requested, thus “good” initial values are necessary. In every iteration

step the estimated vector ˆ

Xis utilized as initial approximation for the next iteration

i+1 =ˆ

Xi,(3.15)

with idenoting the iteration step. After the termination of the iterative procedure, the final solution for

the vector of corrections ˆ

xis utilized for computing an estimate for the vector of unknown parameters

X=ˆ

final +ˆ

xfinal,(3.16)

with the subscript “final” denoting the last iteration step. The solution for the residuals can be obtained by

v=Aˆ

xfinal −l(3.17)

and the vector of adjusted observations

L=L+ˆ

v.(3.18)

The iterative process can be terminated when adequate break-off conditions (stopping criteria) are fulfilled.

According to (Pasioti 2015), suitable stopping criteria for the iterations can be the following:

1. Computation error:

The element of the vector of corrections ˆ

xwith the maximum absolute value, should become smaller

or at least equal to a predefined threshold :

max|ˆ

x| ≤ . (3.19)

In (Pasioti 2015) it is stated “in practice all computations are performed with software compilers and

the usage of decimal places is translated differently to the computer’s world. Instead significant places

are used. Therefore, a small value for is highly recommended to be chosen”. Moreover, it must be

pointed out that a meaningful choice for the threshold parameter depends on the significant digits

and their position inside the number system (in computer this is a binary system while humans usually

think in decimal system).

2. Linearization error:

The maximum absolute difference between the elements of the estimated vector ˆ

L(linearized problem)

and vector Φ(ˆ

X) (“original” nonlinear problem), should become smaller or equal to a predefined value

δ:

max|ˆ

L−Φ(ˆ

X)| ≤ δ. (3.20)

3.1. Traditional geodetic solutions 35

For the second criterion, a value for δshould be chosen as close to zero as possible. This threshold

value ensures that the linear approximation of the functional model has been performed correctly, as

explained in (Pasioti 2015) in a few sentences: “The linearisation error is a safeguard against wrong

linearisation so that the solution of the linearised problem is also the solution of the original nonlinear

problem. If the tolerance criterion is not met then the linearisation is inconsistent.”

3.1.1.2 Error estimation within the GMM

The precision of the estimated unknown parameters in ˆ

xcan be expressed by the VC matrix Σˆ

Xˆ

X, as ex-

plained in (Niemeier 2008, p. 272). Approximate solutions for the stochastic properties of the unknown

parameters in a nonlinear adjustment problem can be obtained in terms of a first order variance-covariance

propagation. This procedure is based on the utilization of the linearized functional model and the employ-

ment of the propagation law of variances and covariances that was discussed in section 2.3.

Stochastic properties of the estimated unknown parameters

Point of beginning is the linear functional relationship between the vector of corrections xand the vector of

reduced observations l:

x= (ATPA)−1ATP l.

This equation can be equivalently formulated as

x=F l,(3.21)

after introducing matrix

F= (ATPA)−1ATP.(3.22)

Following (Niemeier 2008, p. 140), only lis a stochastic parameter, while Fcan be taken as “fixed” or

as deterministic. Utilizing the propagation law of variances and covariances from section 2.3, the cofactor

matrix of ˆ

xcan be computed here by

Qˆxˆx =F Qll FT

⇒Qˆxˆx =ATPA−1ATP Qll ATPA−1ATPT

⇒Qˆxˆx =ATPA−1ATP Qll

|{z}

PA ATPA−1

⇒Qˆxˆx =ATPA−1ATPA ATPA−1

|{z }

⇒Qˆxˆx = (ATPA)−1.

(3.23)

36 Chapter 3. Solutions of nonlinear least squares problems

In case of large measurement samples the a posteriori variance of the unit weight

0=vTPv

,with redundancy : rd=n−m, (3.24)

converges stochastically to σ2

0, with

Es2

0=σ2

0(3.25)

and the VC matrix for the estimated unknown parameters can be computed by

Σˆxˆx =s2

0Qˆxˆx.(3.26)

Finally, a first order approximate solution for the variances and covariances of the estimated unknown

parameters ˆ

Xcan be derived as

Σˆ

Xˆ

X=Σˆxˆx and Qˆ

Xˆ

X=Qˆxˆx.(3.27)

Stochastic properties of the residuals and the adjusted observations

Approximate cofactor and VC matrices for the adjusted observations and the computed residuals can be

derived following the same line of thinking as in (Niemeier 2008, p. 141). Making use of the linearized

functional model (3.8) it is possible to express the adjusted reduced observations by

l=l+ˆ

v=Aˆ

x,(3.28)

with the respective cofactor matrix being

Qˆ

lˆ

l=AQˆxˆxAT.(3.29)

Reformulating appropriately equation (3.28) results in

v=Aˆ

x−l

⇒ˆ

v=AQˆxˆxATPl −l

⇒ˆ

v=AQˆxˆxATP−Inl.

(3.30)

Thus, the cofactor matrix of the residuals can be computed by

Qˆvˆv =AQˆxˆxATP−InQll AQˆxˆxATP−InT,(3.31)

which after some examination can be simplified to

Qˆvˆv =Qll −Qˆ

lˆ

l.(3.32)

3.1. Traditional geodetic solutions 37

3.1.1.3 Least squares parameter estimation within the GMM with constraints

In this case a set of constraints between the unknown parameters has to be taken into account in the

functional model. If these constraints are represented by nonlinear functional relationships, then they have

to be linearized together with the observation equations for a solution using the Gauss-Newton approach.

For example, a number of ncconstraints are enforced in the adjustment problem, which can be expressed

by the system of nonlinear equations

ψ1(x1, x2, ..., xm)=0,

ψ2(x1, x2, ..., xm)=0,

ψnc(x1, x2, ..., xm)=0.

(3.33)

Listing the nonlinear functions ψ1, ψ2, . . . , ψncin the formal vector Ψ(X), it is possible to write the system

of constraints in vector notation

Ψ(X) = 0.(3.34)

The first order Taylor series expansion of Ψ(X) at the point X0reads

Ψ(X)≈Ψ(X0) + ∂Ψ(X)

∂XX=X0

(X−X0).(3.35)

The partial derivatives of the constraint functions ψwith respect to the unknown parameters can be repre-

sented by a Jacobian matrix

Jc=∂Ψ(X)

∂XX=X0







∂ψ1

∂x1

∂ψ1

∂x2··· ∂ψ1

∂xm

∂ψ2

∂x1

∂ψ2

∂x2··· ∂ψ2

∂xm

.....

∂ψnc

∂x1

∂ψnc

∂x2··· ∂ψnc

∂xm







.(3.36)

Introducing the vector of corrections x=X−X0from equation (3.6), the Jacobian matrix Jcand the

vector of misclosures

w=Ψ(X0) (3.37)

into equation (3.34), yields the system of linearized constraints

Jcx+w= 0.(3.38)

The Jacobian Jccan be regarded as a design matrix, with its elements being the linear approximations of

the nonlinear functions in (3.33) with respect to all unknown parameters. Denoting Jcby C, equation (3.38)

38 Chapter 3. Solutions of nonlinear least squares problems

can be written as

C x +w= 0.(3.39)

The mathematical model in this case can be regarded as a GMM with constraints between the unknown

parameters. A definition can be found also in (Perovi´c 2005, p. 189).

In the special case that constraints are imposed on the unknown parameters, a least squares estimate can be

acquired with the method of Lagrange multipliers, as it was explained in subsection 2.2.2. Thus, combining

the objective function (3.9) and the constraints (3.39) yields the Lagrangian

K(v,x,k) = vTPv + 2kT(C x +w)→min,(3.40)

which can be expressed equivalently as

K(x,k)=(Ax −l)TP(Ax −l)+2kT(C x +w).(3.41)

The auxiliary vector kholds the Lagrange multipliers. A least squares solution for vectors xand kis required

that minimizes the developed Lagrange function. Taking the partial derivatives of K(x,k) with respect to

all unknowns and setting the solution to zero yields the system of normal equations

∂K

∂xT= 2(ATPAx −ATPl +CTk)=0

⇒ATPAx +CTk=ATPl,

(3.42)

∂K

∂kT=−2(Cx +w) = 0

⇒Cx =−w.

(3.43)

Equations (3.42) and (3.43) can be combined with the block matrices

"ATPA CT

C 0 #" x

k#="ATPl

−w#.(3.44)

The least squares estimate for the vector of corrections and the vector of Lagrange multipliers is

"ˆ

k#="ATPA CT

C 0 #−1"ATPl

−w#.(3.45)

According to (Niemeier 2008, p. 265), an equivalent solution can be obtained in case of nonsingular products

[(ATPA)] and [CT(ATPA)−1C] by

"ˆ

k#="Q11 Q12

Q21 Q22 #" ATPl

−w#,(3.46)

3.1. Traditional geodetic solutions 39

with the respective quantities

Q22 =−C(ATPA)−1CT−1,

Q12 =−Q22C(ATPA)−1,

Q21 =QT

12,

Q11 = (ATPA)−1(Im−CTQ12).

(3.47)

Imis an identity matrix, with the subscript mdenoting the number of unknown parameters and specifies the

dimensions of the identity matrix. The least squares estimate for the vector of corrections can be explicitly

expressed by

x=Q11(ATPl)−Q12w(3.48)

and the vector of Lagrange multipliers

k=Q21(ATPl)−Q22w.(3.49)

A local minimizer of the Lagrange function (3.41) can be obtained iteratively. Equations (3.16), (3.17) and

(3.18) can be further employed to compute the unknown parameters ˆ

X, the residuals ˆ

vand the vector of

adjusted observations ˆ

3.1.1.4 Error estimation within the GMM with constraints

Approximate error estimates for the parameters in ˆ

xcan be derived by making use of the linearized functional

relationship

x=Q11(ATPl)−Q12w.(3.50)

Applying the propagation law of variances and covariances to the last equation, the cofactor matrix for the

unknown parameters can be found after some investigation in

Qˆxˆx =Q11.(3.51)

In this case the number of constraint equations must be taken into account when computing the redundancy

of the problem, with the estimated variance of the unit weight

0=vTPv

,with redundancy : rd=n−m+nc.(3.52)

Assuming that s2

0converges stochastically to σ2

0, the VC matrix for ˆ

xis

Σˆxˆx =s2

0Qˆxˆx,(3.53)

Furthermore, the VC and cofactor matrices of the estimated unknown parameters ˆ

X, the adjusted observa-

tions ˆ

Land the residuals ˆv are equivalent to those of section 3.1.1.2.

40 Chapter 3. Solutions of nonlinear least squares problems

3.1.2 Adjustment with condition equations and constraints

In this section a nonlinear functional model is under consideration, that implicitly relates the observations

li, their residuals vi(with i= 1, ..., n) and the unknown parameters xj(j= 1, ..., m) with the condition

equations

φ1(l1+v1, ..., ln+vn, x1, ..., xm) = 0,

φ2(l1+v1, ..., ln+vn, x1, ..., xm) = 0,

φr(l1+v1, ..., ln+vn, x1, ..., xm)=0.

(3.54)

while φ1, φ2, ..., φrare nonlinear differentiable functions of the unknown parameters and the residuals. This

system of condition equations is expressed equivalently in matrix notation as

Φ(X,L+v) = 0,(3.55)

with the formal vector Φholding the nonlinear functional relationship between the vector of observations

L, the vector of residuals vand the vector of unknown parameters X.

Following the Gauss-Newton approach, a linear approximation of the nonlinear condition equations has to

be introduced. Here it is important to mention that a correct linearization involves an approximation of

both the unknown parameters xj0and the unknown residuals vi0. This type of linearization leads to the

rigorous solution of the nonlinear adjustment problem, as it has been already examined in (Pope 1972),

(Lenzmann and Lenzmann 2004) and demonstrated on a practical example by Neitzel and Petrovic (2008).

In the latter contributions has been shown very clearly which terms when neglected will produce merely

approximate formulas for the linearized problem3that yield and unusuable solution. Unfortunately, these

approximate formulas can be found in many popular textbooks on adjustment calculus. For more details

please refer to (Neitzel 2010).

A rigorous solution within the GHM is presented here for a combined adjustment problem, according to the

remarks of (Lenzmann and Lenzmann 2004) and (Neitzel 2010). The first order Taylor series approximation

of the formal vector Φ(X,L+v) at the point X0and v0reads

Φ(X,L+v)≈Φ0(X0,L+v0) + ∂Φ(X,L+v)

∂XX=X0,v=v0

(X−X0)

+∂Φ(X,L+v)

∂vX=X0,v=v0

(v−v0).

(3.56)

A linear approximation of the condition equations (3.55) can be expressed by

Φ0(X0,L+v0) + ∂Φ(X,L+v)

∂XX=X0,v=v0

(X−X0) + ∂Φ(X,L+v)

∂vX=X0,v=v0

(v−v0) = 0.(3.57)

3This approximate linearization has been introduced in standard adjustment textbooks in the past, for example in (Helmert,

1924, pp. 171-174) and could lead to simpler algebraic equations for the approximate solution of the nonlinear adjustment

problem without iterating (i.e. the approximate solution of the problem was obtained after one iteration).

3.1. Traditional geodetic solutions 41

Forming a first Jacobian matrix that contains the partial derivatives of the condition equations with respect

to the unknown parameters

Jx=∂Φ(X,L+v)

∂XX=X0,v=v0







∂φ1

∂x1

∂φ1

∂x2··· ∂φ1

∂xm

∂φ2

∂x1

∂φ2

∂x2··· ∂φ2

∂xm

.....

∂φr

∂x1

∂φr

∂x2··· ∂φr

∂xm







(3.58)

and a second that contains the partial derivatives of the condition equations with respect to the residuals

Jv=∂Φ(X,L+v)

∂vX=X0,v=v0







∂φ1

∂v1

∂φ1

∂v2··· ∂φ1

∂vn

∂φ2

∂v1

∂φ2

∂v2··· ∂φ2

∂vn

.....

∂φr

∂v1

∂φr

∂v2··· ∂φr

∂vn







(3.59)

and introducing them into equation (3.57) results in

Φ0(X0,L+v0) + Jx(X−X0) + Jv(v−v0) = 0.(3.60)

The Jacobians Jxand Jvcan be regarded as design matrices. Denoting Jxby Aand Jvby B, this linearized

equation system can be equivalently written as

Φ0(X0,L+v0) + A(X−X0) + B(v−v0) = 0.(3.61)

Introducing the vector of misclosures

w=Φ0(X0,L+v0)−Bv0,(3.62)

and the vector of corrections x=X−X0into equation (3.61), yields the linearized functional model

Bv +Ax +w=0.(3.63)

The combination of the developed linearized functional model together with the stochastic model for the

observed quantities results in the famous Gauss-Helmert model. An equivalent definition of this model can

42 Chapter 3. Solutions of nonlinear least squares problems

be found in various textbooks and publications in geodetic literature, like (Wolf 1978), (Lenzmann and

Lenzmann 2004), (Perovi´c 2005, p. 203), (Neitzel and Petrovic 2008) or (Neitzel 2010).

3.1.2.1 Least squares parameter estimation within the GHM

A least squares solution for the unknown corrections xcan be estimated by minimizing the objective function

Ω(v) = vTPv.(3.64)

Due to the implicit functional relationship between the parameters of this adjustment case, a Lagrangian

K(x,v,k) = vTPv −2kT(Bv +Ax +w) (3.65)

can be formed. Vector kis the vector of Lagrange multipliers. Computing the partial derivatives of K with

respect to all unknown parameters and setting the solution to zero yields the stationary points

∂K

∂vT= 2Pv −2BTk=0

⇒v=QllBTk,(3.66)

∂K

∂xT=−2kTA=0

⇒ATk=0,(3.67)

∂K

∂kT=−2 (Bv +Ax +w) = 0.(3.68)

Inserting the residual vector from equation (3.66) into (3.68) gives

BQllBTk+Ax +w=0.(3.69)

Combining equations (3.67) and (3.69) with the block matrices







BQllBTA

AT0















=





−w





,(3.70)

the solution for the unknown parameters is obtained by











=





BQllBTA

AT0







−1





−w





.(3.71)

3.1. Traditional geodetic solutions 43

Under the condition that the product [BQllBT] is not singular, the last equation can be expressed by











=





Q11 Q12

Q21 Q22











−w





,(3.72)

as it has been presented in (Niemeier 2008, p. 177), with the respective quantities

Q22 =−AT(BQllBT)−1A−1,

Q12 =−(BQllBT)−1AQ22,

Q21 =QT

12,

Q11 = (BQllBT)−1(In−AQ21).

(3.73)

Explicit expressions for the vector of corrections and the vector of Lagrange multipliers are

x=−Q21w=−hATBQllBT−1Ai−1ATBQllBT−1w(3.74)

and

k=−Q11w=−BQllBT−1(Aˆ

x+w).(3.75)

A least squares solution for the vector of unknown parameters can be computed iteratively by

Xi=ˆ

xi+X0(3.76)

and approximate estimates for the residuals by

vi=QllBTˆ

ki,(3.77)

with iindicating the iteration step. Due to the iterative procedure of the Gauss-Newton approach, a local

minimizer ˆ

xfor the Lagrange function K(x,v,k) will be estimated by a series of adjustments within the

GHM. Two vectors have to be updated in each iteration step in this case. The solution for the vector

containing the unknown parameters ˆ

Xwill be introduced as the vector of initial values for the unknown

parameters in the next iteration

i+1 =ˆ

Xi,

and the computed residual vector will be used as an initial residual vector

i+1 =ˆ

vi.(3.78)

44 Chapter 3. Solutions of nonlinear least squares problems

A solution can be obtained after the fulfillment of the two stopping criteria (break-off conditions), according

to equations (3.19) and (3.20). The final estimated vector of corrections can be utilized for computing the

vector of unknown parameters

X=ˆ

final +ˆ

xfinal,(3.79)

the residuals

vfinal =QllBTˆ

kfinal,(3.80)

and the adjusted observations

L=L+ˆ

vfinal,(3.81)

with the subscript “final” denoting the last iteration step.

3.1.2.2 Error estimation within the GHM

Point of beginning for the error estimates in the case of a GHM is the linearized functional relationship

between the estimated parameters ˆ

xand the vector of misclosures wof equation (3.74), written as

x=−hATBQllBT−1Ai−1ATBQllBT−1w.

Following (Niemeier 2008, p. 178), it is necessary to derive an expression for the vector of misclosures was

a linear function of the vector of the observations. Therefore, taking a first order Taylor approximation of

wfrom equation (3.62) with respect to the vector of observations L, results in

w=Φ0(X0,L0+v0)

|{z }

+∂w

∂LL=L0

(L−L0).(3.82)

Forming a Jacobian matrix that contains the partial derivatives of wwith respect to the observations

Jw=∂w

∂LL=L0

(3.83)

and by introducing a vector of reduced observations

l=L−L0(3.84)

in equation (3.82), returns

w=Jwl.(3.85)

At the last step of the iterative procedure, i.e. the final results for the unknown parameters ˆ

xfinal, it can be

shown that the elements of the Jacobian matrix Jwwill be equal to the elements of matrix Bfrom equation

3.1. Traditional geodetic solutions 45

(3.63). The vector of misclosures can be related linearly with the reduced vector of observations, with

w=B l,(3.86)

which can be introduced in (3.74) to derive

x=−hATBQllBT−1Ai−1ATBQllBT−1B l,

x=F l.

(3.87)

The auxiliary matrix Fcan be defined in this case as

F=−hATBQllBT−1Ai−1ATBQllBT−1B.(3.88)

Assuming that lhas the same stochastic properties as L, the former can be regarded as the only stochastic

parameter in equation (3.87).The cofactor matrix for the vector of corrections is

⇒Qˆxˆx =F Qll FT

⇒Qˆxˆx =hATBQllBT−1Ai−1ATBQllBT−1B Qll BT

|{z }

InBQllBT−1AhATBQllBT−1Ai−1

⇒Qˆxˆx =hATBQllBT−1Ai−1ATBQllBT−1A

|{z }

ImhATBQllBT−1Ai−1

⇒Qˆxˆx =hATBQllBT−1Ai−1=Q22.

(3.89)

The estimated variance of the unit weight can be computed in this case by

0=vTPv

,with redundancy : rd=r−m, (3.90)

or using the expression for the residuals from equation (3.66), by

0=kTBQllPv

=kTBv

=−kT(Aˆ

x+w)

.(3.91)

The VC matrix for the vector of corrections is

Σˆxˆx =s2

0Qˆxˆx.(3.92)

46 Chapter 3. Solutions of nonlinear least squares problems

The residual vector and the vector of adjusted observations can be expressed as functions of the observed

quantities. Introducing kand wfrom equations (3.75) and (3.86) into equation (3.66) results in

v=QllBTk=−QllBTQ11w=−QllBTQ11Bl (3.93)

and

l=l+ˆ

v=In−QllBTQ11Bl.(3.94)

Following the same line of reasoning as (Niemeier 2008, p. 179), the required cofactor of the residuals can

be found from

Qˆvˆv =QllBTQ11BQll (3.95)

and the cofactor for the adjusted observations from

Qˆ

lˆ

l=Qll −Qˆvˆv =Qll In−BTQ11BQll.(3.96)

Finally, the necessary VC and cofactor matrices of the adjustment results can be computed as

Σˆ

Xˆ

X=Σˆxˆx ,Qˆ

Xˆ

X=Qˆxˆx

and

Σˆ

Lˆ

L=Σˆ

lˆ

l,Qˆ

Lˆ

L=Qˆ

lˆ

3.1.2.3 Least squares parameter estimation within the GHM with constraints

An adjustment with condition equations and constraints between the unknown parameters can be treated

like the one presented in subsection 3.1.1.4. For constraints that are represented by nonlinear functional

relationships, a linear approximation will result in the linearized constraint equations (3.39), which are

expressed in this subsection by

Cx +w2=0,(3.97)

with the vector of misclosures w2for the constraints. The linearized condition equations (3.63) and the

constraints (3.97) represent the linearized functional model of this adjustment problem. Taking into account

the stochastic information of the observed quantities, results in the case of a GHM with constraints. A least

squares solution can be found by minimizing the Lagrange function

K(x,v,k1,k2) = vTPv −2kT

1(Bv +Ax +w1)−2kT

2(Cx +w2)→min,(3.98)

with the vectors of Lagrange multipliers k1and k2and the vector of misclosures w1for the linearized

condition equations 4. From the standard procedure for obtaining a least squares estimate for the unknowns,

4The subscript “1” is introduced here to diferentiate with the vector of misclosures w2for the linearized constraint equations.

3.1. Traditional geodetic solutions 47

the partial derivatives of K(x,v,k1,k2) with respect to all unknown parameters are computed and set equal

to zero:

∂K

∂vT= 2 Pv −BTk1=0

⇒v=QllBTk1,(3.99)

∂K

∂xT=−2k1TA+k2TC=0

⇒ATk1+CTk2=0,(3.100)

∂K

∂k1T=−2 (Bv +Ax +w1) = 0,(3.101)

∂K

∂k2T=−2 (Cx +w2) = 0.(3.102)

Introducing the vector of residuals from equation (3.99) into (3.101) yields

BQllBTk1+Ax +w1=0.(3.103)

Equations (3.100), (3.102) and (3.103) can be expressed as the block matrices







BQllBTA 0

AT0 CT

0 C 0

























−w1

−w2







,(3.104)

with the least squares solution for the unknown parameters being obtained by



















BQllBTA 0

AT0 CT

0 C 0







−1





−w1

−w2







.(3.105)

For an equivalent solution of the problem, the matrices

R=BQllBT(3.106)

and

M=−ATBQllBT−1A=−ATR−1A(3.107)

48 Chapter 3. Solutions of nonlinear least squares problems

can be introduced. If matrix Ris regular, then the vector of Lagrange multipliers k1is

k1=−R−1(Ax +w1) (3.108)

which can be substituted in equation (3.100), that yields

−ATR−1(Ax +w1) + CTk2=0

⇒Mx −ATR−1w1+CTk2=0.

(3.109)

A least squares solution can be obtained by expressing equations (3.102) and (3.109) as







M CT

C 0















=





ATR−1w1

−w2





,(3.110)

resulting in











=





M CT

C 0







−1





ATR−1w1

−w2





.(3.111)

For a regular Mmatrix, the last equation system can be equivalently written as











=





Q11 Q12

Q21 Q22











ATR−1w1

−w2





,(3.112)

with the respective matrices being

Q22 =−hCMTM−1MTCTi−1

Q12 =MTM−1MTCTQ22,

Q21 =QT

12,

Q11 =MTM−1MT(Im−CTQ12).

(3.113)

Thus, the vector of corrections can be expressed explicitly by

x=Q11(ATR−1w1)−Q12w2(3.114)

and the vector of Lagrange multipliers

k2=Q21 ATR−1w1−Q22w2.(3.115)

3.2. Total least squares 49

An iterative procedure is necessary also in this adjustment case. A local minimizer of the Lagrange function

(3.98) is derived, which results in the least squares solution for the unknown parameters ˆ

X, the residuals

vand the vector of adjusted observations ˆ

L, similarly to the discussed adjustment cases of the previous

subsections.

3.1.2.4 Error estimation within the GHM with constraints

Starting point is the solution for the vector of corrections within the GHM with constraints

x=Q11(ATR−1w1)−Q12w2.

Introducing the same concept as in section 3.1.2.2, the vector of misclosures w1can be related linearly with

the vector of observed quantities lby

w1=B l.

Thus, by substituting w1into (3.114) gives

x=Q11(ATR−1Bl)−Q12w2.(3.116)

Applying the propagation law of vaiances and covariances and after some investigation, the cofactor matrix

for ˆ

xcan be computed by

Qˆxˆx =Q11.(3.117)

Taking into account the number of constraint equations for computing the redundancy of the problem, the

estimated variance of the unit weight is

0=vTPv

,with redundancy : rd=r−m+nc.(3.118)

The VC matrix for the corrections is

Σˆxˆx =s2

0Qˆxˆx.(3.119)

Furthermore, the VC and cofactor matrices of the estimated unknown parameters ˆ

X, the adjusted observa-

tions ˆ

Land the residuals ˆ

vcan be computed as in section 3.1.2.2.

3.2 Total least squares

Modern and sophisticated algorithms have been presented by the mathematical community since the 1980s,

for the solution of nonlinear adjustments. These algorithms deal with a class of nonlinear least squares

problems, which can be expressed within an EIV model and solved by TLS. A solution coming from TLS

does not involve any kind of linearization of the functional model but presupposes the use of SVD, as it

was defined in (Golub and Van Loan 1980) or (Van Huffel and Vandewalle 1991, p. 33 ff.). Thus, a TLS

solution is obtained by computing the roots of a polynomial (i.e. by solving the characteristic equation of

50 Chapter 3. Solutions of nonlinear least squares problems

the eigenvalues) and a direct solution can be possible depending on the polynomial’s degree. Such solutions

have been presented in the literature when postulating, in most cases, equally weighted and uncorrelated

measurements.

Various approaches and algorithms have been implemented for the solution of this class of nonlinear least

squares problems when different precision is associated with each measurement. The solutions from these

algorithms are iterative, they do not include a linearization of the functional model and have been published

under the name WTLS. For example, Schaffrin and Wieser (2008) presented a WTLS solution for linear

regression, which inspired Shen et al. (2011), Fang (2011), Amiri-Simkooei and Jazaeri (2012) and Mahboub

(2012) to present modern WTLS algorithms. Despite the name TLS, in all above cases the solution has been

obtained iteratively and does not follow the definition that was established by Golub and Van Loan (1980),

i.e. direct solution using SVD. A clear overview of these type of algorithmic solutions has been presented

in (Snow 2012), covering also the special cases of cofactor matrices being singular. In that work the term

TLS has been used in a more general sense, as it was implied by the following statement“the terms TLS and

TLS solution as used in this dissertation will mean the least squares solution within the EIV model without

linearization”.

Two different perspectives can be distinguished for the terms TLS and WTLS solution from the discussion

above. The first follows the definition of Golub and Van Loan (1980), and the second that of Snow (2012).

In this dissertation the term TLS will refer to the former and the term WTLS to the latter definition. Thus,

the main characteristics of a least squares solution of an adjustment problem within the EIV model will be

distinguished here by

the TLS approach:

- postulating equally weighted and uncorrelated observations,

- treatment of the nonlinear adjustment problem,

- direct solution derived by SVD,

- global optimization,

the WTLS approach:

- postulating individually weighted and correlated/uncorrelated observations,

- reduction of the derived normal equations,

- iterative solution,

- local optimization.

3.2.1 Nonlinear adjustments within the EIV model

In this section, the modelling of nonlinear least squares problems within the EIV model will be introduced.

Therefore, a nonlinear functional model that implicitly relates the observations li, the residuals vi(with

i= 1,2, ..., n) and the unknown parameters xj(with j= 1,2, ..., m) is under consideration. Similarly to

3.2. Total least squares 51

equation (3.54), the discussed functional model can be written as the system of nonlinear condition equations

φ1(l1+v1, ..., ln+vn, x1, ..., xm) = 0,

φ2(l1+v1, ..., ln+vn, x1, ..., xm) = 0,

φr(l1+v1, ..., ln+vn, x1, ..., xm) = 0.

(3.120)

with φidenoting nonlinear differentiable functions of the unknown parameters and the residuals. For a

certain class of such nonlinear adjustment problems5it is possible to formulate this equation system in

matrix notation by the functional model

L+vL= (A+VA)X,

dim(A) = n×m,

rank(A) = m < n,

(3.121)

where Land vLare the vectors of observations and their residuals6, respectively. Matrix Acontains the

coefficients of the functional model with respect to the unknown parameters xj, except the residuals vi

which are stored in the residual matrix VA.Xis the vector containing the unknown parameters. Here, it

is worth mentioning the differences, by definition, of the observation vector Lwith the vector of reduced

observations land the vector of unknown parameters Xwith the vector of corrections x, as it has already

been explained in section 3.1.1. The presented functional model in (3.121), accompanied by the stochastic

model of the measurements, leads to the nonlinear mathematical model known as “Errors In Variables”. A

definition of the EIV model can be found alternatively in (Golub and Van Loan 1980), (Bickel and Ritov

1987), (Van Huffel and Vandewalle 1989) or (Van Huffel and Vandewalle 1991, p. 5).

In contrast to the classical representation of the functional model of an adjustment problem (see for example

the GMM or the GHM), the latter formulation involves a coefficient matrix Athat includes measured

quantities that are under the influence of random errors. Therefore, the necessary residuals are added to the

measurements, symbolized in A, by means of the residual matrix VA. It is of course not the elements (or

the variables) of matrix Athat are subject to errors, but the measurements that are symbolized by these

elements of A. The following simple example illustrates this type of functional modelling:

Example 3.2.1. Assuming that the coordinates of a set of points in 2D have been observed in both

directions (i.e. in xand ydirection). The question is how to fit a straight line to the observed points. The

simplest representation of a straight line in plane is

y=a x +b. (3.122)

5These adjustment problems can be solved using a traditional geodetic approach, which involved a linearization of the

functional model that resulted in the formulation of the GHM and an iterative procedure for obtaining a least squares solution.

6For notation reasons residuals and residual vectors are introduced here and not the notion of errors and error vectors as

it is usual in the TLS literature, see e.g. (Golub and Van Loan 1980, Van Huffel and Vandewalle 1989).

52 Chapter 3. Solutions of nonlinear least squares problems

Adding the necessary residuals to the measurements yields

y1+vy1=a(x1+vx1) + b,

y2+vy2=a(x2+vx2) + b,

yn+vyn=a(xn+vxn) + b.

(3.123)

yand xare the observed coordinates of the 2D points, with their corresponding residuals denoted by vyand

vx.aand bare the unknown line parameters and need to be estimated. This nonlinear equation system can

be formulated equivalently by the EIV model (3.121):

L=











,vL=





vy1

vy2

vyn







,X="a

b#,A=





x11

x21

xn1







,VA=





vx10

vx20

vxn0







.(3.124)

In this adjustment example it can be easily seen that the elements of the “design matrix” Asymbolize some

of the measured quantities of the adjustment problem, that are subject to errors and thus the corresponding

residuals are listed in matrix VA. Therefore, the term/name “Errors in Variables” is misleading. Always,

the measured quantities are subject to errors and not the variables of a matrix.

It is necessary to point out that the stochastic model within the EIV model, involves both the stochastic

properties of the measurements in vector L, as well as in the design matrix A. Thus, an appropriate weight

matrix can be described by

P="PLPLA

PAL PA#,(3.125)

with the cofactor matrix

QLL ="QLQLA

QAL QA#(3.126)

and the variance-covariance matrix

ΣLL ="ΣLΣLA

ΣAL ΣA#.(3.127)

For uncorrelated observations the terms off the diagonal in matrices P,QLL and ΣLL become zero. In case of

normally distributed errors the most probable solution for the undetermined parameters of this adjustment

problem can be obtained by employing the method of least squares. Therefore, two individual solution

strategies are presented in the following sections regarding adjustment problems that can be expressed by

an EIV model. The first is direct and presumes the use of an orthogonal decomposition (TLS), while the

second is iterative but without involving any kind of linearization (WTLS).

3.2. Total least squares 53

3.2.1.1 Least squares parameter estimation using TLS

By the definition of TLS (Golub and Van Loan 1980), (Van Huffel and Vandewalle 1991, p. 33), a solution

of an adjustment problem within the EIV model is based on the minimization of the objective function

|| [VA,vl]||F=|| ˆ

A,ˆ

L]−[A,L]||F→min,(3.128)

with || ||Fbeing the Frobenius norm of a matrix, defined in (Van Huffel and Vandewalle 1991, p. 21) or in

(Felus and Burtch 2009) with

|| [VA,vl]||F=qtrace ([VA,vl]T[VA,vl]).(3.129)

The adjusted matrix ˆ

Aand vector ˆ

Lare

[ˆ

A,ˆ

L]=[A,L]+[VA,vL].(3.130)

A solution for vector Xhas been presented in the TLS literature by decomposing the augmented matrix

[A,L] (i.e. the matrix containing the coefficient matrix Aand the observation vector L) with the help of

SVD, resulting in

U Σ WT= [A,L],(3.131)

with the following matrices described in (Bronshtein et al. 2005, p. 285) :

•Matrix U∈Rn×nis orthogonal and contains the left singular vectors (u) of matrix [A,l]:

U= [u1,u2,··· ,un],with UTU=In(3.132)

and nis the number of observation equations (this is the number of rows of matrix A).

•Matrix W∈R(m+1)×(m+1) is orthogonal and contains the right singular vectors (w) of matrix [A,l]:

W= [w1,w2,··· ,wm+1],with WTW=Im+1 (3.133)

and mis the number of unknown parameters (this is the number of columns of matrix A).

•Matrix Σ∈Rn×(m+1) has the form

Σ="Σ10

0 0 #,(3.134)

with the diagonal matrix Σ1∈R(m+1)×(m+1) carrying the singular values (σ) of matrix [A,l]:

Σ1=





σ10

σ2

...

0σm+1







,(3.135)

54 Chapter 3. Solutions of nonlinear least squares problems

with σ1=. . . =σm+1 =0.

According to the procedure of (Van Huffel and Vandewalle, 1991, p. 35) or (Felus and Schaffrin 2005), the

TLS solution can be derived by scaling appropriately the right singular vector (wmin) of matrix Wthat

corresponds to the minimum singular value (σmin). This is the last column of Wand can be written as

wmin =wm+1 = [w1,m+1,··· , wm,m+1, wm+1,m+1]T.(3.136)

The TLS solution for the vector of unknowns is

X=−1

wm+1,m+1

[w1,m+1,··· , wm,m+1]T.(3.137)

It must be mentioned that the last equation is usually presented with a negative sign, as in (Felus and

Schaffrin 2005), which is caused by the form of the functional model. However, the functional model

can always be expressed in such a way that the negative sign is not necessary anymore, see for instance

(Malissiovas et al. 2016).

An equivalent formulation of the objective function

To derive the TLS solution, Schaffrin et al. (2012) and (Snow 2012) minimized the sum of the weighted

squared residuals

Ω(vL,vA) = vT

LPLvL+vT

APAvA→min,

with vA:= vec(VA),

(3.138)

for the case of non-singular cofactor matrices QLand QA. “vec” implies a function that stacks the columns

of the residual matrix VAinto one vector. Postulating uncorrelated observed quantities of equal precision,

the last equation can be formulated equivalently by

Ω(vL,vA) = vLTvL+vATvA→min,(3.139)

which is equal to the objective function (3.128). Thus, it is already visible that least squares is the method

being used for the solution of the adjustment problem. Several contributions have already pointed out that

TLS can be regarded as an approach/solution strategy for a special class of nonlinear least squares problems

and not as a different method than least squares, for example (Neitzel and Petrovic 2008), (Reinking 2008),

(Neitzel 2010) or (Malissiovas et al. 2016).

In this dissertation, the terms TLS and TLS approach will refer to the least squares solution of an adjustment

problem within the EIV model, with equally weighted and uncorrelated observations. The solution is derived

by minimizing the objective function (3.139) through SVD of the augmented matrix [A,L].

3.2.1.2 Least squares parameter estimation using WTLS

The least squares solution of adjustment problems within the EIV model can be derived using WTLS,

especially for cases of individually weighted or correlated observations. A variety of WTLS algorithms exists

3.2. Total least squares 55

in the literature, like for example in (Schaffrin and Wieser 2008), (Fang 2011), (Mahboub 2012), (Snow

2012) or (Schaffrin and Snow 2014).

For obtaining a WTLS solution, the objective function (3.138) is combined with the condition equations in

(3.121) to build the Lagrangian

K (vL,vA,k,X) = vT

LPLvL+vT

APAvA+ 2vT

LPLAvA−2kT(L+vL−(A+VA)X),(3.140)

with kdenoting the vector of Lagrange multipliers. The authors dealing with WTLS, e.g. Schaffrin and

Wieser (2008), introduced the Kronecker-Zehfuss product (symbolized by ⊗) to express the developed La-

grange function equivalently as

K(vL,vA,k,X) = vT

LPLvL+vT

APAvA+ 2vT

LPLAvA−2kTL+vL−AX −XT⊗InvA.(3.141)

The necessary stationary points can be obtained by computing the partial derivatives of K with respect

to all unknowns and setting the solution to zero, which according to (Schaffrin and Snow 2014) yields the

nonlinear normal equation system

∂K

∂vT

= 2 (PLvL+PLAvA−k) = 0,(3.142)

∂K

∂vT

= 2 (PAvA+PALvL+ (X⊗In)k) = 0,(3.143)

∂K

∂XT= 2 ATk+Im⊗kTvA=0

⇒(A+VA)Tk=0,(3.144)

∂K

∂kT=−2L+vL−AX −(XT⊗In)vA=0.(3.145)

Two individual iterative approaches are developed in the following of this section for solving the system of

normal equations (3.142)-(3.145).

WTLS - Approach 1

A solution for the unknown parameters can be obtained by reducing appropriately the derived normal

equations. According to (Snow 2012), if QLL is regular then equations (3.142) and (3.143) can be rewritten

vL= [QL−QLA (X⊗In)] k,(3.146)

vA= [QAL −QA(X⊗In)] k,(3.147)

56 Chapter 3. Solutions of nonlinear least squares problems

using a “bidirectional” substitution. Inserting the explicit expressions of the residual vectors into (3.145)

yields

L+ [QL−QLA (X⊗In)] k−AX +XT⊗In[QAL −QA(X⊗In)] k=0

⇒QL−QLA(X⊗In)−(XT⊗In)QAL +XT⊗InQA(X⊗In)

|{z }

k= (AX −L).(3.148)

Introducing approximate values for the vector of unknowns X0only on the left hand side of the last equation,

it is possible to express the vector of Lagrange multipliers by

k=Q−1

1(AX −L),(3.149)

with the auxiliary matrix7

Q1=QL−QLA(X0⊗In)−(X0T⊗In)QAL +X0T⊗InQA(X⊗In).(3.150)

Consequently, substituting kin equations (3.146) and (3.147) results in the residual vectors

vL= [QL−QLA (X⊗In)] Q−1

1(AX −L),(3.151)

vA= [QAL −QA(X⊗In)] Q−1

1(AX −L).(3.152)

Substituting vAand kin (3.144) yields

ATk+ (Im⊗k)TvA=0

⇒ATQ−1

1(AX −L) = −(Im⊗k)T[QAL −QA(X⊗In)] Q−1

|{z }

(AX −L)

⇒ATQ−1

1(AX −L) = R1(AX −L),

(3.153)

with the auxiliary matrix8

R1=−Im⊗k0TQAL −QAX0⊗InQ−1

1.(3.154)

In this last equation the vector of Lagrange multipliers needs also to be approximated with

k0=Q−1

1AX0−L.(3.155)

7The auxiliary matrix Q1coincides with that in (Snow 2012, p. 23) presented in equation (2.11) and is identical to the

product of matrices BQllBTfrom (Fang 2011, p. 22) from equation (4.16). It is usual in TLS literature that matrix Q1is

built without introducing approximate values for the vector of unknowns (X0). However, this can mislead the readers to think

that they deal with a linear problem.

8It must be pointed out that “Algorithm 1” of section 2.1 in (Snow 2012) contains a typo. This is matrix R1in “Algorithm

1” that differs from the correct definition of R1from equation (2.13c) in that dissertation.

3.2. Total least squares 57

Furthermore, rearranging appropriately equation (3.153) gives

ATQ−1

1−R1AX=ATQ−1

1−R1L,(3.156)

which under a non-singular product of matrices ATQ−1

1−R1A, yields the solution for the vector of

unknown parameters

X=ATQ−1

1−R1A−1ATQ−1

1−R1L.(3.157)

Iterative solution for the adjustment results

A least squares solution for the unknown parameters can be obtained iteratively following the WTLS pro-

cedure. The auxiliary matrices Q1and R1are functions of unknown parameters. Thus, an initial approxi-

mation X0for the vector of unknowns, gives

Q1i=QL−QLA(X0

i⊗In)−(X0

T⊗In)QAL +X0

T⊗InQA(X0

i⊗In),(3.158)

ki= (Q1i)−1(AX0

i−L),(3.159)

and

R1i=−(Im⊗ki)TQAL −QAX0

i⊗InQ1

−1

i,(3.160)

with the superscript iimplying the iteration step. A solution for the unknown parameters is obtained by

X=AT(Q1i)−1−R1iA−1AT(Q1i)−1−R1iL(3.161)

and is further introduced as an initial approximation for the unknown parameters in the next iteration step

i+1 =ˆ

Xi.

The final solutions can be obtained after a sufficient stopping criterion is fulfilled. Due to the fact that a

linearization has not been applied in any step of the adjustment, this iterative procedure will be terminated

only after the “computational error” condition has been fulfilled, as it has been defined in subsection 3.1.1.

Therefore, let the vector of corrections being computed from the difference between an estimated vector of

unknown parameters and its approximation in an iteration step, expressed by

∆Xi=ˆ

Xi−X0

The necessary condition for the iteration stop is then given by

max|∆X| ≤ . (3.162)

58 Chapter 3. Solutions of nonlinear least squares problems

The elements of the vector of corrections ∆Xwith the maximum absolute value should become smaller or

at least equal to a predefined threshold . A similar stopping criterion can be found in (Snow 2012). The

developed WTLS procedure is similar to the one presented in (Snow 2012) as “Algorithm 1” and has been

primarily developed and presented by Fang (2011) as “Algorithm 2”.

WTLS - Approach 2

Following (Snow 2012), an alternative WTLS approach exists for obtaining the least squares solution for the

vector of unknown parameters. The first step is to reformulate equation (3.144) to

ATk=−VT

⇒ATQ−1

1(AX −L) = −VT

AQ−1

1(AX −L)

⇒(A+VA)TQ−1

1A X = (A+VA)TQ−1

1L.

(3.163)

Adding the term (A+VA)TQ−1

1VAXto both sides of the last equation yields

h(A+VA)TQ−1

1(A+VA)iX= (A+VA)TQ−1

1(L+VAX).(3.164)

The vector of unknown parameters reads

X=h(A+VA)TQ−1

1(A+VA)i−1(A+VA)TQ−1

1(L+VAX).(3.165)

Iterative solution for the adjustment results

A solution can be achieved, also in this case, iteratively. An initial approximation of the vector of unknowns

X0, as well as the residual matrix V0

A, are necessary and lead to

Q1i=QL−QLA(X0

i⊗In)−(X0

T⊗In)QAL +X0

T⊗InQA(X0

i⊗In),(3.166)

Xi=hA+V0

AiT(Q1i)−1A+V0

Aii−1A+V0

AiT(Q1i)−1L+V0

AiX0

i(3.167)

and

ˆvAi=hQAL −QAˆ

Xi⊗IniQ1

−1

iAˆ

Xi−L,(3.168)

with idenoting the iteration step. The estimates for ˆ

Xand ˆvAcan be used to update the initial values for

the next iteration step

i+1 =ˆ

Xiand V0

Ai+1 = invec(ˆvAi),

with “invec” implying the inverse operator of “vec”, i.e. rearranging a vector back into a matrix. The

iterative procedure should continue until an appropriate “break-off” condition is met. The solution for the

3.3. Discussion and open questions 59

unknown parameters of equation (3.165) has been firstly presented by Fang (2011) within “Algorithm 3”

and is identical to the solution of (Snow 2012) presented in “Algorithm 2”.

3.3 Discussion and open questions

Two existing strategies have been analysed in this chapter for the solution of nonlinear adjustment problems

with the method of least squares. The first has been traditionally used in geodesy and is based on the

Gauss-Newton approach. It involves a linearization of the nonlinear functional model, which allows the

representation of the mathematical model within a GMM or a GHM. This is a local optimization approach,

as the solution from the iterative procedure converges to a local minimum of the objective function. After

the assumption of “good” initial values for the unknown parameters, the estimated least squares solution

would correspond to the global minimum.

The second approach that has been discussed is TLS, as proposed by Golub and Van Loan (1980) for the

direct solution of a class of nonlinear least squares problems using SVD. In that work the solutions of two

individual adjustment problems were presented for fitting a straight line in 2D. The least squares solution was

estimated when only the y-coordinates of the points were regarded as measurements and the x-coordinates

as error free (called ordinary least squares), in contrast to the TLS solution where both coordinates of the

points were measurements. Petrovic (2003) has already pointed out that this comparison caused a confusion

regarding the method of least squares, as many investigations draw the conclusion that TLS is a different

method than least squares, or as stated by Groen (1996) that TLS is a generalization of the least squares

method. For geodesists it has been already clear that the most important steps for the adjustment of

observations is to build a correct mathematical model and minimize the objective function composed of

correct residuals.

The relationship between nonlinear least squares problems and TLS was first placed under scrutiny by

Neitzel and Petrovic (2008) and Neitzel (2010) for two individual nonlinear adjustment problems, this of

fitting a straight line to equally weighted 2D data and for the 2D similarity transformation of coordinates. It

has been shown that the TLS solution is identical to the least squares solution within the GHM, concluding

that TLS can be regarded as a special case of least squares within the GHM. Additionally, Reinking (2008)

showed that the TLS solution can be obtained using the traditional geodetic approaches. From (Neitzel and

Petrovic 2008) and (Neitzel 2010) it can be seen that TLS is not a new method, but a new strategy or an

approach for the solution of a class of nonlinear least squares problems. Therefore, the investigations in the

next chapter try to answer the following arising questions:

- If it is possible to solve an adjustment problem with TLS and SVD, is it also possible to obtain the

same eigenvalue problem from a classical least squares approach and solve the problem directly?

- Are there additional nonlinear least squares problems (besides the generally well-known case of the

straight line fitting to equally weighted 2D data) which can be solved directly?

- Is it possible to classify those nonlinear least squares problems with a direct solution and solve them

by using a systematic approach?

Furthermore, the cases of weighted nonlinear least squares problems, as well as the solution by using WTLS

will be examined in a later chapter.

Part II - Methodological contributions

4 Direct solutions of nonlinear least squares problems

with equal weights

The current chapter is based on the study of Malissiovas et al. (2016). It is an extended version of this

article and includes the most important facts for the solution of adjustment problems with TLS. A clear

mathematical relationship is presented between TLS and direct least squares solutions. Additionally, a

systematic approach has been developed as a by-product of this investigation, for the direct solution of a

class of nonlinear adjustment problems.

4.1 Basic idea and general methodology

The centre of interest is a class of nonlinear least squares problems that can be transformed into solving a

polynomial equation (or the characteristic equation of an eigenvalue problem) and have a direct solution,

depending on the degree of the resulting polynomial. Solutions for these adjustment problems have been

presented in the TLS literature by using SVD. Therefore, the mathematical relationship is examined between

direct solutions of nonlinear least squares and solutions coming from TLS for the following four adjustment

cases:

1. Fitting of a straight line in 2D;

2. Fitting of a straight line in 3D;

3. Fitting of a plane in 3D;

4. 2D similarity transformation of coordinates.

In all four cases under investigation the coordinates in all directions are regarded as measurements. In TLS

literature these problems are often distinguished as EIV model. Moreover, a regular adjustment model is

always postulated here with the observed quantities being equally weighted and uncorrelated.

The concept of solving nonlinear least squares problems applied here is based directly on (Joviˇci´c et al.

1982), where the adjustment problem of fitting a straight line to a set of points in 3D space was examined.

In that work, the least squares estimate has been obtained by solving an eigenvalue problem, which is one

of the key elements of TLS as well. Following the solution strategy from (Joviˇci´c et al. 1982), a systematic

approach for solving the four investigated adjustment problems has been established by Malissiovas et al.

(2016). The proposed mathematical approach involves a sophisticated parametrization of the problem which

can be always solved by building a Lagrange function that results in a quadratic or cubic algebraic equation.

64 Chapter 4. Direct solutions of nonlinear least squares problems with equal weights

In the following sections the least squares solution of the proposed approach from (Malissiovas et al. 2016)

is derived and compared with the TLS solution for the four problems under investigation. It is shown that

a clearly defined mathematical model of the adjustment problem, leading to an objective function based on

the principle of least squares, is the Occam’s razor1for TLS. A flowchart presenting both ways of solving

directly the discussed nonlinear least squares problems is depicted in Figure 4.1.

Special cases

of nonlinear

least-squares

problems

Least-Squares

approach

Total Least-

Squares approach

Sophisticated

parametrization Augmented matrix

Lagrange function Singular value

decomposition

Characteristic

polynomial

Direct

solution

for the

unknowns

Figure 4.1: Flowchart for two possible direct solutions of a class of nonlinear least squares problems.

4.2 Fitting of a straight line in 2D

One of the first attempts to solve the nonlinear problem of least squares for fitting a straight line to a

set of points in plane (i.e. in the 2D space) non-iteratively was done by Adcock (1878), who provided an

elegant way of finding the direct solution to the problem. Pearson (1901) investigated the same problem by

minimizing the sum of the squared orthogonal distances of every point to the requested line and he extended

his study to fitting a plane to a set of points in the 3D space as well. On the other hand, the work of Golub

and Van Loan (1980) provided an analysis of the TLS solution followed by the contributions of Groen (1996),

Van Huffel (2004), Markovsky and Van Huffel (2007) and Schaffrin (2007). These authors always comprised

the example of the straight line fit as the most appropriate example for illustrating the idea of TLS.

At the beginning an amount of 2D data is observed, e.g. a set of points with coordinates in xand in y

direction. The question is how to fit a straight line to the measured points. The general form of a straight

1The Occam’s razor can be defined as “the principle (attributed to William of Occam) that in explaining a thing no more

assumptions should be made than are necessary. The principle is often invoked to defend reductionism or nominalism.“, see

for exampe the Oxford dictionary.

4.2. Fitting of a straight line in 2D 65

line in 2D2is (Bronshtein et al. 2005, p. 194)

ax +by +c= 0,(4.1)

with the constant line parameters a,b,c. The first two parameters denote the components of a vector

normal to the requested straight line, which intersepts the x-axis at c

aand the y-axis at c

b, as it is depicted

in Figure 4.2.

-4 -3 -2 -1 1 2 3 4

-1

Figure 4.2: Representation of a straight line in 2D using equation (4.1).

The Hessian normal form of the straight line can be derived by multiplying equation (4.1) with the normal-

ization parameter 3

f=±1

√a2+b2.(4.2)

Introducing residuals vxfor the coordinates in the xdirection and vyfor the coordinates in the ydirection

results in the nonlinear system of condition equations

a(xi+vxi) + b(yi+vyi) + c= 0,(4.3)

with i= 1, . . . , n,ndenoting the number of observed points. Since the system of equations (4.3) is under-

determined, the least squares criterion can be used for an “optimal” solution by minimizing the sum of the

squared residuals n

i=1

xi+v2

yi→min.(4.4)

2The same problem has been investigated in (Malissiovas et al., 2016), where the straight line in 2D has been represented

in coordinate form.

3The sign of the scaling factor fis opposite to the sign of parameter c, as it is explained in (Bronshtein et al. 2005, p. 195).

66 Chapter 4. Direct solutions of nonlinear least squares problems with equal weights

4.2.1 Least squares adjustment with a direct solution

In this section, a direct least squares solution is developed for fitting a straight line in 2D when both

coordinates are measurements and subject to errors. The unknown line parameters can be estimated directly

by constructing and minimizing an appropriate Lagrange function and by solving a system of homogeneous

normal equations. The goal is to show that the proposed approach leads, according to the chosen technique,

to the solution of such algebraic equations that are equivalent to TLS. The solution for fitting a straight line

in 2D by TLS is presented and analysed in the following subsection.

4.2.1.1 Definition of the problem

For solving an adjustment problem, it is important to clarify which quantities are observations and hence

subject to random errors. This is necessary in order to define the objective function of the problem in an

appropriate way. In this investigation both coordinates (in the direction of xand y) are subject to mea-

surement errors. Furthermore, let all measurements be uncorrelated and of the same precision. Therefore,

the aim is to find the shortest distance of each “measured” point to the requested straight line. As noticed

already by Adcock (1878) the same precision of all coordinate measurements corresponds to the normal

distances

i=v2

xi+v2

yi,(4.5)

as measure of deviations, with i= 1, ..., n (nis the number of observed points). This problem is depicted

in Figure 4.3. Moreover, the normal distance of every point to the requested line can be expressed by

(Bronshtein et al. 2005, p. 195)

Di=a xi+b yi+c

√a2+b2.(4.6)

There are infinitely many choices for a condition that connects the three unknown parameters a,band cfor

the general equation of the straight line. It is possible to restrict the problem to the usual a= 1 or b= 1,

but in these cases some lines in plane are excluded4. From all remaining restrictions, the most appropriate

for this study is

a2+b2= 1,(4.7)

as it allows all lines in the plane to be calculated. Geometrically this restriction can be seen as a normalization

of the orthogonal distances from every point to the requested line (i.e. the denominator of the orthogonal

distance of equation (4.6) becomes 1), which results in

Di=a xi+b yi+c. (4.8)

The developed expressions for the orthogonal distances can serve as observation equations and be utilized

as an alternative to the nonlinear condition equations (4.3) for estimating the unknown line parameters.

An important remark is that the point coordinates xiand yiin this transformed functional model can be

treated as fixed parameters. The orthogonal distances Diare serving as random deviations, thus the observed

quantities are zero pseudo-observations denoting the Euclidean distances of the points to the requested line.

4choosing a= 1, then there is no solution for lines parallel to the xdirection and for b= 1 no solution for lines parallel to

the ydirection.

4.2. Fitting of a straight line in 2D 67

012345678910

Figure 4.3: Example of fitting a straight line to points in 2D with both xand ycoordinates

subject to measurement errors.

This leads to the linear observation equations5for the distances

0i+Di=a xi+b yi+c. (4.9)

Equivalently to the objective function (4.4), the least squares criterion can be applied to obtain the minimum

normal distances from a set of points to the fitted line by minimizing the objective function

Ω(a, b, c) =

i=1

xi+v2

yi=

i=1

(a xi+b yi+c)2.(4.10)

We seek for a least squares solution for the unknown line parameters a,band cthat minimizes (4.10),

subject to the restriction (4.7). Consequently, the Lagrangian

K(a, b, c, λ) = Ω(a, b, c)−k(a2+b2−1),(4.11)

5Although the observation equations (4.9) are linear, the least squares problem is nonlinear, due to the nonlinear constraint

(4.7).

68 Chapter 4. Direct solutions of nonlinear least squares problems with equal weights

can be introduced, with kdenoting the Lagrange multiplier. Differentiating function K with respect to all

unknown parameters and setting the result to zero, yields the system of normal equations

∂K

∂a = 2 a n

i=1

i−k!+b

i=1

yixi+c

i=1

xi!= 0,(4.12)

∂K

∂b = 2 a

i=1

yixi+b n

i=1

i−k!+c

i=1

yi!= 0,(4.13)

∂K

∂c = 2 a

i=1

xi+b

i=1

yi+c n!= 0 (4.14)

and

∂K

∂k =−a2+b2−1= 0.(4.15)

Rearranging equation (4.14), yields a solution for parameter

c=−a 1

i=1

xi!−b 1

i=1

yi!,(4.16)

in terms of aand b. Introducing cinto the normal equations (4.12) and (4.13) results in the reduced system

of normal equations

a



i=1

xi2−1

n n

i=1

xi!2

−k

+b"n

i=1

xiyi−1

n n

i=1

yi!#= 0,(4.17)

a"n

i=1

xiyi−1

n n

i=1

yi!#+b



i=1

yi2−1

n n

i=1

yi!2

−k

= 0.(4.18)

If the Lagrange multiplier kwere known, then equations (4.17) and (4.18) would form a homogeneous system

of linear equations in aand b. Thus, the determinant of the equation system is equal to zero for a nontrivial

solution 





i=1

xi2−1

n n

i=1

xi!2

−k

"n

i=1

xiyi−1

n n

i=1

yi!#

i=1

xiyi−1

n n

i=1

yi!# 



i=1

yi2−1

n n

i=1

yi!2

−k





= 0,(4.19)

which leads to a quadratic characteristic equation with two real and positive solutions for the unknown

parameter k. The minimum solution, denoted by kmin, corresponds to the minimum of the Lagrange func-

tion (4.11). The solution for the unknown line parameters aand bcan be computed by substituting the

Lagrangian factor ˆ

kmin into equations (4.17)-(4.18) subject to the chosen restriction (4.7). An equivalent

solution can be obtained by transforming the equation system (4.17)-(4.18) into an eigenvalue problem.

4.2. Fitting of a straight line in 2D 69

4.2.1.2 Simplification of the problem by substituting one unknown parameter

A simplification of this adjustment problem can be easily achieved by replacing the unknown parameter c

in the functional model (4.3), which leads to more elegant expressions for the orthogonal distances than

equation (4.8). However, such a simplification makes sense only if the requested line will pass through the

center of mass of the measured points, located at

yc=1

i=1

yiand xc=1

i=1

xi.(4.20)

This has been already shown by the developed expression for parameter cin equation (4.16). A similar proof

has been also presented by Joviˇci´c et al. (1982) for the 3D line case and by Adcock (1878) and Malissiovas

et al. (2016) for the 2D line as well. Therefore, introducing parameter cfrom equation (4.16) into (4.1),

yields

a(x−xc) + b(y−yc)=0.(4.21)

The last equation can be further simplified by reducing the coordinates to a coordinate system with its origin

located at the centre of mass of the given points. Geometrically, this is equivalent to shifting the coordinate

system to a point that coincides numerically with the center of mass of the measured set of points, as it is

depicted in Figure 4.4.

Thus, denoting the reduced coordinates of a point by

y0=y−ycand x0=x−xc,(4.22)

leads to a system of simplified condition equations

a(x0

i+vxi) + b(y0

i+vyi) = 0 (4.23)

and to the simplified expression for the normal distances

Di=a x0

i+b y0

i.(4.24)

Consequently, the objective function (4.10) can be rewritten as

Ω(a, b) =

i=1

(a x0

i+b y0

i)2=a2

i=1

2+b2

i=1

2+ 2ab

i=1

ix0

i.(4.25)

We seek for a least squares solution for the unknown line parameters aand bthat minimizes equation (4.25)

subject to the restriction (4.7). The Lagrangian

K(a, b, k) = Ω(a, b)−k(a2+b2−1),(4.26)

70 Chapter 4. Direct solutions of nonlinear least squares problems with equal weights

-5 -4 -3 -2 -1 1 2 3 4 5

-5

-4

-3

-2

-1

Figure 4.4: Example of fitting a straight line to points in 2D with coordinates reduced

to the centre of mass of the measured points.

is introduced, where kis the Lagrange multiplier. Differentiating K with respect to all unknowns and setting

the partial derivatives to zero, yields the normal equations

∂K

∂a = 2a n

i=1

2−k!+ 2b n

i=1

ix0

i!= 0,(4.27)

∂K

∂b = 2a n

i=1

ix0

i!+ 2b n

i=1

2−k!= 0 (4.28)

and ∂K

∂k =−a2+b2−1= 0.(4.29)

If the Lagrange multiplier were known, then equations (4.27)-(4.28) would represent a homogeneous system

of equations which is linear in the unknown line parameters aand b. A nontrivial solution can be obtained

4.2. Fitting of a straight line in 2D 71

by setting the determinant of the equation system equal to zero:



i=1

2−k!n

i=1

ix0

i=1

ix0

i n

i=1

2−k!



= 0,(4.30)

which leads to the quadratic characteristic equation

i=1

2−k! n

i=1

2−k!− n

i=1

ix0

i!2

= 0,(4.31)

with one unknown parameter kand two real and positive solutions kmin and kmax. It can be shown that the

smaller of the two solutions for k, denoted by kmin, corresponds to the minimum of the Lagrange function

(4.26). There are two possibilities to determine a solution for the unknown parameters aand b, either by

substituting the Lagrangian factor kmin into equations (4.27)-(4.28) or by solving an eigenvalue problem.

It can be easily seen that both equations (4.19) and (4.30) result in quadratic characteristic equations that

produce identical results for the unknown Lagrange multipliers kand the requested line parameters.

4.2.2 TLS solution with SVD

An alternative solution for finding the line that fits best to a set of points in 2D can be provided by TLS

(Golub and Van Loan 1980, Groen 1996). According to (Golub and Van Loan, 1980) this solution can

be represented geometrically by minimizing the orthogonal distances, as it is depicted in Figure 4.3. It is

noteworthy in that contribution, that the least squares problem of fitting a straight line in 2D was regarded

only when the y-coordinates are observations, in contrast to the definition and solution of the least squares

problem that was presented in the previous section. Thus, the target is to provide an insight into the TLS

approach and show that the TLS solution is equivalent to the one from the developed direct least squares

approach of section 4.2.1.

Equation (4.1) for the straight line can be rewritten6as

y=β x +γ, (4.32)

with

β=−b

a, γ =−c

a.(4.33)

Such a formulation for the straight line implies that the restriction a= 1 is taken into account. Therefore,

it is not possible to describe all straight lines in plane, however, these are limited cases (in this problem

all lines that are parallel to the ydirection). Rearranging the functional model of equation (4.23), which

already contains the reduced coordinates of the measured points to the centre of mass, yields the system of

nonlinear equations

(y0

i+vyi) = β(x0

i+vxi),(4.34)

6Greek letters are chosen to describe the parameters of the straight line in the TLS approach just for readability reasons.

72 Chapter 4. Direct solutions of nonlinear least squares problems with equal weights

leading to the EIV model of equation (3.121) with the respective quantities

L=











,vL=





vy1

vy2

vyn







,X=hβi,A=











,VA=





vx1

vx2

vxn







.(4.35)

Matrix Acontains the coefficients of equation (4.34) with respect to the unknown parameter β, except of

the residuals that are introduced into matrix VA.

4.2.2.1 TLS solution based on the minimum eigenvalue principle

The TLS solution has been presented amongst others by Felus and Schaffrin (2005), as it has been already

explained in subsection 3.2.1.1. The first step is to construct the augmented matrix

[A,L] = 





1y0

2y0

ny0







(4.36)

and decompose it with the help of SVD, which yields

UΣWT= [A,L],(4.37)

where the matrices Uand Wcontain the left and right singular vectors of the augmented matrix respectively

and matrix Σis diagonal carrying the singular values. The TLS solution is in the right singular vector of

matrix Wthat corresponds to the minimum singular value:

wmin =wm+1 = [w1,m+1,··· , wm,m+1, wm+1,m+1]T,(4.38)

with the vector of unknowns computed by

X=−1

wm+1,m+1

[w1,m+1 :wm,m+1]T.(4.39)

4.2.2.2 Solution by the eigenvalue/eigenvector decomposition

To understand deeper the operation of SVD and the derivation of the adjusted unknowns of equation (4.39)

it is important to explain SVD as the solution of the eigenproblem of the symmetric non-negative definite

matrices ([A,L]T[A,L]) and ([A,L][A,L]T). According to (Lawson and Hanson 1974, p. 18) matrix W

containing the right singular vectors of [A,L] can be also estimated by the eigenvalue decomposition (EVD)

of the squared matrix ([A,L]T[A,L]) :

WΛWT= [A,L]T[A,L],(4.40)

4.2. Fitting of a straight line in 2D 73

where matrix Λis a diagonal matrix carrying the eigenvalues of [A,L]. A relationship between eigenvalues

and singular values can be found in (Golub and Van Loan 1989, p. 427), expressed as

λi=σi2,(4.41)

with λand σbeing the eigenvalues and singular values, respectively. For an explicit solution of this eigen-

problem, matrix

[A,L]T[A,L] = "x0

1x0

2··· x0

1y0

2··· y0

n#





1y0

2y0

ny0







=G,(4.42)

can be introduced, which can be rewritten in a more compact form as







i=1

ix0

i=1

ix0

i=1







.(4.43)

The eigenvalues and eigenvectors of matrix Gcan be computed from the eigenvalue problem

Gy =λy⇒(G−λI)y=0,(4.44)

as explained in (Bronshtein et al. 2005, p. 278). Iis an identity matrix and yan eigenvector of G.

The eigenvalues of Gcan be determined by searching for non-trivial solutions y6= 0, i.e. by solving the

characteristic equation of the eigenvalues



i=1

2−λ

i=1

ix0

i=1

ix0

i=1

2−λ



= 0,(4.45)

or equivalently n

i=1

2−λ! n

i=1

2−λ!− n

i=1

ix0

i!2

= 0.(4.46)

This quadratic equation has two solutions for the unknown eigenvalues, λmin and λmax. By rearranging the

eigenvalues and eigenvectors appropriately, the TLS solution for the line parameter βcan be found from

equation (4.39). Thus, by normalizing the eigenvector that corresponds to the smallest eigenvalue yields

β=

i=1

ix0

i=1

2−λmin

.(4.47)

74 Chapter 4. Direct solutions of nonlinear least squares problems with equal weights

As expected, the TLS solution is identical with the developed direct least squares solution. This can be seen

by simply comparing the developed characteristic equation of the eigenvalues (4.46), which corresponds to

the quadratic equation (4.31) from the direct least squares solution. The conclusion is that the presented

direct least squares solution for the nonlinear straight line fit in 2D already provides the exact result for

TLS.

4.3 Fitting of a straight line in 3D

The problem of fitting a straight line to points in 3D space has been examined e.g. by Kampmann and Renner

(2004), Kupferer (2004) or Sp¨ath (2004). The last from these authors developed an iterative algorithm for

minimizing the sum of the squared orthogonal distances of the measured points to the fitted line and thus

obtaining a least squares estimate for the unknown line parameters. A similar iterative solution can be

found in the investigations of Snow and Schaffrin (2016), who have solved the problem using several models

and always obtained identical results. Non-iterative adjustment solutions for the straight line fit in space

can be found in the studies of (Joviˇci´c et al. 1982) or (Drixler 1993, p. 46). Both can be transformed into

an eigenvalue problem which gives the motivation for investigating the relationship with the TLS solution.

A representation of a straight line in 3D is given in (Bronshtein et al. 2005, p. 217) , expressed as

y−y0

a=x−x0

b=z−z0

c,(4.48)

for a line that passes through a point with coordinates x0,y0and z0and is parallel to a direction vector

with components a,band c. The target is to minimize the errors in all x,yand zcoordinates, which implies

the nonlinearity of the system of condition equations

a(xi+vxi−x0)−b(yi+vyi−y0)=0,

b(zi+vzi−z0)−c(xi+vxi−x0)=0,

c(yi+vyi−y0)−a(zi+vzi−z0) = 0,

(4.49)

with i= 1, . . . , n,nbeing the number of observed points in 3D space. The best line passing through the 3D

point cloud can be obtained by minimizing the sum of squared residuals from all coordinates

i=1

xi+v2

yi+v2

zi→min.(4.50)

In order to solve the problem stated above, two additional constraints (or restrictions between the unknown

parameters) have to be taken into account, as it has been already discussed by Snow and Schaffrin (2016).

However, the selection of a proper restriction is avoided at this point. An appropriate parametrization of

the problem is attempted that involves a substitution of some unknown parameters with known, following

the same procedure presented in (Malissiovas et al. 2016).

4.3. Fitting of a straight line in 3D 75

4.3.1 Direct least squares solution for fitting a straight line in 3D

Analogously to the investigated case of the previous section (fitting of a straight line in 2D), the same

precision of all coordinate measurements would correspond to the normal distances

i=v2

xi+v2

yi+v2

zi,(4.51)

as measures of deviations between the observed points and the requested line. As explained in (Bronshtein

et al. 2005, p. 218), the squared normal distance between a point and a line in space is

D2=[a(x−x0)−b(y−y0)]2+ [b(z−z0)−c(x−x0)]2+ [c(y−y0)−a(z−z0)]2

a2+b2+c2.(4.52)

Furthermore, it is possible to reduce the number of the unknowns of the model by replacing the parameters

x0,y0and z0with the coordinates of the centre of mass7

yc=1

i=1

yi, xc=1

i=1

xi, zc=1

i=1

zi,(4.53)

of the n3D points. Therefore, equation (4.48) can be rewritten as

y−yc

a=x−xc

b=z−zc

c.(4.54)

Solution with coordinates reduced to the centre of mass

A reduction of all coordinates to the centre of mass leads to the simplified functional model

a=x0

b=z0

c,(4.55)

and the condition equations

a(x0

i+vxi)−b(y0

i+vyi)=0,

b(z0

i+vzi)−c(x0

i+vxi)=0,

c(y0

i+vyi)−a(z0

i+vzi)=0,

(4.56)

with x0,y0and z0being coordinates reduced to the centre of mass of the 3D point cloud. The squared normal

distances of the reduced points can be formulated as

i=(a x0

i−b y0

i)2+ (b z0

i−c x0

i)2+ (c y0

i−a z0

i)2

a2+b2+c2.(4.57)

In this case the most appropriate restriction between the unknown parameters can be selected as

a2+b2+c2= 1 (4.58)

7The proof that this parameter replacement is allowed can be found in (Joviˇci´c et al. 1982).

76 Chapter 4. Direct solutions of nonlinear least squares problems with equal weights

and makes possible to derive a simplified expression for the squared orthogonal distances

i= (a x0

i−b y0

i)2+ (b z0

i−c x0

i)2+ (c y0

i−a z0

i)2.(4.59)

Therefore, the best line can be estimated by minimizing the objective function

Ω(a, b, c) =

i=1

xi+v2

yi+v2

zi=

i=1

[(a x0

i−b y0

i)2+ (b z0

i−c x0

i)2+ (c y0

i−a z0

i)2].

(4.60)

In order to derive the minimum of the objective function under the restriction of equation (4.58), the

Lagrangian

K(a, b, c, k) = Ω(a, b, c)−k(a2+b2+c2−1) (4.61)

can be built. A differentiation of function K with respect to all unknowns and setting the resulting partial

derivatives to zero, leads to the system of normal equations

∂K

∂a = 2a n

i=1

2−k!−2b

i=1

ix0

i−2c

i=1

iz0

i= 0,(4.62)

∂K

∂b =−2a

i=1

ix0

i+ 2b n

i=1

2−k!−2c

i=1

iz0

i= 0,(4.63)

∂K

∂c =−2a

i=1

iz0

i−2b

i=1

iz0

i+ 2c n

i=1

2−k!= 0,(4.64)

and

∂K

∂k =−a2+b2+c2−1= 0.(4.65)

Equations (4.62) to (4.64) can be interpreted as a homogeneous system of equations, with the solution for

parameter kobtained from 

(p1−k)q1q2

q1(p2−k)q3

q2q3(p3−k)



= 0,(4.66)

with the respective elements

p1=

i=1

2, p2=

i=1

2, p3=

i=1

q1=−

i=1

ix0

i, q2=−

i=1

iz0

iand q3=−

i=1

iz0

(4.67)

Equation (4.66) is a cubic characteristic equation8with the unknown parameter k. The adjusted line

parameters a,band ccan be estimated by substituting kmin into equations (4.62) - (4.64) under the

8see for example (Bronshtein et al. 2005, p. 261) for the calculation of the value of a determinant of third order.

4.3. Fitting of a straight line in 3D 77

specified restriction or by transforming the equation system into an eigenvalue problem.

4.3.2 TLS fitting of a straight line in 3D

A TLS solution for fitting a straight line in 3D using SVD has been presented for the first time in (Malissiovas

et al. 2016). In this section an equivalent solution is presented using a slightly modified functional model.

In order to build the adjustment model of equation (3.121) it is necessary to derive an appropriate functional

model. Thus, rearranging appropriately equation (4.56) yields

α(x0+vxi)−β(y0+vyi)=0,

0α−β(z0+vzi) = x0+vxi,

−α(z0+vzi)+0β=y0+vyi,

(4.68)

with

α=−a

cand β=−b

c.(4.69)

The derived system of nonlinear equations can be expressed within an EIV model, with the respective

quantities being













,vL=







vx1

vy2

vxn

vyn







,X="α

β#,A=







1−y0

0−z0

−z0

n−y0

0−z0

−z0







,VA=







vx1−vy1

0−vz1

−vz10

vxn−vyn

0−vzn

−vzn0







.(4.70)

The first column of matrix Acontains the coefficients of the functional model (4.68) with respect to the un-

known parameter α, whilst in the second column are the coefficients with respect to the unknown parameter

β.

The augmented matrix is

[A,L] =







1−y0

0−z0

1x0

−z0

10y0

n−y0

0−z0

nx0

−z0

n0y0







.(4.71)

78 Chapter 4. Direct solutions of nonlinear least squares problems with equal weights

The right singular vectors of the augmented matrix can be estimated by the eigenvalue/eigenvector decom-

position. Thus, the squared augmented matrix is

[A,L]T[A,L] = 





p1q1q2

q1p2q3

q2q3p3





=G,(4.72)

with the respective elements pand qcorresponding to those of equation (4.67). The eigenvalues and eigen-

vectors of matrix Gcan be found by employing the generalised eigenvalue problem, which results in



(p1−λ)q1q2

q1(p2−λ)q3

q2q3(p3−λ)



= 0.(4.73)

The derived determinant provides the characteristic equation of the eigenvalues. In this case this is a cubic

characteristic equation with three solutions for the unknown eigenvalues λ. The adjusted line parameters ˆα

and ˆ

βcan be found by employing the minimum eigenvalue principle of equation (4.38). The right eigenvector

corresponding to the smallest eigenvalue of matrix Gholds the TLS solution for the 3D line parameters.

Obviously the elements of matrix Gcoincide with those from the direct least squares solution. The de-

terminants (4.66) and (4.73) are equal, leading to identical characteristic equations. Therefore, the TLS

solution for the nonlinear problem of the straight line fit in 3D space is identical with the presented direct

least squares solution.

4.4 Fitting of a plane in 3D

The third case under investigation is the nonlinear problem of fitting a plane to a 3D point cloud with all

coordinates being subject to measurement errors. Also for this case all coordinates of the points are regarded

as uncorrelated observations of equal precision. Several results from various TLS algorithms were presented

for this problem in (Schaffrin et al. 2006), which resulted in a slight deviation from the least squares solution.

Therefore, a mathematical relation between the TLS and least squares solution is built for fitting a plane in

3D, following the same line of thinking as in the previous application cases.

The general equation of a plane in 3D can be found in (Bronshtein et al. 2005, p. 214), which reads

ax +by +cz +d= 0,(4.74)

with x,yand zbeing the 3D coordinates of a point that lies in the plane. a,b,cand dare the plane

parameters. Assuming that the coordinates in all directions are observed quantities, a system of nonlinear

condition equations emerges.

a(xi+vxi) + b(yi+vyi) + c(zi+vzi) + d= 0 (4.75)

4.4. Fitting of a plane in 3D 79

Applying the least squares criterion, the plane that fits best to the observed point cloud can be estimated

by minimizing the sum of squared residuals

i=1

xi+v2

yi+v2

zi→min.(4.76)

4.4.1 Direct least squares solution for fitting a plane in 3D

Fitting a plane to points in 3D, with all coordinates being subject to measurement errors, is similar to the

case of fitting a straight line in plane. Therefore, the objective function (4.76) is equal to the sum of squared

normal distances of the points to the requested plane

i=1

xi+v2

yi+v2

zi=

i=1

i,(4.77)

with the normal distances being expressed by

Di=axi+byi+czi+d

√a2+b2+c2,(4.78)

as it is shown in (Bronshtein et al. 2005, p. 214). A simplification of the problem is also in this case possible

by replacing one unknown parameter. Thus, reducing the coordinates of the point cloud to the centre of

mass (see equation 4.53) results in the substitution of parameter dand the simplified functional model

ax0+by0+cz0= 0,(4.79)

with x0,y0and z0denoting the coordinates of a point reduced to the centre of mass of the 3D point cloud.

Solution with coordinates reduced to the centre of mass

The developed functional model (4.79) leads to the system of nonlinear condition equations

a(x0

i+vxi) + b(y0

i+vyi) + c(z0

i+vzi) = 0 (4.80)

and the orthogonal distances

Di=ax0

i+by0

i+cz0

√a2+b2+c2.(4.81)

Since equation (4.81) can be scaled by an arbitrary factor, which means that only two out of the three

parameters a,bor care independent, an appropriate constraint would be

a2+b2+c2= 1.(4.82)

Thus, the expression for the normal distances can be rewritten as

Di=ax0

i+by0

i+cz0

i(4.83)

80 Chapter 4. Direct solutions of nonlinear least squares problems with equal weights

and the objective function under minimization is

Ω(a, b, c) =

i=1

(ax0

i+by0

i+cz0

i)2.(4.84)

A least squares solution for the unknown parameters a,band cis required that minimizes Ω(a, b, c), subject

to the constraint (4.82). The Lagrangian

K(a, b, c, k) = Ω(a, b, c)−k(a2+b2+c2−1),(4.85)

can be built. The differentiation of K with respect to the unknown plane parameters leads, after setting the

partial derivatives to zero, to the system of normal equations

∂K

∂a = 2 "a n

i=1

2−k!+b n

i=1

ix0

i!+c n

i=1

iz0

i!#= 0,(4.86)

∂K

∂b = 2 "a n

i=1

ix0

i!+b n

i=1

2−k!+c n

i=1

iz0

i!#= 0,(4.87)

∂K

∂c = 2 "a n

i=1

iz0

i!+b n

i=1

iz0

i!+c n

i=1

2−k!#= 0 (4.88)

and

∂K

∂k =−a2+b2+c2−1= 0.(4.89)

The solution for the Lagrange multiplier can be derived from



(r1−k)s1s2

s1(r2−k)s3

s2s3(r3−k)



= 0,(4.90)

with the quantities

r1=

i=1

2, r2=

i=1

2, r3=

i=1

s1=

i=1

ix0

i, s2=

i=1

iz0

iand s3=

i=1

iz0

i.(4.91)

This is a cubic characteristic equation and has three solutions for k. The unknown plane parameters a,b

and ccan be estimated, either by inserting ˆ

kmin in equations (4.86) - (4.88), under the restriction (4.82), or

by transforming the equation system into an eigenvalue problem. The presented direct solution for fitting a

plane in 3D coincides with that of Linkwitz (1976).

4.4. Fitting of a plane in 3D 81

4.4.2 TLS fitting of a plane in 3D

The TLS solution for fitting a plane in 3D can be derived analogously to the investigations of Schaffrin

et al. (2006), following however a different functional model. Based on the presented approach for obtaining

a TLS estimate, the functional model of equation (4.79) can be rewritten as

z0=−a

cx0−b

cy0⇒z0

i=αx0+βy0,(4.92)

with

α=−a

cand β=−b

c.(4.93)

Therefore, the system of condition equations (4.80) becomes

z0+vzi=α(x0+vxi) + β(y0+vyi),(4.94)

and can be expressed by an EIV model, after introducing the following matrices:

L=











,vL=





vz1

vz2

vzn







,X="α

β#,A=





1y0

2y0

ny0







,VA=





vx1vy1

vx2vy2

vxnvyn







.(4.95)

The first column of the coefficient matrix Acontains the coefficients of the condition equations (4.94) with

respect to the unknown parameter αwhile in the second column are the coefficients with respect to β.

Furthermore, it is possible to build the augmented matrix

[A,L] = 





1y0

1z0

2y0

2z0

ny0

nz0







(4.96)

and the square matrix

[A,L]T[A,L] = 





1x0

2. . . x0

1y0

2. . . y0

1z0

2. . . z0













1y0

1z0

2y0

2z0

ny0

nz0







=G.(4.97)

This is equivalent to

G=





r1s1s2

s1r2s3

s2s3r3





,(4.98)

82 Chapter 4. Direct solutions of nonlinear least squares problems with equal weights

with the respective elements being identical to those of equation (4.91). The eigenvalues and eigenvectors

of matrix Gcan be computed from the generalised eigenvalue problem, by solving



(r1−λ)s1s2

s1(r2−λ)s3

s2s3(r3−λ)



= 0.(4.99)

This characteristic cubic equation has three solutions for the eigenvalue λ. The unknown parameters αand

βcan be estimated using the minimum eigenvalue principle. The presented least squares solution for fitting

a plane in 3D coincides perfectly with the TLS solution. Equations (4.90) and (4.99) are identical and only

the name of the unknown (kor λ) is different.

4.5 2D similarity transformation of coordinates

The 2D similarity transformation of coordinates is one of the most frequent geodetic and photogrammetric

applications. A first attempt to estimate the TLS solution of the problem using SVD was that of Felus and

Schaffrin (2005) by presenting a Strucured TLS (STLS) algorithm for solving the problem. Neitzel (2010)

has shown that this algorithm needs to be modified for estimating the correct solution. For this reason,

the same problem has been examined again by Schaffrin et al. (2012). Their modified solution is iterative,

however, they state that a TLS solution using SVD could be possible. Here, a new approach is presented

for a direct solution of the problem (least squares and also TLS solution via SVD).

The well-known equation for the planar coordinate transformation is

"Xi

Yi#="cos φ−sin φ

sin φcos φ#" µ0

0µ#" xi

yi#+"tx

ty#,(4.100)

see for example (Felus and Schaffrin 2005). This can be written equivalently as

Xi= (µcos φ)xi−(µsin φ)yi+tx

Yi= (µsin φ)xi+ (µcos φ)yi+ty,(4.101)

with i= 1, ..., n, where nis the number of observed homologous points in the source xy and target XY

system. The unknown transformation parameters between the two systems are:

-φ= rotation angle

-µ= scale factor

-tx= translation in xdirection

-ty= translation in ydirection

Introducing the parameters

ξ1=µcos φand ξ2=µsin φ, (4.102)

4.5. 2D similarity transformation of coordinates 83

it is possible to obtain the simplified equation system

Xi=ξ1xi−ξ2yi+tx,

Yi=ξ2xi+ξ1yi+ty.(4.103)

If all point coordinates are considered as measured quantities, the necessary residuals are introduced in the

functional model resulting in the nonlinear system of condition equations

Xi+vXi−ξ1(xi+vxi) + ξ2(yi+vyi)−tx= 0,

Yi+vYi−ξ2(xi+vxi)−ξ1(yi+vyi)−ty= 0.(4.104)

The least squares criterion can be employed for an “optimal” solution by minimizing the sum of the squared

residuals in both coordinate systems:

i=1

xi+v2

yi+v2

Xi+v2

Yi→min.(4.105)

4.5.1 Direct least squares solution for the 2D similarity transformation

For a realistic functional model the translation vector has to be present. However, the substitution of the

translations from the functional model is possible and this can be proven in the same way as for the previous

investigated cases, by showing that

tx=Xc−ξ1xc+ξ2yc,

ty=Yc−ξ2xc−ξ1yc,(4.106)

with xcand ycdenoting the coordinates of the centre of mass of the points in the source system and Xcand

Ycin the target system, computed by

xc=1

i=1

xi, yc=1

i=1

yi, Xc=1

i=1

Xi, Yc=1

i=1

Yi.(4.107)

Therefore, a reduction of all coordinates to their centre of mass leads to a simplified functional model

i+vXi−ξ1(x0

i+vxi) + ξ2(y0

i+vyi)=0,

i+vYi−ξ2(x0

i+vxi)−ξ1(y0

i+vyi)=0.(4.108)

Appropriate parametrization of the problem

In order to obtain a direct solution in the same manner as in the previous sections, an additional unknown

parameter has to be taken into consideration 9. For this reason the functional model (4.108) can be rewritten

9Here the problem is overparametrized and a meaningful constraint between the unknown parameters is chosen. However,

this is not necessary for obtaining a solution but for being consistent with the solution strategy that was followed in the

adjustment problems of the previous sections. Especially for showing that the TLS solution is identical with the solution from

the proposed direct least squares approach.

84 Chapter 4. Direct solutions of nonlinear least squares problems with equal weights

γ(X0

i+vXi) + α(x0

i+vxi)−β(y0

i+vyi)=0,

γ(Y0

i+vYi) + β(x0

i+vxi) + α(y0

i+vyi)=0,(4.109)

with

ξ1=−α

γand ξ2=−β

γ.(4.110)

The enforced additional unknown parameter (γcan be seen as an additional parameter) requires a restriction

between the unknowns. For the purposes of this research, a “meaningful” constraint is chosen as

α2+β2+γ2= 1.(4.111)

Solution with coordinates reduced to the centre of mass

The coordinates of the points in both coordinate systems are subject to measurement errors. By employing

the least squares criterion, the goal is to minimize the errors in all homologous points and in both directions.

This is equivalent to the minimization of the Euclidean distances between the points in the target system

and the transformed homologous points from the source system

i=1

xi+v2

yi+v2

Xi+v2

Yi=

i=1

i→min,(4.112)

with the squared distances between two homologous points expressed as

i= (γ X0

i+α x0

i−β y0

i)2+ (γ Y 0

i+β x0

i+α y0

i)2.(4.113)

Therefore, the objective function under minimization becomes

Ω(α, β, γ) =

i=1

i=1 h(γ X0

i+α x0

i−β y0

i)2+ (γ Y 0

i+β x0

i+α y0

i)2i.(4.114)

A least squares solution for the unknown transformation parameters α,βand γis desired, that minimizes

Ω under the restriction (4.111). Thus, the Lagrange function can be built as

K(α, β, γ, k) = Ω(α, β, γ)−k(α2+β2+γ2−1).(4.115)

Differentiating the Lagrangian K with respect to all unknown parameters and setting the partial derivatives

to zero, yields the system of normal equations

∂K

∂α = 2 "α n

i=1

2−k!+γ n

i=1

iX0

i=1

iY0

i!#= 0,(4.116)

∂K

∂β = 2 "β n

i=1

2−k!+γ n

i=1

iY0

i−

i=1

iX0

i!#= 0,(4.117)

4.5. 2D similarity transformation of coordinates 85

∂K

∂γ = 2 "α n

i=1

iX0

i=1

iY0

i!+β n

i=1

iY0

i−

i=1

iX0

i!+γ n

i=1

2−k!#= 0 (4.118)

and

∂K

∂k =−α2+β2+γ2−1= 0.(4.119)

Similarly to the previous cases it is possible to estimate kby solving



(v1−k)w1w2

w1(v1−k)w3

w2w3(v2−k)



= 0,(4.120)

which leads to a cubic equation with one unknown parameter. The respective elements are

v1=

i=1

2, v2=

i=1

w1= 0, w2=

i=1

iX0

i=1

iY0

iand w3=

i=1

iY0

i−

i=1

iX0

(4.121)

The unknown transformation parameters α,βand γcan be estimated either by substituting parameter kmin

into equations (4.116) - (4.118), under the condition (4.119), or by transforming the equation system into

an eigenvalue problem.

4.5.2 TLS 2D similarity transformation

In this section the TLS solution of the 2D similarity transformation is presented. By utilizing the functional

model of equation (4.108) and following the same approach as in the presented TLS solutions (subsections

4.2.2, 4.3.2, 4.4.2), the EIV model is introduced with the relevant matrices







1−y0

1x0

n−y0

nx0







,VA=







vx1−vy1

vy1vx1

vxn−vyn

vynvxn







,L=













,vL=







vX1

vY1

vXn

vYn







,ˆ

X="ˆ

ξ1

ξ2#.(4.122)

86 Chapter 4. Direct solutions of nonlinear least squares problems with equal weights

The augmented matrix [A,L] can be described in this case by

[A,L] =







1−y0

1X0

1x0

1Y0

n−y0

nX0

nx0

nY0







.(4.123)

The right eigenvectors of the augmented matrix can be derived by the eigenvalue/eigenvector decomposition

of the squared matrix

[A,L]T[A,L] = 





v1w1w2

w1v1w3

w2w3v2





=G,(4.124)

with the respective elements being equal to those of equation (4.121). The solution for the transformation

parameters can be determined from the generalised eigenvalue problem by solving



(v1−λ)w1w2

w1(v1−λ)w3

w2w3(v2−λ)



= 0.(4.125)

As expected, equation (4.125) is the same as equation (4.120) and the resulting cubic polynomial equation

for the solution of kis identical to the characteristic equation of the eigenvalues λ. The translation terms

txand tycan be computed by substituting the estimated parameters into equation (4.106).

4.6 General formulation and classification

The normal equations of the discussed nonlinear least squares problems in this chapter can be transformed

into an eigenvalue problem and be solved directly when the characteristic equation is a polynomial of degree

four or less. Such adjustment cases are the fitting of a straight line in 2D and 3D, the fitting of a plane in 3D

and the 2D similarity transformation of coordinates. The following common features have been identified

for the direct solution of these problems:

1. The measured quantities in all adjustment cases were equally weighted and uncorrelated.

2. In the beginning a nonlinear and over-parametrised functional model was used to express each indi-

vidual problem, see for example equation (4.3) for fitting of a straight line in 2D.

3. Choosing an appropriate restriction between the unknown parameters for the adjustment of each inves-

tigated problem, it was possible to obtain an apparently linear relationship between the observations

and the unknowns.

4. A reduction of the observed coordinates to the centre of mass was in any case proven to be admissible.

This reduction leads everytime to the substitution of some unknown parameters with known ones.

However, it must be mentioned that this parameter substitution is not necessary and is only performed

for simplifying the problem. Thus, an equivalent solution can be obtained from the respective normal

equations, including all unknown parameters.

4.6. General formulation and classification 87

5. The developed objective function for minimising the sum of squared residuals leads to a homogeneous

system of normal equations which is linear with respect to the unknown parameters and has a direct

solution (in case that the derived characteristic equation is a polynomial of degree four or less).

6. The derived direct solutions have been proven to be identical with the TLS solutions obtained by using

SVD.

These features can be used in the future as criteria for identifying easier and quicker those nonlinear least

squares problems that belong to this class. A general formulation of these adjustment problems can be

considered, based on the replacement of the “original” nonlinear functional model with an apparently linear

one, see features (2.) and (3.) above.

General formulation in matrix notation

All discussed adjustments can be solved in their “original” nonlinear form iteratively by linearizing the

condition equations and expressing the problem within a GHM. However, in all cases the problem could be

“transformed” in such a way, so that it could be expressed in matrix notation by the system of observation

equations

L+v=A X,(4.126)

with the nonlinear constraint between the unknown parameters being described by the quadratic function

XTX= 1.(4.127)

Vector Lcontains pseudo-observations (zero elements). Vector vincludes the residuals of the pseudo-

observations, which are the orthogonal distances. The design matrix Acontains the coefficients of the linear

observation equations with respect to the unknown parameters in each problem and vector Xholds the

unknown parameters to be estimated.

In order to illustrate this type of functional modeling, the example of fitting of a straight line in 2D can be

considered. For instance, northogonal distances of the measured points to the straight line are describing

the functional model of the problem, with the observation equations

01+D1=a x0

1+b y0

02+D2=a x0

2+b y0

0n+Dn=a x0

n+b y0

(4.128)

10In strict notation the product of matrices will also be a matrix, which in this case will have only one element.

88 Chapter 4. Direct solutions of nonlinear least squares problems with equal weights

under the constraint a2+b2= 1. This system of observation equations can be expressed in matrix notation,

as in equation (4.126), with the respective matrices being defined by

L=











,v=











,X="a

b#,A=





1y0

2y0

ny0







.(4.129)

Taking into account the stochastic information of the measured quantities, it can be seen that the problem

can be expressed within a GMM with a quadratic constraint (as in the presented solution of section 3.1).

Avoiding any kind of linearization, a least squares solution can be obtained by minimizing the objective

function

Ω(v,X) = vTv→min,(4.130)

or taking into account the constraint (4.127), by finding the minimum of the Lagrange function

K(v,X,k) = vTv−k(XTX−1) →min,(4.131)

with kdenoting the Lagrange multiplier. Rearranging the observation equations (4.126) and substituting

the solution for the residuals in the Lagrangian yields

K(X,k)=(AX −L)T(AX −L)−k(XTX−1)

⇒K(X,k) = XTATAX −2LTAX +LTL−k(XTX−1).

(4.132)

Taking into consideration that the vector of pseudo-observations Lcontains only zero values, the last function

can be equivalently written as

K(X,k) = XTATAX −k(XTX−1).(4.133)

The minimization of the derived Lagrangian can be found already in (Perovi´c 2005, p. 33) as the solution

of mathematical problems in quadratic forms, with their extrema being derived using an EVD. This can be

proven by taking the partial derivatives of K with respect to the unknowns and setting the solution to zero,

which yields the normal equation system

∂K

∂XT= 2ATAX −2kX=0

⇒ATAX −kX=0,(4.134)

∂K

∂k =XTX−1 = 0.(4.135)

4.7. Discussion and open questions 89

In equation (4.134), parameter kcan be seen as an eigenvalue and Xas an eigenvector of the squared matrix

ATA. The solution can be computed from the generalised eigenvalue problem

ATAX −kX=0⇒(ATA−kI)X=0,(4.136)

with Idenoting an identity matrix. The eigenvalues of matrix ATAcan be determined by searching for

non-trivial solutions X6= 0, i.e. by solving the characteristic equation of the eigenvalues, or equivalently by

det(ATA−kI) = 0.(4.137)

From the latter developments it can be seen that all discussed problems can be expressed within a GMM,

while finding the minimum of the objective function is equivalent to finding the minimum of a quadratic

function by employing EVD.

4.7 Discussion and open questions

In this chapter two individual solution strategies have been examined, the systematic approach for direct least

squares solutions that has been established already in (Malissiovas et al. 2016) and TLS. A mathematical

relationship between the two approaches has been presented by comparing their solutions for four nonlinear

least squares problems, the fitting of a straight line in 2D and 3D, the fitting of a plane in 3D, as well as

the 2D similarity transformation of coordinates. The discussed adjustment problems have been identified

as such, that belong to a certain class of nonlinear least squares and can be transformed into solving a

polynomial equation. Thus, depending on the polynomial’s degree a direct solution can be possible11.

An “optimal” estimate for the unknowns is derived by employing the method of least squares and minimizing

a well-defined Lagrange function. In all discussed cases the normal equations can be solved with various

techniques, for example SVD or EVD and by solving a characteristic equation, which was always identical to

the characteristic equation of the eigenvalues from the corresponding TLS solution. The developed approach

provides a deep understanding of the concept of TLS for the solution of nonlinear least squares problems.

It has been already known from (Neitzel and Petrovic 2008), (Neitzel 2010), as well as (Reinking 2008), that

TLS is not a new method per se but a solution strategy for a class of nonlinear least squares problems. In

addition, the presented direct solutions of this chapter reveal that TLS is an algorithmic approach for the

solution of a class of nonlinear least squares problems using SVD.

Nevertheless, in order to obtain a direct solution for the discussed adjustment problems, either using the

proposed systematic approach or with TLS and SVD, it is always assumed that the observations are un-

correlated and have been obtained with equal precision. When postulating a different precision for each

observation, then different solution strategies can be utilized. A weighted least squares solution can be

obtained in this case using, for example, the Gauss-Newton approach from the traditional geodetic solutions

of section 3.1, or by employing one of the WTLS algorithms from section 3.2.1.2.

11Direct least squares solution have been presented for the investigated adjustment problems, as the roots of the derived

polynomials could be computed directly using known formulas from mathematics, see for more information (Bronshtein et al.

2005, p. 62 ff.)

90 Chapter 4. Direct solutions of nonlinear least squares problems with equal weights

The following questions arise out of the findings from this chapter:

- If it is possible to obtain directly a solution for this class of nonlinear least squares problems by

transforming them into solving a polynomial equation, is it also possible to obtain a similar solution

by transforming the weighted least squares problem?

- Are there specific weighted cases of nonlinear least squares problems (besides the generally well-known

case of equally weighted observations) which can be solved directly?

- Is it possible to detect those weighted nonlinear least squares problems with a direct solution and solve

them by using a systematic approach?

- In cases where a direct weighted least squares solution is not possible, what are the alternative ways?

Is it possible to obtain an iterative solution without making any use of linearization of the problem?

Therefore, possible direct solutions are investigated in the next chapter for the discussed class of adjustment

problems by postulating different weighting cases for the measured quantities. In these scenarios where a

direct solution is not possible, an iterative approach is examined that does not involve a linearization of the

problem at any step of the procedure.

5 Direct and iterative solutions of weighted nonlinear

least squares problems

5.1 Basic idea and general methodology

Direct solutions have been presented in the previous chapter for a special class of nonlinear least squares

problems. The established solution strategy involved the transformation of the normal equations into the

solution of a quadratic or cubic algebraic equation (characteristic equation). The mathematical derivations

of those solutions were based on the fact that uncorrelated observations have been obtained with equal

precision. Thus, the postulated weights of the adjustment were in all cases equal.

The investigations in this chapter focus on the aforementioned questions from section 4.7. Three of the

adjustment problems that belong to this class with a direct solution are examined here1, namely:

- Fitting of a straight line in 2D;

- Fitting of a plane in 3D;

- 2D similarity transformation of coordinates,

and for four individual weighting cases for each problem:

1. Same precision for the coordinates in each direction;

2. Individual precision for the coordinates of each point;

3. Individual precision for each coordinate;

4. Individual precision and correlations between the observations (covering also the cases of singular

cofactor matrices).

Direct weighted least squares solutions are proposed in this chapter for the first time for the discussed class

of problems. The general idea of these solutions corresponds to the methodology suggested in (Malissiovas

et al. 2016). It involves a parameterization of the problem and the formation of a Lagrange function that

results in a quadratic or cubic equation. In cases where the problem cannot be transformed to such algebraic

equations, modern iterative algorithms are clearly presented that do not require a linearization of the original

1The case of fitting a straight line in 3D will not discussed further in this chapter in order to avoid repetition, as it has

similarities to the problems of fitting a straight line in 2D and fitting a plane in 3D and therefore could be easily solved following

the same strategy that is presented for the latter problems.

92 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

problem and are based mainly on the approaches from (York 1966) and (Petrovi´c et al. 1983). The iterative

solutions of the developed algorithms can be compared with those from WTLS. Figure 5.1 depicts a flowchart

with the proposed solutions for this class of weighted nonlinear least squares problems.

Class of nonlinear

least-squares problems

(Malissiovas et al 2016)

Constant weights

in each direction

Individual weight

for each point

Individual weight

for each coordinate

Individually

weigthed and cor-

related coordinates

Scaling of the

coordinate system

Sophisticated

parametrization

Lagrange function

Characteristic

polynomial

Sophisticated

parametrization

including weights

Weighted La-

grange function

Characteristic

polynomial

Weigthed La-

grange function

Differentiation

with respect to

all unknowns

Reduced

normal

equations

Direct weigthed

least-squares solution

Iterative weigthed

least-squares solution

without linearization

Figure 5.1: Flowchart for possible direct and iterative solutions of a class of nonlinear weighted least squares

problems.

5.2 Fitting of a straight line in 2D

One of the first attempts to solve the nonlinear least squares problem of fitting a straight line to a set of

points in 2D, that have been observed with different precisions, goes back to (York, 1966). In that article

the problem has been expressed by a pseudo-cubic equation and a solution has been obtained iteratively

and without any kind of linearization. It was further Williamson (1968) who pointed out that the same

adjustment problem can be formulated either as a pseudo-quadratic or even as a pseudo-linear equation. A

year later York (1968) published an iterative least squares solution including this time correlations between

the observations. A thorough investigation and a detailed explanation of the same problem can be found in

(Petrovi´c et al., 1983) as well.

Schaffrin and Wieser (2008) have presented a weighted TLS algorithm for linear regression, that inspired

Shen et al. (2011) and Amiri-Simkooei and Jazaeri (2012) to develope modern TLS algorithms for solving

the same problem. On the other hand, Neitzel and Petrovic (2008) presented a solution within the linearized

GHM, following the traditional geodetic procedure for solving nonlinear least squares problems. This includes

5.2. Fitting of a straight line in 2D 93

a linearization of the condition equations and an iterative process that stops after a predefined threshold,

as it has been discussed in chapter 2.

Point of beginning in this investigation are the coordinates of a set of points in 2D, assuming that they

have been observed with different precisions. The functional model of this problem can be expressed by

the general form of a straight line in 2D, presented already by equation (4.1) in section 4.2. Including the

necessary residuals in the measured quantities results in the nonlinear condition equations

a(xi+vxi) + b(yi+vyi) + c= 0,(5.1)

with i= 1, . . . , n, where nis the number of measured points. The selection of an appropriate restric-

tion between the unknown parameters will be considered at a later point in this section. In the case of

measurements that have been obtained with different precisions, σyifor the y-coordinates and σxifor the

x-coordinates, a least squares solution for the unknown line parameters could be based on the minimization

of the sum of weighted squared residuals

Ω(vxi, vyi) =

i=1

pxiv2

xi+pyiv2

yi→min,(5.2)

where pxiis a weight for the residual vxiand pyifor vyi. The respective weights have been defined in

(Helmert 1924, p. 81) as

pxi=1

σ2

and pyi=1

σ2

.(5.3)

Therefore, a least squares solution for fitting a straight line in 2D is investigated for four individual weighting

cases that often occur in practice:

1. Same precision σxfor the coordinates in xdirection and σyin ydirection.

2. Individual precision for each point: σxi=σyi∀i.

3. Individual precision for each measured coordinate.

4. Individual precision and correlations between the measured 2D coordinates.

5.2.1 Weighting case 1 - Equally weighted observations in each direction

For the first weighting case the coordinates in xdirection have been observed with the same precision σx,

respectively in ydirection with σy. The weights pxand pycan be computed from equation (5.3) and the

objective function under minimization becomes

Ω(vxi, vyi) =

i=1

pxv2

xi+pyv2

yi→min,(5.4)

with px

=m, m = constant.(5.5)

94 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

This problem will have the geometry that is portrayed in Figure 5.2. Thus, it results in the minimization

of slanted distances from each observed point to the requested line. However, it can be observed that the

ratio between the slanted and the orthogonal distances (i.e. the angles between the orthogonal and the

slanted distances) will be constant for every point. From a geometric perspective, the postulated weights

can be seen as a homogeneous scale of the coordinate system in both xand ydirection. A direct approach

012345678910

Figure 5.2: Example of fitting a straight line to points in 2D, with observed xand ycoordinates and px,py

individual constant weights for each coordinate axis.

is presented here, for solving the discussed nonlinear weighted least squares problem. The strategy that will

be followed involves the scaling of the observed coordinates beforehand with

i=xi√pxand ys

i=yi√py,(5.6)

with the superscript “s” indicating scale. The scaled coordinates xs

iand ys

ican be utilized to derive the

requested line in a different coordinate system, where the weights of the observations are equal. In this line

of thinking the residuals of the point coordinates will also be scaled accordingly, with

xi=vxi√pxand vs

yi=vyi√py.(5.7)

5.2. Fitting of a straight line in 2D 95

Substituting the scaled coordinates and their residuals from equations (5.6) and (5.7) into the condition

equations (5.1) yields

√pxxs

i+vs

xi+b1

√pyys

i+vs

yi+c= 0.(5.8)

Introducing the auxiliary scaled line parameters

as=a1

√px

bs=b1

√py

cs=c,

(5.9)

into the condition equations (5.8), yields an alternative functional model to equation (5.1), expressed by

asxs

i+vs

xi+bsys

i+vs

yi+cs= 0.(5.10)

A meaningful constraint2for the solution of this adjustment problem can been chosen here as

as2+bs2= 1.(5.11)

The least squares criterion is employed for a solution of the unknown line parameters by minimizing the

sum of scaled squared residuals n

i=1

2+vs

2→min.(5.12)

Thus, it has been shown that the discussed weighted least squares problem can be transformed into a problem

with equal weights in the scaled coordinates, as it is depicted in Figure 5.3.

5.2.1.1 Direct least squares solution in a scaled coordinate system

The sum of squared scaled residuals can be replaced by the sum of squared orthogonal distances

Ω(vs

xi, vs

yi) =

i=1

2+vs

i=1

i.(5.13)

The orthogonal distances of the points to the straight line, c.f. equation (4.6), can be expressed for this

problem by

Di=asxs

i+bsys

i+cs

pas2+bs2,(5.14)

which under the constraint (5.11) become

Di=asxs

i+bsys

i+cs.(5.15)

2A “meaningful” constraint is chosen here, in the sense that it will lead to simpler equations for the orthogonal distances

of the 2D points to the requested line, as it has been discussed in subsection 4.2.

96 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

012345678910

Figure 5.3: Example of fitting a straight line to the scaled points in 2D.

Therefore, using the scaled coordinates it is possible to obtain a direct least squares solution for the unknown

line parameters, following the same procedure as the one presented in section 4.2.1.

Computation of the line parameters in the original coordinate system

The original line parameters can be computed by substituting the estimated “scaled” line parameters ˆas,ˆ

and ˆcsinto equation (5.9):

ˆa= ˆas√px,

b=ˆ

bs√py,

ˆc= ˆcs.

(5.16)

However, this solution has been restricted to

as2+bs2=a1

√px2

+b1

√py2

= 1.(5.17)

5.2. Fitting of a straight line in 2D 97

The least squares solution which restricts the line parameters to a2+b2= 1 can be easily derived by

multiplying the original condition equations with the term

√a2+b2,(5.18)

which yields

√a2+b2(xi+vxi) + b

√a2+b2(yi+vyi) + c

√a2+b2= 0.(5.19)

Using the information of equation (5.16), the line parameters which are restricted to a2+b2= 1 can be

computed by

ˆa=ˆas√px

rˆas√px2+ˆ

bs√py2,

b=ˆ

bs√px

rˆas√px2+ˆ

bs√py2,

ˆc=ˆcs

rˆas√px2+ˆ

bs√py2.

(5.20)

5.2.2 Weighting case 2 - Individually weighted points in 2D

In the second weighting case under investigation, each measured point has been obtained with individual

precision

σxi=σyiand pxi=pyi=pi∀i. (5.21)

The ratio between the weights for each point is constant with

pxi

pyi

= 1 ∀i. (5.22)

Taking into consideration (5.21), the objective function (5.2) can be reformulated to

i=1

pxiv2

xi+pyiv2

yi=

i=1

pi(v2

xi+v2

yi)→min.(5.23)

5.2.2.1 Direct weighted least squares solution

In case of individually weighted points, the sum of weigthed squared residuals can be expressed equivalently

by the weighted squared orthogonal distances of each point to the requested line:

pxiv2

xi+pyiv2

yi=pi(v2

xi+v2

yi) = pi(D2

i).(5.24)

98 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

The orthogonal distances are

Di=axi+byi+c

√a2+b2,(5.25)

which after taking into account the restriction between the unknown line parameters

a2+b2= 1,(5.26)

can be simplified to

Di=axi+byi+c. (5.27)

Thus, the objective function (5.23) can be equivalently written as

Ω(a, b, c) =

i=1

pi(v2

xi+v2

yi) =

i=1

piD2

i=1

pi(axi+byi+c)2.(5.28)

This adjustment problem is depicted in Figure 5.4.

012345678910

Figure 5.4: Example of fitting a straight line to points in 2D with xand ymeasured coordinates

and pxi,pyibeing equal weights for each point.

5.2. Fitting of a straight line in 2D 99

We seek for a least squares solution for the unknown line parameters a,band cthat minimizes (5.28),

subject to the restriction (5.26). Consequently, the Lagrangian

K(a, b, c, λ) = Ω(a, b, c)−k(a2+b2−1),(5.29)

can be written, with kdenoting the Lagrange multiplier. By differentiating function K with respect to the

unknown parameters and setting the partial derivatives to zero results in the system of normal equations

∂K

∂a = 2 "an

i=1

pixi2−k+bn

i=1

piyixi+cn

i=1

pixi#= 0,(5.30)

∂K

∂b = 2 "an

i=1

piyixi+bn

i=1

piyi2−k+cn

i=1

piyi#= 0,(5.31)

∂K

∂c = 2 "a

i=1

pixi+b

i=1

piyi+c

i=1

pi#= 0 (5.32)

and ∂K

∂k =−a2+b2−1= 0.(5.33)

Rearranging equation (5.32) leads to

c=−a

i=1

pixi

i=1

−b

i=1

piyi

i=1

.(5.34)

Introducing the expression for cinto the normal equations (5.30) and (5.31), yields the reduced normal

equations

a





i=1

pixi2−1

i=1

pi n

i=1

pixi!2

−k





+b





i=1

piyixi−1

i=1

pi n

i=1

pixi

i=1

piyi!





= 0 (5.35)

and

a





i=1

piyixi−1

i=1

pi n

i=1

pixi

i=1

piyi!





+b





i=1

piyi2−1

i=1

pi n

i=1

piyi!2

−k





= 0.(5.36)

100 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

Equations (5.35) and (5.36) form a homogeneous system of linear equations with respect to the unknown

line parameters. The determinant of the equation system is equal to zero for a nontrivial solution









i=1

pixi2−1

i=1

pi n

i=1

pixi!2

−k











i=1

piyixi−1

i=1

pi n

i=1

pixi

i=1

piyi!











i=1

piyixi−1

i=1

pi n

i=1

pixi

i=1

piyi!











i=1

piyi2−1

i=1

pi n

i=1

piyi!2

−k







= 0,(5.37)

which leads to a quadratic characteristic equation with two real and positive solutions for k. The minimum

solution, denoted by kmin, corresponds to the minimum of the Lagrange function (5.29). The solution for

the unknown line parameters aand bcan be computed by substituting the Lagrangian factor kmin into

equations (5.35)-(5.36), subject to the chosen restriction, or by transforming the equation system into an

eigenvalue problem.

5.2.3 Weighting case 3 - Individually weighted 2D coordinates

The third weighting case under investigation is more general than the previous two, as the measured coor-

dinates have been observed with individual precisions. As far as known, a direct least squares solution for

this nonlinear problem has not been found. Iterative solutions, however, have been presented by various

authors in the past. Some of those do not make use of any linearization of the original problem, such as for

example York (1966), York (1968) or Petrovi´c et al. (1983). In the next subsection it is shown that a direct

solution for the discussed adjustment problem is not possible. However, an iterative procedure is presented

that is based to a large extent on the derivations of (Petrovi´c et al. 1983).

Minimizing the sum of weighted squared residuals

Starting from the condition equations (5.1) and the objective function (5.2), it is possible to build the

Lagrangian

K(vxi, vyi, a, b, c, ki) = Ω(vxi, vyi)−2

i=1

ki(a(xi+vxi) + b(yi+vyi) + c),(5.38)

with kidenoting the Lagrange multipliers.The normal equation system of this adjustment problem can be

described by

∂K

∂vxi

= 2pxivxi−2aki= 0,

⇒vxi=aki

pxi

,(5.39)

5.2. Fitting of a straight line in 2D 101

∂K

∂vyi

= 2pyivyi−2bki= 0,

⇒vyi=bki

pyi

,(5.40)

∂K

∂ki

=−2 [a(xi+vxi) + b(yi+vyi) + c]=0,(5.41)

∂K

∂a =−2

i=1

ki(xi+vxi)=0,(5.42)

∂K

∂b =−2

i=1

ki(yi+vyi)=0,(5.43)

∂K

∂c =−2

i=1

ki= 0.(5.44)

Equations (5.39) to (5.44) form a nonlinear system of 3n+ 3 equations. Introducing the residuals from

equations (5.39) and (5.40) into (5.41), yields the expression for the Lagrange multipliers

ki=wi(axi+byi+c),(5.45)

with the auxiliary weighting factors3

wi=−a2

pxi

+b2

pyi−1

.(5.46)

Introducing kiinto equation (5.44), results in the expression for parameter

c=−

i=1

wixi+b

i=1

wiyi

i=1

.(5.47)

Additionally, substituting kiinto equations (5.39) and (5.40) yields explicit expressions for the residual

vectors

vxi=−wi

a(axi+byi+c) (5.48)

and

vyi=−wi

b(axi+byi+c).(5.49)

3It is interesting to note the relationship between the developed weighting factors wi, the coefficients ”Li” and the weighting

factors ”Wi” from (Deming 1964, p. 134,181) and (York 1966).

102 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

Utilizing the developed expressions for vyiand vxi, a minimum of the original objective function Ω can be

found, instead of minimizing the Lagrangian K. This approach gives the possibility to show why a direct

least squares is not possible for this weighted case. Therefore, substituting the developed residuals directly

in the objective function (5.2) yields

Ω(vxi, vyi) =

i=1

pxiv2

xi+pyiv2

i=1

pxi−wi

pxi

a(axi+byi+c)2

+pyi−wi

pyi

b(axi+byi+c)2

i=1 −a2

pxi

+b2

pyi(axi+byi+c)2

a2

pxi

+b2

pyi2

i=1 −1

a2

pxi

+b2

pyi(axi+byi+c)2

i=1

wi(axi+byi+c)2

i=1

(5.50)

From the last equation it can be seen that the problem of minimizing the sum of weighted squared residuals

can be transformed into the minimization of the slanted distances

Di=−1

sa2

pxi

+b2

pyi

(axi+byi+c) = √wi(axi+byi+c).(5.51)

A direct solution for this weighted case is not possible, as there is no restriction for the unknown parameters

that could lead to a linear formulation of the distances in equation (5.51). From a different perspective

it could be said that a direct least squares solution is possible, when the auxiliary weighting factors wi

in equation (5.46) can be set equal to a constant value, by selecting a meaningful restriction between the

unknown parameters. The geometry of this problem is depicted in Figure (5.5).

5.2.3.1 Iterative least squares solution without linearization

An iterative solution for the discussed adjustment problem can be obtained without performing any kind of

linearization, similar to (York 1966) and (Petrovi´c et al. 1983). The unknown line parameters aand bcan

be estimated by introducing the residuals from equations (5.48) - (5.49) and kifrom (5.45), into equations

5.2. Fitting of a straight line in 2D 103

012345678910

Figure 5.5: Example of fitting a straight line to points in 2D with observed xiand yicoordinates and pxi,pyi

individual weights for the coordinates.

(5.42) and (5.43). This yields the reduced normal equations

i=1

pxi

=−

i=1

wi(axi+byi+c)xi(5.52)

and

i=1

pyi

=−

i=1

wi(axi+byi+c)yi.(5.53)

Rearranging appropriately equations (5.52) and (5.53) gives

af1=bf2(5.54)

and

bf3=af2,(5.55)

104 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

with the respective quantities being

f1=

i=1

pxi−

i=1

wix2

i+1

i=1

wi n

i=1

wixi!2

f2=

i=1

(wixiyi)−1

i=1

wi n

i=1

wiyi

i=1

wixi!,

f3=

i=1

pyi−

i=1

wiy2

i+1

i=1

wi n

i=1

wiyi!2

(5.56)

Furthermore, a simplification of the problem is feasible by setting a restriction between the unknown line

parameters aand b. It is possible to take into account the general restriction a2+b2= 1, however, for

convenience the problem is restricted here to b= 1. Thus, a solution for the remaining unknown is

a=f2

,(5.57)

with f1and f2containing some unknown parameters, according to their definition in (5.56). Thus, equation

(5.57) becomes pseudo-linear after selecting an approximate value a0. The estimated line parameters can be

utilized as new starting values in each iteration step, until a break-off condition is met. As a linearization

has not been applied in any step of the adjustment, this iterative procedure can be terminated after the

condition of the “computational error” is fulfilled, as it was presented in chapter 3. Algorithm 1 has been

developed for estimating the weighted least squares solution for fitting a line to a set of points in 2D, based

on the presented procedure of this subsection.

Algorithm 1 Least squares fitting of a straight line in 2D with general weights

1: Choose approximate value for a0.

2: Define parameter b= 1.

3: Set threshold for the break-off condition of the iteration process.

4: Set parameter da=|ˆa−a0|=∞, for entering the iteration process.

5: while da>  do

6: Compute parameters wi,kiand estimate ˆc.

7: Compute the coefficients f1and f2.

8: Estimate parameter ˆa.

9: Compute parameter da=|ˆa−a0|.

10: Update the approximate values with the estimated ones (a0= ˆa).

11: end while

12: return ˆaand ˆc, with b= 1.

5.2. Fitting of a straight line in 2D 105

Iterative procedure of pseudo-quadratic and pseudo-cubic equations

For the sake of a complete overview of the iterative algorithms that can produce the weighted least squares

solution for fitting a straight line in 2D, two alternatives to the developed pseudo-linear equations are

presented here. A thorough analysis of (5.52) and (5.53) can lead to a pseudo-cubic equation instead, as it

has been shown by York (1966), which reads

i=1

ix0

pxi−2a2

i=1

ix0

iy0

pxi−an

i=1

wix0

2−

i=1

iy0

pxi+

i=1

wix0

iy0

i,(5.58)

restricted to b= 1. The coordinates x0and y0are the reduced coordinates to the pseudo-centre of the mass

and can be computed by

i=xi−

i=1

wixi

i=1

and y0

i=yi−

i=1

wiyi

i=1

.(5.59)

The term ”pseudo-centre” originates from (Deming 1964, p. 134,181) and owes its name to the auxiliary

parameters wi, which will change their values in each iteration step. Taking into account the interesting

comments of Williamson (1968), this pseudo-cubic equation can be alternatively expressed by a pseudo-

quadratic or even as a pseudo-linear one. Such a pseudo-quadratic equation has been presented in (York,

1968) including correlations between the observations, which is equivalent to the development of (Petrovi´c

et al., 1983) when setting the correlations equal to zero and is expressed by

i=1

pxi

(pyix0

iy0

i) + an

i=1

pxi

2−

i=1

pyi

2−

i=1

pyi

2y0

2,(5.60)

under the restriction of b= 1. Iterative algorithms can be easily built for the least squares solution of the

line parameters, by making use either of the pseudo-cubic (5.58) or the pseudo-quadratic equation (5.60).

5.2.4 Weighting case 4 - Individually weighted and correlated 2D coordinates

The developed iterative procedure of the previous subsection can be extended to include correlations between

the observed quantities. Correlations are often considered in geodetic applications, as it is typical that the 2D

coordinates of points are not the original measured quantities, but they have been obtained for example by

polar measurements. In another example, these coordinates are the outcome of some previous adjustment,

for instance of a 2D network. The precision of the adjusted 2D coordinates is obtained in most cases from a

linear error propagation that sometimes results in a cofactor matrix that includes correlations between the

observations, depending on the adjustment problem.

It is very important to point out here that the precisions of the coordinates coming from a linearized

error propagation is just an approximation, as it has been discussed in section 2.3. Therefore, when the

original measurements are polar coordinates, then these should be utilized for obtaining a rigorous least

squares solution. However, the adjusted 2D Cartesian coordinates of points and their approximated stochastic

model from the error propagation are often used in practice. Therefore, this subsection is dedicated to

those practical least squares solutions for fitting straight lines in 2D, taking into account the approximated

variances and covariances of the 2D point coordinates.

106 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

It would be beneficial at this point to introduce vector/matrix notation, in order to derive simpler equations

for the solution of the adjustment problem. Firstly, the cofactor matrix QLL is given or obtained from a

previous adjustment and can be written as

QLL ="Qxx Qxy

Qyx Qyy #,with Qxy =QT

yx.(5.61)

Qxx and Qyy are the cofactor matrices of the measured 2D coordinates and matrices Qxy,Qyx hold the

correlations between the coordinates. The respective weight matrices

P=Q−1

LL ="Pxx Pxy

Pyx Pyy #,with Pxy =PT

yx.(5.62)

The nonlinear condition equations (5.1) can be expressed equivalently in vector notation by

a(xc+vx) + b(yc+vy) + ce=0,(5.63)

with vectors xcand yclisting the coordinates of the measured points

xc=











,yc=











(5.64)

and vectors vxand vycarrying the corresponding residuals

vx=





vx1

vx2

vxn







,vy=





vy1

vy2

vyn







.(5.65)

Vector eis a vector of ones

e=









,(5.66)

with dimension being equal to the number of nmeasured points. A least squares solution of the problem

can be derived by minimizing an objective function expressed in matrix notation by

Ω(vx,vy) = vT

xPxxvx+vT

yPyyvy+ 2 vT

xPxyvy.(5.67)

5.2. Fitting of a straight line in 2D 107

5.2.4.1 Iterative least squares solution without linearization

Combining the developed condition equations (5.63) and the objective function (5.67), leads to the Lagrange

function

K(a, b, c, vx,vy,k) = Ω(vx,vy)−2kT[a(xc+vx) + b(yc+vy) + ce],(5.68)

with kdenoting the vector of Lagrange multipliers. Following the same procedure as in the previous sections,

a minimum for K is obtained by differentiation with respect to all unknown parameters and setting the partial

derivatives to zero, resulting in the system of nonlinear normal equations

∂K

∂vT

= 2 (Pxxvx+Pxyvy−ak) = 0,(5.69)

∂K

∂vT

= 2 (Pyyvy+Pyxvx−bk) = 0,(5.70)

∂K

∂kT=−2 [a(xc+vx) + b(yc+vy) + ce] = 0,(5.71)

∂K

∂a =−2kT(xc+vx)=0,(5.72)

∂K

∂b =−2kT(yc+vy) = 0,(5.73)

∂K

∂c =−2kTe= 0.(5.74)

A linearisation or approximation of the original problem is avoided here. Explicit expressions for the residuals

can be obtained by expressing equations (5.69) and (5.70) using block matrices:

"Pxx Pxy

Pyx Pyy #" vx

vy#="ak

bk#.(5.75)

The residual vectors can be computed by

"vx

vy#="Pxx Pxy

Pyx Pyy #−1"ak

bk#="Qxx Qxy

Qyx Qyy #" ak

bk#,(5.76)

which yields

vx= (aQxx +bQxy)k(5.77)

and

vy= (aQyx +bQyy)k.(5.78)

108 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

Equivalently, the residual vectors can be computed by using the properties of inverting block matrices, as it

has been discussed in (Snow 2012, p. 22). Moreover, introducing the expressions for vxand vyinto equation

(5.71) yields

a2Qxx +b2Qyy +abQxy +abQyxk=−(axc+byc+ce).(5.79)

Introducing appropriate approximate values a0and b0in the left hand side of the last equation, allows us

to build the auxiliary matrix

W=a02Qxx +b02Qyy +a0b0Qxy +a0b0Qyx (5.80)

and write equation (5.79) as

Wk =−(axc+byc+ce).(5.81)

In case of regular cofactor matrices, matrix Wis also regular and invertible (the case of singular cofactor

matrices in Wis discussed in the next subsection). This leads to the vector of Lagrange multipliers

k=−W−1(axc+byc+ce).(5.82)

Furthermore, substituting vector kinto the normal equation (5.74) results in

eTk= 0

⇒ −eTW−1(axc+byc+ce)=0

⇒c=−aheTW−1e−1eTW−1xci−bheTW−1e−1eTW−1yci.

(5.83)

The solution for the unknown line parameters aand bcan be obtained by analyzing further equations (5.72)

and (5.73). Taking into account the solution for the vector of Lagrange multipliers from equation (5.82) and

for the residual vectors from (5.77) and (5.78), yields the reduced system of equations

−xT

cW−1(axc+byc+ce) + kT(aQxxk+bQxyk) = 0 (5.84)

and

−yT

cW−1(axc+byc+ce) + kT(aQxyk+bQyyk)=0.(5.85)

Substituting also parameter cfrom (5.83), the last two equations can be equivalently written as

a f1+b f2= 0 (5.86)

5.2. Fitting of a straight line in 2D 109

and

b f3+a f2= 0,(5.87)

with the respective quantities being

f1=kTQxxk−xT

cW−1xc+xT

cW−1eeTW−1e−1eTW−1xc,

f2=kTQxyk−xT

cW−1yc+yT

cW−1eeTW−1e−1eTW−1xc,

f3=kTQyyk−yT

cW−1yc+yT

cW−1eeTW−1e−1eTW−1yc.

(5.88)

Equations (5.86) and (5.87) form a system of pseudo-linear equations with two unknown parameters (i.e.

the line parameters aand b). The term “pseudo-linear” comes from the fact that parameters f1,f2and f3

contain matrix W, which has been built using the approximations a0and b0. Furthermore, a restriction of

the problem to b= 1 leads to the solution for the remaining unknown parameter

a=−f2

.(5.89)

An iterative procedure for estimating the unknown line parameters can be found in Algorithm 2. A similar

iterative solution for this weighting case has been presented in (York 1968).

Algorithm 2 Least squares fitting of a straight line in 2D with general weights and correlations

1: Choose approximate value for a0.

2: Define parameter b= 1.

3: Set threshold for the break-off condition of the iteration process.

4: Set parameter da=|ˆa−a0|=∞, for entering the iteration process.

5: while da>  do

6: Compute the auxiliary matrix W, and the vector of Lagrange multipliers k.

7: Estimate parameter ˆc.

8: Compute the coefficients f1and f2.

9: Estimate parameter ˆa.

10: Compute parameter da=|ˆa−a0|.

11: Update the approximated parameter with the estimated (a0= ˆa).

12: end while

13: return ˆaand ˆc, with b= 1.

5.2.4.2 Solution for singular cofactor matrices

Postulating regular cofactor matrices in equation (5.81) permitted the inversion of matrix W. This led

to the reduction of the normal equations and the solution for the unknown line parameters. However, if

the given cofactor matrices are singular, then a solution for the vector of Lagrange multipliers cannot be

obtained using equation (5.82), as long as matrix Wis not invertible anymore. However, a solution with

iterations is possible also for the case of singular cofactor matrices following a rather different procedure.

110 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

Taking into account equation (5.81), together with the normal equations (5.72)-(5.74), the following system

of equations emerges:

Wk =−(axc+byc+ce),

kT(xc+vx)=0,

kT(yc+vy)=0,

kTe= 0.

(5.90)

If the chosen restriction between the unknown parameters is b= 1, then the developed normal equations

can be written as4

Wk =−(axc+yc+ce),

kT(xc+vx)=0,

kTe= 0.

(5.91)

This equation system can be expressed in a block matrix form, after introducing approximate values for the

residual vector v0







W xce

xc+v0

xT0 0

eT0 0















=





−yc





.(5.92)

This can be equivalently written as

N"k

X#=n,(5.93)

with matrices

N=





W xce

xc+v0

xT0 0

eT0 0





,n=





−yc





(5.94)

and the vector of unknown parameters

X="a

c#.(5.95)

4The third equation, kT(yc+vy) = 0, is not taken into account as we set bas a fixed parameter.

5.2. Fitting of a straight line in 2D 111

A least squares solution for this adjustment problem can be computed by

"ˆ

X#=N−1n,(5.96)

without applying any linearization to the original problem.

Solution with a symmetric normal matrix N

The developed normal matrix Nfrom equation (5.94) is nonsymmetric. An equivalent solution of the

problem using, however, a symmetric matrix Ncan be obtained by adding the term avxto both sides of

equation (5.81)5. This leads to the equation system

Wk +a(xc+vx) + ce=−yc+avx,

kT(xc+vx)=0,

kTe= 0

(5.97)

or written in block matrix form

"W A

AT0#" k

X#="w

0#,(5.98)

with matrix

A=xc+v0

x,e(5.99)

and vector

w=−yc+a0v0

x.(5.100)

It is worth noticing the similarity of matrix Aand vector wwith those from the GHM, as explained in

section 3.1.2, without however applying a linearization to the functional model. Therefore, a solution of the

adjustment problem can be obtained by equation (5.96), after introducing

N="W A

AT0#,n="w

0#,(5.101)

where Nis in this case symmetric.

The inversion of matrix Ndepends on the rank deficiency of matrix W, which is not invertible as long as the

cofactor matrices of the adjustment problem are singular. An elegant way to ensure that a unique solution

exists, even when singular cofactor matrices must be employed, is the fullfilment of the Neitzel-Schaffrin

5A similar procedure for deriving a symetric normal matrix Nis already known in the literature dealing with WTLS

algorithms. An example can be found in (Snow 2012, pp. 25-26).

112 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

(NS) criterion that was proposed by Neitzel and Schaffrin (2016). As a linearization of the problem has been

avoided in the presented solution strategy, a similar criterion is presented in the following that will ensure a

unique solution for the unknown parameters. Starting with the equation system from equation (5.98), with

- rank of W≤n, with n= number of condition equations;

- rank of A=m, with m= number of unknown parameters;

- redundancy : rd=n−m;

while the rank of matrix Wwill be smaller than nin cases of singular cofactor matrices. Similar to the NS

criterion for the GHM, a unique solution will exist if the rank of the augmented matrix [W|A] is equal to

the number of condition equations nof the problem (which is equal to the rank of matrix Bin the GHM).

A criterion that would ensure a unique solution of the problem can be described in this case by

rank ([W|A]) = n. (5.102)

An iterative procedure is presented in Algorithm 3, for the special cases of fitting straight lines in 2D with

singular cofactor matrices. It must be pointed out that this is a general procedure and can be used to derive

a solution also for the previous examined weighting cases for fitting a straight line in 2D. However, this

iteration process involves more complicated normal equations and thus must be preferred when a singular

cofactor matrix is given.

Algorithm 3 Least squares fitting of a straight line in 2D with singular cofactor matrices

1: Choose approximate values for a0,v0

xand v0

2: Define parameter b= 1.

3: Set threshold for the break-off condition of the iteration process.

4: Set parameter da=|ˆa−a0|=∞, for entering the iteration process.

5: while da>  do

6: Compute matrices W,Aand vector w.

7: Build matrix Nand vector n.

8: Estimate the vector of unknowns "ˆ

X#.

9: Compute the residual vectors vxand vy.

10: Compute parameter da=|ˆa−a0|.

11: Update the approximate values with the estimated ones, with a0= ˆa,v0

x=vxand v0

y=vy.

12: end while

13: return ˆaand ˆc, with b= 1.

5.3. Fitting of a plane in 3D 113

5.3 Fitting of a plane in 3D

The problem of fitting a plane to an observed point cloud in 3D is investigated here, with the observed

coordinates being obtained with different precisions. The general form of a plane in 3D has been already

presented in subsection 4.4:

ax +by +cz +d= 0,

with x,yand zdenoting the 3D coordinates of a point that lies in the plane. a,b,cand dare the unknown

plane parameters. Random errors will influence the observed quantities, thus individual residuals can be

introduced in the functional model, leading to the condition equations

a(xi+vxi) + b(yi+vyi) + c(zi+vzi) + d= 0.(5.103)

An “optimal” solution for the unknown plane parameters is possible by minimizing the sum of weighted

squared residuals.The least squares solution of the best plane is investigated, for four different weighting

cases:

1. Same precision σxfor the coordinates in xdirection, σyin ydirection and σzin zdirection.

2. Individual precision for each point: σxi=σyi=σzi∀i.

3. Individual precision for each coordinate.

4. Individual precision and correlations between the measured 3D coordinates.

5.3.1 Weighting case 1 - Equally weighted observations in each direction

The fitting a plane to a 3D point cloud is examined in this subsection, with the measured point coordinates

being observed with the same precision in each direction. The estimation of the unknown plane parameters

is based on the minimization of the sum of weighted squared residuals

i=1

pxv2

xi+pyv2

yi+pzv2

zi→min,(5.104)

with the constant weights px,pyand pzbeing computed by

px=1

σ2

, py=1

σ2

and pz=1

σ2

.(5.105)

Following the same solution strategy as in subsection 5.2.1, the observed 3D coordinates are multiplied

beforehand with the respective weights leading to the scaled coordinates

i=xi√px, ys

i=yi√pyand zs

i=zi√pz(5.106)

and the respective residuals

xi=vxi√px, vs

yi=vyi√pyand vs

zi=vzi√pz.(5.107)

114 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

Substituting the scaled coordinates and their residuals into the condition equations (5.103) yields

√pxxs

i+vs

xi+b1

√pyys

i+vs

yi+c1

√pzzs

i+vs

zi+d= 0.(5.108)

Furthermore, introducing the auxiliary scaled plane parameters

as=a1

√px

bs=b1

√py

cs=c1

√pz

ds=d,

(5.109)

in equation (5.108), results in the functional model

asxs

i+vs

xi+bsys

i+vs

yi+cszs

i+vs

zi+ds= 0.(5.110)

Similar to the procedure of subsection 4.4.1, a “meaningful” restriction between the unknown plane param-

eters is

as2+bs2+cs2= 1.(5.111)

The least squares solution for the unknown plane parameters can be derived by minimizing the sum of

squared scaled residuals n

i=1

2+vs

2→min.(5.112)

Direct least squares solution in a scaled coordinate system

Utilizing the scaled 3D coordinates, the original problem is transformed to a problem with equal weights.

Thus, the sum of squared residuals is equal to the sum of squared orthogonal distances

Ω(vs

xi, vs

yi) =

i=1

2+vs

i=1

i,(5.113)

with the distances of the measured points to the requested plane

Di=asxs

i+bsys

i+cszs

i+ds

pas2+bs2+cs2.(5.114)

Taking into account the selected restriction from equation (5.111), the expressions for the orthogonal dis-

tances become

Di=asxs

i+bsys

i+cszs

i+ds.(5.115)

A least squares solution for the unknown plane parameters can be derived in this case directly, following the

same procedure as the one presented in subsection 4.4.

5.3. Fitting of a plane in 3D 115

Computation of the plane parameters in the original coordinate system

Introducing the estimated plane parameters ˆas,ˆ

bs, ˆcsand ˆ

dsinto equation (5.109), yields the plane param-

eters in the original coordinate system

ˆa= ˆas√px,

b=ˆ

bs√py,

ˆc= ˆcs√pz,

d=ds,

(5.116)

however, being restricted to

as2+bs2+cs2=a

√px2

+b

√py2

+c

√pz2

= 1.(5.117)

Therefore, a least squares solution that is restricted to a2+b2+c2= 1 can be computed by

ˆa=ˆas√px

rˆas√px2+ˆ

bs√py2+ˆcs√pz2

b=ˆ

bs√px

rˆas√px2+ˆ

bs√py2+ˆcs√pz2

ˆc=ˆcs√pz

rˆas√px2+ˆ

bs√py2+ˆcs√pz2

d=ˆ

rˆas√px2+ˆ

bs√py2+ˆcs√pz2

(5.118)

5.3.2 Weighting case 2 - Individually weighted points in 3D

In the second weighting scenario for fitting a plane in 3D, every point has been observed with individual

precision:

σxi=σyi=σzi∀i

⇒pxi=pyi=pzi=pi∀i.

(5.119)

Also in this case the ratio between the weights of each point is constant:

pxi

pyi

= constant ,pyi

pzi

= constant and pxi

pzi

= constant.(5.120)

116 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

A least squares solution for the unknown plane parameters can be found by minimizing

Ω(vxi, vyi, vzi) =

i=1

pxiv2

xi+pyiv2

yi+pziv2

zi,(5.121)

which after taking into account the stochastic model of equation (5.119), becomes

Ω(vxi, vyi, vzi) =

i=1

pi(v2

xi+v2

yi+v2

zi)→min.(5.122)

Direct weighted least squares solution

The orthogonal distances of the measured points to the requested plane are

Di=axi+byi+czi+d

√a2+b2+c2(5.123)

and after selecting the restriction

a2+b2+c2= 1,(5.124)

can be equivalently written as

Di=axi+byi+czi+d. (5.125)

The sum of weighted squared residuals of this adjustment problem is equal to the sum of weighted squared

orthogonal distances

piD2

i=piv2

xi+v2

yi+v2

zi.(5.126)

Thus, the objective function is

Ω(a, b, c, d) =

i=1

pi(v2

xi+v2

yi+v2

zi) = piD2

i=pi(axi+byi+czi+d)2.(5.127)

To obtain a solution for the unknown plane parameters which minimizes equation (5.127) under the chosen

restriction, the Lagrangian

K(a, b, c, k) =

i=1

pi(axi+byi+czi+d)2−k(a2+b2+c2−1),(5.128)

can be built. The normal equation system for this problem is

∂K

∂a = 2 "a n

i=1

pixi2−k!+b n

i=1

piyixi!+c n

i=1

pixizi!+d n

i=1

pixi!#= 0,(5.129)

∂K

∂b = 2 "a n

i=1

piyixi!+b n

i=1

piyi2−k!+c n

i=1

piyizi!+d n

i=1

piyi!#= 0,(5.130)

5.3. Fitting of a plane in 3D 117

∂K

∂c = 2 "a n

i=1

pixizi!+b n

i=1

piyizi!+c n

i=1

pizi2−k!+d n

i=1

pizi!#= 0,(5.131)

∂K

∂d = 2 "a

i=1

pixi+b

i=1

piyi+c

i=1

pizi+d

i=1

pi#= 0,(5.132)

∂K

∂k =−a2+b2+c2−1= 0.(5.133)

Equation (5.132) can be rearranged to

d=−1

i=1

pi a

i=1

xi+b

i=1

yi+c

i=1

zi!.(5.134)

The expression for dcan be subsequently introduced into equations (5.129) - (5.131). This yields the reduced

normal equations

a





i=1

pixi2−1

i=1

pi n

i=1

pixi!2

−k





+b





i=1

pixiyi−1

i=1

pi n

i=1

pixi

i=1

piyi!





c





i=1

pixizi−1

i=1

pi n

i=1

pixi

i=1

pizi!





= 0,

(5.135)

a





i=1

pixiyi−1

i=1

pi n

i=1

pixi

i=1

piyi!





+b





i=1

piyi2−1

i=1

pi n

i=1

piyi!2

−k





c





i=1

piyizi−1

i=1

pi n

i=1

piyi

i=1

pizi!





= 0

(5.136)

and

118 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

a





i=1

pixizi−1

i=1

pi n

i=1

pixi

i=1

pizi!





+b





i=1

piyizi−1

i=1

pi n

i=1

piyi

i=1

pizi!





c





i=1

pizi2−1

i=1

pi n

i=1

pizi!2

−k





= 0,

(5.137)

which represent a homogeneous system of equations. A nontrivial solution is possible by setting the deter-

minant equal to zero: 

(f1−k)g1g2

g1(f2−k)g3

g2g3(f3−k)



= 0,(5.138)

with the respective quantities being

f1=

i=1

pixi2−1

i=1

pi n

i=1

pixi!2

, f2=

i=1

piyi2−1

i=1

pi n

i=1

piyi!2

f3=

i=1

pizi2−1

i=1

pi n

i=1

pizi!2

g1=

i=1

pixiyi−1

i=1

pi n

i=1

pixi

i=1

piyi!, g2=

i=1

pixizi−1

i=1

pi n

i=1

pixi

i=1

pizi!,

g3=

i=1

piyizi−1

i=1

pi n

i=1

piyi

i=1

pizi!.

(5.139)

Equation (5.138) is a cubic characteristic equation with three solutions for k. The unknown plane parameters

can be estimated either by substituting kmin into equations (5.135)-(5.137) or by solving an eigenvalue

problem.

5.3.3 Weighting case 3 - Individually weighted 3D coordinates

For the third weighting case of fitting a plane to a 3D point cloud, the observed 3D coordinates of each point

have been obtained with individual precisions. The objective function under minimization can be expressed

in this case as

Ω(vxi, vyi, vzi) =

i=1

pxiv2

xi+pyiv2

yi+pziv2

zi.(5.140)

5.3. Fitting of a plane in 3D 119

Iterative least squares solution without linearization

A least squares solution can be computed for this nonlinear adjustment by minimizing the Lagrange function

K(a, b, c, vxi, vyi, vzi, ki) = Ω(vxi, vyi, vzi)−2

i=1

ki[a(xi+vxi) + b(yi+vyi) + c(zi+vzi) + d].(5.141)

Avoiding any kind of linearization of the problem, a differentiation of function K with respect to all unknown

parameters and setting the partial derivatives equal to zero yields the system of normal equations

∂K

∂vxi

= 2pxivxi−2aki= 0

⇒vxi=aki

pxi

,(5.142)

∂K

∂vyi

= 2pyivyi−2bki= 0

⇒vyi=bki

pyi

,(5.143)

∂K

∂vzi

= 2pzivzi−2cki= 0

⇒vzi=cki

pzi

,(5.144)

∂K

∂k =−2 [a(xi+vxi) + b(yi+vyi) + c(zi+vzi) + d]=0,(5.145)

∂K

∂a =−2

i=1

ki(xi+vxi)=0,(5.146)

∂K

∂b =−2

i=1

ki(yi+vyi)=0,(5.147)

∂K

∂c =−2

i=1

ki(zi+vzi)=0,(5.148)

∂K

∂d =−2

i=1

ki= 0.(5.149)

Equations (5.142)-(5.149) represent a nonlinear system of 4n+4 equations. The residuals from (5.142)-(5.144)

are introduced into equation (5.145), that yields the expression for the Lagrange multipliers

ki=wi(axi+byi+czi+d),(5.150)

120 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

with the auxiliary weighting factors

wi=−a2

pxi

+b2

pyi

+c2

pzi −1

.(5.151)

Introducing kiinto (5.149) returns

d=−

i=1

wixi+b

i=1

wiyi+c

i=1

wizi

i=1

.(5.152)

Substituting the residuals vxi,vyi,vziand the Lagrange multipliers kiinto equations (5.146), (5.147) and

(5.148), results in the system of reduced equations

i=1

pxi

=−

i=1

wi(axi+byi+czi+d)xi,(5.153)

i=1

pyi

=−

i=1

wi(axi+byi+czi+d)yi(5.154)

and

i=1

pzi

=−

i=1

wi(axi+byi+czi+d)zi.(5.155)

Taking into account parameter dyields

af1=bf2+cf3,(5.156)

bf4=af2+cf5(5.157)

and

cf6=af3+bf5,(5.158)

5.3. Fitting of a plane in 3D 121

with the quantities

f1=

i=1

pxi−

i=1

wix2

i+1

i=1

wi n

i=1

wixi!,

f2=

i=1

(wixiyi)−1

i=1

wi n

i=1

wixi

i=1

wiyi!,

f3=

i=1

(wixizi)−1

i=1

wi n

i=1

wixi

i=1

wizi!,

f4=

i=1

pyi−

i=1

wiy2

i+1

i=1

wi n

i=1

wiyi!,

f5=

i=1

(wiyizi)−1

i=1

wi n

i=1

wiyi

i=1

wizi!,

f6=

i=1

pzi−

i=1

wiz2

i+1

i=1

wi n

i=1

wizi!.

(5.159)

At this stage a meaningful restriction between the unknown parameters can be selected. For the sake of

convenience the restriction c= 1 is chosen. Therefore, solving equation (5.157) for band introducing it in

(5.156) gives

ˆa=f1−f2

f4−1f2f5

+f3(5.160)

and

b= ˆaf2

f4+f5

.(5.161)

Equations (5.160) and (5.161) become pseudo-linear after approximating functions f1, f2, . . . , f6. Therefore,

initial values for a0and b0are necessary for computing the auxiliary weighting factors wi, the Lagrange

multipliers ki, as well as functions f1, f2, . . . , f6. The estimated plane parameters can be utilized as new

starting values in each iteration step, until a break-off condition is met. Based on the presented procedure,

Algorithm 4 has been developed for estimating the weighted least squares solution of fitting a plane to a 3D

point-cloud.

122 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

Algorithm 4 Least squares fitting of a plane to points in 3D with general weights

1: Choose approximate values for a0,b0.

2: Define parameter c= 1.

3: Set threshold for the break-off condition of the iteration process.

4: Set parameter da=|ˆa−a0|=∞and db=|ˆ

b−b0|=∞, for entering the iteration process.

5: while da>  or db>  do

6: Compute parameters ki,piand estimate ˆ

7: Compute the coefficients f1, f2, . . . , f5.

8: Estimate parameters ˆaand ˆ

9: Compute parameter da=|ˆa−a0|and db=|ˆ

b−b0|.

10: Update the approximate values with the estimated ones (a0= ˆaand b0=ˆ

b).

11: end while

12: return ˆa,ˆ

band ˆ

d, with c= 1.

5.3.4 Weighting case 4 - Individually weighted and correlated 3D coordinates

For the fourth weighted adjustment problem under investigation, correlations are introduced between the

measurements. Therefore, the cofactor matrix of the 3D point coordinates is given or obtained from a

previous adjustment and is expressed by

QLL =





Qxx Qxy Qxz

Qyx Qyy Qyz

Qzx Qzy Qzz





,with Qxy =QT

yx ,Qxz =QT

zx and Qyz =QT

zy.(5.162)

Qxx,Qyy and Qzz represent the cofactor matrices for the x,yand zcoordinates, respectively. Matrices Qxy,

Qxz,Qyz,Qyx,Qzx and Qzy are holding the correlations between the 3D point coordinates. The respective

weight matrices are

P=Q−1

LL =





Pxx Pxy Pxz

Pyx Pyy Pyz

Pzx Pzy Pzz





.(5.163)

The nonlinear condition equations (5.103) can be expressed equivalently in vector notation by

a(xc+vx) + b(yc+vy) + c(zc+vz) + de=0.(5.164)

Vectors xc,ycand zcinclude the 3D point coordinates

xc=











,yc=











and zc=











(5.165)

5.3. Fitting of a plane in 3D 123

and the residual vectors vx,vyand vzlist the corresponding residuals

vx=





vx1

vx2

vxn







,vy=





vy1

vy2

vyn







and vz=





vz1

vz2

vzn







.(5.166)

eis a vector of ones with length being equal to the number of measured points. A solution for this weighted

least squares problem can be derived by minimizing the objective function

Ω(vx,vy,vz) = vT

xPxxvx+vT

yPyyvy+vT

zPzzvz+ 2 vT

xPxyvy+ 2 vT

xPxzvz+ 2 vT

yPyzvz.(5.167)

Iterative least squares solution without linearization

The objective function (5.167) can be combined with the nonlinear condition equations (5.164) to build the

Lagrangian

K(a, b, c, d, vx,vy,vz,k) = Ω(vx,vy,vz)−2kT[a(xc+vx) + b(yc+vy) + c(zc+vz) + de].(5.168)

kis the vector of Lagrange multipliers. The resulting normal equations are

∂K

∂vT

= 2 (Pxxvx+Pxyvy+Pxzvz−ak) = 0,(5.169)

∂K

∂vT

= 2 (Pyyvy+Pyxvx+Pyzvz−bk) = 0,(5.170)

∂K

∂vT

= 2 (Pyyvy+Pzxvx+Pzyvy−ck) = 0,(5.171)

∂K

∂kT=−2 [a(xc+vx) + b(yc+vy) + c(zc+vz) + de] = 0,(5.172)

∂K

∂a =−2kT(xc+vx)=0,(5.173)

∂K

∂b =−2kT(yc+vy) = 0,(5.174)

∂K

∂c =−2kT(zc+vz)=0,(5.175)

∂K

∂d =−2kTe= 0.(5.176)

124 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

A solution for the unknown plane parameters can be obtained by analyzing the derived normal equations.

Expressing equations (5.169)-(5.171) with block matrices







Pxx Pxy Pxz

Pyx Pyy Pyz

Pzx Pzy Pzz















=









,(5.177)

it is possible to derive the residual vectors











=





Pxx Pxy Pxz

Pyx Pyy Pyz

Pzx Pzy Pzz







−1









=





Qxx Qxy Qxz

Qyx Qyy Qyz

Qzx Qzy Qzz















,(5.178)

or equivalently

vx= (aQxx +bQxy +cQxz)k,(5.179)

vy= (aQyx +bQyy +cQyz)k,(5.180)

and

vz= (aQzx +bQzy +cQzz)k.(5.181)

Substituting further vx,vyand vzinto equation (5.172) gives

Wk =−(axc+byc+czc+de).(5.182)

The auxiliary matrix

W=a02Qxx +b02Qyy +c02Qzz +a0b0(Qxy +Qyx) + a0c0(Qxz +Qzx) + b0c0(Qyz +Qzy) (5.183)

can be constructed after introducing approximate values for parameters a0,b0and c0. In case of regular

cofactor matrices, matrix Wis also regular and invertible. Thus, a solution for the vector of Lagrange

multipliers can be obtained by

k=−W−1(axc+byc+czc+de).(5.184)

5.3. Fitting of a plane in 3D 125

Introducing vector kinto the normal equation (5.176) yields the expression

eTk= 0

⇒ −eTW−1(axc+byc+czc+de)=0

⇒d=−aheTW−1e−1eTW−1xci−bheTW−1e−1eTW−1yci−cheTW−1e−1eTW−1zci.

(5.185)

The solution for the unknown plane parameters a,band ccan be computed by analyzing further the normal

equations (5.173)-(5.175). Taking into account vector k, as well as the residual vectors vx,vyand vz, yields

the reduced system of equations

−xT

cW−1(axc+byc+cyc+de) + kT(aQxx +bQxy +cQxz)k= 0,(5.186)

−yT

cW−1(axc+byc+cyc+de) + kT(aQxx +bQxy +cQxz)k= 0,(5.187)

and

−zT

cW−1(axc+byc+cyc+de) + kT(aQzx +bQzy +cQzz)k= 0.(5.188)

Introducing parameter dfrom (5.185) into these three equations results in

af1+bf2+cf3= 0,(5.189)

bf4+af2+cf5= 0 (5.190)

cf6+af3+bf5= 0,(5.191)

with the defined parameters

f1=kTQxxk−xT

cW−1xc+xT

cW−1eeTW−1e−1eTW−1xc,

f2=kTQxyk−xT

cW−1yc+yT

cW−1eeTW−1e−1eTW−1xc,

f3=kTQxzk−xT

cW−1zc+zT

cW−1eeTW−1e−1eTW−1xc,

f4=kTQyyk−yT

cW−1yc+yT

cW−1eeTW−1e−1eTW−1yc,

f5=kTQyzk−yT

cW−1zc+zT

cW−1eeTW−1e−1eTW−1yc,

f6=kTQzzk−zT

cW−1zc+yT

cW−1eeTW−1e−1eTW−1zc.

(5.192)

126 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

Equations (5.189) - (5.191) form a system of pseudo-linear equations with three unknown parameters. A

restriction between the plane parameters can be chosen at this point. Selecting the restriction of c= 1, the

remaining unknown plane parameters are

ˆa=f2f5−f3f4

f1f4−f2

(5.193)

and

b=−ˆaf2

f4−f5

.(5.194)

An iterative procedure for estimating the unknown plane parameters, based on the presented approach, can

be found in Algorithm 5.

Algorithm 5 Least squares fitting of a plane in 3D with general weights and correlations

1: Choose approximate values for a0and b0.

2: Define parameter c= 1.

3: Set threshold for the break-off condition of the iteration process.

4: Set parameters da=|ˆa−a0|=∞and db=|ˆ

b−b0|=∞, for entering the iteration process.

5: while da>  or db>  do

6: Compute the auxiliary matrix W, and the vector of Lagrange multipliers k.

7: Estimate parameter ˆ

8: Compute the coefficients f1, f2, . . . , f5.

9: Estimate parameters ˆaand ˆ

10: Compute parameters da=|ˆa−a0|and db=|ˆ

b−b0|.

11: Update the approximate values with the estimated ones, a0= ˆaand b0=ˆ

12: end while

13: return ˆa,ˆ

band ˆ

d, with c= 1.

Solution for singular cofactor matrices

An iterative solution for the case of singular cofactor matrices is possible also for this adjustment problem,

following the same procedure that has been presented in subsection 5.2.4.2. Therefore, putting together

equation (5.182) with the normal equations (5.173)-(5.176), results in the system of nonlinear equations

Wk =−(axc+byc+czc+de),

kT(xc+vx)=0,

kT(yc+vy)=0,

kT(zc+vz)=0,

kTe= 0.

(5.195)

5.3. Fitting of a plane in 3D 127

Furthermore, selecting c= 1 as a meaningful restriction between the unknown parameters and introducing

approximate values for the residual vectors v0

xand v0

y, the last equation system can be expressed by6







W xcyce

xc+v0

xT0 0 0

yc+v0

yT0 0 0

eT0 0 0



















=





−zc







.(5.196)

This can be equivalently written as

N"k

X#=n,(5.197)

after introducing matrices

N=





W xcyce

xc+v0

xT0 0 0

yc+v0

yT0 0 0

eT0 0 0







,n=





−zc







(5.198)

and the vector of unknown parameters

X=









.(5.199)

A solution of this adjustment problem can be derived by

"ˆ

X#=N−1n.(5.200)

Solution with a symmetric normal matrix N

Similarly to the procedure of subsection 5.2.4.2, an equivalent solution of the problem using a symmetric

matrix Ncan be obtained by adding the term avx+bvyto both sides of (5.182). Therefore, the equation

system (5.196) can be expressed as

"W A

AT0#" k

X#="w

0#,(5.201)

with matrix

A=xc+v0

x,yc+v0

y,e(5.202)

6The fourth equation, kT(zc+vz) = 0, is not taken into account as parameter cis treated as known.

128 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

and vector

w=−zc+a0v0

x+b0v0

y.(5.203)

A solution of the adjustment problem can be obtained by equation (5.200), after introducing

N="W A

AT0#,n="w

0#,(5.204)

with Nbeing symmetric.

The inversion of matrix Ndepends on the rank deficiency of matrix W, that will be faced when employ-

ing a singular cofactor matrix. Similar to the adjustment problem from subsection 5.2.4.2, the presented

criterion (5.102) will ensure the existence of a unique solution.

An iterative procedure is presented in Algorithm 6 for estimating the plane parameters when singular cofactor

matrices are given.

Algorithm 6 Least squares fitting of a plane in 3D with singular cofactor matrices.

1: Choose approximate values for a0,b0,v0

x,v0

yand v0

2: Define parameter c= 1.

3: Set threshold for the break-off condition of the iteration process.

4: Define parameters da=|ˆa−a0|=∞and db=|ˆ

b−b0|=∞, for entering the iteration process.

5: while da>  or db>  do

6: Compute matrices W,Aand vector w.

7: Build matrix Nand vector n.

8: Estimate the unknown vector "ˆ

X#.

9: Compute the residual vectors vx,vyand vz.

10: Compute parameters da=|ˆa−a0|and db=|ˆ

b−b0|.

11: Update the approximate values with the estimated ones, with a0= ˆa,b0=ˆ

b,v0

x=vx,v0

y=vyand

z=vz.

12: end while

13: return ˆa,ˆ

band ˆ

d, with c= 1.

5.4 2D similarity transformation of coordinates

This subsection deals with the least squares solution of the 2D similarity transformation of coordinates,

where homologous points in two coordinate systems have been measured with different precisions. This

problem has been treated partly in (Teunissen, 1985, p. 148 ff.) as “symmetric Helmert transformation”. A

direct solution was obtained in terms of an eigenvalue problem, when cofactor matrices with special block

diagonal strucure were applied to the coordinates of the points in the two coordinate systems. Additionaly,

Marx (2017) derived direct solutions for “point-wise” and proportional weight matrices in the two coordinate

systems.

5.4. 2D similarity transformation of coordinates 129

The functional model of this problem has been already presented in subsection 4.5, by the system of equations

Xi=ξ1xi−ξ2yi+tx,

Yi=ξ2xi+ξ1yi+ty.

Assuming that all point coordinates are measured quantities the necessary residuals are introduced in the

functional model, resulting in the system of nonlinear condition equations

Xi+vXi−ξ1(xi+vxi) + ξ2(yi+vyi)−tx= 0,

Yi+vYi−ξ2(xi+vxi)−ξ1(yi+vyi)−ty= 0,(5.205)

with i= 1, . . . , n indicating the number of observed homologous points. This functional model can be

expressed equivalently by the overparameterized system

γ(Xi+vXi) + α(xi+vxi)−β(yi+vyi) + tx= 0,

γ(Yi+vYi) + β(xi+vxi) + α(yi+vyi) + ty= 0,(5.206)

with

ξ1=−α

γand ξ2=−β

γ.(5.207)

Following the procedure of subsection 4.5.1, a constraint can been chosen as

α2+β2+γ2= 1.(5.208)

The least squares criterion is utilized for an “optimal” estimate of the unknown transformation parameters,

by minimizing the sum of weighted squared residuals

i=1 pXivXi

2+pYivYi

2+pxivxi

2+pyivyi

2→min,(5.209)

where pXi,pYiare the weights influencing the residuals of the coordinates in the target system, respectively

pxi,pyithe weights in the source system. This adjustment problem is investigated in the next sections for

four different weighting cases:

1. Same precision σX=σY, for the coordinates in Xand Ydirection of the points in the target coordinate

system. Same precision σx=σy, for the coordinates in the directions of xand yof the points in the

source coordinate system.

2. Individual precision for each pair of homologous points in the two systems: σxi=σyi=σXi=σYi∀i.

3. Individual precision for each coordinate.

4. Individual precisions and correlations between the measured 2D coordinates in each system.

130 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

5.4.1 Weighting case 1 - Equally weighted observations in each coordinate sys-

tem

For the first weighting case the measured points in the target system have been observed with the same

precision σX=σYand the points in the source system with σx=σy. Thus, the objective function under

minimization becomes

Ω(vXi, vYi, vxi, vyi) =

i=1

pXY vXi

2+vYi

2+pxy vxi

2+vyi

2→min.(5.210)

with 1

σX2

σY2

=pXY =constant ∀i,

σx2

σy2

=pxy =constant ∀i.

(5.211)

From a geometric perspective the postulated weights can be seen as a homogeneous scale in each coordinate

system, according to the respective weight. For obtaining a direct solution, the coordinates are transformed

linearly by multiplying them with the respective weights

i=Xi√pXY , Y s

i=Yi√pXY and xs

i=xi√pxy , ys

i=yi√pxy.(5.212)

The scaled coordinates Xs

i,Ys

ifrom the target coordinate system and xs

i,ys

ifrom the source coordinate

system can be used to derive the requested transformation parameters with equally weighted observations.

In this line of thinking the residuals are

Xi=vXi√pXY , vs

Yi=vYi√pXY and vs

xi=vxi√pxy , vs

yi=vyi√pxy.(5.213)

Substituting the scaled coordinates and their residuals from equations (5.212) and (5.213) into the condition

equations (5.206) yields

γ1

√pXY

(Xs

i+vs

Xi) + α1

√pxy

(xs

i+vs

xi)−β1

√pxy

(ys

i+vs

yi) + tx= 0,

γ1

√pXY

(Ys

i+vs

Yi) + β1

√pxy

(xs

i+vs

xi) + α1

√pxy

(ys

i+vs

yi) + ty= 0.

(5.214)

Introducing the scaled transformation parameters

αs=α1

√pxy

βs=β1

√pxy

γs=γ1

√pXY

x=tx,

y=ty,

(5.215)

5.4. 2D similarity transformation of coordinates 131

into equation (5.214), results in

γsXs

i+vs

Xi+αsxs

i+vs

xi−βsys

i+vs

yi+ts

x= 0,

γsYs

i+vs

Yi+βsxs

i+vs

xi+αsys

i+vs

yi+ts

y= 0.(5.216)

A meaningful constraint between the transformation parameters is

αs2+βs2+γs2= 1 (5.217)

and an “optimal” solution is possible by minimizing the sum of scaled squared residuals

i=1

2+vs

2→min.(5.218)

Direct least squares solution

The scaling of the measured 3D coordinates leads to the transformation of the original problem into a

problem with equal weights. This means that the sum of squared residuals is equal to the sum of squared

Euclidean distances between the points in the target system and the transformed points from the source

system: n

i=1

2+vs

i=1

i→min.(5.219)

The squared distances between the homologous points are

i= (γsXs

i+αsxs

i−βsys

i+ts

x)2+ (γsYs

i+βsxs

i+αsys

i+ts

y)2.(5.220)

A least squares solution for the transformation parameters can be obtained directly, following the procedure

of subsection 4.5.

Computation of the transformation parameters in the original coordinate system

The original trasnformation parameters can be computed by substituting the estimated scaled tranformation

parameters into equation (5.215):

ˆα= ˆαs√pxy,

β=ˆ

βs√pxy,

ˆγ= ˆγs√pXY ,

tx=ˆ

ty=ˆ

(5.221)

However, the developed solution is restricted to

αs2+βs2+γs2=α1

√pxy 2

+β1

√pxy 2

+γ1

√pXY 2

= 1.(5.222)

132 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

The least squares solution which restricts the transformation parameters to α2+β2+γ2= 1 can be found

ˆα=ˆαs√pxy

rˆαs√pxy2+ˆ

βs√pxy2+ˆγs√pXY 2

β=

βs√pxy

rˆαs√pxy2+ˆ

βs√pxy2+ˆγs√pXY 2

ˆγ=ˆγs√pXY

rˆαs√pxy2+ˆ

βs√pxy2+ˆγs√pXY 2

tx=ˆ

rˆαs√pxy2+ˆ

βs√pxy2+ˆγs√pXY 2

ty=ˆ

rˆαs√pxy2+ˆ

βs√pxy2+ˆγs√pXY 2

(5.223)

5.4.2 Weighting case 2 - Individual weight for each pair of homologous points

in both systems

In the second weighting case for the 2D similarity transformation the homologous points in the source and

target system have been measured with individual precisions

σXi=σYi=σxi=σyi∀i, (5.224)

with the respective weights

pXi=pYi=pxi=pyi=pi.(5.225)

A least squares solution invlolves the minimization of the objective function (5.209), which taking into

account the postulated weights becomes

i=1

pXivXi

2+pYivYi

2+pxivxi

2+pyivyi

i=1

pivXi

2+vYi

2+vxi

2+vyi

2→min.(5.226)

Direct weighted least squares solution

It has been already shown in section 4.5 that the sum of squared residuals of this problem is equal to the

sum of squared Euclidean distances

i=1

(γXi+αxi−βyi+tx)2+ (γYi+βxi+αyi+ty)2.(5.227)

5.4. 2D similarity transformation of coordinates 133

For equally weighted homologous points, it is also true that

i=1

pivXi

2+vYi

2+vxi

2+vyi

2=

i=1

piD2

i.(5.228)

Therefore, the objective function under minimization can be written as

Ω(α, β, γ, tx, ty) =

i=1

piD2

i=1

pi(γXi+αxi−βyi+tx)2+ (γYi+βxi+αyi+ty)2.(5.229)

A meaningful restriction for the unknown parameters is chosen also for this weighting case as

α2+β2+γ2= 1.(5.230)

We attempt to obtain the least squares solution for the unknown transformation parameters, that minimizes

equation (5.229) under the chosen restriction. Thus, the Lagrangian can be written as

K(α, β, γ, tx, ty, k) = Ω(α, β, γ, tx, ty)−k(α2+β2+γ2−1),(5.231)

with the normal equations

∂K

∂α = 2 "α n

i=1

pixi2+

i=1

piyi2−k!+γ n

i=1

pixiXi+

i=1

piyiYi!

+tx

i=1

pixi+ty

i=1

piyi#= 0,

(5.232)

∂K

∂β = 2 "β n

i=1

pixi2+

i=1

piyi2−k!+γ n

i=1

pixiYi−

i=1

piyiXi!

−tx

i=1

piyi+ty

i=1

pixi#= 0,

(5.233)

∂K

∂γ = 2 "α n

i=1

pixiXi+

i=1

piyiYi!+β n

i=1

pixiYi−

i=1

piyiXi!+γ n

i=1

pixi2+

i=1

piyi2−k!

+tx

i=1

piXi+ty

i=1

piYi#= 0,

(5.234)

∂K

∂tx

= 2 tx

i=1

pi+α

i=1

pixi−β

i=1

piyi+γ

i=1

piXi!= 0,(5.235)

∂K

∂ty

= 2 ty

i=1

pi+α

i=1

piyi+β

i=1

pixi+γ

i=1

piYi!= 0 (5.236)

134 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

and

∂K

∂k =−α2+β2+γ2−1= 0.(5.237)

A solution for the trasnlation parameters can be derived by rearanging equations (5.235) and (5.236), which

yields

tx=−α

i=1

pixi

i=1

+β

i=1

piyi

i=1

−γ

i=1

piXi

i=1

(5.238)

and

ty=−α

i=1

piyi

i=1

−β

i=1

pixi

i=1

−γ

i=1

piYi

i=1

.(5.239)

Substituting txand tyinto equations (5.232) - (5.234), results in the system of reduced normal equations

α





i=1

pixi2+

i=1

piyi2− n

i=1

pixi!2

i=1

− n

i=1

piyi!2

i=1

−k





γ





i=1

pixiXi+

i=1

piyiYi−

i=1

pixi

i=1

piXi

i=1

−

i=1

piyi

i=1

piYi

i=1







= 0,

(5.240)

β





i=1

pixi2+

i=1

piyi2− n

i=1

pixi!2

i=1

− n

i=1

piyi!2

i=1

−k





γ





i=1

pixiYi−

i=1

piyiXi+

i=1

piyi

i=1

piXi

i=1

−

i=1

pixi

i=1

piYi

i=1







= 0,

(5.241)

5.4. 2D similarity transformation of coordinates 135

and

α





i=1

pixiXi+

i=1

piyiYi−

i=1

pixi

i=1

piXi

i=1

−

i=1

piyi

i=1

piYi

i=1







β





i=1

pixiYi−

i=1

piyiXi+

i=1

piyi

i=1

piXi

i=1

−

i=1

pixi

i=1

piYi

i=1







γ





i=1

piXi2+

i=1

piYi2− n

i=1

piXi!2

i=1

− n

i=1

piYi!2

i=1

−k





= 0.

(5.242)

The Lagrange multiplier kcan be calculated by solving



(v1−k)w1w2

w1(v1−k)w3

w2w3(v2−k)



= 0,(5.243)

which is a cubic characteristic equation with the unknown parameter k.

The respective elements are

v1=

i=1

pixi2+

i=1

piyi2− n

i=1

pixi!2

i=1

− n

i=1

piyi!2

i=1

v2=

i=1

piXi2+

i=1

piYi2− n

i=1

piXi!2

i=1

− n

i=1

piYi!2

i=1

w1= 0,

w2=

i=1

pixiXi+

i=1

piyiYi−

i=1

pixi

i=1

piXi

i=1

−

i=1

piyi

i=1

piYi

i=1

w3=

i=1

pixiYi−

i=1

piyiXi+

i=1

piyi

i=1

piXi

i=1

−

i=1

pixi

i=1

piYi

i=1

(5.244)

136 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

The solution for the transformation parameters α,βand γcan be estimated either by substituting parameter

kmin into the reduced normal equations (5.240) - (5.242) or by transforming them and solve an eigenvalue

problem.

5.4.3 Weighting case 3 - Individually weighted coordinates

In the third investigated weighting case for the 2D similarity transformation the coordinates of the points

have been measured with different precisions, leading to individual weights for the residuals of each coordi-

nate. The variances of the observed coordinates in the source system can be stored in the variance-covariance

matrix

ΣLL1="Σxx 0

0 Σyy #=QLL1,for σ2

0= 1,(5.245)

with the sub-matrices

Σxx =





σ2

x10

σ2

...

0σ2







and Σyy =





σ2

y10

σ2

...

0σ2







.(5.246)

Selecting the variance of the unit weight being equal to one, the cofactor of the coordinates in the source

system is equal to the variance-covariance matrix (QLL1=ΣLL1) and the respective weight matrix is

P1=QLL

−1

1="Pxx 0

0 Pyy #,(5.247)

with the weight sub-matrices being expressed by

Pxx =





px10

px2

...

0pxn







and Pyy =





py10

py2

...

0pyn







.(5.248)

Analogously, the variance-covariance matrix of the observed coordinates in the target system can be written

ΣLL2="ΣXX 0

0 ΣYY #=QLL2,for σ2

0= 1,(5.249)

with

ΣXX =





σ2

X10

σ2

...

0σ2







and ΣYY =





σ2

Y10

σ2

...

0σ2







.(5.250)

5.4. 2D similarity transformation of coordinates 137

Thus, the weight matrix for the coordinates in the target system is

P2=QLL

−1

2="PXX 0

0 PYY #,(5.251)

with the weight sub-matrices

PXX =





pX10

pX2

...

0pXn







and PYY =





pY10

pY2

...

0pYn







.(5.252)

The nonlinear condition equations (5.205) can be equivalently written in vector notation as

Xc+vX−ξ1(xc+vx) + ξ2(yc+vy)−txe=0,

Yc+vY−ξ2(xc+vx)−ξ1(yc+vy)−tye=0,(5.253)

with vectors Xc,Yclisting the coordinates of the points in the target system and xc,ycthe coordinates in

the source system:

Xc=











,Yc=











,xc=











,yc=











,(5.254)

while vectors vX,vY,vxand vycontain the residuals of the corresponding coordinates

vX=





vX1

vX2

vXn







,vY=





vY1

vY2

vYn







,vx=





vx1

vx2

vxn







,vy=





vy1

vy2

vyn







.(5.255)

eis a vector of ones, with length being equal to the number of homologous points. A solution based on

the least squares principle can be obtained by minimizing the objective function (5.209), written in matrix

notation as

Ω(vX,vY,vx,vy) = vT

XPXXvX+vT

YPYYvY+vT

xPxxvx+vT

yPyyvy.(5.256)

138 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

Iterative least squares solution without linearization

Utilizing the nonlinear functional model of equation (5.253) and the objective function (5.256), a least

squares solution can be derived by minimizing the Lagrange function

K(ξ1, ξ2, tx, ty,vX,vY,vx,vy,k1,k2) = Ω(vX,vY,vx,vy)

−2kT

1[−(Xc+vX) + ξ1(xc+vx)−ξ2(yc+vy) + txe]

−2kT

2[−(Yc+vY) + ξ2(xc+vx) + ξ1(yc+vy) + tye],

(5.257)

with k1and k2denoting the vectors of Lagrange multipliers. A linearization of the problem is avoided also

here. Differentiating the Lagrangian with respect to all unknowns and setting the partial derivatives to zero

yields

∂K

∂vT

= 2 (PXXvX+k1) = 0,(5.258)

∂K

∂vT

= 2 (PYYvY+k2) = 0,(5.259)

∂K

∂vT

= 2 (Pxxvx−ξ1k1−ξ2k2) = 0,(5.260)

∂K

∂vT

= 2 (Pyyvy+ξ2k1−ξ1k2) = 0,(5.261)

∂K

∂kT

=−2 [−(Xc+vX) + ξ1(xc+vx)−ξ2(yc+vy) + txe] = 0,(5.262)

∂K

∂kT

=−2 [−(Yc+vY) + ξ2(xc+vx) + ξ1(yc+vy) + tye] = 0,(5.263)

∂K

∂ξ1

=−2kT

1(xc+vx) + kT

2(yc+vy)= 0,(5.264)

∂K

∂ξ2

=−2−kT

1(yc+vy) + kT

2(xc+vx)= 0,(5.265)

∂K

∂tx

=−2kT

1e= 0 (5.266)

5.4. 2D similarity transformation of coordinates 139

and

∂K

∂ty

=−2kT

2e= 0.(5.267)

Equations (5.258)-(5.267) represent a nonlinear system of 6n+ 4 equations. Substituting the residuals from

(5.258)-(5.261) into equations (5.262) and (5.263) yields

ξ2

1Qxx +ξ2

2Qyy +QXXk1+ (ξ1ξ2Qxx −ξ1ξ2Qyy)k2=−(ξ1xc−ξ2yc+txe−Xc)(5.268)

and

(ξ1ξ2Qxx −ξ1ξ2Qyy)k1+ξ2

2Qxx +ξ2

1Qyy +QYYk2=−(ξ2xc+ξ1yc+tye−Yc).(5.269)

Introducing approximate values for the unknown transformation parameters only in the left-hand side of

the last two equations, it is possible to write

W1k1+W2k2=−(ξ1xc−ξ2yc+txe−Xc) (5.270)

and

W2k1+W3k2=−(ξ2xc+ξ1yc+tye−Yc),(5.271)

with the auxiliary matrices defined as

W1=ξ0

2Qxx +ξ0

2Qyy +QXX,

W2=ξ0

1ξ0

2Qxx −ξ0

1ξ0

2Qyy,

W3=ξ0

2Qxx +ξ0

2Qyy +QYY.

(5.272)

Since the introduced cofactor matrices from equations (5.245) and (5.249) are diagonal and therefore regular,

then matrices W1,W2and W3are also regular and invertible. In this case a solution for the vectors of

Lagrange multipliers can be found by

k1=W5(ξ2xc+ξ1yc+tye−Yc)−W4(ξ1xc−ξ2yc+txe−Xc) (5.273)

and

k2=W5(ξ1xc−ξ2yc+txe−Xc)−W6(ξ2xc+ξ1yc+tye−Yc).(5.274)

The respective matrices are

W4=W1−W2W−1

3W2−1,

W5=W1−W2W−1

3W2−1W2W−1

3,W6=W3−W2W−1

3W2−1

.(5.275)

140 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

Inserting the expressions for k1and k2into equations (5.266) and (5.267) gives the solution for the translation

parameters

txeTW4e=tyeTW5e+eTW5(ξ2xc+ξ1yc−Yc)−eTW4(ξ1xc−ξ2yc−X),(5.276)

tyeTW6e=txeTW5e+eTW5(ξ1xc−ξ2yc−Xc)−eTW6(ξ2xc+ξ1yc−Yc).(5.277)

Estimates for the unknown transformation parameters ξ1and ξ2can be computed by substituting the derived

residual vectors from equations (5.258) - (5.261), the vectors of Lagrange multipliers k1,k2from (5.273)

and (5.274), as well as the translation parameters (5.276) and (5.277) into the normal equations (5.264) and

(5.265). This results in the reduced system of equations

ξ1f1+ξ2f2+f3= 0 (5.278)

and

ξ2f4+ξ1f2+f5= 0,(5.279)

with the respective quantities

f1=kT

1Qxxk1+kT

2Qyyk2+xT

cW5yc−xT

cW4xc+yT

cW5xc−yT

cW6yc,

f2=kT

1Qxxk2−kT

2Qyyk1+xT

cW5xc+xT

cW4yc−yT

cW5yc−yT

cW6xc,

f3=yT

cW5−xT

cW4(txe−Xc) + xT

cW5−yT

cW6(tye−Yc),

f4=kT

2Qxxk2+kT

1Qyyk1−xT

cW5yc−xT

cW6xc−yT

cW4yc−yT

cW5xc,

f5=xT

cW5+yT

cW4(txe−Xc)−xT

cW6+yT

cW5(tye−Yc).

(5.280)

Last but not least, solving (5.279) for ξ2and introducing it in (5.278) leads to the least squares estimate for

ξ1=f1f4−f2

2−1(f2f5−f3f4) (5.281)

and

ξ2=−ξ1

f4−f5

.(5.282)

An iterative solution is possible by choosing meaningful approximate values for the unknown transforma-

tion parameters. Thus, the last two equations become pseudo-linear with the functions f1, f2, . . . , f5being

approximated. An iterative procedure can be found in Algorithm 7 for the weighted least squares solution

of the 2D similarity transformation of coordinates.

5.4. 2D similarity transformation of coordinates 141

Algorithm 7 Least squares 2D similarity transformation of coordinates with general weights

1: Choose approximate values for ξ0

1and ξ0

2: Set threshold for the break-off condition of the iteration process.

3: Set parameters dξ1=|ˆ

ξ1−ξ0

1|=∞and dξ2=|ˆ

ξ2−ξ0

2|=∞, for entering the iteration process.

4: while dξ1=|ˆ

ξ1−ξ0

1|>  or dξ2=|ˆ

ξ2−ξ0

2|>  do

5: Compute the auxiliary matrices W1,W2,...,W6.

6: Estimate the translation parameters ˆ

txand ˆ

ty.

7: Compute the vectors of Lagrange multipliers k1and k2.

8: Compute the coefficients f1, f2, . . . , f5.

9: Estimate parameters ˆ

ξ1and ˆ

ξ2.

10: Compute parameter dξ1=|ˆ

ξ1−ξ0

1|and dξ2=|ˆ

ξ2−ξ0

2|.

11: Update the approximate values with the estimated ones (ξ0

1=ˆ

ξ1,ξ0

2=ˆ

ξ2) .

12: end while

13: return ˆ

ξ1,ˆ

txand ˆ

ty.

5.4.4 Weighting case 4 - Individually weighted and correlated coordinates in

each coordinate system

In the last investigated weighting case for the 2D similarity transformation, correlations are introduced

between the measured coordinates of the points in each coordinate system. Therefore, two cofactor matrices

are given. The first is related to the coordinates in the source system

QLL1="Qxx Qxy

Qyx Qyy #,with Qxy =QT

yx,(5.283)

while the second concerns the coordinates in the target system

QLL2="QXX QXY

QYX QYY #,with QXY =QT

YX.(5.284)

The respective weights of this problem can be computed by

P1=QLL

−1

1="Pxx Pxy

Pyx Pyy #(5.285)

and

P2=QLL

−1

2="PXX PXY

PYX PYY #.(5.286)

142 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

Taking into account the stochastic model described above, the objective function (5.256) can be extended

Ω(vX,vY,vx,vy) = vT

XPXXvX+vT

YPYYvY+vT

xPxxvx+vT

yPyyvy+ 2 vT

XPXYvY+ 2 vT

xPxyvy.(5.287)

Iterative least squares solution without linearization

The developed objective function can be combined with the nonlinear condition equations (5.253) to form

the Lagrange function

K(ξ1, ξ2, tx, ty,vX,vY,vx,vy,k1,k2) = Ω(vX,vY,vx,vy)

−2kT

1[−(Xc+vX) + ξ1(xc+vx)−ξ2(yc+vy) + txe]

−2kT

2[−(Yc+vY) + ξ2(xc+vx) + ξ1(yc+vy) + tye],

(5.288)

with k1and k2denoting vectors of Lagrange multipliers. Differentiating the Lagrangian with respect to all

unknowns and setting the result to zero, yields the system of normal equations

∂K

∂vT

= 2 (PXXvX+PXYvY+k1) = 0,(5.289)

∂K

∂vT

= 2 (PYYvY+PYXvX+k2) = 0,(5.290)

∂K

∂vT

= 2 (Pxxvx+Pxyvy−ξ1k1−ξ2k2) = 0,(5.291)

∂K

∂vT

= 2 (Pyyvy+Pyxvx+ξ2k1−ξ1k2) = 0,(5.292)

∂K

∂kT

=−2 [−(Xc+vX) + ξ1(xc+vx)−ξ2(yc+vy) + txe] = 0,(5.293)

∂K

∂kT

=−2 [−(Yc+vY) + ξ2(xc+vx) + ξ1(yc+vy) + tye] = 0,(5.294)

∂K

∂ξ1

=−2kT

1(xc+vx) + kT

2(yc+vy)= 0,(5.295)

∂K

∂ξ2

=−2−kT

1(yc+vy) + kT

2(xc+vx)= 0,(5.296)

5.4. 2D similarity transformation of coordinates 143

∂K

∂tx

=−2kT

1e= 0 (5.297)

and

∂K

∂ty

=−2kT

2e= 0.(5.298)

The first two of the developed normal equations can be expressed using block matrices:

"PXX PXY

PYX PYY #" vX

vY#=−"k1

k2#.(5.299)

Thus, a solution for the residual vectors in the target system can be computed by

"vX

vY#=−"PXX PXY

PYX PYY #−1"k1

k2#=−"QXX QXY

QYX QYY #" k1

k2#,(5.300)

or equivalently by

vX=−QXXk1−QXYk2(5.301)

and

vY=−QYXk1−QYYk2.(5.302)

In the same manner, explicit expressions for the residual vectors of the coordinates in the source system can

be obtained by utilizing equations (5.291) and (5.292), which yields

vx=ξ1(Qxxk1+Qxyk2) + ξ2(−Qxyk1+Qxxk2) (5.303)

and

vy=ξ1(Qyxk1+Qyyk2) + ξ2(−Qyyk1+Qyxk2).(5.304)

For a solution of k1and k2, all residual vectors are introduced into equations (5.293) and (5.294), resulting

ξ2

1Qxx −ξ1ξ2Qxy −ξ1ξ2Qyx +ξ2

2Qyy +QXXk1+ξ1ξ2Qxx +ξ2

1Qxy −ξ2

2Qyx −ξ1ξ2Qyy +QXYk2=

−(ξ1xc−ξ2yc+txe−Xc)

(5.305)

144 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

and

ξ1ξ2Qxx +ξ2

1Qyx −ξ2

2Qxy −ξ1ξ2Qyy +QYXk1+ξ2

2Qxx +ξ1ξ2Qxy +ξ1ξ2Qyx +ξ2

1Qyy +QYYk2=

−(ξ2xc+ξ1yc+tye−Yc).

(5.306)

Introducing approximate values for the unknown transformation parameters only in the left-hand side of

the last two equations, it is possible to rewrite them as

W1k1+W2k2=−(ξ1xc−ξ2yc+txe−Xc) (5.307)

and

W3k2+W4k1=−(ξ2xc+ξ1yc+tye−Yc),(5.308)

with the auxiliary matrices

W1=ξ0

2Qxx −ξ0

1ξ0

2Qxy −ξ0

1ξ0

2Qyx +ξ0

2Qyy +QXX,

W2=ξ0

1ξ0

2Qxx +ξ0

2Qxy −ξ0

2Qyx −ξ0

1ξ0

2Qyy +QXY,

W3=ξ0

2Qxx +ξ0

1ξ0

2Qxy +ξ0

1ξ0

2Qyx +ξ0

2Qyy +QYY,

W4=ξ0

1ξ0

2Qxx +ξ0

2Qyx −ξ0

2Qxy −ξ0

1ξ0

2Qyy +QYX.

(5.309)

If the cofactor matrices are regular, then matrices W1,W2,W3and W4are also regular and invertible.

Consequently, a solution for the vectors of Lagrange multipliers can be found by

k1=W5(ξ2xc+ξ1yc+tye−Yc)−W6(ξ1xc−ξ2yc+txe−Xc) (5.310)

and

k2=W7(ξ1xc−ξ2yc+txe−Xc)−W8(ξ2xc+ξ1yc+tye−Yc),(5.311)

after introducing the matrices

W5=W1−W2W−1

3W4−1W2W−1

W6=W1−W2W−1

3W4−1,

W7=W3−W4W−1

1W2−1W4W−1

W8=W3−W4W−1

1W2−1.

(5.312)

5.4. 2D similarity transformation of coordinates 145

Substituting the derived vectors of Lagrange multipliers k1and k2into the normal equations (5.297) and

(5.298) yields the translation vectors

txeTW6e=tyeTW5e+eTW5(ξ2x + ξ1y−Y) −eTW6(ξ1x−ξ2y−X)(5.313)

and

tyeTW8e=txeTW7e+eTW7(ξ1x−ξ2y−X) −eTW8(ξ2x + ξ1y−Y).(5.314)

Furthermore, estimates for the transformation parameters ξ1and ξ2can be computed by substituting the

residual vectors, the vectors of Lagrange multipliers, as well as the translation parameters into equations

(5.295) and (5.296). This results in the reduced normal equations

ξ1f1+ξ2f2+f3= 0 (5.315)

and

ξ2f5+ξ1f4+f6= 0,(5.316)

with the respective quantities

f1=kT

1Qxxk1+kT

1Qxyk2+kT

2Qyxk1+kT

2Qyyk2+xT

cW5yc−xT

cW6xc+yT

cW7xc−yT

cW8yc,

f2=kT

1Qxxk2−kT

1Qxyk1+kT

2Qyxk2−kT

2Qyyk1+xT

cW5xc+xT

cW6yc−yT

cW7yc−yT

cW8xc,

f3=yT

cW7−xT

cW6(txe−Xc) + xT

cW5−yT

cW8(tye−Yc),

f4=kT

1Qyxk1+kT

1Qyyk2−kT

2Qxxk1−kT

2Qxyk2+yT

cW5yc−yT

cW6xc−xT

cW7xc+xT

cW8yc,

f5=kT

1Qyxk2+kT

1Qyyk2−kT

2Qxxk2+kT

2Qxyk1+yT

cW5xc+yT

cW6yc+xT

cW7yc+xT

cW8xc,

f6=−yT

cW6−xT

cW7(txe−Xc) + yT

cW5+xT

cW8(tye−Yc).

(5.317)

Finally, solving equation (5.316) for ξ2and introducing it in (5.315) leads to the least squares estimate for

ξ1= (f1f5−f2f4)−1(f2f6−f3f5) (5.318)

and

ξ2=−ˆ

ξ1

f5−f6

.(5.319)

146 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

Choosing appropriate approximate values for the unknown transformation parameters, an iterative process

could give the solution to this adjustment problem. Equations (5.318) and (5.319) are pseudo-linear with

the auxiliary functions f1, f2, . . . , f6being approximated in each iteration step. Algorithm 8 includes the

presented iterative least squares solution for the 2D similarity transformation of coordinates with correlated

observations.

Algorithm 8 Least squares 2D similarity transformation of coordinates with correlated observations

1: Choose approximate values for ξ0

1and ξ0

2: Set threshold for the break-off condition of the iteration process.

3: Set parameters dξ1=|ˆ

ξ1−ξ0

1|=∞and dξ2=|ˆ

ξ2−ξ0

2|=∞, for entering the iteration process.

4: while dξ1=|ˆ

ξ1−ξ0

1|>  or dξ2=|ˆ

ξ2−ξ0

2|>  do

5: Compute the auxiliary matrices W1,W2,...,W8.

6: Estimate the translation parameters ˆ

txand ˆ

ty.

7: Compute the vectors of Lagrange multipliers k1and k2.

8: Compute the coefficients f1, f2, . . . , f6.

9: Estimate parameters ˆ

ξ1and ˆ

ξ2.

10: Compute parameter dξ1=|ˆ

ξ1−ξ0

1|and dξ2=|ˆ

ξ2−ξ0

2|.

11: Update the approximate values with the estimated ones (ξ0

1=ˆ

ξ1,ξ0

2=ˆ

ξ2) .

12: end while

13: return ˆ

ξ1,ˆ

txand ˆ

ty.

Solution for singular cofactor matrices

An iterative least squares solution is possible also for the case of singular cofactor matrices, following the

same procedure as in the previous application cases. Grouping equations (5.307) and (5.308), together with

the normal equations (5.295)-(5.298), results in the system of equations

W1k1+W2k2=−(ξ1xc−ξ2yc+txe−Xc),

W3k2+W4k1=−(ξ2xc+ξ1yc+tye−Yc),

1(xc+vx) + kT

2(yc+vy)=0,

1(yc+vy)−kT

2(xc+vx)=0,

1e= 0,

2e= 0.

(5.320)

5.4. 2D similarity transformation of coordinates 147

Introducing approximate values for the residual vectors v0

x,v0

y,v0

Xand v0

X, the developed equation system

can be expressed by







W1W2xc−yce0

W4W3ycxc0e

xc+v0

xTyc+v0

yT0 0 0 0

yc+v0

yT−xc+v0

xT0 0 0 0

eT0 0 0 0 0

0eT0 0 0 0













ξ1

ξ2



















,(5.321)

which can be equivalently formulated as

N"k

X#=n.(5.322)

The introduced matrices are







W1W2xc−yce0

W4W3ycxc0e

xc+v0

xTyc+v0

yT0 0 0 0

yc+v0

yT−xc+v0

xT0 0 0 0

eT0 0 0 0 0

0eT0 0 0 0







,n=













(5.323)

with the vector of unknown transformation parameters

X=





ξ1

ξ2







.(5.324)

The least squares solution of this adjustment problem is

"ˆ

X#=N−1n.(5.325)

The estimated parameters for the 2D similarity transformation can be utilized as new approximations and

the procedure is repeated until the necessary predefined condition is met.

Solution with a symmetric normal matrix N

Similarly to the cases of subsections 5.2.4.2 and 5.3.4, a symmetric matrix Ncan be obtained by adding the

terms ξ1vx−ξ2vyand ξ2vx+ξ1vyto both sides of equations (5.307) and (5.308) respectively. In this way,

148 Chapter 5. Direct and iterative solutions of weighted nonlinear least squares problems

the equation system (5.321) becomes

"W A

AT0#" k

X#="w

0#,(5.326)

with the relevant matrices being expressed for this problem as

W="W1W2

W4W3#,(5.327)

A="xc+v0

x−yc+v0

ye0

yc+v0

yxc+v0

x0e#(5.328)

and

w=−zc+a0v0

x+b0v0

y.(5.329)

A solution of this adjustment problem can be computed by equation (5.325), after introducing

N="W A

AT0#,n="w

0#,(5.330)

with Nbeing symmetric.

The rank deficiency of matrix W, respectively of the matrices W1,W2,W3and W4, depends on the cofactor

matrices of the problem and is important for the inversion of matrix N. Similar to the adjustment cases from

subsections 5.2.4.2 and 5.3.4, the presented criterion (5.102) will ensure the existence of a unique solution.

The developed iterative procedure is presented in Algorithm 9 for obtaining a weighted least squares solution

for the 2D transformation parameters, when singular cofactor matrices are given.

Algorithm 9 Least squares 2D similarity transformation of coordinates with singular cofactor matrices

1: Choose approximate values for ξ0

1,ξ0

2,v0

x,v0

yand v0

X,v0

2: Set threshold for the break-off condition of the iteration process.

3: Define parameters dξ1=|ˆ

ξ1−ξ0

1|=∞and dξ2=|ˆ

ξ2−ξ0

2|=∞, for entering the iteration process.

4: while dξ1=|ˆ

ξ1−ξ0

1|>  or dξ2=|ˆ

ξ2−ξ0

2|>  do

5: Compute matrices W,Aand vector w.

6: Build matrix Nand vector n.

7: Estimate the vector of unknown parameters "ˆ

X#.

8: Compute the residual vectors vx,vyand vX,vY.

9: Compute parameter dξ1=|ˆ

ξ1−ξ0

1|and dξ2=|ˆ

ξ2−ξ0

2|.

10: Update the approximate values with the estimated ones, with ξ0

1=ˆ

ξ1,ξ0

2=ˆ

ξ2,v0

x=vx,v0

y=vy,

X=vXand v0

Y=vY.

11: end while

12: return ˆ

ξ1,ˆ

txand ˆ

ty.

5.5. Discussion of weighted nonlinear least squares solutions 149

5.5 Discussion of weighted nonlinear least squares solutions

In this chapter, three adjustment cases that belong to a class of nonlinear least squares problems with a

direct solution have been examined: the fitting of straight line in 2D, the fitting of a plane in 3D and the

2D similarity transformation of coordinates. Further, four individual weighting scenarios were presumed in

each adjustment problem that often occur in practice: constant weights for the coordinates in each direction,

individual weights for the coordinates of each point, individual weight for each coordinate and individually

weighted and correlated coordinates. A thorough analysis of these weighted least squares problems has

shown that in certain cases a direct solution is still possible. Otherwise, an iterative solution could be always

developed without performing any kind of linearization of the problem.

A direct solution has been proven to be always possible for the first two weighting cases, regarding the

discussed class of least squares problems. The estimated unknown parameters have been obtained by

parametrizing appropriately the mathematical model and minimizing a clearly defined Lagrange function.

This led to a system of normal equations that can have a nontrivial solution if the determinant is equal

to zero or equivalently by solving an eigenvalue problem. Therefore, two novel systematic approaches have

been established for the direct solution of the first two weighting cases, based on the same solution strategy

of (Malissiovas et al. 2016).

For the last two weighting cases, it has been demonstrated why a direct least squares solution is not possible.

Subsequently, iterative algorithms have been presented that are applicable in all cases without making use

of linearization of the original problem. The general idea of the established iterative systematic approach is

based on the minimization of a Lagrange function and the solution of a reduced system of normal equations,

following (Petrovi´c et al. 1983). In addition, a simple extension to this systematic approach has been

demonstrated for the solution of these adjustment problems when singular cofactor matrices are given. The

developed iterative algorithms produce the WTLS solution and can be compared to those presented by Fang

(2011) and Snow (2012).

151

6 Numerical Investigations

The developed methodologies and implemented algorithms of the previous chapters are tested here for

two application examples: this of fitting a straight line to measured points in 2D and for estimating the

2D similarity transformation parameters between two measured groups of homologous points. Different

weighting cases are examined for each adjustment problem. Moreover, the presented solutions for the

examples in this chapter have been tested and found to be numerically equal to the least squares solution

within the GHM.

6.1 Fitting of a straight line in 2D

This section illustrates the least squares solution for fitting a straight line in 2D. The dataset of the measured

point coordinates is listed in Table 6.1. It originates from the work of Pearson (1901) and since then it has

been utilized by many authors. (Snow, 2012, p. 67) noticed that York (1966) introduced unusual weights

for the observed coordinates and solved the problem iteratively. Different algorithms for estimating the

least squares solution of the same dataset using York’s stochastic model has been presented at least by Neri

et al. (1989), Schaffrin and Wieser (2008), Shen et al. (2011) or Amiri-Simkooei and Jazaeri (2012). Snow

(2012) introduced also correlations between the measured coordinates and solved the problem for regular

and singular cofactor matrices.

Table 6.1: Example dataset of measured points in 2D.

Point No. x-coord. [m] y-coord. [m]

1 0 5.9

2 0.9 5.4

3 1.8 4.4

4 2.6 4.6

5 3.3 3.5

6 4.4 3.7

7 5.2 2.8

8 6.1 2.8

9 6.5 2.4

10 7.4 1.5

152 Chapter 6. Numerical Investigations

The measured coordinates are in both directions of xand yunder the influence of random errors. Least

squares solutions for fitting a straight line to the points are presented below for four different weighting

cases.

Equally weighted coordinates

For the first weighting case the measured coordinates are uncorrelated and have been obtained with equal

precision, i.e. equal weights:

pxi=pyi= 1.(6.1)

For solving the nonlinear problem of fitting a straight line to the ten points of the presented dataset, Pearson

(1901) proposed a direct approach using a functional model equivalent to equation (4.32). The least squares

solution for the slope of the requested line has been found in that article to be ˆ

β=−0.546. A solution has

been also obtained utilizing the algorithm of Neitzel and Petrovic (2008). The results from the GHM are

listed in Table 6.2.

Table 6.2: Solution within the GHM using the algorithm of Neitzel and Petrovic (2008).

Estimated parameter GHM solution

β(slope ˆa) -0.545561197521

ˆγ(y-intercept ˆ

b) 5.7840437745301

Further, a direct least squares solution was obtained for the unknown line parameters a,band c, following the

developed methodology of section 4.2.1. The Lagrange function of equation (4.11) leads to a homogeneous

system of equations with one unknown parameter k, that can be estimated by solving



(56.396 −k)−30.43

−30.43 (17.22 −k)

= 0.(6.2)

This yields a quadratic equation with the solutions for the unknown parameters kmin = 0.618572759437049

and kmax = 72.997427240563. The results for the line parameters can be found in Table 6.3. Parameters ˆa,

band ˆcwere utilized further to compute parameters ˆ

βand ˆγ.

Table 6.3: Direct least squares solution (section 4.2.1).

Estimated parameter least squares solution

ˆa-0.4789242860482

b-0.8778562115935

ˆc5.0775587555999

Computed parameter

β=−ˆa

b-0.545561197521

ˆγ=−ˆc

b5.7840437745301

6.1. Fitting of a straight line in 2D 153

An estimate for the unknown line parameters was derived using the TLS approach. The determinant of the

generalized eigenvalue problem was built following the procedure for the eigenvalue/eigenvector decomposi-

tion of matrix G(equation 4.43). This results in the characteristic equation of the eigenvalues (quadratic

equation) with the solutions λmin = 0.618572759437049 and λmax = 72.997427240563. The TLS solution

for the unknown line parameters is presented in Table 6.4.

Table 6.4: TLS solution (section 4.2.2).

Estimated parameter TLS solution

β-0.545561197521

ˆγ5.7840437745301

The developed direct least squares solution is, as expected, identical to the TLS. Both solutions are numeri-

cally consistent with the result of the linearized GHM. The requested straight line is depicted in Figure 6.1,

together with the measured points and their estimated residuals.

-1012345678

Figure 6.1: Fitting a straight line to points in 2D, with observed xand ycoordinates of equal precision.

154 Chapter 6. Numerical Investigations

Equally weighted coordinates in each direction

In this weighting case the measured coordinates are uncorrelated and have been obtained with the same

precision in each direction. The postulated weights are

pxi= 0.5 and pyi= 1.5.(6.3)

A direct solution can be derived in this case from the developed methodology of section 5.2.1. The results

are presented in Table 6.5.

Table 6.5: Direct least squares solution (section 5.2.1)

Estimated parameter Direct least squares solution

ˆa-0.4832580303705

b-0.8754779700726

ˆc5.0853141652839

Computed parameter

β=−ˆa

b-0.5519933646422

ˆγ=−ˆc

b5.8086146529331

For comparison reasons the least squares solution has been computed again with the algorithm of (Neitzel

and Petrovic 2008), taking into account the current stochastic model. Both the developed direct solution

and the one within the GHM are numerically identical. The estimated straight line is depicted in Figure 6.2.

6.1. Fitting of a straight line in 2D 155

-1012345678

Figure 6.2: Fitting a straight line to points in 2D, with observed xand ycoordinates and px,pyindividual

constant weights for each coordinate axis.

Individually weighted points

For this weighting example the measured coordinates are uncorrelated and have been obtained with indi-

vidual precision for each measured point. The postulated weights are listed in Table 6.6.

Table 6.6: Individual weights for each point.

Point No. px=1

σx2py=1

σy2

1 1 1

2 1.2 1.2

3 0.8 0.8

4 1.1 1.1

5 0.9 0.9

6 1.15 1.15

7 1 1

8 0.93 0.93

9 1.25 1.25

10 1.13 1.13

156 Chapter 6. Numerical Investigations

A direct weighted least squares solution has been obtained following the procedure of section 5.2.2. The

results for the estimated line parameters can be found in Table 6.7.

Table 6.7: Direct least squares solution (section 5.2.2).

Estimated parameter Direct least squares solution

ˆa-0.4824660036697

b-0.875914696362

ˆc5.1014648040614

Computed parameter

β=−ˆa

b-0.5508139156399

ˆγ=−ˆc

b5.8241571071355

The computed straight line is shown in Figure 6.3.

-1012345678

Figure 6.3: Fitting a straight line to points in 2D, with individual weight for the coordinates of each point.

6.1. Fitting of a straight line in 2D 157

Individually weighted 2D coordinates

The stochastic model of York (1966) has been adopted for this adjustment example. The measured coordi-

nates are uncorrelated and have been obtained with individual precision. The postulated weights are listed

in Table 6.8.

Table 6.8: Individual weights for each coordinate.

Point No. px=1

σx2py=1

σy2

1 1000 1

2 1000 1.8

3 500 4

4 800 8

5 200 20

6 80 20

7 60 70

8 20 70

9 1.8 100

10 1 500

An iterative weighted least squares solution has been derived from the developed approach of section 5.2.3,

by employing Algorithm 1. The results are listed in Table 6.9.

Table 6.9: Iterative least squares solution using Algorithm 1 (section 5.2.3).

Estimated parameter Iterative least squares solution

ˆa0.4805334074462

ˆc-5.4799102240329

Computed parameter

β=−ˆa

b-0.4805334074462

ˆγ=−ˆc

b5.4799102240329

The presented solution has been found to be numerically identical with the iterative least squares solution

of York (1966), Neri et al. (1989), the pseudoquadratic algorithm from (Petrovi´c et al. 1983) and the WTLS

algrithm of Schaffrin and Wieser (2008). The estimated line is depicted in Figure 6.4.

158 Chapter 6. Numerical Investigations

-1012345678

Figure 6.4: Fitting a straight line to points in 2D, with individual weight for each measured coordinate.

Individually weighted and correlated 2D coordinates

In addition to the stochastic model of Table 6.8, correlations between the measured coordinates of each point

are postulated in this example. This is for instance the case when polar coordinates of points have been

originally measured, while their Cartesian coordinates are utilized in the adjustment, together with their

stochastic properties from a linear error propagation. For comparison reasons, the necessary correlations

between the point coordinates have been taken directly from the numerical investigations of (Snow 2012,

pp. 68-70) (i.e. the case of a regular cofactor matrix) and are listed in Table 6.10.

6.1. Fitting of a straight line in 2D 159

Table 6.10: Individual weights for each coordinate and correlations for each point.

Point No. px=1

σx2py=1

σy2ρxy

1 1000 1 -0.165956

2 1000 1.8 0.440649

3 500 4 -0.999771

4 800 8 -0.395335

5 200 20 -0.706488

6 80 20 -0.815323

7 60 70 -0.627480

8 20 70 -0.308879

9 1.8 100 -0.206465

10 1 500 0.077633

The variance-covariance matrix for this adjustment problem can be computed by

ΣLL ="Σxx Σxy

Σyx Σyy #,(6.4)

with

Σxx =







σ2

x100000000 0

0σ2

x20000000 0

0 0 σ2

x3000000 0

000σ2

x40 0 0 0 0 0

0000σ2

x50000 0

00000σ2

x60 0 0 0

000000σ2

x70 0 0

0000000σ2

x80 0

00000000σ2

x90

000000000σ2

x10







,(6.5)

Σyy =







σ2

y100000000 0

0σ2

y20 0 0 0 0 0 0 0

0 0 σ2

y3000000 0

0 0 0 σ2

y40 0 0 0 0 0

0 0 0 0 σ2

y50 0 0 0 0

00000σ2

y60 0 0 0

000000σ2

y70 0 0

0000000σ2

y80 0

00000000σ2

y90

000000000σ2

y10







(6.6)

and

160 Chapter 6. Numerical Investigations

Σxy =Σyx =







σx1y100000000 0

0σx2y20000000 0

0 0 σx3y3000000 0

000σx4y400000 0

0000σx5y50000 0

00000σx6y60 0 0 0

000000σx7y70 0 0

0000000σx8y80 0

00000000σx9y90

000000000σx10y10







.(6.7)

The individual covariances between the coordinates of each point are

σxy =ρxy σxσy.(6.8)

Setting the variance of the universal weight equal to one (σ2

0= 1) the cofactor matrix of the observations is

QLL ="Qxx Qxy

Qyx Qyy #=σ2

0ΣLL =ΣLL.(6.9)

For a regular cofactor matrix it is possible to compute the weight matrix

P=Q−1

LL.(6.10)

A least squares solution for the unknown line parameters can be estimated iteratively, utilizing Algorithm 2

from the developed approach of section 5.2.4.1. The results are presented in Table 6.11.

Table 6.11: Iterative least squares solution using Algorithm 2 (section 5.2.4.1).

Estimated parameter Iterative least squares solution

ˆa0.4592286797279

ˆc-5.357272562041

Computed parameter

β=−ˆa

b-0.4592286797279

ˆγ=−ˆc

b5.357272562041

The results from the proposed approach have been compared and found to be numerically identical with the

solution from the WTLS algorithm presented in (Snow 2012, p. 72). The requested straight line is depicted

in Figure 6.5.

6.1. Fitting of a straight line in 2D 161

-1012345678

Figure 6.5: Fitting a straight line to points in 2D, with individually weighted and correlated coordinates for

each point.

Solution with a singular cofactor matrix

A singular variance-covariance matrix is postulated in this example for the measured point coordinates

including correlations between the observations. This is the case when, for example, the 2D Cartesian

coordinates of the points have been obtained by a least squares adjustment of a free network and their

stochastic properties by a linearized error propagation. The same stochastic model as in (Snow 2012, pp.

71-72) is utilized here, for the case of a singular cofactor matrix which satisfies the NS criterion, in order

to compare the results of the proposed approach. The necessary variances and covariances can be found

in Appendix A.1 and lead to a singular matrix Wwith rank deficiency equal to 2. However, applying the

developed criterion from equation (5.102) results in

rank ([W|A]) = 10 = n, (6.11)

which ensures that a unque solution still exists, with

- rank of W= 8 < n, with n= number of condition equations;

- rank of A=2=m, with m= number of unknown parameters;

- redundancy : rd=n−m= 10 −2 = 8;

162 Chapter 6. Numerical Investigations

Here it can be seen that the rank of matrix Wis smaller than the number of condition equations n, caused

by the rank deficiency of the introduced cofactor matrices.

A least squares solution for the unknown line parameters has been estimated iteratively, using Algorithm 3

from section 5.2.4.2. The results are shown in Table 6.12.

Table 6.12: Iterative least squares solution using Algorithm 3 (section 5.2.4.2).

Estimated parameter Iterative least squares solution

ˆa0.4931726182468

ˆc-5.54275204298

Computed parameter

β=−ˆa

b-0.4931726182468

ˆγ=−ˆc

b5.54275204298

The solution for the line parameters is numerically equal to the WTLS solution from (Snow 2012, p. 72)

and the solution within the GHM for the case of a singular cofactor matrix. The computed straight line is

depicted in Figure 6.6.

-1012345678

Figure 6.6: Fitting a straight line to points in 2D, with a singular cofactor matrix.

6.2. 2D similarity transformation of coordinates 163

6.2 2D similarity transformation of coordinates

The least squares solution of the 2D similarity transformation is presented in this subsection following the

developed methodologies and algorithms of chapters 4 and 5. The coordinates of four homologous points

have been measured in two coordinate systems, the target XY and the source xy system and are listed in

Table 6.13.

Table 6.13: Example dataset for the 2D similarity transformation

Point No. Target S. Source S.

i Xi[m]Yi[m]xi[m]yi[m]

1 -117.478 0 17.856 144.794

2 117.472 0 252.637 154.448

3 0.015 -117.41 140.089 32.326

4 -0.014 117.451 130.40 267.027

This dataset originates from (Mikhail et al. 2001, pp. 397-402) and has been utilized in the past at least by

Felus and Schaffrin (2005), Neitzel (2010) and Malissiovas et al. (2016).

Equally weighted coordinates

In the first weighting case of this numerical example the coordinates of the homologous points are equally

weighted and uncorrelated, i.e. equal weights:

pXi=pYi=pxi=pyi= 1.(6.12)

A GHM has been employed by Neitzel (2010) for deriving the least squares solution for the unknown

transformation parameters between the coordinate systems. The results from the GHM are presented in

Table 6.14.

Table 6.14: Results from Neitzel (2010).

Estimated parameter GHM solution

Parameter ˆ

ξ10.99900748077781

Parameter ˆ

ξ2-0.04109806319405

Scale factor ˆµ0.99985248784424

Rotational angle ˆ

φ-2◦21020.7200

Translation parameter ˆ

tx-141.2628 mm

Translation parameter ˆ

ty-143.9316 mm

The developed direct least squares solution for the 2D similarity transformation of section 4.5.1 is presented

in Table 6.15. For comparison reasons, the estimates for α,βand γcan be substituted in equation (4.110)

164 Chapter 6. Numerical Investigations

to derive the parameters ξ1and ξ2. Furthermore, the rotational angle φas well as the scale factor µcan be

computed by substituting ˆ

ξ1and ˆ

ξ2into equation (4.102). Thus, the rotational angle is

φ= arctan ˆ

ξ2

ξ1!(6.13)

and the scale factor

ˆµ=ˆ

ξ1

cos ˆ

φ.(6.14)

Similarly, the translation parameters txand tycan be derived from equation (4.106).

Table 6.15: Direct least squares solution for the 2D similarity transformation

Estimated parameter Direct least squares solution

ˆα-0.4832580303705

β-0.8754779700726

ˆγ5.0853141652839

Computed parameter

ξ1=−ˆα

ˆγ0.99900748077781

ξ2=−ˆ

ˆγ-0.04109806319405

ˆµ0.99985248784424

φ-2◦21020.72394355800

tx-141.2627900259449

ty-143.9316426333377

In addition, the TLS solution of section 4.5.2 has been computed and found to be identical to the solution

from the proposed direct least squares approach. Both coincide numerically with the results from the GHM,

as it can be seen from Tables 6.14 and 6.15.

Equally weighted coordinates in each coordinate system

Equal weights between the coordinates of each coordinate system are assumed for the second weighting case,

with

pXY i= 1.1 and pxyi= 0.9.(6.15)

A direct weighted least squares solution has been derived here, following the developed methodology of

section 5.4.1. The results are presented in Table 6.16.

6.2. 2D similarity transformation of coordinates 165

Table 6.16: Direct least squares solution (section 5.4.1)

Estimated parameter Direct least squares solution

ˆα-0.7064570685103

β0.0290628626954

ˆγ0.7071589357166

Computed parameter

ξ1=−ˆα

ˆγ0.9990074830836

ξ2=−ˆ

ˆγ-0.0410980632889

ˆµ0.99985249015199

φ-2◦21020.72394355600

tx-141.2627903519875

ty-143.9316429655667

Equally weighted homologous points in both systems

For this weighting example the coordinates of the homologous points have been measured with individual

precisions. The weights of this stochastic model are listed in Table 6.17.

Table 6.17: Individual weights for homologous points in both systems.

Point No. Target S. Source S.

i pXi=1

σX2

pYi=1

σY2

pxi=1

σx2

pyi=1

σy2

1 0.9 0.9 0.9 0.9

2 1.05 1.05 1.05 1.05

3 0.85 0.85 0.85 0.85

4 1.3 1.3 1.3 1.3

A direct weighted least squares solution has been estimated for this example following the solution strategy

of section 5.4.2. Table 6.18 contains the results.

166 Chapter 6. Numerical Investigations

Table 6.18: Direct least squares solution (section 5.4.2)

Estimated parameter Direct least squares solution

ˆα-0.7064686408794

β0.0290681956735

ˆγ0.7071471554453

Computed parameter

ξ1=−ˆα

ˆγ0.9990404902845

ξ2=−ˆ

ˆγ-0.0411062894755

ˆµ0.99988580761122

φ-2◦21022.13959790500

tx-141.2687384001714

ty-143.9337541051444

Individually weighted coordinates

Individual precision for each measured coordinate has been postulated in this case. The introduced weights

of this stochastic model are listed in Table 6.19.

Table 6.19: Individual weight for each coordinate.

Point No. Target S. Source S.

i pXi=1

σX2

pYi=1

σY2

pxi=1

σx2

pyi=1

σy2

1 0.95 1.2 1 0.8

2 1 0.75 1.15 0.9

3 0.95 1.3 0.7 0.85

4 1 0.85 1.2 1

An iterative weighted least squares solution for the unknown transformation parameters has been computed

by using Algorithm 7 from section 5.4.3. The results are presented in Table 6.20.

6.2. 2D similarity transformation of coordinates 167

Table 6.20: Iterative least squares solution (section 5.4.3)

Estimated parameter Iterative least squares solution

ξ10.99899700973504

ξ2-0.041113617539

tx-141.2636321979249

ty-143.9334627584676

Computed parameter

ˆµ0.99984266512622

φ-2◦21024.01884146500

Individually weighted and correlated coordinates in each coordinate system

In this adjustment example correlations are present only between the coordinates of each point and in both

coordinate systems. The correlation coefficients have been computed randomly, like in (Snow, 2012) for

the example of fitting a straight line in 2D. Here the randn function has been used from the programming

language GNU Octave (version 4.0.0). The introduced weights, together with the correlation coefficients

are listed in Table 6.21 for the coordinates in the target system and in Table 6.22 for the coordinates in the

source system.

Table 6.21: Weights and correlations for the coordinates of the points in the target system.

Point No. Target S.

i pXi=1

σX2

pYi=1

σY2

ρXY i

1 0.95 1.2 0.2

2 1 0.75 -0.3

3 0.95 1.3 -0.5

4 1 0.85 0.2

Table 6.22: Weights and correlations for the coordinates of the points in the source system.

Point No. Source S.

i pxi=1

σx2

pyi=1

σy2

ρxyi

1 1 0.8 0.2

2 1.15 0.9 -0.4

3 0.7 0.85 -0.3

4 1.2 1 0.2

An iterative weighted least squares solution was derived for this adjustment problem utilizing Algorithm 8

from section 5.4.4. The results are presented in Table 6.23.

168 Chapter 6. Numerical Investigations

Table 6.23: Iterative least squares solution (section 5.4.4)

Estimated parameter Iterative least squares solution

ξ10.9990276226893

ξ2-0.0411159727

tx-141.26679907

ty-143.93227343

Computed parameter

ˆµ0.99987334903342

φ-2◦21024.24459835800

Solution with a singular cofactor matrix

A singular cofactor matrix is postulated in the last weighting example of the 2D similarity transformation.

For the sake of comparison, the dataset from the numerical example in (Snow 2012, pp. 77-82) and (Neitzel

and Schaffrin 2017) is adopted. The observed coordinates are listed in Table 6.24.

Table 6.24: Example dataset from Neitzel and Schaffrin (2016).

Point No. Target S. Source S.

i Xi[m]Yi[m]xi[m]yi[m]

1 400.0040 100.0072 453.8001 137.6099

2 500.0019 299.9994 521.2865 350.7972

3 399.9925 399.9933 406.8728 433.9247

4 100.0059 400.0022 110.5545 386.9880

5 99.9956 99.9978 157.4861 90.6802

The coordinates of the homologous points in the two coordinate systems are the outcome of a free network

adjustment. An approximate solution for the variances and covariances of these coordinates has been also

computed using a linearized error propagation. As it is explained by Neitzel and Schaffrin (2017), the

resulting cofactor matrices are fully populated (i.e. the off-diagonal elements are not zero) and rank deficient,

but still fulfilling the NS criterion. The introduced singular cofactor matrices for this numerical example

can be found in Appendix A.2 and lead to a singular matrix Wwith rank deficiency equal to 2. Also for

this numerical example the developed criterion from equation (5.102) leads to

rank ([W|A]) = 10 = n, (6.16)

which ensures that a unque solution for the unknown parameters exists, with

- rank of W= 8 < n, with n= number of condition equations;

- rank of A= 4 = m, with m= number of unknown parameters;

6.2. 2D similarity transformation of coordinates 169

- redundancy : rd=n−m= 10 −4 = 6;

An iterative least squares solution has been computed for this adjustment case using Algorithm 9 from

section 5.4.4. Table 6.25 contains the resulting estimates for the transformation parameters.

Table 6.25: Iterative least squares solution (section 5.4.4)

Estimated parameter Iterative least squares solution

ξ10.9876550155542

ξ2-0.1564292113176

tx-69.726354301821

ty35.0782153796499

Computed parameter

ˆµ0.99996626338233

φ-9◦000.00480300600

The developed iterative procedure provides the exact numerical result as the least squares solution within

the GHM presented by Neitzel and Schaffrin (2017) and the WTLS solution from (Snow 2012, p. 80).

171

7 Conclusion and outlook

7.1 Conclusion

The fundamental principles of adjustment calculus and the method of least squares have been briefly dis-

cussed in chapter 2, setting the basis for the methodological developments of this thesis. A review of related

works in adjustment calculus has shown that the mathematical modelling of the measurement results em-

bodies the most fundamental parts of every adjustment problem (i.e. the functional and stochastic model).

Only a correct mathematical model could lead to meaningful estimates for the unknown parameters and

the residuals of each problem. For the purposes of this research an unambiguous definition has been given

in chapter 2 concerning linear and nonlinear least squares problems. Additionally, the evaluation of the ad-

justment results has been discussed in terms of precision and reliability. Various approaches were presented

that are common in geodetic literature for the computation and correct interpretation of the stochastic

parameters of the estimated unknowns.

The main subject of this thesis is the least squares solution of a class of nonlinear adjustment problems.

For this reason, two solution strategies were highlighted in chapter 3. The first is related to the traditional

approaches that are commonly used in geodesy and two famous adjustment models, namely the GMM and

the GHM. Based on the Gauss-Newton approach, a least squares solution can be derived iteratively and by

linearizing the nonlinear functional model. The second strategy includes the most modern algorithms that

have been developed in the last decades by the mathematical/statistical community. In the TLS literature

the discussed problems are often expressed within the EIV model and various algorithmic approaches have

been presented for a solution. Depending on the stochastic model, the presented TLS solutions are direct

and cover the cases of equally weighted and uncorrelated measurements, while the WTLS solutions are

iterative and can deal usually with more general weighting cases. Chapter 3 comes to the conclusion that

TLS and WTLS provide the (weighted) least squares solution of the discussed class of problems. Hence, it

is in agreement with the views of Petrovic (2003) and Neitzel (2010), that TLS is not a new method but a

special case of the least squares method.

In the fourth chapter of this thesis, the solution of four individual nonlinear adjustment problems is discussed:

the fitting of a straight line in 2D and 3D, the fitting of a plane in 3D and the 2D similarity transformation of

coordinates. A mathematical relationship between direct least squares and TLS solutions has been presented

that was based on the publication of Malissiovas et al. (2016). This shows that TLS produces the least

squares solution of a problem using SVD, while the exact solution can also be achieved by following the

standard procedure for the least squares solution of a problem, i.e. by minimizing the sum of squared

residuals. Additionally, a new solution strategy has been established for the direct least squares solution of

the investigated class of problems. The developments of this chapter give an overview of these adjustment

problems that can be transformed into an eigenvalue problem and thus, it clarifies in which cases a solution

172 Chapter 7. Conclusion and outlook

from TLS is possible. This chapter demonstrates that TLS is an algorithmic approach for obtaining the

least squares solution and not a method.

The findings in chapter 5 provide an overview of possible weighted least squares solutions for the discussed

class of nonlinear adjustment problems, i.e. the WTLS solution for the mathematical/statistical community.

Different weighting scenarios and correlations between the observations were postulated for each problem.

It has been shown that for certain weighting cases a direct solution still exists. For these cases two novel

direct approaches have been proposed. Further, general weight matrices have been examined including also

cofactor matrices with correlations between the measurements. New algorithms have been developed and

presented for the iterative weighted least squares solution of this class of problems without linearizing the

original problem in any step of the procedure. In addition, singular cofactor matrices that still fulfill the

criterion of Neitzel and Schaffrin (2016) have been taken into account. The presented algorithms can handle

the latter stochastic model without the need of any special treatment of the adjustment problem.

The implemented algorithms are based on the established solution strategies for obtaining a (weighted) least

squares solution for the investigated class of nonlinear problems. The direct approaches can be employed

in engineering tasks for which efficiency is important, i.e. no need for starting values for the unknown

parameters or iterations to obtain a minimum solution for the objective function. For instance, straight

lines or planes can be fitted directly to measured 3D point clouds from laser scanners, or the similarity

transformation parameters between several sets of homologous points can be estimated, if such a stochastic

model is postulated that can lead to a direct solution. In case of more general weight matrices or taking

correlations between the measurements into account in the stochastic model, a weighted least squares solution

is provided by the presented iterative algorithms.

7.2 Outlook

Complex adjustment problems can be further tackled in future research using the knowledge that was

acquired so far. Thus, all the developed algorithms can be possibly extended in order to provide elegant

solutions for other adjustment cases such as:

•Variance Component Estimation (VCE) of individual groups of observations,

•Tykhonov regularization of ill-posed problems.

Moreover, the discussed (weighted) least squares solutions can be considered as “optimal” only if the random

errors that influence the measurements are normally distributed. In the presence of outliers (or blunders),

however, other adjustment methods may be preferred that are more robust in the sense that the solution

is not falsified by a small amount of outliers. Thus, the developed algorithms and solution strategies can

be extended to become more robust against outliers. Future developments can be based on the following

objectives:

- Robust estimators have been studied since decades by geodesists, mathematicians and statisticians.

These estimators can be divided into two main groups, the M-estimators (based on the maximum

likelihood method) with a breakdown point of 5 to 10% and estimators with higher one, like for

example the L1-norm which can reach a breakdown point of 50% of blunders. It is worth to mention

7.2. Outlook 173

the rigorous solution via linear programming (Dantzig 1963, Dantzig and Thapa 2006), for example

using the simplex algorithm presented in (Dantzig 1949) or (Barrodale and Roberts 1974). However,

the minimization of L1-norm with linear programming has been, interestingly, neglected by the geodetic

literature with some exceptions, for example (Fuchs 1980) or (Fuchs 1982). It is still not known if such

algorithms can be developed for a robust solution of adjustment problems within the EIV model.

- An approximate solution by minimizing a Lp-norm could be obtained by an iterative procedure of

reweighted least squares, which is almost exclusively used by geodesists. Some first examples can be

found in the algorithm of Schlossmacher (1973) and the contributions of Krarup et al. (1980) and

Somogyi and Z´avoti (1993). In fact, reweighted least squares can possibly fail to provide a correct

solution in some cases, as has already been pointed out by Neitzel (2004) for the case of L1. A thorough

analysis of the robustness of various Lp-norms has been presented by Marx (2013), who noticed that Lp

for 1.2<p<1.5 may be less resistant to outliers than L1and proposed L1.05 as an alternative solution.

New robust algorithms can be implemented for the detection of blunders, based on the combination

of a reweighted approach and the direct solutions of weighted nonlinear least squares from this thesis.

- Most of the mentioned procedures fail to provide a resistant solution when leverage points1exist in

the dataset. Rigorous methods for the identification of erroneous data can be global optimization

methods based on systematic or stochastic search, as demonstrated by the study of Marx (2015) for

the detection of blunders by means of a Monte Carlo simulation. Combinatorial approaches can also

be employed in leverage point’s cases. Examples of combinatorial approaches include the maximum

subsample (MSS) method developed theoretically by (Neitzel, 2004, p. 109) and employed successfully

by Neitzel and Marx (2007), Wujanz et al. (2016) and Wujanz (2016) for the detection of deformations

using laser scanning data. The employment of such methods for the implementation of modern and

robust algorithms is of great interest.

1See e.g. (Everitt and Skrondal 2010) for a definition.

175

Appendices

177

A Stochastic models for the numerical investigations

A.1 Singular cofactor matrix for fitting a straight line in 2D

A singular cofactor matrix is presented in this appendix for the application example of fitting a straight

line in 2D. It was computed by utilizing the correlation coefficients from (Snow 2012, pp. 94-95), together

with the necessary precisions of the measured coordinates from Table 6.8. The variance-covariance matrix

for this adjustment problem is

QLL ="Qxx Qxy

Qyx Qyy #,(A.1)

with the sub-matrices:

Qxx =







0.001 −0.00010729581873004 −0.000184367806311119 −0.000162605765335955 −0.000396422204114424 ···

−0.00010729581873004 0.001 −0.000209334092987561 −0.00018594935307068 −0.000458502562871592 ···

−0.000184367806311119 −0.000209334092987561 0.002 −0.000326944096945427 −0.000819451184602405 ···

−0.000162605765335955 −0.00018594935307068 −0.000326944096945427 0.00125 −0.000762997332323675 ···

−0.000396422204114424 −0.000458502562871592 −0.000819451184602405 −0.000762997332323675 0.005 ···

−0.0002625284166187 −0.000318111572725898 −0.00060532385276285 −0.000608691217742683 −0.00177606225648547 ···

−1.30268774106986e−06 −6.44923418774174e−05 −0.000275369953484654 −0.000452568996672073 −0.00192208514323818 ···

0.00055638919876439 0.000525619226802249 0.0006397035186675 0.000228426867755593 −0.000754067056197251 ···

0.00130709043198385 0.00133828286919678 0.00195079375824933 0.00127603026024317 0.00137539801300953 ···

0.00375954407416456 0.00404574061392281 0.00646160268527631 0.00507413806293484 0.00993112604965441 ···

−0.0002625284166187 −1.30268774106986e−06 0.00055638919876439 0.00130709043198385 0.00375954407416456

−0.000318111572725898 −6.44923418774174e−05 0.000525619226802249 0.00133828286919678 0.00404574061392281

−0.00060532385276285 −0.000275369953484654 0.0006397035186675 0.00195079375824933 0.00646160268527631

−0.000608691217742684 −0.000452568996672073 0.000228426867755593 0.00127603026024317 0.00507413806293484

−0.00177606225648547 −0.00192208514323818 −0.000754067056197251 0.00137539801300953 0.00993112604965441

0.0125 −0.00356447023778097 −0.0041685755760135 −0.00394653521323417 −0.000250424830498049

−0.00356447023778097 0.0166666666666667 −0.0159730654839819 −0.0211377773455841 −0.0296837502900246

−0.0041685755760135 −0.0159730654839819 0.05 −0.0415061974365333 −0.069564033429064

−0.00394653521323417 −0.0211377773455841 −0.0415061974365333 0.555555555555556 −0.114601731317248

−0.000250424830498049 −0.0296837502900246 −0.069564033429064 −0.114601731317248 1







,(A.2)

Qyy =







1−0.0799735814606215 −0.06518386303754 −0.0514204579136473 −0.0396422204114424 ···

−0.0799735814606215 0.555555555555556 −0.0551643771471813 −0.0438286828378432 −0.0341747632812917 ···

−0.06518386303754 −0.0551643771471813 0.25 −0.0365534612806129 −0.0289719744741855 ···

−0.0514204579136473 −0.0438286828378432 −0.0365534612806129 0.125 −0.0241280941877523 ···

−0.0396422204114424 −0.0341747632812917 −0.0289719744741855 −0.0241280941877523 0.05 ···

−0.0166037549406539 −0.0149959233498933 −0.0135354528317981 −0.0121738243548537 −0.0112328039935045 ···

−3.8138791846177e−05 −0.00140733827808684 −0.00285034981466199 −0.00418997473652902 −0.00562728909449856 ···

0.00940469397241429 0.00662217980206098 0.00382295973501637 0.0012209929672505 −0.00127460596179143 ···

0.005545515048479 0.00423202202024702 0.002926190637374 0.00171197424195029 0.000583531957097723 ···

0.00531679821802292 0.00426458505408133 0.00323080134263815 0.00226922352718828 0.0014044733149058 ···

178 Appendix A. Stochastic models for the numerical investigations

−0.0166037549406539 −3.8138791846177e−05 0.00940469397241429 0.005545515048479 0.00531679821802292

−0.0149959233498933 −0.00140733827808684 0.00662217980206098 0.00423202202024702 0.00426458505408133

−0.0135354528317981 −0.00285034981466199 0.00382295973501637 0.002926190637374 0.00323080134263815

−0.0121738243548537 −0.00418997473652902 0.0012209929672505 0.00171197424195029 0.00226922352718828

−0.0112328039935045 −0.00562728909449856 −0.00127460596179143 0.000583531957097723 0.0014044733149058

0.05 −0.00660011638235733 −0.00445639474180467 −0.00105896652148659 −2.23986777699e−05

−0.00660011638235733 0.0142857142857143 −0.00790461742025186 −0.00262556138650223 −0.00122902402483588

−0.00445639474180467 −0.00790461742025186 0.0142857142857143 −0.00297656367844959 −0.00166289845870223

−0.00105896652148659 −0.00262556138650223 −0.00297656367844959 0.01 −0.000687610387903489

−2.23986777699e−05 −0.00122902402483588 −0.00166289845870223 −0.000687610387903489 0.002







,(A.3)

and

Qxy =







0.0316227766016838 −0.0025289867005658 −0.00206129473887088 −0.00162605765335955 −0.00125359708006575 ···

−0.00339299170599481 0.0235702260395516 −0.00234042630964225 −0.0018594935307068 −0.00144991241169878 ···

−0.00583022195151901 −0.00493405188950133 0.0223606797749979 −0.00326944096945427 −0.0025913321746667 ···

−0.00514204579136473 −0.00438286828378432 −0.00365534612806129 0.0125 −0.00241280941877523 ···

−0.0125359708006575 −0.0108070090465971 −0.00916174276506853 −0.00762997332323675 0.0158113883008419 ···

−0.00830187747032693 −0.00749796167494667 −0.00676772641589903 −0.00608691217742684 −0.00561640199675225 ···

−4.11946034176041e−05 −0.00152009907587077 −0.00307872967476321 −0.00452568996672073 −0.00607816690940393 ···

0.0175945713361161 0.0123889639864633 0.00715210276593168 0.00228426867755593 −0.0023845694060815 ···

0.0413338287288236 0.0315436297318278 0.021810537267639 0.0127603026024317 0.00434939041038 ···

0.11888722238149 0.0953590207675548 0.072242914239365 0.0507413806293484 0.0314049780471384 ···

−0.0005250568332374 −1.20605449440977e−06 0.000297402536496859 0.000175364583519327 0.000168131922284769

−0.000636223145451796 −5.97083063915161e−05 0.000280955294656434 0.000179549488118848 0.000180931020641263

−0.0012106477055257 −0.000254943037809526 0.000341935913709648 0.000261726447211668 0.00028897165695746

−0.00121738243548537 −0.000418997473652902 0.00012209929672505 0.000171197424195029 0.000226922352718828

−0.00355212451297094 −0.0017795050590842 −0.000403065795849046 0.000184529007192446 0.000444133458802924

0.025 −0.00330005819117866 −0.00222819737090234 0.000529483260743293 −1.119933888495e−05

−0.00712894047556193 0.0154303349962092 −0.00853796263679499 −0.00283593042227886 −0.00132749766951248

−0.008337151152027 −0.0147881850800537 0.0267261241912424 −0.00556864073733696 −0.0031109981507291

−0.00789307042646833 −0.0195697791310586 −0.0221859957479004 0.074535599249993 −0.00512514523129067

−0.000500849660996098 −0.0274818126551341 −0.0371835399333781 −0.015375435693872 0.0447213595499958







,(A.4)

with Qxy =QT

yx.

A.2 Singular cofactor matrix for the 2D similarity transformation

A stochastic model is postulated in the last investigated weighting case for the 2D similarity transformation,

that involves a singular cofactor matrix. This matrix has been firstly introduced in (Snow 2012, pp. 96-97)

and (Neitzel and Schaffrin 2017). However, it is reordered in such a way, so that it fits to the needs and the

formulation of the stochastic model in this work (see section 5.4.4). Therefore, two cofactor matrices are

given. The first matrix is related to the coordinates in the source system

QLL1="Qxx Qxy

Qyx Qyy #,(A.5)

with the relevant sub-matrices:

A.2. Singular cofactor matrix for the 2D similarity transformation 179

Qxx =





36.370281457026799 −10.717856095227498 −8.652980417908310 −3.008722473688215 −13.990722470202799

−10.717856095227498 31.714850184591498 −9.754412805230141 −7.170133519153175 −4.072447764980719

−8.652980417908310 −9.754412805230141 29.490003562291800 −9.264714951570749 −1.817895387582575

−3.008722473688215 −7.170133519153175 −9.264714951570749 26.031496206858098 −6.587925262445999

−13.990722470202799 −4.072447764980719 −1.817895387582575 −6.587925262445999 26.468990885212101







10−6[m2],(A.6)

Qyy =





29.082186568548597 −12.471413471703498 −6.363232761238930 −2.835610062891940 −7.411930272714205

−12.471413471703498 30.958038019578300 −17.720699070083000 −0.556828023714815 −0.209097454076965

−6.363232761238930 −17.720699070083000 38.734832754164593 −10.531447467909800 −4.119453454932779

−2.835610062891940 −0.556828023714815 −10.531447467909800 32.063142073587599 −18.139256519071097

−7.411930272714205 −0.209097454076965 −4.119453454932779 −18.139256519071097 29.879737700795101







10−6[m2],(A.7)

Qxy =QT

yx =





−5.470847531049374 −4.151968395749720 −6.309575380973525 5.061541661693445 10.870849646079201

5.848221379655184 3.083310192383325 −3.437514270957704 −5.891999249139355 0.397981948058545

4.362573210487724 3.061227368299260 6.203628967652264 −10.232648328119298 −3.394781218319955

−0.456888271105760 5.116298153814835 2.979330360651054 −1.237264484444125 −6.401475758915990

−4.283058787987756 −7.108867318747670 0.564130323627930 12.300370400009299 −1.472574616901790







10−6[m2].

(A.8)

The second cofactor matrix concerns the coordinates in the target system

QLL2="QXX QXY

QYX QYY #,(A.9)

with

QXX =





6.794487571800590 −2.067492652515560 −1.752371987000400 −0.451547525907080 −2.523075406377545

−2.067492652515560 6.429318039548140 −1.970512300406600 −1.403742194865920 −0.987570891760060

−1.752371987000400 −1.970512300406600 6.229399700497529 −2.051290948290685 −0.455224464799860

−0.451547525907080 −1.403742194865920 −2.051290948290685 5.079972517721780 −1.173391848658100

−2.523075406377545 −0.987570891760060 −0.455224464799860 −1.173391848658100 5.139262611595570







10−6[m2],

(A.10)

QYY =





6.094989272208480 −2.499242822649745 −1.204799645049460 −0.699300070526570 −1.691646733982720

−2.499242822649745 5.912760395723919 −3.440142089389230 −0.117912645153740 0.144537161468785

−1.204799645049460 −3.440142089389230 7.206462860853270 −1.847407593285470 −0.714113533129125

−0.699300070526570 −0.117912645153740 −1.847407593285470 6.360602565647050 −3.695982256681265

−1.691646733982720 0.144537161468785 −0.714113533129125 −3.695982256681265 5.957205362324320







10−6[m2],

(A.11)

QXY =QT

YX =





−1.246373797420340 −0.879050061336250 −1.163579541268595 0.979701730455350 2.309301669569825

1.090128341653400 0.554510615436640 −0.917558590359990 −0.955341882018735 0.228261515288690

0.938081379301460 0.362129743332930 1.443374807628205 −2.018715860691035 −0.724870069571550

−0.106812374044770 1.212589988419975 0.582948103761090 −0.048153307431655 −1.640572410704635

−0.675023549489755 −1.250180285853295 0.054815220239305 2.042509319686065 −0.172120704582330







10−6[m2]

(A.12)

181

Bibliography

Abatzoglou T., Mendel J. and Harada G., 1991. The constrained Total Least-Squares technique and its

application to harmonic superresolution. Trans. Signal Process., 39, 1070–1087.

Adcock R., 1878. A problem in least squares. The Analyst, 5, 53–54.

Alkhatib H., 2007. On Monte Carlo methods with applications to the current satellite gravity missions. PhD

dissertation, Landwirtschaftliche Fakult¨at, Universit¨at Bonn.

Alkhatib H. and Schuh W., 2007. Integration of the Monte Carlo covariance estimation strategy into tailored

solution procedures for large-scale least squares problems. Journal of Geodesy, 81, 53–66.

Amiri-Simkooei A. and Jazaeri S., 2012. Weighted total least squares formulated by standard least squares

theory. Journal of Geodetic Science, 2, 113–124.

Barrodale I. and Roberts F., 1974. Algorithm 478: Solution of an overdetermined system of equations in

the L1norm. ACM, 17, 319–320.

Bickel P.J. and Ritov Y., 1987. Efficient estimation in the errors in variables model. Ann. Stat., 15, 2,

513–540.

Bjerhammar A., 1973. Theory of errors and generalized matrix inverses. Elsevier, Amsterdam, London,

New York.

Bj¨orck A., 2015. Numerical Methods in Matrix Computations, Texts in Applied Mathematics 59. Springer

International Publishing.

Bronshtein I., Semendyayev K., Musiol G. and Muehlig H., 2005. Handbook of Mathematics. Springer,

Berlin Heidelberg New York, fifth edition.

Cross P., 1994. Advanced least squares applied to position fixing. Working paper no. 6. North East London

Polytechnic, Department of Land Surveying.

Dantzig G., 1949. Programming of Independent Activities II. Mathematical Model, Econometrica, 17,

200–211.

182 Bibliography

Dantzig G., 1963. Linear programming and extensions. Princeton university press.

Dantzig G. and Thapa M., 2006. Linear programming 2: Theory and extensions. Springer Science & Business

Media.

Dekking F., Kraaikamp C., Lopuha¨a H. and Meester L., 2005. A Modern Introduction to Probability and

Statistics: Understanding why and how. Springer Science & Business Media.

Deming W., 1931. The application of least squares. The London, Edinburgh, and Dublin Philosophical

Magazine and Journal of Science, 11(68), 146–158.

Deming W., 1934. On the application of least squares.-II. The London, Edinburgh, and Dublin Philosophical

Magazine and Journal of Science, 7(114), 804–829.

Deming W., 1964. Statistical adjustment of data. Dover publications, Inc., New York.

Drixler E., 1993. Analyse der Form und Lage von Objekten im Raum, volume C(409). Deutsche Geod¨atische

Kommission bei der Bayerischen Akademie der Wissenschaften.

Everitt B.S. and Skrondal A., 2010. The Cambridge dictionary of statistics. Cambridge university press,

fourth edition.

Fang X., 2011. Weighted Total Least Squares Solutions for Applications in Geodesy. PhD dissertation, Dept.

of Geodesy and Geoinformatics, Leibniz University Hannover, Germany.

Felus Y. and Burtch R., 2009. On symmetrical three-dimensional datum conversion. GPS solutions, 13, 1,

65–74.

Felus Y. and Schaffrin B., 2005. Performing similarity transformations using the errors-in-variables-model.

Proceedings of the ASPRS Meeting, Washington DC, 35, 751–762.

Fuchs H., 1980. Untersuchungen zur Ausgleichung durch Minimieren der Absolutsumme der Verbesserungen.

PhD dissertation, TU Graz.

Fuchs H., 1982. Contributions to the Adjustment by Minimizing the Sum of Absolute Residuals. Manuscripta

Geodaetica, 7, 3, 151–207.

Gauss C., 1809. Theoria motus corporum coelestium in sectionibus conicis solem anbientium. F. Perthes

and I. H. Besser, Hamburg.

Gauss C., 1823. Theoria combinationis observationum erroribus minimis obnoxiae. Henricus Dieterich,

Gottingae.

Ghilani C.D., 2010. Adjustment computations: spatial data analysis. John Wiley and Sons, Hoboken, NJ,

USA, fifth edition.

183

Golub G. and Van Loan C., 1980. An analysis of the total least squares problem. SIAM Journal on Numerical

Analysis, 17, 6, 883–893.

Golub G. and Van Loan C., 1989. Matrix Computation. The Johns Hopkins University Press, Baltimore,

Maryland, second edition.

Golub G. and Van Loan C., 1996. Matrix Computation. The Johns Hopkins University Press, Baltimore,

Maryland, third edition.

Gonin R., 1989. Nonlinear Lp-Norm Estimation. CRC Press.

Groen P., 1996. An introduction to total least squares. Nieuw Archief voor Wiskunde, 14, 237–253.

Hampel F., 1980. Robuste Sch¨atzungen: Ein anwendungsorientierter ¨

Uberblick. Biometrica, 22, 3–21.

Helmert F., 1872. Die Ausgleichungsrechnung nach der Methode der kleinsten Quadrate. B.G Teubner,

Leipzig, Berlin.

Helmert F., 1924. Die Ausgleichungsrechnung nach der Methode der kleinsten Quadrate (Mit Anwendungen

auf die Geod¨asie, die Physik und die Theorie der Messinstrumente). B.G Teubner, Leipzig, Berlin, third

edition.

Huber P., 1964. Robust estimation of a location parameter. Annals of mathematical statistics, 35, 73–101.

J¨ager R., M¨uller T., Saler H. and Schw¨able R., 2005. Klassische und robuste Ausgleichungsverfahren. Ein

Leitfaden f¨ur Ausbildung und Praxis von Geod¨aten und Geoinformatikern. Herbert Wichmann Verlag,

Heidelberg.

Joviˇci´c D., Lapaine M. and Petrovi´c S., 1982. Prilagodjavanje pravca skupu toˇcaka prostora (Fitting a

straight line to a set of points in space, in Croatian). Geodetski list, 36(59), 260–266.

Julier S. and Uhlmann J., 1996. A general method for approximating nonlinear transformations of probability

distributions. Technical Report, RRG, Department of Engineering Science, University of Oxford.

Julier S. and Uhlmann J., 2000. A new method for the nonlinear transformation of means and covariances

in filters and estimator. IEEE T Automat Contr, 45, 477–478.

Julier S., Uhlmann J. and Whyte H.D., 1995. A new approach for filtering nonlinear systems. Proceedings

of the 1995 American Control Conference, IEEE, New York, 3, 1628–1632.

Kampmann G. and Renner B., 2004. Vergleich verschiedener Methoden zur Bestimmung ausgleichender

Ebenen und Geraden. Allgemenine Vermessungs-Nachrichten, 2, 56–67.

Koch K.R. and Pope A.J., 1969. Least Squares Adjustment with Zero Variances. zfv - Zeitschrift f¨ur

Geod¨asie, Geoinformation und Landmanagement, 10, 390–393.

184 Bibliography

Krakiwsky E.J., 1975. A synthesis of recent advances in the method of least squares. Department of Geodesy

and Geomatics Engineering, Lecture notes 42, University of New Brunswick.

Krarup T., Juhl J. and Kubik K., 1980. G¨otterd¨ammerung over Least Squares Adjustment. 14th congress

of the international society of photogrammetry, Hamburg, B3, 369–378.

Kupferer S., 2004. Verschiedene Ans¨atze zur Sch¨atzung einer ausgleichenden Raumgeraden. Allgemenine

Vermessungs-Nachrichten, 5, 162–170.

Lawson C. and Hanson R., 1974. Solving Least Squares Problems. SIAM, Philadelphia.

Lenzmann L. and Lenzmann E., 2004. Strenge Auswertung des nichtlinearen Gauss-Helmert-Modells. AVN ,

111, 68–73.

Linkwitz K., 1960. ¨

Uber die Systematik verschiedener Formen der Ausgleichungsrechnung. zfv - Zeitschrift

f¨ur Vermessungswesen, 5, 6, 7–10.

Linkwitz K., 1976. ¨

Uber einige Ausgleichungsprobleme und ihre L¨osung mit Hilfe Matrizen-Eigenwerten.

Linnik Y., 1961. Method of Least Squares and Principles of the Theory of Observations (Translated from

the Russian by Regina C. Elandt, Ph.D.). Pergammon press, Oxford, London, New York, Paris.

L¨osler M., B¨ahr H. and Ulrich T., 2016. Verfahren zur Transformation von Parametern und Unsicherheiten

bei nichtlinearen Zusammenh¨angen. Photogrammetrie Laserscanning Optische 3D-Messtechnik. Beitr¨age

der Oldenburger 3D-Tage 2016. Wichmann, 274–285.

Madsen K., Nielsen H. and Tingelff O., 2004. Methods for non-linear least squares problems. Informatics

and Mathematical Modelling, Technical University of Denmark.

Mahboub V., 2012. On structured weighted total least squares for geodetic transformations. Journal of

Geodesy, 86(5), 359–367.

Malissiovas G., Neitzel F. and Petrovic S., 2016. G¨otterd¨ammerung over total least squares. Journal of

Geodetic Science, 6(1), 43–60.

Markovsky I. and Van Huffel S., 2007. Overview of total least-squares methods. Signal Processing, 87,

2283–2302.

Marx C., 2013. On resistant Lp-norm estimation by means of iteratively reweighted least squares. Journal

of Applied Geodesy, 7, 43–60.

Marx C., 2015. Outlier detection by means of monte carlo estimation including resistant scale estimation.

Journal of Applied Geodesy, 9(2), 123–142.

185

Marx C., 2017. A weighted adjustment of a similarity transformation between two point sets containing

errors. Journal of Geodetic Science, 7, 105–112.

Meissl P., 1982. Least squares adjustment: a modern approach. Mitteilungen der geod¨atischen Institute

der Technischen Universit¨at Graz, Folge 43. Hochsch¨ulerschaft an der Technischen Universit¨at Graz,

Ges.m.b.H.

Merimman M., 1877. Elements of the method of Least squares adjustment. Cambridge: Printed by C.J.

Clay, M.A. at the university press.

Mihajlovic D. and Cvijetinovic Z., 2016. Weighted coordinate trasnformation formulated by standard least-

squares theory. Survey Review, 0, 1–18.

Mikhail E., Bethel J. and McGlone C., 2001. Introduction to Modern Photogrammetry. Wiley: New York

Chichester.

Mikhail E.M. and Ackermann F., 1976. Observations and Least Squares. Thomas Y. Crowell Company, Inc.

Montgomery D.C. and Runger G.C., 2010. Applied statistics and probability for engineers. John Wiley &

Sons.

Neitzel F., 2004. Identifizierung konsistenter Datengruppen am Beispiel der Kongruenzuntersuchung

geod¨atischer Netze, volume C(565). Deutsche Geod¨atische Kommission bei der Bayerischen Akademie

der Wissenschaften.

Neitzel F., 2010. Generalisation of total least squares on example of unweighted and weighted similarity

transformation. Journal of Geodesy, 84(12), 751–762.

Neitzel F. and Marx C., 2007. Deformationsanalyse und regionale Anpassung eines historischen Geodatenbe-

standes. Entwicklerforum Geoinformationstechnik 2007, 243–255.

Neitzel F. and Petrovic S., 2008. Total Least Squares (TLS) im Kontext der Ausgleichung nach kleinsten

Quadraten am Beispiel der ausgleichenden Geraden. zfv - Zeitschrift f¨ur Geod¨asie, Geoinformation und

Landmanagement, 133, 141–148.

Neitzel F. and Schaffrin B., 2016. On the Gauss-Helmert model with a singular dispersion matrix where BQ

is of smaller rank than B. Journal of Computational and Applied Mathematics, 291, 458–467.

Neitzel F. and Schaffrin B., 2017. Adjusting a 2D Helmert transformation within a Gauss-Helmert model

with a singular dispersion matrix where BQ is of smaller rank than B. Acta Geodaetica et Geophysica,

Montanistica Hungarica, 52, 479–496.

Neri F., Saitta G. and Chiofalo S., 1989. An accurate and straightforward approach to line regression

analysis of error-affected experimental data. Journal of physics E: scientific instruments, 22, 4, 215–217.

186 Bibliography

Niemeier W., 2008. Ausgleichungsrechnung. Walter de Gruyter, New York, second edition.

Pasioti A., 2015. Investigation of non-linear least squares problems using the example of circle fitting (Mas-

ter’s Thesis). Technische Universit¨at Berlin, Institut of Geodesy and Geoinformation Science.

Pearson K., 1901. On Lines and Planes of Closest Fit to Systems of Points in Space. The London, Edinburgh,

and Dublin Philosophical Magazine and Journal of Science, 2, 11, 559–572.

Perovi´c G., 2005. Least Squares (MONOGRAPH). Faculty of Civil Engineering, University of Belgrade,

first edition.

Petrovic S., 2003. Parametersch¨atzung f¨ur unvollst¨andige funktionale Modelle in der Geod¨asie, vol-

ume C(563). Habilitation, Deutsche Geod¨atische Kommission bei der Bayerischen Akademie der Wis-

senschaften.

Petrovi´c S., Lapaine M., Joviˇci´c D. and ˇ

Zarinac Franˇcula B., 1983. Prilagodjavanje pravca (Fitting of

a straight line, in Croatian). In: Proceedings of the 5th international symposium “Computer at the

university”, Cavtat, 529–535.

Pope A., 1972. Some pitfalls to be avoided in the iterative adjustment of nonlinear problems. In: Proceedings

of the 38th Annual Meeting of the American Society of Photogrammetry, Washington, DC, 449–477.

Pope A.J., 1974. Two approaches to Nonlinear Least Squares Adjustments. The Canadian Surveyor, 28, 5,

663–669.

Reinking J., 2008. Total Least Squares? zfv - Zeitschrift f¨ur Geod¨asie, Geoinformation und Landmanage-

ment, 6, 384–389.

Schaffrin B., 2006. A note on Constrained Total Least-Squares estimation. Linear Algebra and its Applica-

tions, 417, 1, 245–258.

Schaffrin B., 2007. Connecting the Dots: The Straight-Line Case Revisited. zfv - Zeitschrift f¨ur Geod¨asie,

Geoinformation und Landmanagement, 132, 385–394.

Schaffrin B., Lee I., Felus Y. and Choi Y., 2006. Total least-squares(TLS) for geodetic straight-line and

plane adjustment. Bollettino di geodesia e scienze affini, 65, 3, 141–168.

Schaffrin B., Neitzel F., Uzun S. and Mahboub V., 2012. Modifying cadzow’s algorithm to generate the

optimal TLS-solution for the structured EIV-model of a similarity transformation. Journal of Geodetic

Science, 2, 2, 98–106.

Schaffrin B. and Snow K., 2014. The case of the Homogeneous Errors-In-Variables Model. Journal of

Geodetic Science, 4, 1, 166–173.

187

Schaffrin B. and Wieser A., 2008. On weighted total least-squares adjustment for linear regression. Journal

of Geodesy, 82, 7, 415–421.

Schlossmacher E., 1973. An iterative technique for absolute deviations curve fitting. Journal of the American

Statistical Association, 68, 344, 857–859.

Shen Y., Li B. and Chen Y., 2011. An iterative solution of weighted total least-squares adjustment. Journal

of Geodesy, 85, 4, 229–238.

Snow K., 2012. Topics in Total Least-Squares within the Errors-In-Variables Model: Singular Cofactor

Matrices and Prior Information. PhD Dissertation, the Ohio State University.

Snow K. and Schaffrin B., 2016. Line fitting in euclidian 3d-space. Studia Geophysica et Geodaetica, 60, 2,

210–227.

Somogyi J. and Z´avoti J., 1993. Robust estimation with iteratively reweighted least-squares method. Acta

Geodaetica et Geophysica, Montanistica Hungarica, 28, 413–420.

Sp¨ath H., 2004. Zur numerischen Berechnung der Tr¨agheitsgeraden und der Tr¨agheitsebene. Allgemenine

Vermessungs-Nachrichten, 7, 273–275.

Stigler S., 1981. Gauss and the invention of least squares. The Annals of Statistics, 9, 3, 465–474.

Taylor J., 1982. An Introduction to Error Analysis. University science books, Sausalito, California, second

edition.

Teunissen P., 1985. The geometry of geodetic inverse linear mapping and nonlinear adjustment. Netherlands

Geodetic Commission, Publications on Geodesy, New Series, Vol. 8, No. , Delft.

Teunissen P., 1990. Nonlinear least squares. Manuscripta Geodaetica, 15, 137–150.

Teunissen P. and Knickmeyer E., 1988. Nonlinearity and least squares. CISM Journal ACSGC, 42, 4,

321–330.

Van Huffel S., 2004. Total Least Squares and Errors-in-Variables Modeling: Bridging the Gap between

Statistics, Computational Mathematics and Engineering. COMPSTAT Proceedings in Computational

Statistics, 17, 539–555.

Van Huffel S. and Vandewalle J., 1989. Algebraic connections between the least squares and total least

squares problems. Numerische Mathematik, 55, 4, 431–449.

Van Huffel S. and Vandewalle J., 1991. The Total Least Squares Problem, computational aspects and analysis.

SIAM, Philadelphia.

188 Bibliography

Wells D. and Krakiwsky E.J., 1971. The method of least squares. Department of Geodesy and Geomatics

Engineering, Lecture notes 18, University of New Brunswick.

Williamson J.H., 1968. Least-squares fitting of a straight line. Canadian Journal of Physics, 46, 16, 1845–

1847.

Wolf H., 1978. Das geod¨atische Gauß-Helmert-Modell und seine Eigenschaften. zfv - Zeitschrift f¨ur Vermes-

sungswesen, 41–43.

Wolf H., 1979. Singul¨are Kovarianzen im Gauss-Helmert-Modell. zfv - Zeitschrift f¨ur Vermessungswesen,

10, 390–393.

Wujanz D., 2016. Terrestrial laser scanning for geodetic deformation monitoring, volume C(775). Deutsche

Geod¨atische Kommission bei der Bayerischen Akademie der Wissenschaften.

Wujanz D., Krueger D. and Neitzel F., 2016. Identification of stable areas in unreferenced laser scans for

deformation measurement. The Photogrammetric Record, 31, 155, 261–280.

York D., 1966. Least-squares fitting of a straight line. Canadian Journal of Physics, 44, 5, 1079–1086.

York D., 1968. Least-squares fitting of a straight line with correlated errors. Earth and planetary science

letters, 5, 320–324.