Unconstrained optimization problems are defined as follows:
\begin{equation} \text{minimize}\quad f(x) \tag{1} \label{eq:1} \end{equation}
where $f: \mathbf{R}^n \rightarrow \mathbf{R}$ is convex and twice continously differentiable (which implies that $\mathbf{dom}\enspace f$ is open). We denote the optimal value $\inf_xf(x)=f(x^\ast)$, as $p^\ast$. Since $f$ is differentiable and convex, a necessary and sufficient condition for a point $x^\ast$ to be optimal is
\begin{equation} \nabla f(x^\ast)=0. \tag{2} \label{eq:2} \end{equation}
Thus, solving the unconstrained minimization problem \eqref{eq:1} is the same as finding a solution of \eqref{eq:2}, which is a set of $n$ equations in the $n$ variables $x_1, \dots, x_n$. Usually, the problem must be solved by an iterative algorithm. By this we mean an algorithm that computes a sequence of points $x^{(0)}, x^{(1)}, \dots \in \mathbf{dom}\enspace f$ with $f(x^{(k)})\rightarrow p^\ast$ as $k\rightarrow\infty$. The algorithm is terminated when $f(x^{k}) - p^\ast \le \epsilon$, where $\epsilon>0$ is some specified tolerance.
Initial point and sublevel set
The starting point $x^{(0)}$ must lie in $\mathbf{dom}\enspace f$, and in addition the sublevel set
must be closed. This condition is satisfied for all $x^{(0)}\in\mathbf{dom}\enspace f$ if the function $f$ is closed.
Note: 1) Continuous functions with $\mathbf{dom}\enspace f=\mathbf{R}^n$ are closed; 2) Another important class of closed functions are continuous functions with open domains.
Examples
Quadratic minimization and least-squares
The general convex quadratic minimization problem has the form
where $P\in\mathbf{S}_+^n$, $q\in\mathbf{R}^n$, and $r\in\mathbf{R}$. This problem can be solved via the optimality conditions, $Px+q=0$, which is a set of linear equations.
One special case of the quadratic minimization problem that arises very frequently is the least-squares problem
The optimality condition
are called the normal equations of the least-squares problem.
Unconstrained geometric programming
Analytic center of linear inequalities
where the domain of $f$ is the open set
Analytic center of a linear matrix inequality
where $F: \mathbf{R}^n\rightarrow\mathbf{S}^p$ is affine. Here the domain of $f$ is
Strong convexity and implications
The objective function is strongly convex on $S$, which means that there exists an $m>0$ such that
\begin{equation} \nabla^2f(x) \succeq mI \tag{3} \label{eq:3} \end{equation}
for all $x\in S$. For $x, y \in S$ we have
for some $z$ on the line segement $[x, y]$. By the strong convexity assumption \eqref{eq:3}, the last term on the righthand side is at least $(m/2)\lVert y-x\rVert^2_2$, so we have the inequality
\begin{equation} f(y) \ge f(x) + \nabla f(x)^T(y-x) + \frac{m}2\lVert y-x \rVert_2^2 \tag{4} \label{eq:4} \end{equation}
for all $x$ and $y$ in $S$.
Bound $f(x)-p^\ast$ in terms of $\lVert \nabla f(x) \rVert_2$
Setting the gradient of the righthand side of \eqref{eq:4} with respect to $y$ equal to zero, we find that $\tilde y = x-(1/m)\nabla f(x)$ minimizes the righthand side. Therefore we have
Since this holds for any $y\in S$, we have
\begin{equation} p^* \ge f(x) - \frac1{2m}\lVert \nabla f(x)\rVert_2^2 \tag{5} \label{eq:5} \end{equation}
This inequality shows that if the gradient is small at a point, then the point is nearly optimal.
Bound $\lVert x-x^\ast\rVert_2^2$ in terms of $\lVert \nabla f(x) \rVert_2$
Apply \eqref{eq:4} with $y=x^\ast$ to obtain
where we use the Cauchy-Schwarz inequality in the second inequality. Since $p^\ast \le f(x)$, we must have
Therefore, we have
\begin{equation} \lVert x - x^\ast \rVert_2 \le \frac2m\lVert \nabla f(x) \rVert_2. \tag{6}\label{eq:6} \end{equation}
Uniqueness of the optimal point $x^\ast$
If there are two optimal point $x^\ast_1, x^\ast_2$, according to \eqref{eq:6},
Hence, $x_1^\ast = x_2^\ast$, the optimal point $x^\ast$ is unique.
Upper bound on $\nabla^2f(x)$
There exists a constant $M$ such that
for all $x \in S$. This upper bound on the Hessian implies for any $x, y \in S$,
minimizing each side over $y$ yields
Condition number of sublevel sets
The ratio $\kappa=M/m$ is an upper bound on the condition number of the matrix $\nabla^2 f(x)$, i.e., the ratio of its largest eigenvalue to its smallest eigenvalue.
We define the width of a convex set $C \subseteq \mathbf{R}^n$, in the direction $q$, where $\lVert q \rVert_2 = 1$, as
The minimum width and maximum width of $C$ are given by
The condition number of the convex set $C$ is defined as
Suppose $f$ satisfies $mI \preceq \nabla^2 f(x) \preceq MI$ for all $x\in S$. The condition number of the $\alpha$-sublevel $C_\alpha={x \vert f(x) \le \alpha}$, where $p^\ast < \alpha \le f(x^{(0)})$, is bounded by
The strong convexity constants
It must be kept in mind that the constants $m$ and $M$ are known only in rare cases, so they cannot be used in a practical stopping criterion.