Sunday, September 22, 2019

Bound-constrained Optimization by Variable Mapping

This is a note written many years ago, when I was working on a curve fitting problem.  Parameter fitting with bound constrains was not supported in numerical libraries available at the time. Many people approached the bound-constrained problem through adding a penalty term, which distorts the original target function.  Therefore, the ideas below is still worth a post in my opinion.

Introduction


Considering the optimization problem of finding a minimum of a function $f(x), x \in R^n$, subject to bounds $a \le x \le b$, i.e.:

$$\begin{equation} x^* = \arg\min_x f(x), x \in [a,b] \end{equation} $$.

Although solutions have been described previously [1-2], we here propose an alternative conceptually simpler approach by optimizing an equivalent function $f(y), y \in R^n$ and $-\infty \le y \le \infty$.  That is by mapping the constrained variable $x$ into an unconstrained variable $y$ via $x = \mathcal{M}(y)$, the solution $y^* = \arg\min_y{f(y)} $ can be found using any unconstrained optimization solver, then map $y^*$ back, $x^* = \mathcal{M}(y^*)$.

Variable Mapping


It is a trivial case if $a$ is $-\infty$ and $b$ is $\infty$, i.e., an unconstrained case.  Therefore, we only consider mapping functions for the following three non-trivial cases:

Case A: given constrain $a \le x \le b$, define $y$, so that
$$\begin{equation} x =  \frac{e^y-e^{-y}}{e^y+e^{-y} }\frac{b-a}{2}+\frac{b+a}{2}. \end{equation} $$
As $x$ increases from $a$ to $b$, $y$ increases from $-\infty$ to $\infty$ monotonically.


Case B: given constrain $-\infty \lt x \le b$, define $y$, so that
$$\begin{equation} x = b-e^{-y}. \end{equation}$$
As $x$ increases from $-\infty$ to $b$, $y$ increases from $-\infty$ to $\infty$ monotonically.


Case C: given constrain $a \le x \lt \infty$, define $y$, so that
$$\begin{equation} x = a+e^{y}. \end{equation}$$
As $x$ increases from $a$ to $\infty$, $y$ increases from $-\infty$ to $\infty$ monotonically.

With the above mapping functions, the minimization of $f(x)$ with bound constrains is equivalent to the minimization of $f(y)$ without any constrain.

Example A

Figure 1. $f(x) = -x^2$

Let's find the $x$ that minimize $f(x) = -x^2$, with the catch that $x$ can only be in $[1,2]$.  We know the answer should be 2.  However, if you directly solve this problem numerically without constrains, the answer will be $x = \inf$, outside the domain box.

Using Equation 2, we can define $y$ as:

$$\begin{equation} x = \frac{1}{2}\frac{e^y-e^{-y}}{e^y+e^{-y}} + \frac{3}{2}. \end{equation}$$

We then find the $y^*$ that minimizes $f(y)$:

$$\begin{equation} y^* = \arg\min_y - \left( \frac{1}{2}\frac{e^y-e^{-y}}{e^y+e^{-y}} + \frac{3}{2} \right)^2. \end{equation}$$

Now numerical solution of the above equation returns a really large value $y^* = \infty$.

Therefore, corresponding solution according to Equation 5 is: $x^* = 2$, satisfying the constrain.

Example B


This example was what motivated this study.  If it is not understandable to you, do not worry.  A common problem in biomedical research is to characterize the potency of a compound in a biological assay by determining the parameters described in a logistic regression formula.  By taking $n$ measured data points $(c_i, r_i), i = 1, …, n$, where $r_i$ is assay activity measured with dosing the compound at concentration $c_i$.  The compound is characterized by four parameters, which should be determined by minimizing the least square error between experimental data points and their corresponding theoretical predictions as the following:

$$\begin{equation}
\min_{x_1,x_2,x_3,x_4 }⁡ \sum_{i=1}^{n} \left( x_1+\frac{x_2-x_1}{1+{(\frac{c_i}{x_3})}^{x_4 }}-r_i \right) ^2 . \end{equation}$$

To ensure the parameters $\textbf{x}$ are biologically meaningful, $x_1,x_2,x_3,x_4$ are subject to individual constrains, for instance, $0 \le  x_1 \le 0.2, 0.8 \le x_2 \le 1.2, 0 \lt x_3$, and $0.3 \le x_4 \le 3.0$.  Parameters $x_1, x_2, x_3$, and $x_4$ are also known as bottom, top, $IC_{50}$, and Hill slope of the dose-response formula (see [3] for details).  A negative $IC_{50}$ would be biologically meaningless.

To solve the above minimization problem, we optimize a similar unconstrained function of $y$:

$$\begin{equation} \min_{y_1,y_2,y_3,y_4 }⁡ \sum_{i=1}^{n} \left( \mathcal{M}_1(y_1)+\frac{\mathcal{M}_2(y_2)-\mathcal{M}_1(y_1)}{1+{(\frac{c_i}{\mathcal{M}_3(y_3)})}^{\mathcal{M}_4(y_4) }}-r_i \right) ^2 ,  \\
\textbf{y} \in (-\infty, \infty).\end{equation}$$
with mappings (based on Equation 2 and 4):


$$\begin{equation} \begin{aligned}
x_1 &= \mathcal{M}_1(y_1) = 0.2 \frac{e^{y_1}}{e^{y_1 }+e^{-y_1}} \\
x_2 &= \mathcal{M}_2(y_2) = 1+0.2 \frac{e^{y_2 }-e^{-y_2 }}{e^{y_2}+e^{-y_2}} \\
x_3 &= \mathcal{M}_3(y_3) = e^{y_3} \\
x_4 &= \mathcal{M}_4(y_4) = 1.65+ 1.35 \frac{e^{y_4 }-e^{-y_4 }}{e^{y_4 }+e^{-y_4}} \end{aligned} \end{equation}.$$

We use any numerical solver to solve $\textbf{y}^*$ in Equation 6, then find out $\textbf{x}^*$ using Equation 7.  All constrains on $\textbf{x}$ are automatically satisfied.

Discussion


The variable mapping approach introduced appears to be conceptually simpler compared to ones previously introduced.  By mapping the constrained variable space $x$ into unconstrained space $y$, we can utilize nearly any well-studied general optimization algorithms to solve the unconstrained problem.  The mapping functions introduced here are generally well behaved and do not expect to significantly increase the complexity of the original optimization problem.  However, better mapping functions most likely exist.  If the general optimization algorithm of interest requires calculating the derivatives of $f(y)$ with respect to $y$, such derivatives can be easily computed.  Taking for instance the case A mapping described in equation 2:

$$\begin{equation} \begin{aligned}
 \frac{\partial{f(\textbf{y})}}{\partial{y_i}}&=\frac{\partial{f(\textbf{x})}}{\partial{dx_i}}  \frac{d \mathcal{M}_i(y_i) = }{d y_i} \\
&=\frac{2(b-a)}{(e^{y_i}+e^{-y_i} )^2 } \frac{\partial{f(\textbf{x})}}{\partial{dx_i}}   \end{aligned} \end{equation}$$.

Reference


  1. Powell MJD, The BOBYQA algorithm for bound constrained optimization without derivatives. DAMTP 2009/NA06.
  2. Dieter Kraft, Algorithm 733: TOMP – Fortran modules for optimal control calculations.  ACM Transactions on Mathematical Software. (1994) 20:262-281.
  3. https://en.wikipedia.org/wiki/Dose%E2%80%93response_relationship