{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"# Huber regression"
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"## Standard regression"
]
},
{
"cell_type": "markdown",
"source": [
"In this example we do Huber regression in CVXPY.\n",
"We start by discussing standard regression.\n",
"In a regression problem we are given data $(x_i,y_i)\\in {\\bf R}^n \\times {\\bf R}$, $i=1,\\ldots, m$.\n",
"and fit a linear (affine) model\n",
"\n",
"$$\\hat y_i = \\beta ^Tx_i - v,$$\n",
"\n",
"where $\\beta \\in {\\bf R}^n$ and $v \\in {\\bf R}$.\n",
"\n",
"The residuals are $r_i = \\hat y_i - y_i$.\n",
"In standard (least-squares) regression we choose $\\beta,v$ to minimize $\\|r\\|_2^2 = \\sum_i r_i^2$.\n",
"For this choice of $\\beta,v$ the mean of the optimal residuals is zero.\n",
"\n",
"A simple variant is to add (Tychonov) regularization, meaning we solve the optimization problem\n",
"\n",
"$$\\begin{array}{ll}\n",
"\\mbox{minimize} & \\|r\\|_2^2 + \\lambda \\|\\beta \\|_2^2,\n",
"\\end{array}$$\n",
"\n",
"where $\\lambda>0$."
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"## Robust (Huber) regression\n",
"\n",
"A more sophisticated variant is to replace the square function with the *Huber function*\n",
"\n",
"$$\n",
"\\phi(u) = \\left\\{ \\begin{array}{ll} u^2 & |u|\\leq M\\\\\n",
"2Mu - M^2 & |u|>M\n",
"\\end{array}\\right.\n",
"$$\n",
"\n",
"where $M>0$ is the Huber threshold.\n",
"The image below shows the square function on the left and the Huber function on the right.\n",
"\n",
"![title](huber_vs_square.png)\n",
"\n",
"Huber regression is the same as standard (least-squares) regression for small residuals, but allows (some)\n",
"large residuals."
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"## Example\n",
"\n",
"In the following code we do a numerical example of Huber regression.\n",
"We generate $m=450$ measurements with $n=300$ regressors.\n",
"We randomly choose $\\beta^\\mathrm{true}$ and $x_i \\sim \\mathcal N(0,I)$.\n",
"We set $y_i = (\\beta^\\mathrm{true})^Tx_i + \\epsilon_i$, where $\\epsilon_i \\sim\n",
"\\mathcal N(0,1)$.\n",
"Then with probability $p$ we replace $y_i$ with $-y_i$.\n",
"\n",
"The data has fraction $p$ of (non-obvious) wrong measurements.\n",
"The distribution of \"good\" and \"bad\" $y_i$ are the same.\n",
"\n",
"Our goal is to recover $\\beta^\\mathrm{true} \\in {\\bf R}^n$ from the measurements $y\\in {\\bf R}^m$.\n",
"We compare three approaches: standard regression, Huber regression, and \"prescient\" regression, where we know which measurements had their sign flipped.\n",
"\n",
"We generate $50$ problem instances, with $p$ varying from $0$ to $0.15$, and plot the relative error in reconstructing $\\beta^\\mathrm{true}$ for the three approaches.\n",
"Notice that in the range $p \\in [0,0.08]$, Huber regression matches prescient regression. Standard regression, by contrast, fails even for very small $p$."
]
},
{
"cell_type": "code",
"execution_count": 3,
"outputs": [],
"source": [
"# Generate data for Huber regression.\n",
"import numpy as np\n",
"\n",
"np.random.seed(1)\n",
"n = 300\n",
"SAMPLES = int(1.5 * n)\n",
"beta_true = 5 * np.random.normal(size=(n, 1))\n",
"X = np.random.randn(n, SAMPLES)\n",
"Y = np.zeros((SAMPLES, 1))\n",
"v = np.random.normal(size=(SAMPLES, 1))"
],
"metadata": {
"collapsed": false,
"pycharm": {
"name": "#%%\n"
}
}
},
{
"cell_type": "code",
"execution_count": 4,
"outputs": [],
"source": [
"# Generate data for different values of p.\n",
"# Solve the resulting problems.\n",
"# WARNING this script takes a few minutes to run.\n",
"import cvxpy as cp\n",
"\n",
"TESTS = 50\n",
"lsq_data = np.zeros(TESTS)\n",
"huber_data = np.zeros(TESTS)\n",
"prescient_data = np.zeros(TESTS)\n",
"p_vals = np.linspace(0, 0.15, num=TESTS)\n",
"for idx, p in enumerate(p_vals):\n",
" # Generate the sign changes.\n",
" factor = 2 * np.random.binomial(1, 1 - p, size=(SAMPLES, 1)) - 1\n",
" Y = factor * X.T.dot(beta_true) + v\n",
"\n",
" # Form and solve a standard regression problem.\n",
" beta = cp.Variable((n, 1))\n",
" fit = cp.norm(beta - beta_true) / cp.norm(beta_true)\n",
" cost = cp.norm(X.T @ beta - Y)\n",
" prob = cp.Problem(cp.Minimize(cost))\n",
" prob.solve()\n",
" lsq_data[idx] = fit.value\n",
"\n",
" # Form and solve a prescient regression problem,\n",
" # i.e., where the sign changes are known.\n",
" cost = cp.norm(cp.multiply(factor, X.T @ beta) - Y)\n",
" cp.Problem(cp.Minimize(cost)).solve()\n",
" prescient_data[idx] = fit.value\n",
"\n",
" # Form and solve the Huber regression problem.\n",
" cost = cp.sum(cp.huber(X.T @ beta - Y, 1))\n",
" cp.Problem(cp.Minimize(cost)).solve()\n",
" huber_data[idx] = fit.value"
],
"metadata": {
"collapsed": false,
"pycharm": {
"name": "#%%\n"
}
}
},
{
"cell_type": "code",
"execution_count": 5,
"outputs": [
{
"data": {
"text/plain": "