LASSO

LASSO#

We wish to recover a sparse vector \(x \in \mathbf{R}^n\) from measurements \(y \in \mathbf{R}^m\). Our measurement model tells us that

\[ y = Ax + v, \]

where \(A \in \mathbf{R}^{m \times n}\) is a known matrix and \(v \in \mathbf{R}^m\) is unknown measurement error. For our demonstration the entries of \(v\) are sampled from the normal distribution with mean zero and standard deviation \(\sigma\) (by default, \(\sigma = 1\)).

We can first try to recover \(x\) by solving the optimization problem

\[\begin{split} \begin{array}{ll} \mbox{minimize} & ||Ax - y||^2_2 + \gamma ||x||^2_2.\\ \end{array} \end{split}\]

This problem is called ridge regression.

The code below defines \(n\), \(m\), \(A\), \(x\), and \(y\). Use CVXPY to estimate \(x\) from \(y\) using ridge regression. Try multiple values of \(\gamma\). Use the plotting code to compare the estimated \(x\) with the true \(x\).

A more effective approach is to solve the LASSO problem

\[\begin{split} \begin{array}{ll} \mbox{minimize} & ||Ax - y||^2_2 + \gamma \|x\|_1.\\ \end{array} \end{split}\]

How many measurements \(m\) are needed to find an accurate \(x\) with ridge regression? How about with the LASSO?

# Ridge regression vs. LASSO to estimate sparse x.
import numpy as np
import scipy.linalg as la
import scipy.sparse as sp
import cvxpy as cp

np.random.seed(1)

n = 400
m = 200
true_x = 100 * sp.rand(n, 1, 0.1).toarray().flatten()
A = np.random.randn(m, n)
sigma = 1.0
v = np.random.normal(0, sigma, m)
y = A @ true_x + v


x = cp.Variable(n)
gamma = None  # set me! Initialize to 1.

ridge_loss = None # set me
ridge = cp.Problem(cp.Minimize(ridge_loss))
ridge.solve(solver='ECOS')
x_ridge = x.value

lasso_loss = None # set me
lasso = cp.Problem(cp.Minimize(lasso_loss))
lasso.solve(solver='ECOS')
x_lasso = x.value

import matplotlib.pyplot as plt
%matplotlib inline
plt.semilogy(range(n), np.sort(np.abs(true_x - x_ridge)),  label="ridge errors")
plt.semilogy(range(n), np.sort(np.abs(true_x - x_lasso)),  label="lasso errors")
plt.legend()
plt.show()

# Managing parallelism for cross-validation.
import time
import dask

def get_x(gamma_val):
    # set the regularization parameter
    # gamma to gamma_val. Solve the problem
    # with the ECOS solver. Return the 
    # optimal "x".
    return None

num_gamma = 30
gamma_values = np.logspace(-4, 2, num_gamma)

tic = time.time()
xs_loop = [get_x(val) for val in gamma_values]
toc = time.time()
t_loop = toc - tic

tic = time.time()
dasklist = [dask.delayed(get_x)(val) for val in gamma_values]
xs_dask = dask.compute(*dasklist, scheduler='processes')
toc = time.time()
t_dask = toc - tic

print(f'Time using a native loop \n\t{t_loop}')
print(f'Time to solve using Dask parallelism \n\t{t_dask}')

sol_diffs = [la.norm(xs_loop[i] - xs_dask[i]) for i in range(num_gamma)]
print(f'Maximum discrepency between computed solutions \n\t{max(sol_diffs)}')

sol_errors = [la.norm(xs_loop[i] - true_x) for i in range(num_gamma)]
plt.plot(sol_errors)
plt.show()