Partitioned Least Squares

Linear least squares is one of the most widely used regression methods among scientists in many fields. The simplicity of the model allows this method to be used when data is scarce and it is usually appealing to practitioners that need to gather some insight into the problem by inspecting the values of the learnt parameters. PartitionedLS is a variant of the linear least squares model allowing practitioners to partition the input features into groups of variables that they require to contribute similarly to the final result.

The model

The Partitioned Least Squares model is formally defined as:

\[\begin{gather*} \text{minimize}_{\mathbf{\alpha}, \mathbf{\beta}} \| \mathbf{X} \times (\mathbf{P} \circ \mathbf{\alpha}) \times \mathbf{\beta} - \mathbf{y} \|_2^2 \\ \begin{aligned} \quad s.t.\quad &\mathbf{\alpha} \succeq 0\\ &\mathbf{P}^T \times \mathbf{\alpha} = \mathbf{1}. \end{aligned} \end{gather*}\]

where:

  • $\mathbf{X}$ is $N \times M$ data matrix;
  • $\mathbf{P}$ is a user-defined partition matrix having $K$ columns (one for each element of the partition), $M$ rows, and containing $1$ in $P_{i,j}$ if the $i$-th attribute belongs to the $j$-th partition and $0$ otherwise;
  • $\mathbf{\beta}$ is a vector weighting the importance of each set of attributes in the partition;
  • $\mathbf{\alpha}$ is a vector weighting the importance of each attribute within one of the sets in the partition. Note that the constraints imply that for each set in the partition the weights of the corresponding $\alpha$ variables are all positive and sum to $1$.

The PartitionedLS problem is non-convex and NP-complete. The library provides two algorithms to solve the problem anyway: an iterative algorithm based on the Alternating Least Squares approach and an optimal algorithm that guarantees requiring however exponential time in the cardinality of the partition (i.e., it is mainly useful when $K$ is small).

More details can be found in the paper Partitioned Least Squares.

To install this library

Just add it as a dependency to your Julia environment. Launch julia from the main directory of your project and enter the following commands:

# Opens the package manager REPL
]

# Activate you local environment (can be skipped if you want to install the library globally)
activate .

# Adds the library to the environment
add PartitionedLS

To use this library

You will need a matrix P describing the partitioning of your variables, e.g.:

P = [[1 0]; 
     [1 0]; 
     [0 1]]

specifies that the first and the second variable belongs to the first partition, while the third variable belongs to the second.

You have then the choice to use either the standard interface or the MLJ interface.

Standard interface

The standard interface defines a fit function for each of the implemented algorithms. The function returns a tuple containing:

  • a PartLSFitResult object containing the model and the parameters found by the algorithm;
  • nothing (this is mandated by the MLJ interface, but it is not used in this case).
  • a NamedTuple containing some additional information.

A complete example:


using PartitionedLS

X = [[1. 2. 3.]; 
     [3. 3. 4.]; 
     [8. 1. 3.]; 
     [5. 3. 1.]]

y = [1.; 
     1.; 
     2.; 
     3.]

P = [[1 0]; 
     [1 0]; 
     [0 1]]


# fit using the optimal algorithm 
result = fit(Opt, X, y, P, η = 0.0)


# Make predictions on the given data matrix. The function works
# with results returned by anyone of the solvers.
predict(result[1], X)

MLJ interface

The MLJ interface is a allows you to use the library in a more MLJ-like fashion. The interface is defined by the PartLS model, which can be used in the MLJ framework. The model can be used in the same way as any other MLJ model.

A complete example:

using MLJ
using PartitionedLS

X = [[1. 2. 3.]; 
     [3. 3. 4.]; 
     [8. 1. 3.]; 
     [5. 3. 1.]]

y = [1.;
     1.;
     2.;
     3.]

P = [[1 0]; 
     [1 0]; 
     [0 1]]

# Define the model

model = PartLS(P=P, Optimizer=Opt, η=0.0)

# Fit the model
mach = machine(model, X, y)
fit!(mach)

# Make predictions
predict(mach, X)

API Documentation

PartitionedLS.PartLSType
mutable struct PartLS <: Deterministic

The PartLS struct represents a partitioned least squares model. Fields are:

  • Optimizer: the optimization algorithm to use. It can be Opt, Alt or BnB.
  • P: the partition matrix. It is a binary matrix where each row corresponds to a partition and each column corresponds to a feature. The element P_{k, i} = 1 if feature i belongs to partition k.
  • η: the regularization parameter. It controls the strength of the regularization.
  • ϵ: the tolerance parameter. It is used to determine when the Alt optimization algorithm has converged. Only used by the Alt algorithm.
  • T: the maximum number of iterations. It is used to determine when to stop the Alt optimization algorithm has converged. Only used by the Alt algorithm.
  • rng: the random number generator to use.
    • If nothing, the global random number generator rand is used.
    • If an integer, the global number generator rand is used after seeding it with the given integer.
    • If an object of type AbstractRNG, the given random number generator is used.

Example

model = PartLS(P=P, Optimizer=Alt, rng=123)
source
PartitionedLS.PartLSFitResultType
struct PartLSFitResult

The PartLSFitResult struct represents the solution of the partitioned least squares problem. It contains the values of the α and β variables, the intercept t and the partition matrix P.

Fields

  • α::Vector{Float64}: The values of the α variables. For each partition $k$, it holds the values of the α variables are such that $\sum_{i \in P_k} \alpha_{k} = 1$.
  • β::Vector{Float64}: The values of the β variables. For each partition $k$, $\beta_k$ is the coefficient that multiplies the features in the k-th partition.
  • t::Float64: The intercept term of the model.
  • P::Matrix{Int64}: The partition matrix. It is a binary matrix where each row corresponds to a partition and each column corresponds to a feature. The element $P_{k, i} = 1$ if feature $i$ belongs to partition $k$.
source
MLJModelInterface.fitFunction
fit(
    ::Type{Alt},
    X::Matrix{Float64},
    y::Vector{Float64},
    P::Matrix{Int64};
    η,
    ϵ,
    T,
    nnlsalg,
    rng
) -> Tuple{PartLSFitResult, Nothing, NamedTuple{(:opt,), <:Tuple{Any}}}

Fits a PartitionedLS model by alternating the optimization of the α and β variables. This version uses an optimization strategy based on non-negative-least-squaes solvers. This formulation is faster and more numerically stable with respect to fit(Alt, ...)`.

Arguments

  • X: $N × M$ matrix describing the examples
  • y: $N$ vector with the output values for each example
  • P: $M × K$ matrix specifying how to partition the $M$ attributes into $K$ subsets. $P_{m,k}$ should be 1 if attribute number $m$ belongs to partition $k$.
  • η: regularization factor, higher values implies more regularized solutions. Default is 0.0.
  • T: number of alternating loops to be performed. Default is 100.
  • ϵ: minimum relative improvement in the objective function before stopping the optimization. Default is 1e-6
  • nnlsalg: specific flavour of nnls algorithm to be used, possible values are :pivot, :nnls, :fnnls. Default is :nnls

Result

A Tuple with the following fields:

  1. a PartLSFitResult object containing the fitted model
  2. a nothing object
  3. a NamedTuple with a field opt containing the optimal value of the objective function
source
fit(
    ::Type{Opt},
    X::Matrix{Float64},
    y::Vector{Float64},
    P::Matrix{Int64};
    η,
    nnlsalg,
    returnAllSolutions
) -> Tuple{PartLSFitResult, Nothing, Any}

Fits a PartialLS Regression model to the given data and resturns the learnt model (see the Result section). It uses a coplete enumeration strategy which is exponential in K, but guarantees to find the optimal solution.

Arguments

  • X: $N × M$ matrix describing the examples
  • y: $N$ vector with the output values for each example
  • P: $M × K$ matrix specifying how to partition the $M$ attributes into $K$ subsets. $P_{m,k}$ should be 1 if attribute number $m$ belongs to

partition $k$.

  • η: regularization factor, higher values implies more regularized solutions (default: 0.0)
  • returnAllSolutions: if true an additional output is appended to the resulting tuple containing all solutions found during the algorithm.
  • nnlsalg: the kind of nnls algorithm to be used during solving. Possible values are :pivot, :nnls, :fnnls (default: :nnls)

Example

X = rand(100, 10)
y = rand(100)
P = [1 0 0; 0 1 0; 0 0 1; 1 1 0; 0 1 1]
result = fit(Opt, X, y, P)
source
fit(
    ::Type{BnB},
    X::Matrix{Float64},
    y::Vector{Float64},
    P::Matrix{Int64};
    η,
    nnlsalg
) -> Tuple{PartLSFitResult, Nothing, NamedTuple{(:opt, :nopen), <:Tuple{Any, Int64}}}

Implements the Branch and Bound algorithm to fit a Partitioned Least Squres model.

Arguments

  • X: $N × M$ matrix describing the examples
  • y: $N$ vector with the output values for each example
  • P: $M × K$ matrix specifying how to partition the $M$ attributes into $K$ subsets. $P_{m,k}$ should be 1 if attribute number $m$ belongs to

partition $k$.

  • η: regularization factor, higher values implies more regularized solutions (default: 0.0)
  • nnlsalg: the kind of nnls algorithm to be used during solving. Possible values are :pivot, :nnls, :fnnls (default: :nnls)

Result

A tuple with the following fields:

  1. a PartLSFitResult object containing the fitted model
  2. a nothing object
  3. a NamedTuple with fields:
    • opt containing the optimal value of the objective function
    • nopen containing the number of open nodes in the branch and bound tree
source
fit(
    m::PartLS,
    verbosity,
    X,
    y
) -> Tuple{PartLSFitResult, Nothing, Any}

Fits a PartitionedLS Regression model to the given data and resturns the learnt model (see the Result section). It conforms to the MLJ interface.

Arguments

  • m: A PartLS model to fit
  • verbosity: the verbosity level
  • X: the data matrix
  • y: the target vector
source
MLJModelInterface.predictFunction
predict(
    α::Vector{Float64},
    β::Vector{Float64},
    t::Float64,
    P::Matrix{Int64},
    X::Matrix{Float64}
) -> Vector{Float64}

Result

the prediction for the partitioned least squares problem with solution α, β, t over the dataset X and partition matrix P

source
predict(
    model::PartLSFitResult,
    X::Matrix{Float64}
) -> Vector{Float64}

Make predictions for the datataset X using the PartialLS model model.

Arguments

  • model: a PartLSFitResult
  • X: the data containing the examples for which the predictions are sought

Return

the predictions of the given model on examples in X.

source
PartitionedLS.homogeneousCoordsFunction

Rewrites X and P in homogeneous coordinates. The result is a tuple (Xo, Po) where Xo is the homogeneous version of X and Po is the homogeneous version of P.

Arguments

  • X: the data matrix
  • P: the partition matrix

Return

  • Xo: the homogeneous version of X
  • Po: the homogeneous version of P
source
PartitionedLS.regularizeProblemFunction

Adds regularization terms to the problem. The regularization terms are added to the objective function as a sum of squares of the α variables. The regularization parameter η controls the strength of the regularization.

Arguments

  • X: the data matrix
  • y: the target vector
  • P: the partition matrix
  • η: the regularization parameter

Return

  • Xn: the new data matrix
  • yn: the new target vector

Main idea

K new rows are added to the data matrix X, row $k \in \{1 \dots K\}$ is a vector of zeros except for the components that corresponds to features belonging to the k-th partition, which is set to sqrt(η). The target vector y is extended with K zeros.

The point of this change is that when the objective function is evaluated as $math \|Xw - y\|^2$, the new part of the matrix contributes to the loss with a factor of $η \sum \|w_i\|^2$ . This is equivalent to adding a regularization term to the objective function.

source