Class OWLQN

  • All Implemented Interfaces:
    Minimizer

    public class OWLQN
    extends AbstractMinimizer
    Java translation of C++ code of "Orthant-Wise Limited-memory Quasi-Newton Optimizer for L1-regularized Objectives" (@see http://research.microsoft.com/).

    The Orthant-Wise Limited-memory Quasi-Newton algorithm (OWL-QN) is a numerical optimization procedure for finding the optimum of an objective of the form {smooth function} plus {L1-norm of the parameters}. It has been used for training log-linear models (such as logistic regression) with L1-regularization. The algorithm is described in "Scalable training of L1-regularized log-linear models" by Galen Andrew and Jianfeng Gao.

    Orthant-Wise Limited-memory Quasi-Newton algorithm minimizes functions of the form

    f(w) = loss(w) + C |w|_1

    where loss is an arbitrary differentiable convex loss function, and |w|_1 is the L1 norm of the weight (parameter) vector. It is based on the L-BFGS Quasi-Newton algorithm, with modifications to deal with the fact that the L1 norm is not differentiable. The algorithm is very fast, and capable of scaling efficiently to problems with millions of parameters.

    This is a straight forward translation, with the use of my math library.

    Author:
    thomas.jungblut
    • Constructor Summary

      Constructors 
      Constructor Description
      OWLQN()  
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      OWLQN doGradChecks()
      Set to true this will check the gradients every iteration and print out if it aligns with the numerical gradient.
      de.jungblut.math.DoubleVector minimize​(CostFunction f, de.jungblut.math.DoubleVector theta, int maxIterations, boolean verbose)
      Minimizes the given costfunction with the starting parameter theta.
      static de.jungblut.math.DoubleVector minimizeFunction​(CostFunction f, de.jungblut.math.DoubleVector theta, int maxIterations, boolean verbose)
      Minimizes the given cost function with L-BFGS.
      OWLQN setL1Weight​(double l1weight)
      This implementation also supports l1 weight adjustment (without the costfunction knowing about it).
      OWLQN setM​(int m)
      The amount of directions and gradients to keep, this is the "limited" part of L-BFGS.
      OWLQN setTolerance​(double tol)
      The breaking tolerance over a window of five iterations.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • OWLQN

        public OWLQN()
    • Method Detail

      • minimize

        public de.jungblut.math.DoubleVector minimize​(CostFunction f,
                                                      de.jungblut.math.DoubleVector theta,
                                                      int maxIterations,
                                                      boolean verbose)
        Description copied from interface: Minimizer
        Minimizes the given costfunction with the starting parameter theta.
        Parameters:
        f - the costfunction to minimize.
        theta - the starting parameters.
        maxIterations - the number of iterations to do.
        verbose - if TRUE it will print progress.
        Returns:
        the optimized theta parameters.
      • doGradChecks

        public OWLQN doGradChecks()
        Set to true this will check the gradients every iteration and print out if it aligns with the numerical gradient.
      • setM

        public OWLQN setM​(int m)
        The amount of directions and gradients to keep, this is the "limited" part of L-BFGS. It defaults to 10.
      • setL1Weight

        public OWLQN setL1Weight​(double l1weight)
        This implementation also supports l1 weight adjustment (without the costfunction knowing about it). This is turned off by default.
      • setTolerance

        public OWLQN setTolerance​(double tol)
        The breaking tolerance over a window of five iterations. This defaults to 1e-4.
      • minimizeFunction

        public static de.jungblut.math.DoubleVector minimizeFunction​(CostFunction f,
                                                                     de.jungblut.math.DoubleVector theta,
                                                                     int maxIterations,
                                                                     boolean verbose)
        Minimizes the given cost function with L-BFGS.
        Parameters:
        f - the costfunction to minimize.
        theta - the initial weights.
        maxIterations - the maximum amount of iterations.
        verbose - true if progress output shall be printed.
        Returns:
        the optimized set of parameters for the cost function.