Class FeatureValue

java.lang.Object
com.linkedin.feathr.common.FeatureValue
All Implemented Interfaces:
com.esotericsoftware.kryo.KryoSerializable, Serializable

public class FeatureValue extends Object implements Serializable, com.esotericsoftware.kryo.KryoSerializable
Describes the basic representation of a feature value in Feathr. This is the OLD FeatureValue API. Feathr is migrating to a new FeatureValue API defined at com.linkedin.feathr.common.value.FeatureValue. This implementation of the old API is a wrapper around the new API, with lots of glue to maintain compatibility. SERIALIZATION: For use with Spark RDD API, this class is serializable both by Java Object Serialization and by Kryo. Note that because the serialized representation is based on NTV (name-term-value), serialization will NOT work for tensor features having rank greater than 1. MUTABILITY: For legacy compatibility, instances of this class are mutable under the following conditions: 1. When constructed with zero-argument constructor 2. When constructed with Map<String,Float> constructor 3. When constructed specifically as a term-vector using any createStringTermVector factory methods 4. When deserialized with Kryo or Java-Object Serialization When mutable, instances may be mutated by: 1. Using modifier methods the Map objects returned by .getValue or .getAsTermVector 2. Using modifier methods on ._value public member field. 3. Using the .put method. All mutability support in FeatureValue is DEPRECATED. NOTE: This class will be deprecated soon in favor of FeatureValue
See Also:
  • Field Details

  • Constructor Details

  • Method Details

    • put

      @Deprecated public void put(String key, Float value)
      Deprecated.
      mutability in FeatureValue is deprecated; see class-level Javadoc
      Modify a term-vector FeatureValue by adding a term-value mapping. Mutability is only allowed under certain conditions, described in the class-level Javadoc.
      Throws:
      RuntimeException - if not mutable
    • isEmpty

      public boolean isEmpty()
      Check if the FeatureValue is empty or not. When it's empty, it return true; otherwise false. Emptiness of FeatureValue is defined in the following way: 1. For tensor based features, empty means an tensor. 2. For legacy term vector based features, empty means an empty vector. Note: Since Boolean NTV feature false is equivalent to empty map, so it's false is considered as empty as well.
    • size

      @Deprecated public int size()
      Deprecated.
      Prefer to use getNumTerms() which has a more descriptive name
      Throws:
      RuntimeException - if this instance stores a tensor of rank greater than 1
    • getValue

      @Deprecated public Map<String,Float> getValue()
      Deprecated.
      Prefer to use getAsTermVector() which has a more descriptive name
    • getNumTerms

      @Deprecated public int getNumTerms()
      Deprecated.
      Get the number of terms in feature when interpreted as a term vector.
    • For a numeric feature, this will always be 1
    • For a categorical feature, this will always be 1
    • For a categorical set feature, this will be the number of terms in the set
    • For any vector feature, it will be the number of explicitly present elements in the vector
    • Returns:
      the number of terms in this feature
      Throws:
      RuntimeException - if this instance stores a tensor of rank greater than 1
    • getFeatureType

      public FeatureType getFeatureType()
    • getAsTypedTensor

      public TypedTensor getAsTypedTensor()
      Returns:
      a Quince TypedTensor representation of this feature value
    • getAsTensorData

      public TensorData getAsTensorData()
      Returns:
      a Quince TensorData representation of this feature value
    • getAsTermVector

      public Map<String,Float> getAsTermVector()
      Gets the feature in an NTV term-vector representation.
      Returns:
      Map from String term names to Float values
      Throws:
      RuntimeException - if the feature is a tensor of rank greater than 1, then this feature cannot be represented as an NTV term vector
    • getAsNumeric

      public Float getAsNumeric()
      get feature value as number
      Returns:
      number value
      Throws:
      RuntimeException - if this is not a valid numeric feature
    • getAsString

      @Deprecated public String getAsString()
      Deprecated.
      use getAsCategorical() instead
      Gets a CATEGORICAL feature value represented as a string.
      Returns:
      string value
    • getAsCategorical

      public String getAsCategorical()
      Returns the single categorical value associated with the feature Note: this method will return String even if the feature was created using createCategorical(Number) because categorical are meant to contain only discrete values which are represented by String
      Returns:
      the categorical term as a String
      Throws:
      RuntimeException - if this is not a valid categorical feature
    • getAsBoolean

      public Boolean getAsBoolean()
      Get feature value as boolean
      Returns:
      boolean value
      Throws:
      RuntimeException - if this is not a valid boolean feature
    • createNumeric

      public static FeatureValue createNumeric(Number num)
      Creates a numeric feature.
      Parameters:
      num - Number
      Returns:
      FeatureValue
    • createCategorical

      @Deprecated public static FeatureValue createCategorical(Number term)
      Deprecated.
      Creates a categorical feature using a whole number term which is within some precision of a whole number. The whole number term will be stored as a String. Only whole number terms are accepted because categorical are meant to contain discrete values. The common use case for using number as the term would be to represent some entity ID (e.g. y, x)
      Parameters:
      term - Number the numeric term which will be stored as a String term
      Returns:
      a new instance of FeatureValue
      Throws:
      RuntimeException - if num is not within some precision of a whole number
    • createCategorical

      public static FeatureValue createCategorical(CharSequence term)
      Creates a categorical feature from a string term
      Parameters:
      term - CharSequence
      Returns:
      a new instance of FeatureValue
    • createCategorical

      public static FeatureValue createCategorical(Character term)
      Creates a categorical feature from a character term
      Parameters:
      term - Character
      Returns:
      a new instance of FeatureValue
    • createDenseVector

      public static <T extends Number> FeatureValue createDenseVector(List<T> vec)
      Creates a dense vector feature from a List of numbers
      Type Parameters:
      T - the type of Number
      Parameters:
      vec - List
      Returns:
      a new instance of FeatureValue
    • createDenseVector

      public static <T extends Number> FeatureValue createDenseVector(T[] vec)
      Creates a dense vector feature from an array of numbers
      Type Parameters:
      T - the type of Number
      Parameters:
      vec - FeatureValue array of items
      Returns:
      a new instance of FeatureValue
    • createStringTermVector

      public static <K extends CharSequence, V extends Number> FeatureValue createStringTermVector(Map<K,V> inputMap)
      Creates a term vector feature (aka sparse vector) using strings for the terms
      Type Parameters:
      K - the string term type
      V - the numeric value type
      Parameters:
      inputMap - Map from a string term to numeric values
      Returns:
      a new instance of FeatureValue
    • createNumericTermVector

      public static <K extends Number, V extends Number> FeatureValue createNumericTermVector(Map<K,V> inputMap)
      Creates a term vector feature (e.g. sparse vector) using a numeric to numeric map
      Type Parameters:
      K - the numeric term type
      V - the numeric value type
      Parameters:
      inputMap - Map from a number term (within some precision of a whole numbers) to numeric value
      Returns:
      a new instance of FeatureValue
      Throws:
      RuntimeException - if input numeric key is a not within some precision of a whole numbers (e.g. 1.5f)
      RuntimeException - if multiple input numeric keys map to the same whole number (e.g. 1.00f and 1.0000d)
    • createStringTermVector

      public static <K extends CharSequence, V extends Number> FeatureValue createStringTermVector(Collection<Map<K,V>> termValues)
      Creates a term vector feature (AKA sparse vector) from multiple term vectors with string terms
      Type Parameters:
      K - the string term type
      V - the numeric value type
      Parameters:
      termValues - Collection of Map from string term to numeric value
      Returns:
      a new instance of FeatureValue
      Throws:
      RuntimeException - if input numeric key is a not within some precision of a whole numbers
      RuntimeException - if multiple input numeric keys map to the same whole number (e.g. 1.00f and 1.0000d)
    • createNumericTermVector

      public static <K extends Number, V extends Number> FeatureValue createNumericTermVector(Collection<Map<K,V>> termValues)
      Creates a term vector feature (e.g. sparse vector) by merging multiple numeric to numeric map
      Type Parameters:
      K - the numeric term type
      V - the numeric value type
      Parameters:
      termValues - Map from a number term to numeric value
      Returns:
      a new instance of FeatureValue
      Throws:
      RuntimeException - if input numeric keys are not within some precision of a whole numbers (e.g. 1.5f)
      RuntimeException - if multiple input numeric keys map to the same whole number (e.g. 1.00f and 1.0000d)
    • createNumericCategoricalSet

      public static <T extends Number> FeatureValue createNumericCategoricalSet(Collection<T> terms)
      Creates a categorical set feature from a collection of Numbers that are whole numbers
      Type Parameters:
      T - the numeric type
      Parameters:
      terms - Collection of Number terms
      Returns:
      a new instance of FeatureValue
      Throws:
      RuntimeException - if input numeric elements are not within some precision of a whole numbers (e.g. 1.5f)
    • createStringCategoricalSet

      public static <T extends CharSequence> FeatureValue createStringCategoricalSet(Collection<T> terms)
      Creates a categorical set feature from string terms
      Type Parameters:
      T - the string term type
      Parameters:
      terms - Collection of CharSequence terms
      Returns:
      a new instance of FeatureValue
    • createCharacterCategoricalSet

      public static FeatureValue createCharacterCategoricalSet(Collection<Character> characters)
      Creates a categorical set feature from character terms
      Parameters:
      characters - Collection of Character
      Returns:
      a new instance of FeatureValue
    • createBoolean

      public static FeatureValue createBoolean(Boolean b)
      Creates a boolean feature.
      Parameters:
      b - Boolean
      Returns:
      a new instance of FeatureValue
    • createTensor

      public static FeatureValue createTensor(Object value, TensorType type)
      Creates a (tensor) FeatureValue based on the provided TensorType.
      Parameters:
      value - The data used to creates the the tensor.
      type - The corresponding type needed to create the tensor.
      Throws:
      IllegalArgumentException - if the data type is not supported or if create a 2-d or higher dimension tensor.
    • createTensor

      public static FeatureValue createTensor(TensorData tensorData)
      Create a FeatureValue from a TensorData (adding a TensorType inferred from the primitives of the TensorData). NOTE that the TensorType will not have any shape, so may be NOT reused for constructing new dense tensors from arrays and lists.
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • equals

      public boolean equals(Object o)
      Overrides:
      equals in class Object
    • hashCode

      public int hashCode()
      Overrides:
      hashCode in class Object
    • write

      public void write(com.esotericsoftware.kryo.Kryo kryo, com.esotericsoftware.kryo.io.Output output)
      Specified by:
      write in interface com.esotericsoftware.kryo.KryoSerializable
    • read

      public void read(com.esotericsoftware.kryo.Kryo kryo, com.esotericsoftware.kryo.io.Input input)
      Specified by:
      read in interface com.esotericsoftware.kryo.KryoSerializable