Package com.linkedin.feathr.common.util
Class MvelContextUDFs
java.lang.Object
com.linkedin.feathr.common.util.MvelContextUDFs
MVEL is an open-source expression language and runtime that makes it easy to write concise statements that operate
on structured data objects (such as Avro records), among other things.
This class contains all the udfs used in Mvel for both online and offline
-
Nested Class Summary
Nested Classes -
Method Summary
Modifier and TypeMethodDescriptionstatic booleanand(boolean left, boolean right) static Doublecast_double(Object input) Cast the input to double.static Floatcast_float(Object input) Cast the input to float.static IntegerCast the input to Integer.static StringConcatenate two strings into one.static FloatcosineSimilarity(Object obj1, Object obj2) static intdayofmonth(Object input) static intstatic Collection<Object>distinct(Collection<Object> collection) static DoubledotProduct(Object obj1, Object obj2) Returns a standard dotProduct of two vector objects.extract_term_value_from_array(ArrayList<org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema> array, String termFieldName, String valueFieldName) extract_term_value_from_array(ArrayList<org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema> array, String termFieldName, String valueFieldName, String filterExpr) static Collection<Object>flatten(Collection<? extends Collection<Object>> collection) static Stringget_data_type(Object input) Get the class type of the input object WARNING: This is only used for debug for users.static Collection<String>getTopKTerms(Object item, int k) return an ordered list of terms, based on descending order of corresponding valuesstatic StringgetTopTerm(Object item) get terms with the highest value (duplicate will resolved randomly)static intstatic booleanif_else(boolean input, boolean first, boolean second) static Doublestatic Floatstatic Integerstatic StringTernary operator.static booleanReturn true if a feature has at least 1 term with value not zero.static booleanstatic booleanstatic booleanreturn true if a feature variable is not null.static booleannot(boolean input) static booleanor(boolean left, boolean right) static voidregisterUDFs(Class<?> clazz, org.mvel2.ParserConfiguration parserConfig) static longtime_duration(Object startTime, Object endTime, String outputGranularity) static BooleanConverts an object to a booleanstatic ObjecttoCategorical(Object item) Convert input to categorical feature Example inputs that can be converted - Map("a" -> 2.0) return Map("a" -> 2.0) - Map("" -> 2.5) returns ("2.5", 1.0)static StringtoLowerCase(String input) convert input to lower case stringstatic ObjectConvert input to numeric value Example inputs that can be converted - Map("" -> 2.0) returns 2.0 - Map("876" -> 1.0) returns 876static StringtoUpperCase(String input) convert input to upper case string
-
Method Details
-
registerUDFs
-
get_data_type
Get the class type of the input object WARNING: This is only used for debug for users.- Returns:
- Type in String form
-
cast_double
Cast the input to double. If it's null, null is returned. If it's string, it will try to parse it. For number types, it will do standard conversion. For other types, it will be coerced to double. -
cast_float
Cast the input to float. If it's null, null is returned. If it's string, it will try to parse it. For number types, it will do standard conversion. For other types, it will be coerced to float. -
cast_int
Cast the input to Integer. If it's null, null is returned. If it's string, it will try to parse it. For number types, it will do standard conversion. For other types, it will be coerced to Integer. -
and
public static boolean and(boolean left, boolean right) -
or
public static boolean or(boolean left, boolean right) -
not
public static boolean not(boolean input) -
isnull
-
isnotnull
-
concat
Concatenate two strings into one. -
if_else
Ternary operator. If input is evaluated to true, then first is returned, else second is returned. -
if_else
-
if_else
-
if_else
-
if_else
public static boolean if_else(boolean input, boolean first, boolean second) -
isNonZero
Return true if a feature has at least 1 term with value not zero. For checking if a feature is null, please use isPresent(). -
isPresent
return true if a feature variable is not null. -
toBoolean
Converts an object to a boolean -
toNumeric
Convert input to numeric value Example inputs that can be converted - Map("" -> 2.0) returns 2.0 - Map("876" -> 1.0) returns 876 -
toCategorical
Convert input to categorical feature Example inputs that can be converted - Map("a" -> 2.0) return Map("a" -> 2.0) - Map("" -> 2.5) returns ("2.5", 1.0) -
getTerms
-
getTopKTerms
return an ordered list of terms, based on descending order of corresponding values- Parameters:
item- Object that can be converted to Map of string to floatk- integer, if k<0, do reverse order selection, e.g. -1: select bottom first- Returns:
- List of string, ordered
-
getTopTerm
get terms with the highest value (duplicate will resolved randomly)- Parameters:
item- Object that can be converted to Map of string to float- Returns:
- String
-
distinct
-
flatten
-
cosineSimilarity
-
dotProduct
Returns a standard dotProduct of two vector objects. UsecosineSimilarity(Object, Object)for normalized dot-product. -
toLowerCase
convert input to lower case string- Parameters:
input- input string- Returns:
- lower case input
-
toUpperCase
convert input to upper case string- Parameters:
input- input string- Returns:
- upper case input
-
time_duration
-
dayofweek
-
dayofmonth
-
hourofday
-
extract_term_value_from_array
-
extract_term_value_from_array
-