Package com.linkedin.feathr.common
Class FeatureDependencyGraph
java.lang.Object
com.linkedin.feathr.common.FeatureDependencyGraph
A dependency graph for feature anchors and feature derivations.
Purpose 1: Given a list of features' dependencies and which features are anchored, build a graph that can determine
which features are reachable and can resolve features' transitive dependencies on demand.
Purpose 2: Given a list of features with entity key bindings (a.k.a. key tags), provide fully expanded list of all
transitive dependencies including entity key bindings. E.g. if feature2 depends on feature1, and the
request was for [ (a1):feature2, (a2):feature2 ], then the expanded dependencies would be:
[ (a1):feature2, (a2):feature2, (a1):feature1, (a2):feature1]
-
Constructor Summary
ConstructorsConstructorDescriptionFeatureDependencyGraph(Map<String, Set<ErasedEntityTaggedFeature>> dependencyFeatures, Collection<String> anchoredFeatures) Constructs a FeatureDependencyGraph for a given map of features' dependency relationships and a list of which features are anchored. -
Method Summary
Modifier and TypeMethodDescriptiongetComputationPipeline(Collection<TaggedFeatureName> requestedFeatures) Return a computation pipeline (a collection of stages that can be computed simultaneously) for the requested features by examining their dependencies and grouping them based on their depth.Returns an ordered list of features including the requested features and its dependencies that represents their execution order.Deprecated.getPlan(Collection<String> features) Construct a plan for procuring a group of features.booleanisDeclared(String feature) Returns whether a given feature name is present in the dependency graphbooleanisReachable(String feature) Deprecated.isReachableWithErrorMessage(String feature) Returns whether a given feature is reachabletoString()
-
Constructor Details
-
FeatureDependencyGraph
public FeatureDependencyGraph(Map<String, Set<ErasedEntityTaggedFeature>> dependencyFeatures, Collection<String> anchoredFeatures) Constructs a FeatureDependencyGraph for a given map of features' dependency relationships and a list of which features are anchored.- Parameters:
dependencyFeatures- Map of derived feature names to their sets of required inputs (described as TaggedFeatureNames)anchoredFeatures- List of anchored feature names
-
-
Method Details
-
isDeclared
Returns whether a given feature name is present in the dependency graph- Parameters:
feature-- Returns:
- whether a given feature name is present in the dependency graph
-
isReachable
Deprecated.Returns whether a given feature is reachable- Parameters:
feature-- Returns:
- isReachable Boolean
-
isReachableWithErrorMessage
public com.linkedin.feathr.common.FeatureDependencyGraph.Pair<Boolean,String> isReachableWithErrorMessage(String feature) Returns whether a given feature is reachable- Parameters:
feature-- Returns:
- A Pair of Boolean and String. Boolean indicates if it's reachable and the String indicates the error message if not reachable.
-
getPlan
Construct a plan for procuring a group of features. For a given group of features, what are all the features (transitive dependencies) that need to be procured in order to derive them? This function returns a complete, ordered list of features sufficient to derive the given group of features when evaluated in-order.- Parameters:
features-- Returns:
- Ordered list of feature names that can be resolved in-sequence to produce a superset of the given group of features. The returned list is NOT just a re-ordering of the input features, and may contain other features that weren't specifically requested but are required as dependencies. The returned list will always contain all of the features provided in the input.
-
getOrderedPlanForRequest
@Deprecated public List<TaggedFeatureName> getOrderedPlanForRequest(Collection<TaggedFeatureName> request) Deprecated. -
getOrderedPlanForFeatureUrns
Returns an ordered list of features including the requested features and its dependencies that represents their execution order. For example, if the feature dependency is A->B, B->C and (A,C) -> D. Then one possible execution order would be: A, B, C, D -
getComputationPipeline
public List<Set<TaggedFeatureName>> getComputationPipeline(Collection<TaggedFeatureName> requestedFeatures) Return a computation pipeline (a collection of stages that can be computed simultaneously) for the requested features by examining their dependencies and grouping them based on their depth. For each stage, all features will have their direct dependencies resolved from the previous stages so they can be computed simultaneously For example, if the feature dependencies are as follows and features E and F are requested A -> C B -> D (C,D) -> E F Then the result would be returned in the form of an ordered list: [[A,B, F],[C,D],[E]] which represents stage 1: A, B, F stage 2: C, D stage 3: E Disclaimer: the pipeline based approach provides one way to optimize the feature execution and their dependencies but it is by no means the most optimal. For example, in the example above, if A is very slow compared to B and F, then computation of C will be blocked until A is ready. The optimal solution where each feature is computed in isolation all the way from its root dependency.- Returns:
- a sorted list of features represents the stages for feature execution
-
toString
-