Packages

  • package root
    Definition Classes
    root
  • package ai
    Definition Classes
    root
  • package catboost
    Definition Classes
    ai
  • package spark

    CatBoost is a machine learning algorithm that uses gradient boosting on decision trees.

    CatBoost is a machine learning algorithm that uses gradient boosting on decision trees.

    Overview

    This package provides classes that implement interfaces from Apache Spark Machine Learning Library (MLLib).

    For binary and multi- classification problems use CatBoostClassifier, for regression use CatBoostRegressor.

    These classes implement usual fit method of org.apache.spark.ml.Predictor that accept a single org.apache.spark.sql.DataFrame for training, but you can also use other fit method that accepts additional datasets for computing evaluation metrics and overfitting detection similarily to CatBoost's other APIs.

    This package also contains Pool class that is CatBoost's abstraction of a dataset. It contains additional information compared to simple org.apache.spark.sql.DataFrame.

    It is also possible to create Pool with quantized features before training by calling quantize method. This is useful if this dataset is used for training multiple times and quantization parameters do not change. Pre-quantized Pool allows to cache quantized features data and so do not re-run feature quantization step at the start of an each training.

    Detailed documentation is available on https://catboost.ai/docs/

    Definition Classes
    catboost
  • package params
    Definition Classes
    spark
  • CatBoostClassificationModel
  • CatBoostClassifier
  • CatBoostPredictorTrait
  • CatBoostRegressionModel
  • CatBoostRegressor
  • Pool

class CatBoostClassifier extends ProbabilisticClassifier[Vector, CatBoostClassifier, CatBoostClassificationModel] with CatBoostPredictorTrait[CatBoostClassifier, CatBoostClassificationModel] with ClassifierTrainingParamsTrait

Class to train CatBoostClassificationModel

The default optimized loss function depends on various conditions:

  • Logloss — The label column has only two different values or the targetBorder parameter is specified.
  • MultiClass — The label column has more than two different values and the targetBorder parameter is not specified.
Examples

Binary classification.

val spark = SparkSession.builder()
  .master("local[*]")
  .appName("ClassifierTest")
  .getOrCreate();

val srcDataSchema = Seq(
  StructField("features", SQLDataTypes.VectorType),
  StructField("label", StringType)
)

val trainData = Seq(
  Row(Vectors.dense(0.1, 0.2, 0.11), "0"),
  Row(Vectors.dense(0.97, 0.82, 0.33), "1"),
  Row(Vectors.dense(0.13, 0.22, 0.23), "1"),
  Row(Vectors.dense(0.8, 0.62, 0.0), "0")
)

val trainDf = spark.createDataFrame(spark.sparkContext.parallelize(trainData), StructType(srcDataSchema))
val trainPool = new Pool(trainDf)

val evalData = Seq(
  Row(Vectors.dense(0.22, 0.33, 0.9), "1"),
  Row(Vectors.dense(0.11, 0.1, 0.21), "0"),
  Row(Vectors.dense(0.77, 0.0, 0.0), "1")
)

val evalDf = spark.createDataFrame(spark.sparkContext.parallelize(evalData), StructType(srcDataSchema))
val evalPool = new Pool(evalDf)

val classifier = new CatBoostClassifier
val model = classifier.fit(trainPool, Array[Pool](evalPool))
val predictions = model.transform(evalPool.data)
predictions.show()

Multiclassification.

val spark = SparkSession.builder()
  .master("local[*]")
  .appName("ClassifierTest")
  .getOrCreate();

val srcDataSchema = Seq(
  StructField("features", SQLDataTypes.VectorType),
  StructField("label", StringType)
)

val trainData = Seq(
  Row(Vectors.dense(0.1, 0.2, 0.11), "1"),
  Row(Vectors.dense(0.97, 0.82, 0.33), "2"),
  Row(Vectors.dense(0.13, 0.22, 0.23), "1"),
  Row(Vectors.dense(0.8, 0.62, 0.0), "0")
)

val trainDf = spark.createDataFrame(spark.sparkContext.parallelize(trainData), StructType(srcDataSchema))
val trainPool = new Pool(trainDf)

val evalData = Seq(
  Row(Vectors.dense(0.22, 0.33, 0.9), "2"),
  Row(Vectors.dense(0.11, 0.1, 0.21), "0"),
  Row(Vectors.dense(0.77, 0.0, 0.0), "1")
)

val evalDf = spark.createDataFrame(spark.sparkContext.parallelize(evalData), StructType(srcDataSchema))
val evalPool = new Pool(evalDf)

val classifier = new CatBoostClassifier
val model = classifier.fit(trainPool, Array[Pool](evalPool))
val predictions = model.transform(evalPool.data)
predictions.show()

Serialization

Supports standard Spark MLLib serialization. Data can be saved to distributed filesystem like HDFS or local files.

Examples== Save:
val classifier = new CatBoostClassifier().setIterations(100)
val path = "/home/user/catboost_classifiers/classifier0"
classifier.write.save(path)

Load:

val path = "/home/user/catboost_classifiers/classifier0"
val classifier = CatBoostClassifier.load(path)
val trainPool : Pool = ... init Pool ...
val model = classifier.fit(trainPool)
Linear Supertypes
ClassifierTrainingParamsTrait, TrainingParamsTrait, QuantizationParamsTrait, ThreadCountParams, IgnoredFeaturesParams, CatBoostPredictorTrait[CatBoostClassifier, CatBoostClassificationModel], DefaultParamsWritable, MLWritable, DatasetParamsTrait, HasWeightCol, ProbabilisticClassifier[Vector, CatBoostClassifier, CatBoostClassificationModel], ProbabilisticClassifierParams, HasThresholds, HasProbabilityCol, Classifier[Vector, CatBoostClassifier, CatBoostClassificationModel], ClassifierParams, HasRawPredictionCol, Predictor[Vector, CatBoostClassifier, CatBoostClassificationModel], PredictorParams, HasPredictionCol, HasFeaturesCol, HasLabelCol, Estimator[CatBoostClassificationModel], PipelineStage, Logging, Params, Serializable, Serializable, Identifiable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. CatBoostClassifier
  2. ClassifierTrainingParamsTrait
  3. TrainingParamsTrait
  4. QuantizationParamsTrait
  5. ThreadCountParams
  6. IgnoredFeaturesParams
  7. CatBoostPredictorTrait
  8. DefaultParamsWritable
  9. MLWritable
  10. DatasetParamsTrait
  11. HasWeightCol
  12. ProbabilisticClassifier
  13. ProbabilisticClassifierParams
  14. HasThresholds
  15. HasProbabilityCol
  16. Classifier
  17. ClassifierParams
  18. HasRawPredictionCol
  19. Predictor
  20. PredictorParams
  21. HasPredictionCol
  22. HasFeaturesCol
  23. HasLabelCol
  24. Estimator
  25. PipelineStage
  26. Logging
  27. Params
  28. Serializable
  29. Serializable
  30. Identifiable
  31. AnyRef
  32. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new CatBoostClassifier()
  2. new CatBoostClassifier(uid: String)

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def $[T](param: Param[T]): T
    Attributes
    protected
    Definition Classes
    Params
  4. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  5. final val allowConstLabel: BooleanParam
    Definition Classes
    TrainingParamsTrait
  6. final val allowWritingFiles: BooleanParam
    Definition Classes
    TrainingParamsTrait
  7. final val approxOnFullHistory: BooleanParam
    Definition Classes
    TrainingParamsTrait
  8. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  9. final val autoClassWeights: EnumParam[EAutoClassWeightsType]
  10. final val baggingTemperature: FloatParam
    Definition Classes
    TrainingParamsTrait
  11. final val bestModelMinTrees: IntParam
    Definition Classes
    TrainingParamsTrait
  12. final val bootstrapType: EnumParam[EBootstrapType]
    Definition Classes
    TrainingParamsTrait
  13. final val borderCount: IntParam
    Definition Classes
    QuantizationParamsTrait
  14. final val classNames: StringArrayParam
  15. final val classWeightsList: DoubleArrayParam
  16. final val classWeightsMap: OrderedStringMapParam[Float]
  17. final val classesCount: IntParam
  18. final def clear(param: Param[_]): CatBoostClassifier.this.type
    Definition Classes
    Params
  19. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  20. def copy(extra: ParamMap): CatBoostClassifier
    Definition Classes
    CatBoostClassifier → Predictor → Estimator → PipelineStage → Params
  21. def copyValues[T <: Params](to: T, extra: ParamMap): T
    Attributes
    protected
    Definition Classes
    Params
  22. def createModel(nativeModel: TFullModel): CatBoostClassificationModel
    Attributes
    protected
    Definition Classes
    CatBoostClassifierCatBoostPredictorTrait
  23. final val customMetric: StringArrayParam
    Definition Classes
    TrainingParamsTrait
  24. final def defaultCopy[T <: Params](extra: ParamMap): T
    Attributes
    protected
    Definition Classes
    Params
  25. final val depth: IntParam
    Definition Classes
    TrainingParamsTrait
  26. final val diffusionTemperature: FloatParam
    Definition Classes
    TrainingParamsTrait
  27. final val earlyStoppingRounds: IntParam
    Definition Classes
    TrainingParamsTrait
  28. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  29. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  30. final val evalMetric: Param[String]
    Definition Classes
    TrainingParamsTrait
  31. def explainParam(param: Param[_]): String
    Definition Classes
    Params
  32. def explainParams(): String
    Definition Classes
    Params
  33. def extractLabeledPoints(dataset: Dataset[_], numClasses: Int): RDD[LabeledPoint]
    Attributes
    protected
    Definition Classes
    Classifier
  34. def extractLabeledPoints(dataset: Dataset[_]): RDD[LabeledPoint]
    Attributes
    protected
    Definition Classes
    Predictor
  35. final def extractParamMap(): ParamMap
    Definition Classes
    Params
  36. final def extractParamMap(extra: ParamMap): ParamMap
    Definition Classes
    Params
  37. final val featureBorderType: EnumParam[EBorderSelectionType]
    Definition Classes
    QuantizationParamsTrait
  38. final val featureWeightsList: DoubleArrayParam
    Definition Classes
    TrainingParamsTrait
  39. final val featureWeightsMap: OrderedStringMapParam[Float]
    Definition Classes
    TrainingParamsTrait
  40. final val featuresCol: Param[String]
    Definition Classes
    HasFeaturesCol
  41. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  42. final val firstFeatureUsePenaltiesList: DoubleArrayParam
    Definition Classes
    TrainingParamsTrait
  43. final val firstFeatureUsePenaltiesMap: OrderedStringMapParam[Float]
    Definition Classes
    TrainingParamsTrait
  44. def fit(trainPool: Pool, evalPools: Array[Pool] = Array[Pool]()): CatBoostClassificationModel

    Additional variant of fit method that accepts CatBoost's Pool s and allows to specify additional datasets for computing evaluation metrics and overfitting detection similarily to CatBoost's other APIs.

    Additional variant of fit method that accepts CatBoost's Pool s and allows to specify additional datasets for computing evaluation metrics and overfitting detection similarily to CatBoost's other APIs.

    trainPool

    The input training dataset.

    evalPools

    The validation datasets used for the following processes:

    • overfitting detector
    • best iteration selection
    • monitoring metrics' changes
    returns

    trained model

    Definition Classes
    CatBoostPredictorTrait
  45. def fit(dataset: Dataset[_]): CatBoostClassificationModel
    Definition Classes
    Predictor → Estimator
  46. def fit(dataset: Dataset[_], paramMaps: Array[ParamMap]): Seq[CatBoostClassificationModel]
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" )
  47. def fit(dataset: Dataset[_], paramMap: ParamMap): CatBoostClassificationModel
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" )
  48. def fit(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): CatBoostClassificationModel
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" ) @varargs()
  49. final val foldLenMultiplier: FloatParam
    Definition Classes
    TrainingParamsTrait
  50. final val foldPermutationBlock: IntParam
    Definition Classes
    TrainingParamsTrait
  51. final def get[T](param: Param[T]): Option[T]
    Definition Classes
    Params
  52. final def getAllowConstLabel: Boolean
    Definition Classes
    TrainingParamsTrait
  53. final def getAllowWritingFiles: Boolean
    Definition Classes
    TrainingParamsTrait
  54. final def getApproxOnFullHistory: Boolean
    Definition Classes
    TrainingParamsTrait
  55. final def getAutoClassWeights: EAutoClassWeightsType
  56. final def getBaggingTemperature: Float
    Definition Classes
    TrainingParamsTrait
  57. final def getBestModelMinTrees: Int
    Definition Classes
    TrainingParamsTrait
  58. final def getBootstrapType: EBootstrapType
    Definition Classes
    TrainingParamsTrait
  59. final def getBorderCount: Int
    Definition Classes
    QuantizationParamsTrait
  60. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  61. final def getClassNames: Array[String]
  62. final def getClassWeightsList: Array[Double]
  63. final def getClassWeightsMap: LinkedHashMap[String, Float]
  64. final def getClassesCount: Int
  65. final def getCustomMetric: Array[String]
    Definition Classes
    TrainingParamsTrait
  66. final def getDefault[T](param: Param[T]): Option[T]
    Definition Classes
    Params
  67. final def getDepth: Int
    Definition Classes
    TrainingParamsTrait
  68. final def getDiffusionTemperature: Float
    Definition Classes
    TrainingParamsTrait
  69. final def getEarlyStoppingRounds: Int
    Definition Classes
    TrainingParamsTrait
  70. final def getEvalMetric: String
    Definition Classes
    TrainingParamsTrait
  71. final def getFeatureBorderType: EBorderSelectionType
    Definition Classes
    QuantizationParamsTrait
  72. final def getFeatureWeightsList: Array[Double]
    Definition Classes
    TrainingParamsTrait
  73. final def getFeatureWeightsMap: LinkedHashMap[String, Float]
    Definition Classes
    TrainingParamsTrait
  74. final def getFeaturesCol: String
    Definition Classes
    HasFeaturesCol
  75. final def getFirstFeatureUsePenaltiesList: Array[Double]
    Definition Classes
    TrainingParamsTrait
  76. final def getFirstFeatureUsePenaltiesMap: LinkedHashMap[String, Float]
    Definition Classes
    TrainingParamsTrait
  77. final def getFoldLenMultiplier: Float
    Definition Classes
    TrainingParamsTrait
  78. final def getFoldPermutationBlock: Int
    Definition Classes
    TrainingParamsTrait
  79. final def getHasTime: Boolean
    Definition Classes
    TrainingParamsTrait
  80. final def getIgnoredFeaturesIndices: Array[Int]
    Definition Classes
    IgnoredFeaturesParams
  81. final def getIgnoredFeaturesNames: Array[String]
    Definition Classes
    IgnoredFeaturesParams
  82. final def getInputBorders: String
    Definition Classes
    QuantizationParamsTrait
  83. final def getIterations: Int
    Definition Classes
    TrainingParamsTrait
  84. final def getL2LeafReg: Float
    Definition Classes
    TrainingParamsTrait
  85. final def getLabelCol: String
    Definition Classes
    HasLabelCol
  86. final def getLeafEstimationBacktracking: ELeavesEstimationStepBacktracking
    Definition Classes
    TrainingParamsTrait
  87. final def getLeafEstimationIterations: Int
    Definition Classes
    TrainingParamsTrait
  88. final def getLeafEstimationMethod: ELeavesEstimation
    Definition Classes
    TrainingParamsTrait
  89. final def getLearningRate: Float
    Definition Classes
    TrainingParamsTrait
  90. final def getLoggingLevel: ELoggingLevel
    Definition Classes
    TrainingParamsTrait
  91. final def getLossFunction: String
    Definition Classes
    TrainingParamsTrait
  92. final def getMetricPeriod: Int
    Definition Classes
    TrainingParamsTrait
  93. final def getModelShrinkMode: EModelShrinkMode
    Definition Classes
    TrainingParamsTrait
  94. final def getModelShrinkRate: Float
    Definition Classes
    TrainingParamsTrait
  95. final def getMvsReg: Float
    Definition Classes
    TrainingParamsTrait
  96. final def getNanMode: ENanMode
    Definition Classes
    QuantizationParamsTrait
  97. def getNumClasses(dataset: Dataset[_], maxNumClasses: Int): Int
    Attributes
    protected
    Definition Classes
    Classifier
  98. final def getOdPval: Float
    Definition Classes
    TrainingParamsTrait
  99. final def getOdType: EOverfittingDetectorType
    Definition Classes
    TrainingParamsTrait
  100. final def getOdWait: Int
    Definition Classes
    TrainingParamsTrait
  101. final def getOrDefault[T](param: Param[T]): T
    Definition Classes
    Params
  102. def getParam(paramName: String): Param[Any]
    Definition Classes
    Params
  103. final def getPenaltiesCoefficient: Float
    Definition Classes
    TrainingParamsTrait
  104. final def getPerFloatFeatureQuantizaton: Array[String]
    Definition Classes
    QuantizationParamsTrait
  105. final def getPerObjectFeaturePenaltiesList: Array[Double]
    Definition Classes
    TrainingParamsTrait
  106. final def getPerObjectFeaturePenaltiesMap: LinkedHashMap[String, Float]
    Definition Classes
    TrainingParamsTrait
  107. final def getPredictionCol: String
    Definition Classes
    HasPredictionCol
  108. final def getProbabilityCol: String
    Definition Classes
    HasProbabilityCol
  109. final def getRandomSeed: Int
    Definition Classes
    TrainingParamsTrait
  110. final def getRandomStrength: Float
    Definition Classes
    TrainingParamsTrait
  111. final def getRawPredictionCol: String
    Definition Classes
    HasRawPredictionCol
  112. final def getRsm: Float
    Definition Classes
    TrainingParamsTrait
  113. final def getSamplingFrequency: ESamplingFrequency
    Definition Classes
    TrainingParamsTrait
  114. final def getSamplingUnit: ESamplingUnit
    Definition Classes
    TrainingParamsTrait
  115. final def getSaveSnapshot: Boolean
    Definition Classes
    TrainingParamsTrait
  116. final def getScalePosWeight: Float
  117. final def getScoreFunction: EScoreFunction
    Definition Classes
    TrainingParamsTrait
  118. final def getSnapshotFile: String
    Definition Classes
    TrainingParamsTrait
  119. final def getSnapshotInterval: Duration
    Definition Classes
    TrainingParamsTrait
  120. final def getSparkPartitionCount: Int
    Definition Classes
    TrainingParamsTrait
  121. final def getSubsample: Float
    Definition Classes
    TrainingParamsTrait
  122. final def getTargetBorder: Float
  123. final def getThreadCount: Int
    Definition Classes
    ThreadCountParams
  124. def getThresholds: Array[Double]
    Definition Classes
    HasThresholds
  125. final def getTrainDir: String
    Definition Classes
    TrainingParamsTrait
  126. final def getUseBestModel: Boolean
    Definition Classes
    TrainingParamsTrait
  127. final def getWeightCol: String
    Definition Classes
    HasWeightCol
  128. final def getWorkerInitializationTimeout: Duration
    Definition Classes
    TrainingParamsTrait
  129. final def hasDefault[T](param: Param[T]): Boolean
    Definition Classes
    Params
  130. def hasParam(paramName: String): Boolean
    Definition Classes
    Params
  131. final val hasTime: BooleanParam
    Definition Classes
    TrainingParamsTrait
  132. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  133. final val ignoredFeaturesIndices: IntArrayParam
    Definition Classes
    IgnoredFeaturesParams
  134. final val ignoredFeaturesNames: StringArrayParam
    Definition Classes
    IgnoredFeaturesParams
  135. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  136. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  137. final val inputBorders: Param[String]
    Definition Classes
    QuantizationParamsTrait
  138. final def isDefined(param: Param[_]): Boolean
    Definition Classes
    Params
  139. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  140. final def isSet(param: Param[_]): Boolean
    Definition Classes
    Params
  141. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  142. final val iterations: IntParam
    Definition Classes
    TrainingParamsTrait
  143. final val l2LeafReg: FloatParam
    Definition Classes
    TrainingParamsTrait
  144. final val labelCol: Param[String]
    Definition Classes
    HasLabelCol
  145. final val leafEstimationBacktracking: EnumParam[ELeavesEstimationStepBacktracking]
    Definition Classes
    TrainingParamsTrait
  146. final val leafEstimationIterations: IntParam
    Definition Classes
    TrainingParamsTrait
  147. final val leafEstimationMethod: EnumParam[ELeavesEstimation]
    Definition Classes
    TrainingParamsTrait
  148. final val learningRate: FloatParam
    Definition Classes
    TrainingParamsTrait
  149. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  150. def logDebug(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  151. def logDebug(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  152. def logError(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  153. def logError(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  154. def logInfo(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  155. def logInfo(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  156. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  157. def logTrace(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  158. def logTrace(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  159. def logWarning(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  160. def logWarning(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  161. final val loggingLevel: EnumParam[ELoggingLevel]
    Definition Classes
    TrainingParamsTrait
  162. final val lossFunction: Param[String]
    Definition Classes
    TrainingParamsTrait
  163. final val metricPeriod: IntParam
    Definition Classes
    TrainingParamsTrait
  164. final val modelShrinkMode: EnumParam[EModelShrinkMode]
    Definition Classes
    TrainingParamsTrait
  165. final val modelShrinkRate: FloatParam
    Definition Classes
    TrainingParamsTrait
  166. final val mvsReg: FloatParam
    Definition Classes
    TrainingParamsTrait
  167. final val nanMode: EnumParam[ENanMode]
    Definition Classes
    QuantizationParamsTrait
  168. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  169. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  170. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  171. final val odPval: FloatParam
    Definition Classes
    TrainingParamsTrait
  172. final val odType: EnumParam[EOverfittingDetectorType]
    Definition Classes
    TrainingParamsTrait
  173. final val odWait: IntParam
    Definition Classes
    TrainingParamsTrait
  174. lazy val params: Array[Param[_]]
    Definition Classes
    Params
  175. final val penaltiesCoefficient: FloatParam
    Definition Classes
    TrainingParamsTrait
  176. final val perFloatFeatureQuantizaton: StringArrayParam
    Definition Classes
    QuantizationParamsTrait
  177. final val perObjectFeaturePenaltiesList: DoubleArrayParam
    Definition Classes
    TrainingParamsTrait
  178. final val perObjectFeaturePenaltiesMap: OrderedStringMapParam[Float]
    Definition Classes
    TrainingParamsTrait
  179. final val predictionCol: Param[String]
    Definition Classes
    HasPredictionCol
  180. def preprocessBeforeTraining(quantizedTrainPool: Pool, quantizedEvalPools: Array[Pool]): (Pool, Array[Pool], JObject)

    override in descendants if necessary

    override in descendants if necessary

    returns

    (preprocessedTrainPool, preprocessedEvalPools, catBoostJsonParams)

    Attributes
    protected
    Definition Classes
    CatBoostClassifierCatBoostPredictorTrait
  181. final val probabilityCol: Param[String]
    Definition Classes
    HasProbabilityCol
  182. final val randomSeed: IntParam
    Definition Classes
    TrainingParamsTrait
  183. final val randomStrength: FloatParam
    Definition Classes
    TrainingParamsTrait
  184. final val rawPredictionCol: Param[String]
    Definition Classes
    HasRawPredictionCol
  185. final val rsm: FloatParam
    Definition Classes
    TrainingParamsTrait
  186. final val samplingFrequency: EnumParam[ESamplingFrequency]
    Definition Classes
    TrainingParamsTrait
  187. final val samplingUnit: EnumParam[ESamplingUnit]
    Definition Classes
    TrainingParamsTrait
  188. def save(path: String): Unit
    Definition Classes
    MLWritable
    Annotations
    @Since( "1.6.0" ) @throws( ... )
  189. final val saveSnapshot: BooleanParam
    Definition Classes
    TrainingParamsTrait
  190. final val scalePosWeight: FloatParam
  191. final val scoreFunction: EnumParam[EScoreFunction]
    Definition Classes
    TrainingParamsTrait
  192. final def set(paramPair: ParamPair[_]): CatBoostClassifier.this.type
    Attributes
    protected
    Definition Classes
    Params
  193. final def set(param: String, value: Any): CatBoostClassifier.this.type
    Attributes
    protected
    Definition Classes
    Params
  194. final def set[T](param: Param[T], value: T): CatBoostClassifier.this.type
    Definition Classes
    Params
  195. final def setAllowConstLabel(value: Boolean): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  196. final def setAllowWritingFiles(value: Boolean): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  197. final def setApproxOnFullHistory(value: Boolean): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  198. final def setAutoClassWeights(value: EAutoClassWeightsType): CatBoostClassifier.this.type
  199. final def setBaggingTemperature(value: Float): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  200. final def setBestModelMinTrees(value: Int): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  201. final def setBootstrapType(value: EBootstrapType): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  202. final def setBorderCount(value: Int): CatBoostClassifier.this.type
    Definition Classes
    QuantizationParamsTrait
  203. final def setClassNames(value: Array[String]): CatBoostClassifier.this.type
  204. final def setClassWeightsList(value: Array[Double]): CatBoostClassifier.this.type
  205. final def setClassWeightsMap(value: LinkedHashMap[String, Float]): CatBoostClassifier.this.type
  206. final def setClassesCount(value: Int): CatBoostClassifier.this.type
  207. final def setCustomMetric(value: Array[String]): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  208. final def setDefault(paramPairs: ParamPair[_]*): CatBoostClassifier.this.type
    Attributes
    protected
    Definition Classes
    Params
  209. final def setDefault[T](param: Param[T], value: T): CatBoostClassifier.this.type
    Attributes
    protected
    Definition Classes
    Params
  210. final def setDepth(value: Int): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  211. final def setDiffusionTemperature(value: Float): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  212. final def setEarlyStoppingRounds(value: Int): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  213. final def setEvalMetric(value: String): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  214. final def setFeatureBorderType(value: EBorderSelectionType): CatBoostClassifier.this.type
    Definition Classes
    QuantizationParamsTrait
  215. final def setFeatureWeightsList(value: Array[Double]): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  216. final def setFeatureWeightsMap(value: LinkedHashMap[String, Float]): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  217. def setFeaturesCol(value: String): CatBoostClassifier
    Definition Classes
    Predictor
  218. final def setFirstFeatureUsePenaltiesList(value: Array[Double]): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  219. final def setFirstFeatureUsePenaltiesMap(value: LinkedHashMap[String, Float]): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  220. final def setFoldLenMultiplier(value: Float): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  221. final def setFoldPermutationBlock(value: Int): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  222. final def setHasTime(value: Boolean): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  223. final def setIgnoredFeaturesIndices(value: Array[Int]): CatBoostClassifier.this.type
    Definition Classes
    IgnoredFeaturesParams
  224. final def setIgnoredFeaturesNames(value: Array[String]): CatBoostClassifier.this.type
    Definition Classes
    IgnoredFeaturesParams
  225. final def setInputBorders(value: String): CatBoostClassifier.this.type
    Definition Classes
    QuantizationParamsTrait
  226. final def setIterations(value: Int): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  227. final def setL2LeafReg(value: Float): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  228. def setLabelCol(value: String): CatBoostClassifier
    Definition Classes
    Predictor
  229. final def setLeafEstimationBacktracking(value: ELeavesEstimationStepBacktracking): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  230. final def setLeafEstimationIterations(value: Int): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  231. final def setLeafEstimationMethod(value: ELeavesEstimation): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  232. final def setLearningRate(value: Float): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  233. final def setLoggingLevel(value: ELoggingLevel): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  234. final def setLossFunction(value: String): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  235. final def setMetricPeriod(value: Int): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  236. final def setModelShrinkMode(value: EModelShrinkMode): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  237. final def setModelShrinkRate(value: Float): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  238. final def setMvsReg(value: Float): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  239. final def setNanMode(value: ENanMode): CatBoostClassifier.this.type
    Definition Classes
    QuantizationParamsTrait
  240. final def setOdPval(value: Float): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  241. final def setOdType(value: EOverfittingDetectorType): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  242. final def setOdWait(value: Int): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  243. final def setPenaltiesCoefficient(value: Float): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  244. final def setPerFloatFeatureQuantizaton(value: Array[String]): CatBoostClassifier.this.type
    Definition Classes
    QuantizationParamsTrait
  245. final def setPerObjectFeaturePenaltiesList(value: Array[Double]): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  246. final def setPerObjectFeaturePenaltiesMap(value: LinkedHashMap[String, Float]): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  247. def setPredictionCol(value: String): CatBoostClassifier
    Definition Classes
    Predictor
  248. def setProbabilityCol(value: String): CatBoostClassifier
    Definition Classes
    ProbabilisticClassifier
  249. final def setRandomSeed(value: Int): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  250. final def setRandomStrength(value: Float): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  251. def setRawPredictionCol(value: String): CatBoostClassifier
    Definition Classes
    Classifier
  252. final def setRsm(value: Float): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  253. final def setSamplingFrequency(value: ESamplingFrequency): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  254. final def setSamplingUnit(value: ESamplingUnit): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  255. final def setSaveSnapshot(value: Boolean): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  256. final def setScalePosWeight(value: Float): CatBoostClassifier.this.type
  257. final def setScoreFunction(value: EScoreFunction): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  258. final def setSnapshotFile(value: String): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  259. final def setSnapshotInterval(value: Duration): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  260. final def setSparkPartitionCount(value: Int): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  261. final def setSubsample(value: Float): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  262. final def setTargetBorder(value: Float): CatBoostClassifier.this.type
  263. final def setThreadCount(value: Int): CatBoostClassifier.this.type
    Definition Classes
    ThreadCountParams
  264. def setThresholds(value: Array[Double]): CatBoostClassifier
    Definition Classes
    ProbabilisticClassifier
  265. final def setTrainDir(value: String): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  266. final def setUseBestModel(value: Boolean): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  267. final def setWorkerInitializationTimeout(value: Duration): CatBoostClassifier.this.type
    Definition Classes
    TrainingParamsTrait
  268. final val snapshotFile: Param[String]
    Definition Classes
    TrainingParamsTrait
  269. final val snapshotInterval: DurationParam
    Definition Classes
    TrainingParamsTrait
  270. final val sparkPartitionCount: IntParam
    Definition Classes
    TrainingParamsTrait
  271. final val subsample: FloatParam
    Definition Classes
    TrainingParamsTrait
  272. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  273. final val targetBorder: FloatParam
  274. final val threadCount: IntParam
    Definition Classes
    ThreadCountParams
  275. final val thresholds: DoubleArrayParam
    Definition Classes
    HasThresholds
  276. def toString(): String
    Definition Classes
    Identifiable → AnyRef → Any
  277. def train(dataset: Dataset[_]): CatBoostClassificationModel
    Attributes
    protected
    Definition Classes
    CatBoostPredictorTrait → Predictor
  278. final val trainDir: Param[String]
    Definition Classes
    TrainingParamsTrait
  279. def transformSchema(schema: StructType): StructType
    Definition Classes
    Predictor → PipelineStage
  280. def transformSchema(schema: StructType, logging: Boolean): StructType
    Attributes
    protected
    Definition Classes
    PipelineStage
    Annotations
    @DeveloperApi()
  281. val uid: String
    Definition Classes
    CatBoostClassifier → Identifiable
  282. final val useBestModel: BooleanParam
    Definition Classes
    TrainingParamsTrait
  283. def validateAndTransformSchema(schema: StructType, fitting: Boolean, featuresDataType: DataType): StructType
    Attributes
    protected
    Definition Classes
    ProbabilisticClassifierParams → ClassifierParams → PredictorParams
  284. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  285. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  286. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  287. final val weightCol: Param[String]
    Definition Classes
    HasWeightCol
  288. final val workerInitializationTimeout: DurationParam
    Definition Classes
    TrainingParamsTrait
  289. def write: MLWriter
    Definition Classes
    DefaultParamsWritable → MLWritable

Inherited from TrainingParamsTrait

Inherited from QuantizationParamsTrait

Inherited from ThreadCountParams

Inherited from IgnoredFeaturesParams

Inherited from DefaultParamsWritable

Inherited from MLWritable

Inherited from DatasetParamsTrait

Inherited from HasWeightCol

Inherited from ProbabilisticClassifier[Vector, CatBoostClassifier, CatBoostClassificationModel]

Inherited from ProbabilisticClassifierParams

Inherited from HasThresholds

Inherited from HasProbabilityCol

Inherited from Classifier[Vector, CatBoostClassifier, CatBoostClassificationModel]

Inherited from ClassifierParams

Inherited from HasRawPredictionCol

Inherited from Predictor[Vector, CatBoostClassifier, CatBoostClassificationModel]

Inherited from PredictorParams

Inherited from HasPredictionCol

Inherited from HasFeaturesCol

Inherited from HasLabelCol

Inherited from Estimator[CatBoostClassificationModel]

Inherited from PipelineStage

Inherited from Logging

Inherited from Params

Inherited from Serializable

Inherited from Serializable

Inherited from Identifiable

Inherited from AnyRef

Inherited from Any

Ungrouped