sklearn.multiclass.OutputCodeClassifier?

class sklearn.multiclass.OutputCodeClassifier(estimator, *, code_size=1.5, random_state=None, n_jobs=None)

[源碼]

錯誤糾正,輸出代碼多類策略

基于輸出代碼的策略包括用二進制代碼（0和1的數組）表示每個類。在擬合時，在代碼簿中每位裝配一個二進制分類器。在預測時，分類器用于在類空間中投影新點，并選擇最接近這些點的類。這些策略的主要優點是用戶可以控制使用的分類器數量，以壓縮模型（0 <code_size <1）或使模型對錯誤更有效（code_size> 1）。有關更多詳細信息，請參見文檔。

在用戶指南中閱讀更多內容。

參數	說明
estimator	estimator object 一種實現擬合和決策函數（decision_function）或預測概率（predict_proba）之一的估計對象。
code_size	float 用于創建代碼簿的類數的百分比。介于0和1之間的數字需要的分類器比1和其余的要少。大于1的數字將需要比其他分類器更多的分類器。
random_state	int, RandomState instance or None, optional, default: None 用于初始化密碼本的生成器。為多個函數調用傳遞可重復輸出的int值。請參閱詞匯表。
n_jobs	int or None, optional (default=None) 用于計算的數量。`None`除非`joblib.parallel_backend`上下文中，否則表示1 。 `-1`表示使用所有處理器。有關更多詳細信息，請參見詞匯表。

屬性	說明
estimators_estimators	*list of `int(n_classes code_size)`** 用于預測的估計量。
classes_	numpy array of shape [n_classes] 包含標簽的數組。
code_book_	numpy array of shape [n_classes, code_size] 包含每個類代碼的二進制數組。

參考文獻

1 “Solving multiclass learning problems via error-correcting output codes”, Dietterich T., Bakiri G., Journal of Artificial Intelligence Research 2, 1995.

2 “The error coding method and PICTs”, James G., Hastie T., Journal of Computational and Graphical statistics 7, 1998.

3 “The Elements of Statistical Learning”, Hastie T., Tibshirani R., Friedman J., page 606 (second-edition) 2008.

實例

>>> from sklearn.multiclass import OutputCodeClassifier
>>> from sklearn.ensemble import RandomForestClassifier
>>> from sklearn.datasets import make_classification
>>> X, y = make_classification(n_samples=100, n_features=4,
...                            n_informative=2, n_redundant=0,
...                            random_state=0, shuffle=False)
>>> clf = OutputCodeClassifier(
...     estimator=RandomForestClassifier(random_state=0),
...     random_state=0).fit(X, y)
>>> clf.predict([[0, 0, 0, 0]])
array([1])

方法	說明
`fit`(X, y)	擬合基礎估計量。
`get_params`([deep])	獲取此估計量的參數。
`predict`(X)	使用基礎估計量預測多類別目標。
`score`(X, y[, sample_weight])	返回給定測試數據和標簽上的平均準確度。
`set_params`(**params)	設置此估算器的參數。

__init__(estimator, *, code_size=1.5, random_state=None, n_jobs=None)

[源碼]

初始化self，請參閱help(type(self))以獲得準確的說明。

fit(X, y)

[源碼]

擬合基礎估計量。

參數	說明
X	(sparse) array-like of shape (n_samples, n_features) 數據
y	numpy array of shape [n_samples] 多類別目標。

返回值
self

get_params(deep=True)

[源碼]

獲取此估計量的參數。

參數	說明
deep	bool, default=True 如果為True，則將返回此估計量和作為估計量的包含子對象的參數。

返回值	說明
params	mapping of string to any 參數名稱映射到其值。

predict(X)

[源碼]

使用基礎估計量預測多類別目標。

參數	說明
X	(sparse) array-like of shape (n_samples, n_features) 數據

返回值	說明
y	numpy array of shape [n_samples] 預測的多類別目標。

score(X, y, sample_weight=None)

[源碼]

返回給定測試數據和標簽上的平均準確度。

在多標簽分類中，這是子集準確性，是一個比較苛刻的指標，因為您需要為每個樣本正確預測每個標簽集。

參數	說明
X	array-like of shape (n_samples, n_features) 測試樣本。
y	array-like of shape (n_samples,) or (n_samples, n_outputs) X的真實標簽。
sample_weight	array-like of shape (n_samples,), default=None 樣本權重

返回值	說明
score	float self.predict(X) wrt. y.的平均準確度。

set_params(**params)

[源碼]

設置此估算器的參數。

該方法適用于簡單的估計器以及嵌套對象（例如 pipelines）。后者具有形式的參數， <component>__<parameter>以便可以更新嵌套對象的每個組件。

參數	說明
**params	dict 估算量參數。

返回值	說明
self	object 估算量實例。