sklearn.decomposition.SparsePCA?

class sklearn.decomposition.SparsePCA(n_components=None, *, alpha=1, ridge_alpha=0.01, max_iter=1000, tol=1e-08, method='lars', n_jobs=None, U_init=None, V_init=None, verbose=False, random_state=None, normalize_components='deprecated')

[源碼]

稀疏主成分分析（SparsePCA）

查找可以最佳地重構數據的稀疏組件集。稀疏程度可通過參數alpha給出的L1懲罰系數來控制。

在用戶指南中閱讀更多內容。

參數	說明
n_components	int, 要提取的稀疏組件數。
alpha	float, 稀疏控制參數。較高的值會導致組件稀疏。
ridge_alpha	float, 調用transform方法時要應用以改善條件的脊收縮量。
max_iter	int, 要執行的最大迭代次數。
tol	float, 停止條件的容差。
method	{‘lars’, ‘cd’} lars：使用最小角度回歸方法來解決套索問題（linear_model.lars_path）cd：使用坐標下降法來計算套索解決方案（linear_model.Lasso）。如果估計的成分稀疏，Lars將更快。
n_jobs	int or None, optional (default=None) 要運行的并行作業數。 `None`除非`joblib.parallel_backend`上下文中，否則表示1 。 `-1`表示使用所有處理器。有關更多詳細信息，請參見詞匯表。
U_init	array of shape (n_samples, n_components), 熱啟動方案的負載初始值。
V_init	array of shape (n_components, n_features), 用于熱啟動方案的組件初始值。
verbose	int 控制詳細程度；越高，消息越多。預設為0。
random_state	int, RandomState instance, default=None 在詞典學習期間使用。在多個函數調用之間傳遞int以獲得可重復的結果。請參閱詞匯表。
normalize_components	‘deprecated’ 此參數沒有任何作用。組件總是標準化的。 0.20版中的新功能。自版本0.22起已棄用：`normalize_components`在0.22中已棄用，并將在0.24中刪除。

屬性	說明
components_	array, [n_components, n_features] 從數據中提取的稀疏成分。
error_	array 每次迭代的誤差向量。
n_components_	int 估計的組件數量。 0.23版中的新功能。
n_iter_	int 運行的迭代次數。
mean_	array, shape (n_features,) 根據訓練集估算的每特征經驗均值。等于`X.mean(axis=0)`。

另見

PCA

MiniBatchSparsePCA

DictionaryLearning

例子

>>> import numpy as np
>>> from sklearn.datasets import make_friedman1
>>> from sklearn.decomposition import SparsePCA
>>> X, _ = make_friedman1(n_samples=200, n_features=30, random_state=0)
>>> transformer = SparsePCA(n_components=5, random_state=0)
>>> transformer.fit(X)
SparsePCA(...)
>>> X_transformed = transformer.transform(X)
>>> X_transformed.shape
(200, 5)
>>> # most values in the components_ are zero (sparsity)
>>> np.mean(transformer.components_ == 0)
0.9666...

方法

方法	說明
`fit`(X[, y])	根據X中的數據擬合模型。
`fit_transform`(X[, y])	擬合數據，然后對其進行轉換。
`get_params`([deep])	獲取此估計量的參數。
`set_params`(**params)	設置此估算器的參數。
`transform`(X)	數據的最小二乘投影到稀疏分量上。

__init__(n_components=None, *, alpha=1, ridge_alpha=0.01, max_iter=1000, tol=1e-08, method='lars', n_jobs=None, U_init=None, V_init=None, verbose=False, random_state=None, normalize_components='deprecated')

[源碼]

初始化self。請參閱help（type（self））獲取準確的信息。

fit(X, y=None)

[源碼]

根據X中的數據擬合模型。

參數	說明
X	array-like, shape (n_samples, n_features) 訓練向量，其中樣本數量n_samples個，特征數量n_features個。
y	Ignored

參數	說明
self	object 返回實例本身。

fit_transform（X，y = None，** fit_params ）

[源碼]

擬合數據，然后對其進行轉換。

使用可選參數fit_params將轉換器擬合到X和y，并返回X的轉換版本。

參數	說明
X	{array-like, sparse matrix, dataframe} of shape (n_samples, n_features)
y	ndarray of shape (n_samples,), default=None 目標值。
**fit_params	dict 其他擬合參數。

返回值	說明
X_new	ndarray array of shape (n_samples, n_features_new) 轉換后的數組。

get_params(deep=True)

[源碼]

獲取此估計器的參數。

參數	說明
deep	bool, default=True 如果為True，則將返回此估算器和其所包含子對象的參數。

參數	說明
params	mapping of string to any 參數名稱映射到其值。

set_params(**params)

[源碼]

設置此估算器的參數。

該方法適用于簡單的估計器以及嵌套對象（例如管道）。后者具有<component>__<parameter>形式的參數，以便可以更新嵌套對象的每個組件。

參數	說明
**params	dict 估算器參數。

返回值	說明
self	object 估算器實例。

transform（X ）

[源碼]

數據的最小二乘投影到稀疏分量上。

為了避免在系統不確定的情況下出現不穩定問題，可以通過ridge_alpha參數進行正則化（Ridge回歸）。

請注意，稀疏PCA組件的正交性不像PCA中那樣強制，因此不能使用簡單的線性投影。

參數	說明
X	array of shape (n_samples, n_features) 要轉換的測試數據，必須具有與用于訓練模型的數據相同數量的特征。

返回值	說明
X_new	array, shape (n_samples, n_components) 轉換后的數據。