当前位置:文档之家› 非负矩阵分解方法综述

非负矩阵分解方法综述


SVD A = UΣ V T =
r T σ u v i i i i=1
What is the SVD?
7 of 30
decreasing importance
The SVD
8 of 30
Data Matrix
Am×n with rank r Examples
term-by-document matrix pixel intensity-by-image matrix gene-by-DNA microarray matrix feature-by-item matrix user-by-purchase matrix terrorist-by-action matrix
where
Dk
is diagonal, and elements of
Xk , Yk ∈ {−1, 0, 1}.
• CUR factorization
Other Low-Rank Approximations
• QR decomposition • any URVT factorization • Semidiscrete decomposition (SDD) Ak = Xk Dk YT k,
SVD A = UΣ V T =
r T σ u v i i i i=1
Low Rank Approximation use Ak =
k T σ u v i=1 i i i
in place of A


SVD Rank Reduction
Amy Langville
langvillea@ C. of Charleston Mathematics Dept.
NISS NMF Workshop February 23–24, 2007
Outline
• Two Factorizations: — Singular Value Decomposition — Nonnegative Matrix Factorization • Why factor anyway? • Computing the NMF — Early Algorithms — Recent Algorithms • Extensions of NMF
polysem
• polysems broken across several basis vectors wi
Text Mining Applications
• Data compression • Find similar terms Wk Hk 0 ≤ cos(θ) = Wk Hk q ≤ 1 0 ≤ cos(θ) = qT Wk Hk ≤ 1
SVD NMF
Ak Ak
nonneg

= =
Uk
mixed
Σk
nonneg
VT k
mixed
Wk
nonneg
Hk
nonneg
Interpretation with NMF
• columns of W are the underlying basis vectors, i.e., each of the n columns of A can be built from k columns of W. • columns of H give the weights associated with each basis vector. ⎡ . ⎤ ⎡ . ⎤ ⎡ . ⎤ . . . . . . ⎢ ⎢ ⎢ ⎥ ⎥ ⎥ Ak e1 = Wk H∗1 = ⎣ w1 ⎦ h11 + ⎣ w2 ⎦ h21 + . . . + ⎣ wk ⎦ hk1 . . . . . . . . . • Wk , Hk ≥ 0 ⇒ immediate interpretation
Reconstructed Images k = 100
Text Mining
MED dataset (k = 10)
Highest Weighted Terms in Basis Vector W *1 Highest Weighted Terms in Basis Vector W *2 1 2 3 4 5 6 7 8 9 10 0
10 of 30
Why use Low Rank Approximation?
• Data Compression and Storage when k << r • Remove noise and uncertainty ⇒ improved performance on data mining task of retrieval (e.g., find similar items) ⇒ improved performance on data mining task of clustering
0.5 1 1.5 weight 2 2.5
term
Highest Weighted Terms in Basis Vector W *5
term
Highest Weighted Terms in Basis Vector W *6
1 2 3 4 5 6 7 8 9 10 0
childre n child autistic speech group early visual anxiety emotional autism
8 7 6
sigma
4 3
5
2
1
0
0
20
40
60
80
100
120
k=28
Other Low-Rank Approximations
• QR decomposition • any URVT factorization • Semidiscrete decomposition (SDD) Ak = Xk Dk YT k,
1 2 weight 3 4
1 2 3 4 5 6 7 8 9 10 0
kidney marro w dna cells nephrectom y unilateral lymphocyte s bone thymidine rats
0.5 1 1.5 weight 2 2.5
term
term
Text Mining
• Find similar documents • Cluster documents
Clustering with the NMF
Clustering Terms • use rows of Wm×k = ⎛ cl.1 cl.2 term1 .9 0 ⎜ .1 term 2 .. 8 ⎝ . . . . . . . . ... ... ... . .. cl.k ⎞ .3 .. 2 ⎟ ⎠ . .
where
Dk
is diagonal, and elements of
Xk , Yk ∈ {−1, 0, 1}.
• CUR factorization BUT All create basis vectors that are mixed in sign. Negative elements make interpretation difficult.
Xk , Yk ∈ {−1, 0, 1}.
• CUR factorization BUT All create basis vectors that are mixed in sign. Negative elements make interpretation difficult. ⇒ Nonnegative Matrix Factorization
Data Matrix
Am×n with rank r Examples
term-by-document matrix pixel intensity-by-image matrix gene-by-DNA microarray matrix feature-by-item matrix user-by-purchase matrix terrorist-by-action matrix
Properties of SVD
• basis vectors ui and vi are orthogonal • uij , vij are mixed in sign = Uk Σk Ak
nonneg mixed nonneg
VT k
mixed
• U, V are dense • uniqueness—while there are many SVD algorithms, they all create the same (truncated) factorization • optimality—of all rank-k approximations, Ak is optimal A − Ak
F
= minrank(B)≤k A − B
F
Summary of Truncated SVD
Strengths • using Ak in place of A gives improved performance • noise reduction isolates essential components of matrix • best rank-k approximation • Ak is unique Weaknesses • storage—Uk and Vk are usually completely dense • interpretation of basis vectors is difficult due to mixed signs • good truncation point k is hard to determine • orthogonality restriction
相关主题