当前位置:文档之家› 贝叶斯统计分析

贝叶斯统计分析

贝叶斯统计与数学建模
季春霖 深圳光启高等理工研究院
What is Statistics?


To learn the whole from a small part The mathematical science of uncertainties
Quotes

True logic of this world is in the calculus of probabilities. --- J. C. Maxwell What we see is the solution to a computational problem, our brains compute the most likely causes from the photon absorptions within our eyes. --- H. Helmholtz
Statistical Inference

Inference (推断)


To infer: “To conclude based on fact and/or premise” Everyday: Make inferences about things unseen based on the observed
Chinese Restaurant Process

Statistics in Economics and marketing



Statistics in technology developments


Kalman filtering and dynamic systems Signal processing and military applications Quality control and industrial engineering VLSI chip design Planning of experiments Risk analysis …
Role of Statistics



Traditional role 1: population statistics, survey sampling, economic statistics Traditional role 2: experimental designs in agriculture and industry Traditional role 3: evaluation of procedures
Posterior
Likelihood
Prior
= model
= observations / training data
Priors for the model parameters

Prior over class assignments

Class assignments are Multinomial, we will choose a conjugate Dirichlet prior. This allows us to specify a priori how likely we think each class will be. Class distributions are multivariate Normal. We will choose conjugate Normal*Inverse-Wishart priors. These let us specify a priori where and how broad we think each mixture density should be.


calculating significance levels (under a model) deriving asymptotic distributions (of something) simulation study for comparing methods
Information age: Deriving new and powerful procedures.

Prior over class distribution parameters

But what t the infinite part?


Properly parameterized, a posterior formed from a Multinomial Dirichlet conjugate pair is well behaved as the number of hidden classes approaches infinity. This results in a model with an infinite number of hidden causes, but one that only a finite number are causal w.r.t. our finite dataset. The Chinese Restaurant Process is one process that generates samples from such a model.
Deriving Statistical Procedures
Likelihood
Need
method
an explicit model: p(data | ). Most often =(, ), and one is only interested in . Missing data problem: p(yobs, ymis | ).
p(q | y) =
ò
p(data | q )p(q ) µ p(data | q )p(q ) p(data | q )p(q )dq
Bayesian Statistics

贝叶斯公式: P(x | Q) P(Q) likelihood × prior P(Q x) = = P(x) evidence
Statistical inference



Sample mean to estimate (what?) Linear regression – “estimating” the slope Whether a certain drug/treatment is effective. What is the true signal? Who will win the election? Who will win the World Cup (statistical prediction)?
.25 .25 .25 .25
Generate observation according to class model
Bayesian Modeling

Estimate a posterior distribution over models Provides a principled way to encode prior beliefs about the form of the solution Posterior distribution represented by samples Will enable us to estimate how many hidden classes there are
1 2 6 9 10 3 5 11 4 7 8
Infinitely many tables
First customer sits at the first table. Remaining customers seat themselves randomly.
Exchangeable distribution (Aldous, 1985; Pitman, 2002)

Statistical Inference

Facts are the data Premise carried by a probability model Conclusions on unknowns
Example of inferences

Daily life: too many. Name one or two yourself!
Bayesian
describe
method
quantities of scientific interest by an appropriate (joint) probability distribution. Let the law of probability work its way out.
/2009/08/06/technology/06stats.html?_r=1&emc=eta1
Why need statisticians?

Traditional Line and New Challenges

“Typical” statistical problems more demanding than before. New technologies generate new data and new opportunities (e.g., engineering/computer problems; bioinformatics; data mining)
Statistics: an all encompassing field

Statistics in scientific fields


Biology – genetics and molecular biology (bioinformatics) Medical research – epidemiology, clinical trials etc. Chemistry and physics – molecular structures Astrophysics – analyzing stars and galaxies Social sciences Psychology Computer science Econometrics Hedge fund strategies: data mining
相关主题