当前位置:文档之家› 视频大数据技术趋势及实践

视频大数据技术趋势及实践


Surprising results are happening
The intersection of deep and reinforcement learning continues
If you aren't usห้องสมุดไป่ตู้ng batch normalization you should
来源:Brad Neuberg, 10 Deep Learning Trends at NIPS 2015, 2015.12
LSTM-E (C3D) [15] Yao et al. [32] Joint-BiLSTM reinforced (ours) 29.9 29.6 30.3
来源:Bidirectional Long-Short Term Memory for Video Description, 2016.06
深度学习技术趋势
基础 网络 数据
网上视频数据,除了有节目内容外,还包括个人 身份信息、位置信息、通信录、应用偏好信息, 可能还有账号信息等。
物理空间
现实世界
网络世界
信息空间
来源:邬贺铨,视频大数据的发展与应用,2016年03月
视频数据的常规处理
4
图像识别 图像描述 参考帧 图像定位 图像生成 语音转文本 字幕分离 OCR 特征转换 时空信息 文件信息
MU
time
MU
SU a
Model S2VT-unidirectional S2VT-bidirectional S2VT-BiLSTM reinfored Joint-LSTM unidirectional Joint-LSTM bidirectional Joint-BiLSTM reinforced
Uni: a person is cutting a potato. Bi: a person is slicing a potato. Re: a person is peeling a potato.
Uni: a man is playing. Bi: a man is playing a piano. Re: a man is playing the piano.
8
Architectures more complex and sophisticated
All the cool kids are using LSTMs
Attention models are showing up
Neural Turing Machines remain interesting
7
CNN (VGG)
Forward Pass
Backward Unit BU
FU
Backward Pass
FU
FU
FU
FU Language Model
BU
Merge
BU
BU
BU
BU <BOS> a SU man man SU is is SU riding
… …
SU
MU
MU
MU Visual Model
METEOR 28.7 28.6 29.5 29.5 29.9 30.3
… <EOS>
Uni: a man is riding a bike. Uni: a man is playing on a stage. Bi: a group of a man is dancing. Bi: a man is riding on a motorcycle. Re: a group of people are dancing. Re: a man is riding a motorcycle.
深度学习用于视频分析
6
来源:Translating Videos to Natural Language Using Deep Recurrent Neural Networks, 2015.04
深度学习用于视频分析
Forward Unit FU Merge Unit MU Sentence Unit SU
视频大数据技术趋势及实践
目录
2
01
02
03
04
技术发展概述
常规处理框架
具体实践说明
现场疑问解答
三元空间视频资源分类
三元数据 网络层面
自媒体 数据
社交网络、 博客、微博
3
物理层面
传感器、条码、二维码
社会层面
政府、企事业、平面媒体
人类认知
社会空间
日志 数据
富媒体 数据
搜索引擎、网 文本、音视频、 信令、定位、 购、金融支付 图片、照片 计费
标题 关键词 用户名 IP地址
视频内容挖掘
视频关联分析
采集日期 视频时长 运行时间 监控编号
来源:Microsoft,HAL,NBA,Karen's .etc
深度学习用于图像描述
5
Google和Microsoft的工程师开发的机器学习算法, 能自动生成图片题注说明,准确地描述图像的内容
来源:Google,2014;Microsoft,2015
Model METEOR Table 2: Comparing with several state-of-the-art LSTM 26.9 models (reported in percentage, higher is better). Joint-LSTM unidirectinal (ours) 29.5 S2VT [23] Model METEOR -RGB (VGG) 29.2 LSTM 26.9 -RGB (VGG)+Flow (AlexNet) 29.8 Joint-LSTM unidirectinal (ours) 29.5 LSTM-E (VGG) [15] 29.5 S2VT [23] LSTM-E (C3D) [15] 29.9 -RGB (VGG) 29.2 -RGB (VGG)+Flow (AlexNet) 29.8 Yao et al. [32] 29.6 LSTM-E (VGG) [15] 29.5 Joint-BiLSTM reinforced (ours) 30.3
Deep learning for CV and NLP are crosshybridizing each other Neural network research and productionisation go hand in hand
Symbolic differentiation is becoming even more important
相关主题