当前位置:文档之家› 大数据经济学

大数据经济学


什么是大数据
Data come with less structure. The information available about a consumer may include her entire shopping history. With this information, it is possible to
火爆,社会对“数据科学家”的需求异常高涨,他们的薪
酬也随之水涨船高。 但“数据科学家”的供给却非常有限,为此,一些北美 高校开始设立数据科学本科和硕士学位项目。
引 言
大数据如此受欢迎,还应归功于奥巴马的总统竞选;他 们通过数据科学家对大量数据的分析,获取了募捐和广告 方面的优势
数据科学家成功的预测了奥巴马竞选连任。
什么是大数据
This is a lot of data. What’s exactly new about it? Data is now available faster, has greater coverage and scope and includes new types of observations and
interesting challenge for an econometric research.
大数据的使用:预测
什么是大数据
K is a lot smaller than N. When data arrive in its raw form of digital recording of a sequence of events, with no further structure, there are a huge
什么是大数据
Data is available in large scale A major change for economists is the scale of the modern datasets. Before, we worked with data with hundreds or
thousands observations and few variables. With small samples,
statistical power was an important issue; the omitted variable bias was also a concern. Now datasets with tens of millions of observations and huge number of variables are common. Statistical power is no longer an issue.
was split by products or product categories. Nowadays,
scanner data makes it possible to track individual purchases and item sales, capture the exact time at which they occur and the purchase histories of individuals, and use electronic inventory data to link purchases to spec大数据
Example 1. Internet retailers observe not just this information, but can trace individual’s behavior around the sale, including his or her initial search queries, items
在实证分析中,我们深知“其他因素”都在变,但因缺 乏数据,不得不忽略他们。 以前,经济活动的数据记录下来的很少,有的大都是总 体数据。今天,计算机技术和英特网改变了录在案。当你在淘宝上逛街的时候,每一项游览活动和每 一项购买都记录在案。当你在网上阅读、看录像、聊天或
create an almost unlimited set of individual characteristics.
While this is very powerful, it is also challenging. We are familiar with “rectangular” form of data with N observations and K variables。
者查看你的个人金融状况是,你的行为都记录在案。
短信、微信、推特、手机、超市的摄像机和取付款机、 银行的提款机,道路口和各种场合的摄像镜头等各种电子 通讯设备都留下了数据的脚印。 大数据是通过各种手段记录下来的数据;它有可能是实
什么是大数据
例一,Consider the data collected by retail stores. Few decades ago, stores might have collected data on daily sales, and it would have been considered high quality if the data
spending or credit history
什么是大数据
Example 2. There has been a parallel evolution in business activity. As firms have moved their day to day operations to computers then online, it has become
possible to compile rich datasets of sales contacts, hiring
practices, and physical shipments of goods. Increasingly, there are also electronic records of collaborative work efforts, personnel evaluations and productivity measures. Same story can be told about the public sector
大数据背景下的经济分析
艾春荣 (中国人民大学、美国佛罗里达大学)
目录
引言 什么是大数据 大数据的使用 大数据下的经济和政策分析 大数据的挑战 数据科学家应具备的条件
引 言
自从美国奥巴马总统将大数据列为美国科技发展战略以 来,大数据受到社会各界和媒体的高度关注 几年之前人们没有听说的“数据科学家”突然变得异常
viewed and discarded, recommendations and promotions
that were shown and subsequent product or seller review. In principle, these data could be linked to demographics, advertising exposure, social media activity, offline
location data records where people have been. Social network
data captures personal connections. Most economists believe that social connections play an important role in job search, in shaping consumer preferences and in the transmission of information.
微软的数据科学家成功地预测世界杯比赛结果,击败了
所有其他的预测,包括IBM 的数据科学家。
数据科学家在中国也有巨大需求。
什么是大数据
经济理论一般高度简化,假设“其他因素不变”。在实 际中,“其他因素”是变化的。 如果“其他因素”变化 聊,理论结果还有效吗?这就是所谓的比较分析,经济理
论很少做,也很难做。
什么是大数据
Data is available on novel types of variables. Much of the data now being recorded is on activities that previously were very difficult to observe. Email or geo-
measurements that previously were not available.
A key aspect of such modern datasets is that they have
much less or more structure than the traditional datasets
as in panel data or linked by time. But individuals in a social
network may be connected in highly complex ways. Indeed the point of econometric modeling may be to uncover exactly what are the key features of this dependence structure. Developing methods that are suited to these settings is an
什么是大数据
Data is available in real time The ability to capture and process large amount of data in real time is crucial for many business applications, but has not
相关主题