PASS Summit 2011 Day Three Keynote
Speaker: David J. DeWitt, Technical Fellow, Data and Storage Platform Division
Date: Friday, October 14 - Keynote at 8:15am
Big Data – What is the Big Deal?
Are you ready for the exploding world of big data? Do you know the difference between Hive and Pig? Do you know why MapReduce is being taught in many universities rather than SQL? If not, pay attention because this talk will help get you started in understanding this new world. While sometimes the Hadoop toolkit (which includes HDFS, MapReduce, Hive, Pig, and Sqoop) is used as an alternative to relational database systems such as SQL Server, more frequently customers are using it as a complementary tool. Sometimes it may be used as an ETL tool or to perform an initial analysis of a freshly acquired data set to determine whether or not it is worth loading into the data warehouse, and sometimes to process massive data sets that are too big to even contemplate loading into all but the very largest data warehouses. In addition to covering the basics of the various parts of the Hadoop stack, this talk will discuss the strengths and weakness of the Hadoop approach compared to that provided by relational database systems and explores how the two technologies can be used productively in conjunction with one another.