Skip to content Skip to navigation

Introduction to Big Data

graphic of beaker
Thursday, July 2, 2020 - 9:00am to 12:00pm

VPGE is partnering with University IT to bring virtual, professional instructor-led technology trainings to graduate students and postdoctoral scholars. Take advantage of these free virtual classes to learn and practice applying these technologies to your research, career planning, summer project or internship - or just to your life!  Add these trainings to your resume or CV to help demonstrate your transferable skills.

This class will help you get started with the background and introduction of the history of Big Data. Get an introduction to working with Big Data Ecosystem technologies, which include HDFS, MadReduce, Hive, Pig, Machine Learning, and more.

After this course, you will be able to:

  • Understand the history and background of Big data and Hadoop 
  • Describe the Big Data landscape including examples of real-world big data problems
  • Explain the 5 V’s of Big Data (volume, velocity, variety, veracity, and value)
  • Understand the foundational principles that have made Big Data so successful.
  • Provide an explanation of the ecosystem components like HDFS, MapReduce, Sqoop, Flume, Hive, Pig, Mahout (Machine Learning), R Connector, Ambari, Zookeeper, Oozie and No-SQL like HBase.
  • Understand the various offerings like Cloudera, Hortonworks, MapR, Amazon EMR and Microsoft Azure HDInsight in the industry around Big data on cloud and on Premise.
  • Understand the impact and value of Apache Spark in the Big Data Ecosystem.
  • Understand the Apache Spark Architecture and the various libraries to perform various use cases like Streaming, Machine & Deep Learning, GraphX etc.

In advance of each session, UIT Tech Training will provide you with a Zoom link to your class, along with any required class materials.

Time Commitment: 
Learning Experience: 

This event belongs to a program