Course Discription |
:
This course is an introduction to the concepts of "Big Data" and "data analysis". It provides an
introduction to one of the most common frameworks, Hadoop, that has made big data analysis
easier and more accessible. At the end of this course, students are expected to first, describe the
Big Data landscape including examples of real-world big data problems including the three key
sources of Big Data: people, organizations, and sensors. Second, explain the V’s of Big Data
(volume, velocity, variety, veracity, valence, and value) and why each impacts data collection,
monitoring, storage, analysis, and reporting. Third, to have some Practical experience with some
commonly used tools and techniques for (big) data processing. Forth, know the basics of
distributed file systems, databases, and computing. Fifth, to have gained practical data processing
skills with the MapReduce framework / Apache Hadoop. |