Section outline

    • Course Description

      Big data involves storing, processing, analyzing and making sense of huge volumes of data extracted in many formats and from many sources. This course is designed to teach students the fundamentals of Big Data, Big Data analysis tools and how these tools can be used to extract value out of Big Data.

      Course Objectives

      • Understand fundamental principles of Big Data.

      • Understand the key components of the computing environment for Big Data including hardware, software, distributed systems, and analytical tools.

      • Understand and discuss the application of data mining methodologies, algorithms, and enabling technologies on Big Data to deliver extraordinary results and value.

      Learning Outcomes

      Upon successful completion of this course, students should be able to:

      • Describe Big Data and its characteristics.

      • Demonstrate ability to work with Big Data using the main Big Data tools (Hadoop & Spark)

      • Describe how Big Data can be resourcefully used in a corporate environment

      • Effectively apply predictive analytics on Big Data

      • Design and implement a prototypical Big Data Analytics Solution to address a decision making situation facing an organization of your choice.

    • Opened: Wednesday, 4 March 2026, 12:00 AM
      Due: Sunday, 8 March 2026, 11:59 PM

      You are required to watch the video in the link below and prepare a 5 slide ppt embedded with a live video of you presenting the ppt. Not exceeding 5 minutes.

      Your task is to draw your attention to the key things we have covered so far in relation to the linked video to be watched.

      Submission is Friday, 6th March 2026 at 23:59hrs

    • Opened: Wednesday, 25 March 2026, 12:00 AM
      Due: Thursday, 2 April 2026, 3:00 PM

      You're required to search and summarize about the Big Data Ecosystems, Technologies and Platforms.

      From all included, add comparison study of at least 5 major Big Data Ecosystems, Technologies and Platforms.

      Submission is, 28th March 2026 at 23:59hrs

    • Opened: Wednesday, 25 March 2026, 12:00 AM
      Due: Thursday, 2 April 2026, 11:59 PM

      It is assumed you have a PC or Laptop (with Windows or Mac OS systems) with a minimum of 8gb of RAM and 512gb or above of Hard disk storage, and i5 processor or other better compatible ones from different manufacturers. 

      The GOAL is to prepare a linux (UBUNTU flavour) environment for Hadoop installation.

      a) Install Oracle virtualBox on your Windows/Mac OS.

      b) After, install the latest Ubuntu OS inside the Oracle virtualBox. The user account on your UbuntuOs MUST be your given name.

      c) Determine, prepare and establish that the environment is ready for Hadoop installation.

      d) Install Hadoop on your Ubuntu and under your user account. Once done, verify that your Hadoop is working properly and health.

      e) If health, find "NYC Taxi (Small Subset)~366 MB (compressed) dataset", and load it on your Hadoop and identify appropriate tools to pick any insights of your choice from it while it is stored in Hadoop environment.

      Note:

      • You MUST document all processes you have gone through at each stage including screenshots with explanations. 
      • In your documentation, include the handles or challenges experienced and how you overcame them during all installations.

      Lets get our hands a bit dirty today.

      Submission is 30th March 2026 at 23:59hrs