Data Engineering

This program will focus on the design and development of tools and applications that use big data to drive and improve business processes.

The program presents a variety of data engineering tools and concepts that can be used to design and develop tools and applications that are able to store, process, and organize large amounts of information in a meaningful way. The course covers concepts such as Extract Transform Load processes, data-intensive applications, and design considerations for building scalable, reliable, and maintainable big data software.

The course presents various tools and concepts that are essential to data engineering activities. That includes Python, relational and non-relational databases, distributed systems and processing using Hadoop and Spark, and cloud services, such as AWS. The learner will design and develop a cloud-based application that automates data analysis.



In addition, the course also presents various tools/technologies that are essential to data science development, including Python, Amazon SageMaker, Amazon S3, AWS Glue, and Linux. The course will outline how to leverage these concepts and tools to design and develop software that is able to analyze data in a fast, simple, and effective way.

By the end of this course, participants will be able to:

  • Describe a variety of activities related to data science.
  • Design and develop software that is able to extract, process, and analyze data.
  • Design and develop data-intensive solutions to business problems
  • Design and develop non-relational databases to support data science activities.
  • Leverage scripting languages such as Python to merge, process and analyze data from multiple sources.
  • Leverage AWS and Linux to implement IT infrastructure to support data science activities.
  • Design and Develop an end to end software solutions that predicts the outcome of an event using data science and machine learning.