Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Kickstart your career in the high-growth field of data architecture. In this program, you’ll learn in-demand skills like data modeling, database design, and enterprise data management to build scalable and secure data systems.
Data architects design, implement, and manage data systems, support analytics, ensure compliance, and drive modernization.
In this program, you’ll gain a strong foundation in data engineering, SQL, and relational databases (RDBMS), while building essential technical skills such as Linux commands, shell scripting, and database administration (DBA). You’ll also develop hands-on experience in key areas of the field, preparing you to work with modern data infrastructure.
You’ll also explore topics like data warehousing, NoSQL databases, and ETL workflows with tools such as Airflow and Kafka. You’ll also work with big data processing using Spark and Hadoop, while covering key areas like data integration, governance, security, privacy, and compliance, ensuring you’re ready to tackle complex data challenges.
When you complete this full program, you’ll have a portfolio of projects and a Professional Certificate from IBM to showcase your expertise. You’ll also earn an IBM digital badge and will gain access to career resources to help you in your job search, including mock interviews and resume support.
Syllabus
- Course 1: Introduction to Data Engineering
- Course 2: Introduction to Relational Databases (RDBMS)
- Course 3: SQL: A Practical Introduction for Querying Databases
- Course 4: Hands-on Introduction to Linux Commands and Shell Scripting
- Course 5: Relational Database Administration (DBA)
- Course 6: Data Warehouse Fundamentals
- Course 7: Introduction to NoSQL Databases
- Course 8: ETL and Data Pipelines with Shell, Airflow and Kafka
- Course 9: Introduction to Big Data with Spark and Hadoop
- Course 10: Data Integration, Data Storage, & Data Migration
- Course 11: Data Privacy, Security, Governance, Risk and Compliance
- Course 12: Enterprise Data Architecture and Operations
- Course 13: Data Architect Capstone Project
Courses
-
Start your journey in one of the fastest growing professions today with this beginner-friendly Data Engineering course! You will be introduced to the core concepts, processes, and tools you need to know in order to get a foundational knowledge of data engineering. as well as the roles that Data Engineers, Data Scientists, and Data Analysts play in the ecosystem. You will begin this course by understanding what is data engineering as well as the roles that Data Engineers, Data Scientists, and Data Analysts play in this exciting field. Next you will learn about the data engineering ecosystem, the different types of data structures, file formats, sources of data, and the languages data professionals use in their day-to-day tasks. You will become familiar with the components of a data platform and gain an understanding of several different types of data repositories such as Relational (RDBMS) and NoSQL databases, Data Warehouses, Data Marts, Data Lakes and Data Lakehouses. You’ll then learn about Big Data processing tools like Apache Hadoop and Spark. You will also become familiar with ETL, ELT, Data Pipelines and Data Integration. This course provides you with an understanding of a typical Data Engineering lifecycle which includes architecting data platforms, designing data stores, and gathering, importing, wrangling, querying, and analyzing data. You will also learn about security, governance, and compliance. You will learn about career opportunities in the field of Data Engineering and the different paths that you can take for getting skilled as a Data Engineer. You will hear from several experienced Data Engineers, sharing their insights and advice. By the end of this course, you will also have completed several hands-on labs and worked with a relational database, loaded data into the database, and performed some basic querying operations.
-
Are you ready to dive into the world of data engineering? In this beginner level course, you will gain a solid understanding of how data is stored, processed, and accessed in relational databases (RDBMSes). You will work with different types of databases that are appropriate for various data processing requirements. You will begin this course by being introduced to relational database concepts, as well as several industry standard relational databases, including IBM DB2, MySQL, and PostgreSQL. Next, you’ll utilize RDBMS tools used by professionals such as phpMyAdmin and pgAdmin for creating and maintaining relational databases. You will also use the command line and SQL statements to create and manage tables. This course incorporates hands-on, practical exercises to help you demonstrate your learning. You will work with real databases and explore real-world datasets. You will create database instances and populate them with tables and data. At the end of this course, you will complete a final assignment where you will apply your accumulated knowledge from this course and demonstrate that you have the skills to: design a database for a specific analytics requirement, normalize tables, create tables and views in the database, load and access data. No prior knowledge of databases or programming is required. Anyone can audit this course at no-charge. If you choose to take this course and earn the Coursera course certificate, you can also earn an IBM digital badge upon successful completion of the course.
-
Get started with NoSQL Databases with this beginner-friendly introductory course! This course will provide technical, hands-on knowledge of NoSQL databases and Database-as-a-Service (DaaS) offerings. With the advent of Big Data and agile development methodologies, NoSQL databases have gained a lot of relevance in the database landscape. Their main advantage is the ability to handle scalability and flexibility issues modern applications raise. You will start this course by learning the history and the basics of NoSQL databases (document, key-value, column, and graph) and discover their key characteristics and benefits. You will learn about the four categories of NoSQL databases and how they differ. You’ll also explore the differences between the ACID and BASE consistency models, the pros and cons of distributed systems, and when to use RDBMS and NoSQL. You will also learn about vector databases, an emerging class of databases popular in AI. Next, you will explore the architecture and features of several implementations of NoSQL databases, namely MongoDB, Cassandra, and IBM Cloudant. You will learn about the common tasks that they each perform and their key and defining characteristics. You will then get hands-on experience using those NoSQL databases to perform standard database management tasks, such as creating and replicating databases, loading and querying data, modifying database permissions, indexing and aggregating data, and sharding (or partitioning) data. At the end of this course, you will complete a final project where you will apply all your knowledge of the course content to a specific scenario and work with several NoSQL databases. This course suits anyone wanting to expand their Data Management and Information Technology skill set.
-
Delve into the two different approaches to converting raw data into analytics-ready data. One approach is the Extract, Transform, Load (ETL) process. The other contrasting approach is the Extract, Load, and Transform (ELT) process. ETL processes apply to data warehouses and data marts. ELT processes apply to data lakes, where the data is transformed on demand by the requesting/calling application. In this course, you will learn about the different tools and techniques that are used with ETL and Data pipelines. Both ETL and ELT extract data from source systems, move the data through the data pipeline, and store the data in destination systems. During this course, you will experience how ELT and ETL processing differ and identify use cases for both. You will identify methods and tools used for extracting the data, merging extracted data either logically or physically, and for loading data into data repositories. You will also define transformations to apply to source data to make the data credible, contextual, and accessible to data users. You will be able to outline some of the multiple methods for loading data into the destination system, verifying data quality, monitoring load failures, and the use of recovery mechanisms in case of failure. By the end of this course, you will also know how to use Apache Airflow to build data pipelines as well be knowledgeable about the advantages of using this approach. You will also learn how to use Apache Kafka to build streaming pipelines as well as the core components of Kafka which include: brokers, topics, partitions, replications, producers, and consumers. Finally, you will complete a shareable final project that enables you to demonstrate the skills you acquired in each module.
-
Get started with Relational Database Administration and Database Management in this self-paced course! This course begins with an introduction to database management; you will learn about things like the Database Management Lifecycle, the roles of a Database Administrator (DBA) as well as database storage. You will then discover some of the activities, techniques, and best practices for managing a database. You will also learn about database optimization, including updating statistics, slow queries, types of indexes, and index creation and usage. You will learn about configuring and upgrading database server software and related products. You’ll also learn about database security; how to implement user authentication, assign roles, and assign object-level permissions. And gain an understanding of how to perform backup and restore procedures in case of system failures. You will learn how to optimize databases for performance, monitor databases, collect diagnostic data, and access error information to help you resolve issues that may occur. Many of these tasks are repetitive, so you will learn how to schedule maintenance activities and regular diagnostic tests and send automated messages of the success or failure of a task. The course includes both video-based lectures as well as hands-on labs to practice and apply what you learn. This course ends with a final project where you will assume the role of a database administrator and complete a number of database administration tasks across many different databases.
-
This self-paced IBM course will teach you all about big data! You will become familiar with the characteristics of big data and its application in big data analytics. You will also gain hands-on experience with big data processing tools like Apache Hadoop and Apache Spark. Bernard Marr defines big data as the digital trace that we are generating in this digital era. You will start the course by understanding what big data is and exploring how insights from big data can be harnessed for a variety of use cases. You’ll also explore how big data uses technologies like parallel processing, scaling, and data parallelism. Next, you will learn about Hadoop, an open-source framework that allows for the distributed processing of large data and its ecosystem. You will discover important applications that go hand in hand with Hadoop, like Distributed File System (HDFS), MapReduce, and HBase. You will become familiar with Hive, a data warehouse software that provides an SQL-like interface to efficiently query and manipulate large data sets. You’ll then gain insights into Apache Spark, an open-source processing engine that provides users with new ways to store and use big data. In this course, you will discover how to leverage Spark to deliver reliable insights. The course provides an overview of the platform, going into the components that make up Apache Spark. You’ll learn about DataFrames and perform basic DataFrame operations and work with SparkSQL. Explore how Spark processes and monitors the requests your application submits and how you can track work using the Spark Application UI. This course has several hands-on labs to help you apply and practice the concepts you learn. You will complete Hadoop and Spark labs using various tools and technologies, including Docker, Kubernetes, Python, and Jupyter Notebooks.
-
This course provides a practical understanding of common Linux / UNIX shell commands. In this beginner friendly course, you will learn about the Linux basics, Shell commands, and Bash shell scripting. You will begin this course with an introduction to Linux and explore the Linux architecture. You will interact with the Linux Terminal, execute commands, navigate directories, edit files, as well as install and update software. Next, you’ll become familiar with commonly used Linux commands. You will work with general purpose commands like id, date, uname, ps, top, echo, man; directory management commands such as pwd, cd, mkdir, rmdir, find, df; file management commands like cat, wget, more, head, tail, cp, mv, touch, tar, zip, unzip; access control command chmod; text processing commands - wc, grep, tr; as well as networking commands - hostname, ping, ifconfig and curl. You will then move on to learning the basics of shell scripting to automate a variety of tasks. You’ll create simple to more advanced shell scripts that involve Metacharacters, Quoting, Variables, Command substitution, I/O Redirection, Pipes & Filters, and Command line arguments. You will also schedule cron jobs using crontab. The course includes both video-based lectures as well as hands-on labs to practice and apply what you learn. You will have no-charge access to a virtual Linux server that you can access through your web browser, so you don't need to download and install anything to complete the labs. You’ll end this course with a final project as well as a final exam. In the final project you will demonstrate your knowledge of course concepts by performing your own Extract, Transform, and Load (ETL) process and create a scheduled backup script. This course is ideal for data engineers, data scientists, software developers, and cloud practitioners who want to get familiar with frequently used commands on Linux, MacOS and other Unix-like operating systems as well as get started with creating shell scripts.
-
Much of the world's data lives in databases. SQL (or Structured Query Language) is a powerful programming language that is used for communicating with and manipulating data in databases. A working knowledge of databases and SQL is a must for anyone who wants to start a career in Data Engineering, Data Warehousing, Data Analytics, Data Science or Business Intelligence. The purpose of this course is to help you learn and apply foundational and intermediate knowledge of the SQL language, and become familiar with many relational database (RDBMS) concepts along the way. You will start with performing basic Create, Read, Update and Delete (CRUD) operations using CREATE, SELECT, INSERT, UPDATE and DELETE statements. You will then learn to filter, order, sort, and aggregate data. You will work with functions, perform sub-selects and nested queries, as well as JOIN data in multiple tables. You will also work with VIEWS, transactions and create stored procedures. The emphasis in this course is on hands-on, practical learning. As such, you will work with real database systems, use real tools, and real-world datasets. You will create a database instance in the cloud. Through a series of hands-on labs, you will practice building and running SQL queries. At the end of the course you will apply and demonstrate your skills with a final project. The SQL skills you learn in this course will be applicable to a variety of RDBMSes such as MySQL, PostgreSQL, IBM Db2, Oracle, SQL Server and others. No prior knowledge of databases, SQL or programming is required, however some basic data literacy is beneficial.
-
Whether you’re an aspiring data engineer, data architect, business analyst, or data scientist, strong data warehousing skills are a must. With the hands-on experience and competencies, you gain on this course, your resume will catch the eye of employers and power up your career opportunities. A data warehouse centralizes and organizes data from disparate sources into a single repository, making it easier for data professionals to access, clean, and analyze integrated data efficiently. This course teaches you how to design, deploy, load, manage, and query data warehouses, data marts, and data lakes. You’ll dive into designing, modeling, and implementing data warehouses, and explore data warehousing architectures like star and snowflake schemas. You’ll master techniques for populating data warehouses through ETL and ELT processes, and hone your skills in verifying and querying data, and utilizing concepts like cubes, rollups, and materialized views/tables. Additionally, you’ll gain valuable practical experience working on hands-on labs, where you’ll apply your knowledge to real data warehousing tasks. You’ll work with repositories like PostgreSQL and IBM Db2, and complete a project that you can refer to in interviews.
-
Data integration, data storage, and data migration are core skills for a data professionals. With data management projected to grow by 140% by 2030 (IoT Analytics), these skills are in hot demand! As part of the IBM Data Manager Professional Certificate, this Data Integration, Data Storage, and Data Migration Strategies gives aspiring data managers the essential skills employers are looking for. During this course, you’ll learn best practices and processes in these three key areas—data integration, storage, and migration. You’ll investigate data integration and automate data aggregation from disparate sources into a single view to make it useful for analysis. You’ll explore data storage methods and processes to ensure your data is organized. Plus, you’ll learn data migration processes businesses use to upgrade their legacy systems and infrastructure with minimal disruption to other business operations. If you’re looking to enhance your resume with the essential skills, a data manager needs to catch an employer’s eye, enroll today and boost your career opportunities in just three weeks!
-
Whether you’re an aspiring data engineer, data architect, business analyst, or data scientist, a strong foundation in data security is crucial for any data management professional. By gaining practical experience and comprehensive knowledge, you will strengthen your resume and unlock exciting career opportunities in the dynamic field of data security and protection. This course provides insight into data privacy, security, and governance. You will learn to implement effective strategies for protecting sensitive information while identifying and mitigating various cyberthreats. You’ll explore encryption techniques, access control management, and incident response planning. The course also covers data architecture and governance, where you will develop frameworks for maintaining data quality and integrity. Additionally, learner will learn about risk management and compliance, focusing on essential regulations to safeguard data and foster organizational trust. You will reinforce the concepts learned in the course through hand-on labs, activities, and case studies. To enhance your resume with the essential skills for a data-oriented career and attracting the attention of potential employers, enroll today!
-
In today’s data-driven world, the ability to design and manage effective Enterprise Data Architecture (EDA) is a highly sought-after skill for career opportunities in data strategy, architecture, and management. Whether you want to advance in your current role as a data engineer or database administrator or transition to a specialized position like a data architect or enterprise data strategist, this course provides the essential knowledge and skills to succeed in the in-demand field of EDA. Gain foundational knowledge of EDA, exploring its core components, including data models and industry-standard frameworks such as TOGAF and the Zachman Framework. This will enable you to design architecture that meets specific business requirements. Practical insights into ETL processes, DataOps practices, and lifecycle management provide a strong foundation for managing data operations. Through hands-on labs and projects, you will gain skills in designing, optimizing, and managing enterprise data architectures, ensuring efficient storage, processing, migration, and governance. With a mix of instructional lectures, activities, and labs, you’ll acquire the expertise needed to excel in enterprise data architecture.
-
Gain practical, real-world experience in data architecture through this hands-on capstone project course, developing skills highly valued by employers. During this course, you’ll apply all that you’ve learned throughout the Data Architecture Professional Certificate. As you work through the course, you’ll evaluate, design, migrate, and integrate enterprise data systems through a case study. In the capstone project, you will assess the current data architectures of two organizations, highlighting their strengths and identifying areas for improvement. Based on this analysis, you will design and implement a unified and efficient architecture for the newly merged entity, aligning with business goals. The project includes working with both RDBMS and NoSQL databases and developing ETL pipelines to ensure smooth data integration and flow. Additionally, you will create a data governance plan that addresses regulatory compliance and outlines strategies for data protection. Overall, this real-world inspired scenario will give you plenty to talk about implementing an architecture and managing a system transition in interviews. If you’re keen to add practical experience to your portfolio that employers look for, enroll today!
Taught by
Aije Egwaikhide, Jeff Grossman, Lavanya Thiruvali Sunderarajan, Muhammad Yahya, Priya Kapoor, Ramesh Sannareddy, Rav Ahuja, Romeo Kienzler, Sabrina Spillner, Sam Prokopchuk, Sandip Saha Joy, SkillUp, Steve Ryan and Yan Luo