Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Northeastern University

Data Management for Analytics Part 1

Northeastern University via Coursera

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
This course will offer you an opportunity to learn the fundamental concepts and emerging technologies in database design and modeling and database systems. It presents a balanced theory-practice focus and covers entity relationship model and UML model, relational model, and relational databases. By the end of this part 1 course on data analytics, you will have a foundational understanding of the theory and applications of database management to support data analytics, data mining, machine learning, and artificial intelligence.

Syllabus

  • Fundamental Concepts of Database Management
    • In this module, we will introduce the fundamental concepts of database management, review applications of database technology, and define key concepts. We will also contrast the file-based approach to data management with the database approach. Finally, we will examine the elements of a database system and the advantages of database design.
  • Architecture and Categorization of Database Management Systems (DBMSs)
    • In this module, we take a quick look at what is under the hood of a database management system. We will examine the key components of DBMS architecture and how these components work together for data storage, processing, and management. We also check how DBMSs can be categorized based on data models, degree of simultaneous access, architecture, and usage.
  • Conceptual Data Modeling, Part 1
    • In this module, we first review the database design process from conceptual and logical to physical database design and elaborate on the data requirements of a business process. We then introduce the Entity Relationship (ER) model for conceptual data modeling. The fundamental building blocks of the ER model include entity types, attribute types, and relationship types. We discuss attribute type details such as domains, key attribute types, simple versus composite attribute types, single-valued versus multi-valued attribute types, and derived attribute types. For relationship types, we also examine the degree and roles, cardinalities, weak entity types, and ternary relationship types. Various examples are included for clarification.
  • Conceptual Data Modeling, Part 2
    • In this module, we will learn three additional semantic data modeling concepts: specialization/generalization, categorization, and aggregation. These concepts enhance and extend the ER model discussed in the previous module. We will introduce an alternative conceptual model: the Unified Modeling Language (UML) class diagram. The UML is a modeling language that assists in the specification, visualization, construction, and documentation of artifacts of a software system. The UML can offer case diagrams, sequence diagrams, package diagrams, deployment diagrams, etc. Here we use the UML for conceptual data modeling.
  • Organizational Aspects of Data Management
    • In this module, we focus on some organizational aspects of data management, including the DBMS catalog, the roles of metadata, and metadata modeling. We also discuss data quality, data governance, and different roles in data management. By the end of this module, you will understand the proper management of data and the corresponding data definitions. Data management entails proper management of data and the corresponding data definitions or metadata. The objective of data management is to ensure that (meta-)data is of good quality, and thus a key resource, for data analytics tasks and effective and efficient managerial decision-making.
  • Relational Model
    • As discussed in the previous modules, designing a database takes multiple steps. Once the conceptual data model is finalized, the next step is to map the conceptual data model to a logical data model by the database designer during the logical design step. Note that, unlike the conceptual data model, the logical data model is associated with the data model used by the implementation DBMS environment. In other words, a logical data model is intended for a specific type of DBMS. Since the top ten DBMSs in use are usually dominated by relational DBMSs such as Oracle, MySQL (open-source), Microsoft SQL Server, etc., we will focus on the relational model that can be used as a logical data model for relational DBMSs. The relational model is a formal data model with a sound mathematical foundation, based on set theory and first-order predicate logic. Unlike the ER and EER models, the relational model has no standard graphical representation, which makes it unsuitable as a conceptual data model. Given its solid theoretical underpinning, the relational model is commonly adopted to build both logical and internal data models. In this module, we are concerned with the definitions of relational models that can be used as a logical data model and/or an internal model for relational DBMSs such as Oracle and Microsoft SQL servers. The relational model is introduced as a formal data model. Different types of keys are defined, and their roles are specified along with relational constraints. Students will learn the relational model as a logical data model. The mapping of a conceptual ER model to a relational model is explained in detail, including the mapping of entity types, binary one-to-one relationship types, binary one-to-many relationship types, binary many-to-many relationship types, unary relationship types, n-nary relations types, multi-valued attribute types, and weak entity types.
  • Normalization of Relational Model and Mapping of the EER Model to Relational Model
    • This module first presents an overview of the insertion, deletion, and update anomalies in an unnormalized relational model and discusses informal normalization guidelines. Two key concepts used in the normal forms are defined and examined: functional dependency and prime attribute type along with various special cases of function dependency, including full versus partial, transitive, trivial, and multivalued dependencies. The process and the formal procedures for the normalization of a relational model are discussed in detail via the first normal form (1 NF), the second normal form (2 NF), the third normal form (3 NF), the Boyce-Codd normal form (BCNF), and the fourth normal form (4 NF).

Taught by

Xuemin Jin

Reviews

Start your review of Data Management for Analytics Part 1

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.