Show Menu
Cheatography

DataBase modelling Cheat Sheet by

Insights to how we can most effectively design our databases

Data Models

Conceptual
Highes­t-l­evel, Important entities and the relati­onships among them.
No attribute is specified.
No primary key is specified.
Logical
Data in as much detail as possible, without regard to how they will be physical implem­ented in the database
{{Steps:}}
1, Specify primary keys for all entities.
2, Find the relati­onships between different entities.
3, Find all attributes for each entity.
4. Resolve many-t­o-many relati­ons­hips.
separate the two entities and create two one-to­-many (1:n) relati­onships between them with a third intersect entity.
5, Normal­iza­tion.
Physical
How the model will be built in the database.
A physical database model shows all table struct­ures, including column name, column data type, column constr­aints, primary key, foreign key, and relati­onships between tables.
Convert entities into tables.
Convert relati­onships into foreign keys.
Convert attributes into columns.
Modify the physical data model based on physical constr­aints / requir­ements.

Partit­ioning Tables

Use Case:
As our data scales, querie­s/u­pdates may not always fit into memory
Horizontal Partit­ioning:
Splitting a table by rows
Pros:
Indices of heavil­y-used partitions fit in memory; Supports OLAP and OLTP
Cons:
Partit­ioning existing table can be a hassle; Some constr­aints can not be set

Horizontal partit­ioning

ATE TABLE sales (
...
timestamp DATE NOT NULL
)
PARTITION BY RANGE (timestamp);
CREATE TABLE sales_2019_q1 PARTITION OF sales
FOR VALUES FROM ('2019-01-01') TO ('2019-03-31');
...
CREATE TABLE sales_2019_q4 PARTITION OF sales
FOR VALUES FROM ('2019-09-01') TO ('2019-12-31');
CREATE INDEX ON sales ('timestamp')

Sharding

sharding implies the data is spread across multiple computers while partit­ioning does not
Partit­ioning is about grouping subsets of data within a single database instance
Sharding and partit­ioning are both about breaking up a large data set into smaller subsets.
 

Models: Diagra­mitic Overview

Models: Tablular Overivew

SSAS

Has a three-tier archit­ecture
RDBMS: The data from different sources like Excel, Database, Text, others can be pulled with the help of ETL tool into the RDBMS.
SSAS: Aggregate data from RDBMS is pushed into SSAS cubes by using analysis services projects. The SSAS cubes will create an analysis database, and once the analysis database is ready, it can be used for many purposes.
Client: Clients can access data using Dashbo­ards, Scorec­ards, Portals etc.
A cube is a basic unit of storage
It is a collection of data which has been aggregated to allow queries to return data quickly.
The MOLAP is made of data cube which contains of measures and dimensions
Dimension Table
contains dimens­ions, or charac­ter­istics (who, what, where), of a fact.
 
Joined to fact table with a foreign key
 
de-nor­malized tables.
A fact table is the most important table in a dimens­ional model.
A Fact Table contains Measur­eme­nts­/fact and Foreign key to the dimension table. For example, payroll operat­ions.
Two SSAS model types
Multi-­Dim­ens­ional Data Model
consists of a data cube. It is a group of operations which allows you to query the value of cells by using cube and dimension members as coordi­nates.
 
rules which decide the way that measure values are rolled up within hierar­chies
Tabular modeling organizes data into related tables.
The table doesn’t designate as “dimen­sions” or “facts” and develo­pment time is less with tabular because of all related tables able to serve both roles.
 

No SQL databases

Key - Value Stores
Document
Columnar
Graph

OLAP

On-Line Analytical Proces­sing.
OLAP is Multid­ime­nsional
Providing ability to analyze metrics in different dimensions such as time, geography, gender, product, etc
For example, sales for the company are up. What region is most respon­sible for this increase? Which store in this region is most respon­sible for the increase?
For example, sales for the company are up:
What region is most respon­sible for this increase?
Which store in this region is most respon­sible for the increase?

SSRS

Microsoft SQL Server has three types of SQL Services
Microsoft SQL Server Integr­ation services which integrate data from different sources.
Microsoft SQL Server Analytical service which helps for the analysis of the data
Microsoft SQL Server Reporting service allows for generating a visual report of the data.
SSRS data sources
Retrieve data from managed, OLE ODBC, and DB connec­tions
Display data in a variety of formats which includes tabular, free-form, and charts
Reporting Life Cylce
Authoring: In this phase, the report author defines the layout and syntax of the data. The tools used in this process are the SQL Server Develo­pment Studio and SSRS tool.
Manage­ment: This phase involves managing a published report which is mostly part of the websites. In this stage, you need to consider access control over report execution.
Delivery: In this phase, you need to understand when the reports need to be delivered to the customer base. Delivery can be on-demand or pre-de­fined schedule. You can also add an automation feature of subscr­iption which creates reports and sends to the customer automa­tically
               
 

Comments

No comments yet. Add yours below!

Add a Comment

Your Comment

Please enter your name.

    Please enter your email address

      Please enter your Comment.

          Related Cheat Sheets

          SQL Server Cheat Sheet
          Essential MySQL Cheat Sheet
          DataBase Concepts Cheat Sheet

          More Cheat Sheets by datamansam

          Jupyter Notebook Keyboard Shortcuts
          Numpy Crib Cheat Sheet
          DataBase Concepts Cheat Sheet