DBMS

Database

Almost always more than one database are used, but it is import to understand which database is best to perform what particular task. There are different functionalities and features involved while designing a system, and our focus remains to use the best suited database for a particular use case

Database is a collection of related data which represents some aspect of the real world

DBMS, Database Management System

A software for storing and retrieving users' data
DBMS Provides security and removes redundancy
End-Users, Application Programmers, and Database Administrators are users who access a DBMS
DBMS serves an efficient handler to balance the needs of multiple applications using the same data

Type of DBMS:

Hierarchical DBMS

In a Hierarchical database, model data is organised in a tree-like structure
Data is Stored Hierarchically (top down or bottom up) format
Data is represented using a parent-child relationship
In Hierarchical DBMS parent may have many children, but children have only one parent

Network Model

The network database model allows each child to have multiple parents
It helps you to address the need to model more complex relationships like as the orders/parts many-to-many relationship
In this model, entities are organised in a graph which can be accessed through several paths.

Relational model

Relational DBMS is the most widely used DBMS model because it is one of the easiest
This model is based on normalising data in the rows and columns of the tables
Relational model stored in fixed structures and manipulated using SQL

Object-Oriented Model

In Object-oriented Model data stored in the form of objects
The structure which is called classes which display data within it
It defines a database as a collection of objects which stores both data members values and operations

SQL vs NoSQL Database

SQL stands for Structured Query Language, also know as relational database, while NoSQL is known as non-relational database.

Relational databases enforces ACID, hence we have schema based transaction oriented data stores. It's proven and suitable for 99% of the real world applications. We can practically do anything with relational databases.
However, there are limitations on speed and scaling when it comes to massive high availability data stores. So we bargain availability for speed and scaling purpose and therefore end-up using NoSQL (non relational database)

For example, Google and Amazon have terabytes of data stored in big data centres. Querying and inserting is not performance efficient in these scenarios because of the blocking/schema/transaction nature of the RDBMs. That's the reason they have implemented their own databases (actually, key-value stores) for massive performance gain and scalability.

Database Selection: Questions to be Answered

How much data do you expect to store when the application is mature?
How many users do you expect to handle simultaneously at peak load?
What availability, scalability, latency, throughput, and data consistency does your application need?

Latency: Response time for single request
Throughput: Number of transactions per sec

How often will your database schema change?

The term "schema" refers to the organisation of data as a blueprint of how the database is constructed (divided into database tables in the case of relational databases)

What is the geographic distribution of your user population?
What is the natural “shape” of your data?
Does your application need online transaction processing (OLTP), analytic queries (OLAP), or both?
What ratio of reads to writes do you expect in production?
Do you need geographic queries and/or full-text queries?
What are your preferred programming languages?
Do you have a budget? If so, will it cover licenses and support contracts?
Are there legal restrictions on your data storage?

Choosing a Database

No database is perfect, serving all the purposes; there are multiple types of database, and every database comes with its own advantages and disadvantages.

In general, CAP (Consistency, Availability and Partition-Tolerance) theorem is used as a deciding factor for choosing a database.

CAP

Consistency: Every read receives most recent write or an error
Availability: Every request receives a (non-error) response, without guarantee that it contains the most recent write
Partition-Tolerance: The system continues to operate an arbitrary number of messages being dropped (or delayed) by the network between the nodes

No database at present satisfies all the CAP properties; every database satisfies just two of the three properties. Therefore decision to chose a database comes from the requirements and use-case.

If everything written is complicated for you, here is another way to look at it:

Atomicity: Operations are classified into atoms, and that whole atom execute or none of it, not possible to execute a part of it (for example bank transaction), relational database to be used for the purpose
Scaling (Vertical or Horizontal): NoSQL is preferred for horizontal scaling, for vertical scaling SQL is fine
Speed: if speed is a priority over ACID, need to use NoSQL (document-based is preferred)

This flow diagram will further help in the understanding, which database to chose:

Reference for Further Reading

Blog | Why of AIWhy of AI

PreviousDatabase NextSQL

Last updated 11 months ago

Was this helpful?