技术标签: python  java  mysql  数据库  sql  


How to make a good database design? Why we should create a good design database? Database design is an essential skill of a software engineer. In some interviews, the interviewer can ask you a few questions about it. As far as I know, we have some database principles. There are a lot of definitions about them and you can search on google for more details. Based on my experience, I’ll write it simply.

如何进行良好的数据库设计? 为什么我们应该创建一个好的设计数据库? 数据库设计是软件工程师的一项基本技能。 在某些采访中,访调员可以问您几个问题。 据我所知,我们有一些数据库原则。 关于它们有很多定义,您可以在Google上搜索更多详细信息。 根据我的经验,我将简单地编写它。

After reading this article, you will understand things:


  • What is a good database design? Why we should create a good design database? How to make a good database design?

    什么是好的数据库设计? 为什么我们应该创建一个好的设计数据库? 如何进行良好的数据库设计?
  • Design process

  • Define and use some rules

  • Normalization Rules

  • Integrity Rules

  • Column Indexing

  • Some notes and advice when we design a database


数据库设计概述 (Database Design Overview)

Firstly, What is database design?


“Database Design is the organization of data according to a database model. The designer determines what data must be stored and how the data elements interrelate.” Source: wikipedia.org

“数据库设计是根据数据库模型进行的数据组织。 设计人员确定必须存储哪些数据以及数据元素如何相互关联。” 资料来源:Wikipedia.org

Database design is a part of the Design Process when we develop software. Before doing database design, we have to complete software architecture (N-tier layer, Microservice, …) at the high-level. Database design is a very important step at the low-level. Design Process often creates by Senior Software Engineer or Software Architect who has a lot of experience in the IT field.

当我们开发软件时,数据库设计是设计过程的一部分。 在进行数据库设计之前,我们必须在高层完成软件体系结构(N层,微服务等)。 从低层次看,数据库设计是非常重要的一步。 设计流程通常是由在IT领域具有丰富经验的高级软件工程师或软件架构师创建的。

With a medium or big system, we usually choose and combine some databases to achieve our purpose. We need to support transactions and relationships: MySQL or PostgreSQL or SQL Server. We need to save flexible data: MongoDB(unstructured data). Support caching (Redis: key-value, sorted set, list, ..), support full-text searching(Elastic Search, …), and so on.

对于中型或大型系统,我们通常选择并组合一些数据库以实现我们的目的。 我们需要支持事务和关系:MySQL或PostgreSQL或SQL Server。 我们需要保存灵活的数据:MongoDB(非结构化数据)。 支持缓存(Redis:键值,排序集,列表等),支持全文搜索(Elastic Search等)。

Depends on your project, you should choose and combine some databases appropriately and wisely. There’s not the best database, only have database appropriately. We should take advantage of databases and know the limit/issues of them. In this article, I’ll only write about DBMS(Database Management System): MySQL. The reason is it’s complex more than NoSQL database such as MongoDB, Redis, and so on.

根据您的项目,您应该适当明智地选择并组合一些数据库。 没有最好的数据库,只有适当的数据库。 我们应该利用数据库并了解它们的限制/问题。 在本文中,我只会写有关DBMS(数据库管理系统)的信息:MySQL。 原因是它比NoSQL数据库(例如MongoDB,Redis等)更复杂。

In some projects, the Senior Software Engineer or Solution Architect could request to make a Class Diagram and ERD (Entity Relationship Diagram). What the difference between the Class Diagram and ERD?

在某些项目中,高级软件工程师或解决方案架构师可能要求制作类图和ERD(实体关系图)。 类图和ERD有什么区别?

  • The class diagrams are used to represent the main object or building block of the system. They are used to show the relationship of one class with another and also represent the attributes of the system.

    类图 用于表示主要对象或构件 系统的 。 它们用于显示一类与另一类的关系,代表 系统属性

  • However, and ERD is more of a database in the form of tables. They don’t show individual relationships but relationship sets as well as sets of entities. They show the type of information that needs to be stored in the database.

    但是, ERD更像是表形式的数据库。 它们不显示个人关系,而是关系集以及实体集。 它们显示了需要存储在数据库中的信息类型。

In my opinion, we should make ERD and don’t create Class Diagrams unless we have some special reasons. This depends on your project.

我认为,除非有特殊原因,否则我们应该创建ERD并不要创建类图。 这取决于您的项目。

什么是好的数据库设计? (What is a good database design?)

A properly designed database provides you with access to up-to-date, accurate information. Because a correct design is essential to achieving your goals in working with a database, investing the time required to learn the principles of good design makes sense.

设计正确的数据库可为您提供最新,准确的信息。 因为正确的设计对于使用数据库实现目标至关重要,所以花时间学习良好设计的原理是很有意义的。

Key points of good database design:


Image for post
4 key points of good database design.

设计过程 (Design Process)

You should make sure you make the right decisions by using these guidelines. In my opinion, the design process includes the following steps:

您应该确保使用这些准则做出正确的决定。 我认为设计过程包括以下步骤:

Image for post
Design Process
  • Step 1: Define the purpose of the database based on business requirements. Example: you wanna build a system displaying Olympic Tokyo 2020 information(news, results, live matches, and so on): Summer Olympic Games — Tokyo 2020, Summer Paralympic Games — Tokyo 2020.

    步骤1 :根据业务需求定义数据库的用途。 示例:您想要构建一个显示2020年东京奥运会信息(新闻,结果,现场比赛等)的系统:夏季奥运会-东京2020年,夏季残奥会-东京2020年。

  • Step 2: Find and organize the information required. Example: Summer Olympic Games — Tokyo 2020: https://odf.olympictech.org/2020-Tokyo/tokyo_2020_OG.htm and Summer Paralympic Games — Tokyo 2020: https://odf.olympictech.org/2020-Tokyo/tokyo_2020_PG.htm

    步骤2 :查找并整理所需的信息。 例如:夏季奥运会-2020年东京: https//odf.olympictech.org/2020-Tokyo/tokyo_2020_OG.htm和夏季残奥会-东京2020年: https//odf.olympictech.org/2020-Tokyo/tokyo_2020_PG。 htm

  • Step 3: Define and use some rules: name conventions (ex: lowercase), all tables are required: id field, created_at, and updated_at field, and so on.

    步骤3 :定义并使用一些规则:名称约定(例如:小写),所有表都是必需的:id字段,created_at和updated_at字段,依此类推。

  • Step 4: Divide the information into tables, specify the primary keys. Example: games_competition_group, games_competition, games_event, games_event_phase, games_unit(match), and so on.

    步骤4 :将信息划分为表格,指定主键。 示例:games_competition_group,games_competition,games_event,games_event_phase,games_unit(match)等。

  • Step 5: Determine the relationships among tables: one-to-one, one-to-many, many-to-many

    步骤5 :确定表之间的关系:一对一,一对多,多对多

  • Step 6: Refine your design & Normalize the design: analyze your design for errors. Create the tables and add a few records of sample data. See if you can get the results you want from your tables. Make adjustments: adding more columns or removing columns if needed, create a new table for optional data using a one-to-one relationship, split a large table into two smaller tables, and so on.

    步骤6 :优化设计并规范化设计:分析设计中的错误。 创建表并添加一些示例数据记录。 查看是否可以从表中获得所需的结果。 进行调整:添加更多列或根据需要删除列,使用一对一关系为可选数据创建新表,将一个大表拆分为两个较小的表,依此类推。

  • Step 7: Adding Indexing: single column or multi-columns.


定义和使用一些规则 (Define and use some rules)

In my experience, we should define and use some rules when we make a design database. Every member in the team has to abide by them. Here are the rules of my team:

以我的经验,我们在设计数据库时应该定义和使用一些规则。 团队中的每个成员都必须遵守。 这是我团队的规则:

  • Choose Engine MySQL: InnoDB and charset: utf8 or utf8mb4.

    选择引擎MySQL :InnoDB和字符集:utf8或utf8mb4。

  • Naming conventions: all names are snake cases (lowercase). Example: games_event, games_result, games_event_phase, and so on

    命名约定 :所有名称均为大写字母(小写)。 示例:games_event,games_result,games_event_phase等

  • All tables are required: id field (PRIMARY KEY and AUTO_INCREMENT), created_at, and updated_at field. Except for some special cases.

    所有表都是必需的: id字段(PRIMARY KEY和AUTO_INCREMENT),created_at和updated_at字段。 除了一些特殊情况。

  • All tables have the same prefix. Example: games_ or sports_ or empty prefix.

    所有表都具有相同的前缀。 示例:games_或sports_或空前缀。

  • All tables have to name in Singular or Many. Example: games_person or games_people, games_textblock or games_textblocks.

    所有表都必须以单数或许多命名。 例如:games_person或games_people,games_textblock或games_textblocks。

  • All Columns: have to add a comment for column and always set the default for integer (Ex: 0), varchar field (Example: ‘’), and unify in the whole system.


  • Id field: PRIMARY KEY set int(10) unsigned (Not set int(11) 2147483648 max value and -2147483648 min value)

    ID字段: PRIMARY KEY设置int(10)为无符号(未设置int(11)最大值为2147483648,最小值为-2147483648)

  • Boolean field: choose tinyint(1) type. Should be set `display` name. Ex: `display` tinyint(1) unsigned NOT NULL DEFAULT ‘1’ COMMENT ‘`0`: hidden/false, `1`: visible/true’

    布尔值字段:选择tinyint(1)类型。 应该设置“显示”名称。 例如:`display` tinyint(1)unsigned NOT NULL缺省'1'注释'0':隐藏/错误,`1`:可见/ true'

  • Status field: choose tinyint type. For example, tinyint(2)

    状态字段:选择tinyint类型。 例如tinyint(2)

  • Int field: choose int or mediumint or bigint if determine you will work with large data. Example: A case study of views on youtube. Example: we should use int(3) for weight (kg) of athele instead int(5). Because int(3) has maximum 999. There isn’t any athele who is 999 kg. Ref: https://dev.mysql.com/doc/refman/5.7/en/integer-types.html

    Int字段:如果确定您将使用大数据,则选择int或mediumint或bigint。 示例:YouTube上的观看案例研究。 示例:我们应该将int(3)用作动脉的重量(kg),而不是int(5)。 因为int(3)的最大值为999。没有任何一个999 kg的针。 参考: https : //dev.mysql.com/doc/refman/5.7/en/integer-types.html

  • Text field: choose default varchar(255) type. Depends on large of text you should choose tinytext — 255 Bytes(255 characters), text — 64KB (65,535 characters), mediumtext — 16MB (16,777,215 characters) and longtext — 4GB (4,294,967,295 characters). Ref: https://chartio.com/resources/tutorials/understanding-strorage-sizes-for-mysql-text-data-types/. BLOBs are an alternative type of data storage that share matching naming and capacity mechanisms with TEXT objects. However, BLOBs are binary strings with no character set sorting, so they are treated as numeric values while TEXT objects are treated as character strings. This differentiation is important for sorting information. BLOBsare used to store data files like images, videos, and executables.

    文本字段:选择默认的varchar(255)类型。 根据文本的大写,您应该选择tinytext — 255个字节(255个字符),text — 64KB(65535个字符),mediumtext — 16MB(16,777,215个字符)和longtext — 4GB(4,294,967,295个字符)。 参考: https : //chartio.com/resources/tutorials/understanding-strorage-sizes-for-mysql-text-data-types/BLOB是数据存储的另一种类型,它与TEXT对象共享匹配的命名和容量机制。 但是, BLOB是没有字符集排序的二进制字符串,因此将它们视为数字值,而将TEXT对象视为字符串。 这种区分对于信息分类很重要。 BLOBs用于存储数据文件,例如图像,视频和可执行文件。

  • Naming Indexing: single column format: column_name_idx, multiple column format: column_a_column_b_idx. If has >=3 columns, we should choose a name appropriately

    命名索引:单列格式:column_name_idx,多列格式:column_a_column_b_idx。 如果具有> = 3列,则应适当选择一个名称

  • Naming Unique Indexing: single column formart: column_name_unique, multiple column formart: column_a_column_b_unique. If has >=3 columns, we should choose a name appropriately.

    命名唯一索引:单列formart:column_name_unique,多列formart:column_a_column_b_unique。 如果具有> = 3列,则应适当选择一个名称。

  • Relationships: to avoid eager/lazy loading. We should not use @ManyToOne, @ManyToMany, @OneToMany in your code. Instead of that, we use the foreign key as an integer field.

    关系:避免渴望/懒惰的加载。 我们不应在您的代码中使用@ ManyToOne,@ ManyToMany,@ OneToMany。 取而代之的是,我们将外键用作整数字段。

  • Metadata: If you’re using MySQL <= 5.6: Using JSON in String; Otherwise, you’re using MySQL => 5.7: Using JSON Data Type which provides JSON Column Indexing. References: https://dev.mysql.com/doc/refman/5.7/en/json.html

    元数据:如果您使用的是MySQL <= 5.6:在字符串中使用JSON; 否则,您将使用MySQL => 5.7:使用提供JSON列索引的JSON数据类型。 参考: https : //dev.mysql.com/doc/refman/5.7/en/json.html

  • Dynamic column/attributes: It’s flexible data and the table contains columns: id, foreign_key, type(long, text, integer), property/attribute, and value. Otherwise, we can consider using Entity-Attribute-Value (EAV). Ref: https://inviqa.com/blog/understanding-eav-data-model-and-when-use-it

    动态列/属性:它是灵活的数据,并且该表包含以下列:id,foreign_key,type(long,text,integer),property / attribute和value。 否则,我们可以考虑使用实体属性值(EAV)。 参考: https : //inviqa.com/blog/understanding-eav-data-model-and-when-use-it

  • Partition Data: Split data into a lot of tables if you have large data.


  • NULL Value: NULL is not a data type — this means it’s not recognized as an “int”, “date”. Arithmetic operations involving NULL always return NULL (69 + NULL = NULL). Finally, NULL is simply a place holder for data that does not exist(missing information and inapplicable information). We only use IS NULL or IS NOT NULL (don’t use =, <>).

    NULL值: NULL不是数据类型-这意味着无法将其识别为“ int”,“ date”。 涉及NULL的算术运算始终返回NULL(69 + NULL = NULL)。 最后,NULL只是不存在的数据(缺少信息和不适用信息)的占位符。 我们仅使用IS NULL或IS NOT NULL(不要使用=,<>)。

Here is an example:


CREATE TABLE `games_team` (
        `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
        `cp_team_id` int(10) unsigned NOT NULL COMMENT 'Content provider team ID',
        `name` varchar(64) NOT NULL COMMENT 'Team name',
        `address` varchar(64) DEFAULT NULL COMMENT 'Team official address',
        `phone_number` varchar(45) DEFAULT NULL COMMENT 'Team official telephone number',
        `comment` text COMMENT 'Comment about the team',
        `website_url` varchar(255) DEFAULT NULL COMMENT 'Website URL',
        `created_at` int(10) unsigned NOT NULL DEFAULT '0',
        PRIMARY KEY (`id`),
        UNIQUE KEY `cp_team_id_unique` (`cp_team_id`),
        KEY `cp_stadium_id_idx` (`cp_stadium_id`)
        ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

规范化规则 (Normalization Rules)

Apply the so-called normalization rules to check whether your database is structurally correct and optimal.


  • First Normal Form(1NF): this is known as the atomic rule. Use the one-to-many relationship to follow 1NF.

    第一范式(1NF) :这称为原子规则。 使用一对多关系遵循1NF。

  • Second Normal Form(2NF): it’s 1NF and every non-key column is fully dependent on the primary key.

    第二范式(2NF) :它是1NF,每个非键列都完全取决于主键。

  • Third Normal Form(3NF): it’s 2NF and the non-key columns are independent of each other. Ex: price and discount column in the product table.

    第三范式(3NF) :它是2NF,非关键列彼此独立。 例如:产品表中的价格和折扣列。

Note: We have a lot of higher Normal Forms: it’s the 3NF and higher normal form. In my opinion, you should use 3 above rules that are enough. Some times, we can break these rules. For example, you need to save metadata in a column data JSON type (this violates 1NF)or we wanna high performance in the report feature: we’ll add more columns that can take a long time to calculate the final result (this violates 3NF).

注意:我们有很多高级范式:3NF和更高的范式。 我认为,您应该使用上面的3条规则就足够了。 有时候,我们可以打破这些规则。 例如,您需要将元数据保存为列数据JSON类型(这违反了1NF),或者我们想在报表功能中实现高性能:我们将添加更多的列,这些列可能需要很长时间才能计算出最终结果(这违反了3NF )。

诚信规则 (Integrity Rules)

You should also apply the integrity rules to check the integrity of your design:


  • Entity Integrity Rule: the primary key can’t contain NULL. Otherwise, it can’t uniquely identify the row. (includes: multiple column’s primary keys).

    实体完整性规则:主键不能包含NULL。 否则,它不能唯一地标识该行。 (包括:多列的主键)。

  • Referential Integrity Rule: each foreign key value must be matched to a primary key value in the table referenced (or parent table). Most RBDMS can be set up to perform the check and ensure referential integrity but I highly recommend you doing it manually with medium/big project.

    引用完整性规则:每个外键值必须与所引用表(或父表)中的主键值匹配。 可以将大多数RBDMS设置为执行检查并确保引用完整性,但是我强烈建议您在大型/大型项目中手动进行。

  • Business Logic Rule: besides the above two general integrity rules, there could be integrity (validation) pertaining to the business logic. Example: competition code or unit code or phone number have to correct format before inserting into tables. These could be carried out in validation rule(for the specific column) or programming logic.

    业务逻辑规则:除了以上两个通用完整性规则外,还可能存在与业务逻辑有关的完整性(验证)。 例如:比赛代码,单位代码或电话号码必须正确插入表格之前的格式。 这些可以在验证规则(针对特定列)或编程逻辑中执行。

列索引 (Column Indexing)

You could create an index on selected columns to facilitate data searching and retrieval. An index is a structured file that speeds up data access for reading but my slow down for updating. Notice that the index needs to be rebuilt whenever a record is changed, which results is overhead associated with using indexes.

您可以在选定的列上创建索引,以方便数据搜索和检索。 索引是一个结构化文件,它可以加快数据访问的读取速度,但会降低更新速度。 请注意,无论何时更改记录,都需要重建索引,这将导致与使用索引相关的开销。

The index can be defined on a single column, a set of columns. You could build more than one index in a table. Most RDBMS builds index on the primary key automatically.

可以在单个列或一组列上定义索引。 您可以在一个表中建立多个索引。 大多数RDBMS会自动在主键上建立索引。

Note: The EXPLAIN statement provides information about how MySQL executes statements. Ref: https://dev.mysql.com/doc/refman/5.7/en/explain-output.html

注意:EXPLAIN语句提供有关MySQL如何执行语句的信息。 参考: https : //dev.mysql.com/doc/refman/5.7/en/explain-output.html

一些注意事项和建议 (Some notes and advice)

  • You should learn and apply the above design steps to your project appropriately.

  • Besides, Applying normalization and integrity rules is very important. Don’t forget them when you design !!!.

    此外,应用规范化和完整性规则非常重要。 在设计时请不要忘记它们!

翻译自: https://medium.com/swlh/how-to-make-a-good-database-design-584cd3cab5c5


版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。



此笔记为完成fcc前端js算法挑战的笔记记录1.翻转字符串先把字符串转化成数组,再借助数组的reverse方法翻转数组顺序,最后把数组转化成字符串。你的结果必须得是一个字符串function reverseString(str) { var Arr = str.split('');//用‘’切割 var newStr = Arr.reverse().join('');//...

MySQL 之Navicat Premium 12安装使用、pymysql模块使用、sql注入问题的产生与解决_mysql12和mysql模块-程序员宅基地

阅读目录一、Navicat Premium 12简介与使用:二、pymysql模块的使用:查:增删改三、sql注入问题产生与解决方法:本文内容提要:Navicat Premium 12 的介绍、使用。pymysql模块的使用sql注入问题的产生与解决一、Navicat Premium 12简介与使用:1、Navicat Premium 12是一套快速、可靠..._mysql12和mysql模块


您所在位置:网站首页 > 海量文档&nbsp>&nbsp学术论文&nbsp>&nbsp大学论文C语言考试系统的设计【毕业论文+文献综述+任务书+开题报告】.doc69页本文档一共被下载:次,您可全文免费在线阅读后下载本文档。 下载提示1.本站不保证该用户上传的文档完整性,不预览、不比对内容而直接下载产生的反悔问题本站..._c语言程序设计题库管理系统设计与开发参考文献




原标题:2018年中青杯全国大学生数学建模竞赛 报名截止时间:6月7日23:59报名官网:zqb.52jingsai.com为培养大学生的创新意识、协作精神及运用数学方法和科技教育解决社会问题的能力,中青杯全国大学生数学建模竞赛组委会和吉林省科技教育学会共同主办2018年中青杯全国大学生数学建模竞赛,欢迎各高校同学报名参赛。一、组织单位主办单位:中青杯全国大学生数学建模竞赛组委会、吉林省科技教育学...

WSL(Windows Subsystem for Linux)初始修改root密码_wsl su-程序员宅基地

win10出的linux子系统安装好后没有提示输入默认密码,而是新建一个普通的用户,当需要使用root账号时没有root密码,这时可以根据你装的linux从powershell中启动,默认启动的是新建的那个普通用户,这里我装的是kali的这个只是普通用户的,并不能使用root权限,而root的密码我找的也没有给出默认的root密码经过2小时的查找。。。kali config --defaul..._wsl su


oracle 字符集乱码本质验证-程序员宅基地


构建时 flatten-maven-plugin报错问题_failed to execute goal org.codehaus.mojo:flatten-m_Yuhang_Z的博客-程序员宅基地

[ERROR ]Failed to execute goal org.codehaus.mojo:flatten-maven-plugin:1.2.4:flatten (flatten) on project 构建项目时发生以上错误,最后成功解决。解决方法:flatten-maven-plugin支持的maven版本要在3.5以上,需要确认maven版本信息;查看maven本地库,flatten-maven-plugin包是否存在。..._failed to execute goal org.codehaus.mojo:flatten-maven-plugin:1.2.7:flatten

MyBatis 使用数组作为参数_mybatis 数组参数-程序员宅基地

&lt;select id="findDataByCodes" resultMap="BaseResultMap" &gt; select &lt;include refid="this_list" /&gt; from EDIIS_BLACK_LIST_UNIT t where 1=1 &lt;if test="array !=null and arr..._mybatis 数组参数



C# 异步编程async await_c# 异步编程 async-程序员宅基地

在方法中使用await关键字,则方法签名中要加上async关键字 await后指定异步方法 async await关键字说明不等待当前线程执行结束就开始执行await后的异步操作 class Program { static void Main(string[] args) { Say(); Cons..._c# 异步编程 async

Another Redis Desktop Manager 1.4.2中文版(Redis可视化管理工具)_another redis中文版-程序员宅基地

AnotherRedis Desktop Manager中文版是mac上一款基于nodejs开发的Redis可视化管理工具,可以运行在Windows、Linux、Mac平台,具有更快、更好、更稳定的特点,更重要的是,当加载大量的key时,它不会崩溃。- 更好的性能-集群支持-ssl / ssh支持-暗模式- 树视图Another Redis Desktop Manager 1.4.2中文版:mac.orsoon.com/Mac/182379.html????复制扔浏览器_another redis中文版