ERD in the above, there are 2 that confusion can lead to an error occurs, namely:
1. On the relationship between the entity and the project does not have a clear cardinality.
2. There is a ternary relationship have on the relationship.
Based on the above confusion, the ERD can be improved:
Mapping of ERD image above:
1 NF
First normal form (1NF or Minimal Form) is a normal form used in database normalization. A relational database table that adheres to 1NF is one that meets a certain minimum set of criteria. These criteria are basically concerned with ensuring that the table is a faithful representation of a relation and that it is free of repeating groups.
Second Normal form (NF 2) is:
Meets 1 of NF (normal first).
Non-key attributes must depend on the function key / primery key.
So that the second normal form for each table / file must be specified attribute keys.
3NF/Third Normal Form
Definition of Third Normal Form (NF 3) is:
Meet the 2 forms of NF (both normal).
Non-key attributes do not have a transitive dependency on the key / key primery.
Same as in the second normal form, the changes only on the table pegawai only.
Database design is the process of producing a detailed data model of a database. This logical data model contains all the needed logical and physical design choices and physical storage parameters needed to generate a design in a Data Definition Language, which can then be used to create a database. A fully attributed data model contains detailed attributes for each entity.
The term database design can be used to describe many different parts of the design of an overall database system. Principally, and most correctly, it can be thought of as the logical design of the base data structures used to store the data. In the relational model these are the tables and views. In an object database the entities and relationships map directly to object classes and named relationships. However, the term database design could also be used to apply to the overall process of designing, not just the base data structures, but also the forms and queries used as part of the overall database application within the database management system (DBMS).
Database normalization
Normalization is the process of efficiently organizing data in a database. There are two goals of the normalization process: eliminating redundant data (for example, storing the same data in more than one table) and ensuring data dependencies make sense (only storing related data in a table). Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that data is logically stored.
Normalization done by because to Optimalisasi Structure of is tables, Improving speed, Eliminating same data inclusion, More efficient in storage media use, Lessening redundans, Avoiding anomali ( insertion anomalies, deletion anomalies, update the anomalies) and improved Data integrity.
A tables told by goodness of if fulfilling 3 criterion :
· If there is decomposition of is tables of, hence decomposition have to be secured ( Lossless-Join Decomposition
· The looking after of functional depending at the (time) of data change ( Dependency Preservation).
· Do not impinge the Normal Boyce-Code of Form ( BCNF)
If third criterion ( BCNF) cannot be fullfiled, hence at least the tables do not impinge the Normal Form of third phase ( 3rd Normal of Form / 3NF )
functional dependency
functional dependency (FD) is a constraint between two sets of attributes in a relation from a database.
Given a relation R, a set of attributes X in R is said to functionally determine another attribute Y, also in R, (written X → Y) if and only if each X value is associated with precisely one Y value. Customarily we call X the determinant set and Y the dependent attribute. Thus, given a tuple and the values of the attributes in X, one can determine the corresponding value of the Y attribute. For the purposes of simplicity, given that X and Y are sets of attributes in R, X → Y denotes that X functionally determines each of the members of Y - in this case Y is known as the dependent set. Thus, a candidate key is a minimal set of attributes that functionally determine all of the attributes in a relation.
(Note: the "function" being discussed in "functional dependency" is the function of identification.)
A functional dependency FD:X\to Y is called trivial if Y is a subset of X.
The determination of functional dependencies is an important part of designing databases in the relational model, and in database normalization and denormalization. The functional dependencies, along with the attribute domains, are selected so as to generate constraints that would exclude as much data inappropriate to the user domain from the system as possible.
For example, suppose one is designing a system to track vehicles and the capacity of their engines. Each vehicle has a unique vehicle identification number (VIN). One would write VIN → EngineCapacity because it would be inappropriate for a vehicle's engine to have more than one capacity. (Assuming, in this case, that vehicles only have one engine.) However, EngineCapacity → VIN, is incorrect because there could be many vehicles with the same engine capacity.
This functional dependency may suggest that the attribute EngineCapacity be placed in a relation with candidate key VIN. However, that may not always be appropriate. For example, if that functional dependency occurs as a result of the transitive functional dependencies
then that would not result in a normalized relation.
Trivial functional dependency
A trivial functional dependency is a functional dependency of an attribute on a superset of itself. {Employee ID, Employee Address} → {Employee Address} is trivial, as is {Employee Address} → {Employee Address}.
Full functional dependency
An attribute is fully functionally dependent on a set of attributes X if it is
· functionally dependent on X, and
· not functionally dependent on any proper subset of X. {Employee Address} has a functional dependency on {Employee ID, Skill}, but not a full functional dependency, because it is also dependent on {Employee ID}.
Transitive dependency
A transitive dependency is an indirect functional dependency, one in which X→Z only by virtue of X→Y and Y→Z.
Multivalued dependency
A multivalued dependency is a constraint according to which the presence of certain rows in a table implies the presence of certain other rows.
Join dependency
A table T is subject to a join dependency if T can always be recreated by joining multiple tables each having a subset of the attributes of T.
Superkey
A superkey is an attribute or set of attributes that uniquely identifies rows within a table; in other words, two distinct rows are always guaranteed to have distinct superkeys. {Employee ID, Employee Address, Skill} would be a superkey for the "Employees' Skills" table; {Employee ID, Skill} would also be a superkey.
Candidate key
A candidate key is a minimal superkey, that is, a superkey for which we can say that no proper subset of it is also a superkey. {Employee Id, Skill} would be a candidate key for the "Employees' Skills" table.
Non-prime attribute
A non-prime attribute is an attribute that does not occur in any candidate key. Employee Address would be a non-prime attribute in the "Employees' Skills" table.
Primary key
Most DBMSs require a table to be defined as having a single unique key, rather than a number of possible unique keys. A primary keyis a key which the database designer has designated for this purpose.
First Normal Form (1NF)
First Normal Form (1NF) is now generally considered part of the formal definition of a relation. Historically, 1NF was intended to disallow multi-valued attributes. 1NF dictates that the domains (allowable values) of attributes must include only atomic (simple, indivisible) values and that any given value of an instance of an attribute must be a single value from the domain of that attribute. In short, a given cell of a column in a table can contain only one value.
The following table violates 1NF because the second row contains more than one value in the COLOR column:
To ensure a table is in 1NF, one simply needs to decompose grouped attributes into separate rows (or in some case, tables). The
following representation of the above data is in 1NF (and others):
Second Normal Form
Second Normal Form (2NF) is based on the concept of "full" functional dependency. A functional dependency, X ® Y, is a full functional dependency if removal of any attribute from X means that the dependency does not hold anymore. For example, Given a table that tracks hours (HOURS) a given employee (SSN) devotes to a given project (PROJNUM), we note that HOURS is functionally dependent on the combination of SSN and PROJNUM because a given employee can work on more than one project. Removal of either SSN or PROJNUM from the functional dependency results in an incorrect relationship. For example, if we were to remove SSN from the previous functional dependency, we are left with HOURS and PROJNUM, but we don’t know which SSN worked those HOURS! Thus, we say SSN, PROJNUM is a full functional dependency of
HOURS.
A table is in 2NF if that table is in 1NF and every non-prime (is not involved in a primary key of the table) attribute in that table is fully functionally dependent on the primary key of the table.
For example, the following table is in 1NF but is not in 2NF because PNAME and PLOCATION are dependent on only part of the primary key (PROJNUM and SSN). Likewise, ENAME is also only dependent on SSN. (Primary keys will be denoted in this document by underlining the included columns.) Employee_Project_Table
To correct this schema, we need to create additional tables and decompose the partial dependencies into these new tables, as in figure 4.
(Please note that, for simplicity, these diagrams will not denote the resulting referential integrity, such as foreign keys, that would need to be added to these decomposed schemas.)
Employee_Table
Third Normal Form
Third Normal Form (3NF) is based on the concept of "transitive dependency". A transitive dependency can be loosely defined as a dependency that does not involve the primary key. For example, in the table below, we see that while all elements have functional dependencies on the key, SSN, there also exist other, transitive, dependencies. Namely, DEPTNAME is dependent on DEPTNUM.
Employee_Department_Table
A table is in 3NF if it is in 2NF and no non-key attributes are dependent on other non-key attributes.
We can decompose the above table into 3NF by creating a second table for department. Thus, the following structure is in 3NF:
Employee_Table
Department_Table
There is a subtle difference between 2NF and 3NF. In 2NF, we were concerned about non-key fields being dependent on subsets of the key. In 3NF, we are concerned about non-key fields being dependent on other non-key fields. Another way to say this has been nicely summarized as: any non-key field must be "… Dependent on the key, the whole key, and nothing but the key."
Elementary Key Normal Form
Elementary Key Normal Form (EKNF) is a subtle enhancement on 3NF (by definition, EKNF tables are also in 3NF) that most often occurs when there is more than one unique composite key (more than one column) which overlap (one or more columns are involved in both keys) in a table 3. Such cases can cause redundant information in the overlapping column(s). For example, in the following table, let’s assume that a subject title (SUBJECTTITLE) is also a unique identifier for a given subject in the following table:
Enrollment_Table
The primary key of the above table is the combination of STUDENTNUM and SUBJECTCODE. However, we can also see a (non-primary) uniqueness constraint (alternate key) that should span the STUDENTNUM and SUBJECTTITLE columns as well. The above schema could result in update and deletion anomalies because values of both SUBJECTCODE and SUBJECTTITLE tend to be repeated for a given subject. The following schema is a decomposition of the above table in order to satisfy EKNF:
Subject_Table
Enrollment_Table
For reasons that will become obvious in the following section, ensuring a table is in EKNF is usually skipped, as most designers will move directly on to Boyce-Codd Normal Form after ensuring that a schema is in 3NF. Thus, EKNF is included here only for reasons of historical accuracy and completeness.
Boyce-Codd Normal Form
Like EKNF, the only time a table is in 3NF but is not in Boyce-Codd Normal form (BCNF) is when the table contains two or more candidate keys that overlap. Beyond that, there is only a subtle difference between EKNF and BCNF, which I will outline below. Consider the same example we used to illustrate EKNF, but we now add a column (GRADE) to denote a student’s grade received in the course. (Further, for illustrative simplicity, let’s assume that a student can only take a course once.)
Enrollment_Grade_Table
We see here that the GRADE column is dependent only on a given enrollment pair and that the keys are now elementary 3 (which satisfies EKNF). However, SUBJECTTITLE is dependent on SUBJECTCODE. Since the key of the table is STUDENTNUM and SUBJECTCODE, we decompose this structure into the following two tables which satisfy BCNF:
One may note that this would also have happened to solve our EKNF problem in the previous section. For that very reason, most designers seldom worry about EKNF and move straight on to BCNF.
Fourth Normal Form
The final normal forms are concerned with multi-valued facts. We can also note that they are concerned with composite keys, as they tend to minimize the number of fields involved in a composite key. A table is in Fourth Normal form if it is in BCNF and all functional dependencies are "single valued". Another way to state this is to say that a table cannot contain two or more independent "multivalued" 4 facts. By "independent", we mean to say that there is no direct connection between the two (or more) multivalued facts. This vague definition is better handled by example. In the following table (in BCNF, since it is entirely composed of attributes involved in the key), we record people (NAME), instruments they play (INSTRUMENT), and music styles (MUSICSTYLE) they play.
We see that redundancy occurs because a given person (NAME) can play more than one INSTRUMENT and play more than one MUSICSTYLE (the fact that ‘Hallock’ plays the ‘Piano’ is repeated, as is the fact that he plays the ‘Blues’ and ‘Classical’). Further, this table seems to suggest a link between instruments and music styles. Can ‘Hallock’ play ‘Blues’ with a French Horn 5? (Yes, and if you know Hallock, you know he plays the blues with anything, including spoons!)
In other words, we see that there are two independent multi-valued facts in the above table. The first is that a person (NAME) can play more than one INSTRUMENT while the second is that a person (NAME) can play more than one MUSICSTYLE. These facts are independent because these two facts have no bearing on each other. Decomposing this table into two tables (below) solves the problem.
Figure 12: The table in Figure 11 has been decomposed into a 4NF schema.
One should note that 4NF only applies to tables with three or more attributes (it eliminates overlapping multi-valued dependencies, which, by definition, require three or more attributes) and only when all attributes compose the primary key of the table.
Fifth Normal Form and Project Join Normal Form
Cases where a table is in 4NF but is not in Fifth Normal Form (5NF) are extremely rare. Further, Project Join Normal Form (PJNF) is a slightly stronger (although this is debated) case of 5NF, and in virtually all cases it can be treated as an equivalent. Therefore, PJNF is included here for completeness. As in 4NF, 5NF considerations apply only to tables with three or more attributes, all of which comprise the primary key. The formal definition of 5NF and PJNF requires that we must first define a "projection". A projection of a table is a subset of the total number of columns with no duplicate rows. For example, the following table:
has the following projections:
The training table in Figure 13 may or may not be in 5NF depending on the business rules. Say we have to enforce the rule:
An EMPLOYEE trains a CLASSTYPE for a COMPANY if and only if an EMPLOYEE trains a CLASSTYPE, the EMPLOYEE trains for a COMPANY, and the COMPANY the EMPLOYEE trains for makes a tool that implements the CLASSTYPE the EMPLOYEE trains.
If we enforce the above rule the table in Figure 13 is not in 5NF and must be reduced to three tables represented by the above projections of the original table.
To achieve 5NF, one checks all-key tables for decompositions whose joins result in the same information. A cautionary note, however, is that such decompositions can lead to a loss of constraint knowledge. For example, in the above case, we need to create database code to handle the specified rule between an EMPLOYEE, the CLASSTYPEs they train, and the COMPANY who makes the tool that implements the CLASSTYPE.
The root concept behind 4NF, 5NF, and PJNF is that the tables not in these normal forms can be derived from simpler, more fundamental relationships. Further, 5NF does not differ from 4NF unless there are other rules (symmetric constraints) that dictate correct data population Lastly, 5NF differs from 4NF in that the fact combinations we are concerned with are no longer independent from each other (due to the semantic constraints).
A database is a structured collection of records or data that is stored in a computer system. The structure is achieved by organizing the data according to a database model. The model in most common use today is the relational model. Other models such as the hierarchical model and the network model use a more explicit representation of relationships.
DBMS ( Database Management System )
A database management system (DBMS) is computer software that manages databases. DBMSes may use any of a variety of database models, such as the network model or relational model. In large systems, a DBMS allows users and other software to store and retrieve data in a structured way.
BIT, BYTE, FIELD
Bit
A bit is a binary digit, taking a logical value of either "1" or "0" (also referred to as "true" or "false" respectively). Binary digits are a basic unit of information storage and communication in digital computing and digital information theory. Information theory also often uses the natural digit, called either a nit or a nat. Quantum computing user qubits; single piece of quantum information encoded on a two level quantum system and hence having the potential to exist in superposition of "true" and "false".
Byte
A byte is a basic unit of measurement of information storage in computer science In many computer architecture it is a unit of memory addressing . There is no standard but a byte most often consists of eight bits.
A byte is an ordered collection of bits, with each bit denoting a single binary value of 1 or 0. The byte most often consists of 8 bits in modern systems; however, the size of a byte can vary and is generally determined by the underlying computer operating system or hardware. Historically, byte size was determined by the number of bits required to represent a single character from a Western character set. Its size was generally determined by the number of possible characters in the supported character set and was chosen to be a divisor of the computer's word zize. Historically bytes have ranged from five to twelve bits.
Field
Field is a set of byte-byte similar, in the database used the term attribute
Attribute/field
Attribute or Field is a characteristic from entity, which preparing detailed explanation about that's entity.
A relation could have atribute too.
Example of Attribute :
Student : NIM, Name, Sex, Address
Car : Plat Number, Color, CC
Book : ID, title, author
Type of Attribute
Single vs multivalue
Single > can only be filled at most one value
Multivalue > can be filled with more than one value with the same type of
Atomic vs composition
Atomic > can’t be divided into the attributes of smaller
composition > is a combination of several attributes of a smaller
Derived Attribute
attribute value can be derived from other attribute values, for example: age of the attributes generated from the date of birth.
Null Value Attribute
Attributes that have no value to a record
Mandatory Value Attribute
Attributes must have values
Record / Tuple
Record is a data line in an relation. Consist of attributes where there attribute can interaction to completely information a entity / relation.
Entity or File
Entity is a collection of same kind of record and having same element, same attribute however different in their data value
type of Entity:
in processing application, file can be categorized likes:
- mains file
- transaction file
- report file
- history file
- protector file
- work file
Domain
Domain is collection of values which enabled to stay in one or more attribute. Every attribute in a relational database is defined like a domain
Key of element data
Key is the element of record which used to find the record when accessing or can also used to identify every entity / record / row
Type of key :
superkey
A superkey is defined in the relational model of database organization as a set of attributes of a relation variable (relvar) for which it holds that in all relations assigned to that variable there are no two distinct tuples (rows) that have the same values for the attributes in this set. Equivalently a superkey can also be defined as a set of attributes of a relvar upon which all attributes of the relvar are functionally dependent.
Candidate key
In the relational model, a candidate key of a relvar (relation variable) is a set of attributes of that relvar such that
1. at all times it holds in the relation assigned to that variable that there are no two distinct tuples with the same values for these attributes and
2. there is not a proper subset of this set of attributes for which (1) holds.
Since a superkey is defined as a set of attributes for which (1) holds, we can also define a candidate key as a minimal superkey, i.e. a superkey of which no proper subset can also be called as a candidate key.
The importance of candidate keys is that they tell us how we can identify individual tuples in a relation. As such they are one of the most important types of database constraint that should be specified when designing a database schema. Since a relation is a set (no duplicate elements), it holds that every relation will have at least one candidate key (because the entire heading is always a superkey). Since in some RDBMSs tables may also represent multisets (which strictly means these DBMSs are not relational), it is an important design rule to specify explicitly at least one candidate key for each relation. For practical reasons RDBMSs usually require that for each relation one of its candidate keys is declared as the primary key, which means that it is considered as the preferred way to identify individual tuples. Foreign keys, for example, are usually required to reference such a primary key and not any of the other candidate keys.
Primary Key
One of the key attributes of the candidate can be selected / specified a
primary key with the three following criteria:
1. Key is more natural to use as reference
2. Key is more simple
3. Key is guaranteed unique
Alternate Key
is an attribute of the candidate key is not selected to be primary key.
Foreign Key
is any attribute that points to the primary key in another table. Foreign key will be going on a relationship that has kardinalitas one to many or many to many. Foreign key is usually always put on the table that point to many.
External Key
is a lexical attribute (or set of lexical attributes) that values are always identify an object instance.
ERD (ENTITY RELATIONSHIP DIAGRAM)
Data models are tools used in analysis to describe the data requirements and assumptions in the system from a top-down perspective. They also set the stage for the design of databases later on in the SDLC.
There are three basic elements in ER models:
Entities are the "things" about which we seek information.
Attributes are the data we collect about the entities.
Relationships provide the structure needed to draw information from multiple entities.
Generally, ERD's look like this:
ELEMENTS OF THE ERD
Entity
In the ER Diagram Entity is described with the form of a rectangle. entity is something that exists in the real system and the abstract where the data stored or where there are data.
Relationship
ER diagram on the relationship can be described with a lozenge. Relationship is a natural relationship that occurs between entities. In general, the name given to the verb base making it easier to do readings relationships.
Relationship Degree
is the number of entities participating in a relationship. Degree which is often used in the ERD.
Attribute
is the nature or characteristics of each entity and relationship
Kardinalitas
tupel indicates the maximum number that can be berelasi with entities on the other entity
Relationship Degree
Unary Relationship is a relationship model between entity coming from same entity set.
Binary Relationship is a relationship model between 2 entity.
Ternary Relationship is a relationship between instance from 3 type of entitas unilaterally.
Cardinality
Cardinality indicates the maximum number of tables that can be relation with the entity on the other entity
Type of Cardinality :
One to One:
Level one to one relationship with the one stated in the entity's first event, only had one relationship with one incident in which the two entities and vice versa.
One to Many or Many to One:
Level one to many relationship is the same as the one to many depending on the direction from which the relationship is viewed. For one incident in the first entity can have many relationships with the incident on the second entity, if the one incident in which two entities can have only one incident hubugan with the first entity.
Many To Many:
if any incident occurs in an entity has many relationships with other entities in the incident.
NOTATION (E-R DIAGRAM)
Symbolic notation in the ER diagram is:
1. Rectangle represent the collective entity
2. Circle represent the attributes
3. Rhomb represent collective relationships
4. Line as the set of relations between the Association and the collective entity with entity attributes
Reference:
ER Ngurah Agus Sanjaya. Slide Part 5 - Database Dan Er-Diagram.