Terms – Data Integrity and Normalization

Data Integrity and Normalization: Explain the following terms

  1. Synonym
  2. Homonym
  3. Redundancy
  4. Mutual Exclusiveness of Data

1.       Synonym

A synonym is a type of problem that exists in relation. A synonym is created when two different names are used for the same information or attribute. The name of attribute must be same if it exists in two or more relations.

The following example displays two relations with synonym problem:

ITEM

The above relations are inter-related. The ITEM relation contains information about the items that are supplied by the supplier. The SUPPLER relation has an attribute Supplier_ID. The ITEM relation is referring to Supplier_ID with Supplier_Code that is wrong. It must also use Supplier_ID in order to refer to Supplier_ID attribute of SUPPLIER relation.

2.       Homonym

A homonym is a type of problem that exists in relations. A homonym is created when same name is used for two different attributes.

The following example displays two relations with Homonym problem:

DUSTOMER                                        SUPPLIER

Company_Name                              Company_Name

The attribute Company_Name is appearing in both relations. It may create confusion. The solution is that unique attribute names must be used in all relations to avoid confusion.

3.       Redundancy

Redundancy means duplication of data in multiple files. It is a type of problem that exists in relations. It is created when the same information is unnecessarily stored in two different ways of forms.

The following example displays a relation with redundancy problem:

EMPLOYEE

Date_of_Birth

Age

The relation contains two attributes. The first attribute stores the date of birth of an employee. The second attribute stores the age of an employee. The age can be calculated by using the date of birth. It means that Age attribute is not necessary. It is creating redundancy in the relation and must be dropped from the relation.

4.       Mutual Exclusiveness of Data

The data does not have overlapping information is known as mutually exclusive data. The mutual exclusiveness of data creates problem in some cases. It creates problem for the attributes whose values can be specified as “Yes/No” form. Sometimes, two or more such attributes cannot be true or false at the same time for one entity.

The following example displays a relation with this problem:

EMPLOYEE

Married

Single

This above two attributes cannot be true or false at the same. The problem occurs if “Yes” is selected in both attributes. The problem can be solved by using a larger categorical attribute. The above relation can have an attribute “MARITAL_STATUS”. The possible values in this attribute can be “M” and “S” where “M” indicates “Married” and “S” indicates “Single”.

EMPLOYEE

Marital_status