Referential Integrity in MySQL
By Ian Gilfillan
This tutorial briefly explains the concepts of referential integrity, and looks at how MySQL enforces them with its definition of foreign keys. You should be using a stable version of MySQL 4 to follow all the examples, although some examples may work with earlier versions running InnoDB tables.
What is referential integrity?
Simply put, referential integrity means that when a record in a table refers to a corresponding record in another table, that corresponding record will exist. Look at the following:
There are 2 customers in the customer table, but 3 customer_id’s in the customer sales table. Assuming the two tables are linked with the customer_id field, you can tell that Nhlanhla has an amount of 23, and Anton 81. However, there is no corresponding name for customer_id 3. Foreign key relationships are described as parent/child relationships (customer being the parent, and customer_sales the child), and the record is said to be orphaned when its parent is no longer in existence.
A database in this sort of condition is referred to as having poor referential integrity (there are other kinds of integrity problems too). This is not necessarily a serious problem - one of the primary systems I work uses MyISAM tables, and has loads of orphans: article blurbs and article bodies not linked to any articles, but these don’t do much harm besides prickle my aesthetic sensibility, and we’ve never needed to fix this. However, it is not good design, and can sometimes lead to problems, so you should avoid a situation like this where possible.
In the past, the MySQL DBMS could not enforce this, and the responsibility passed to the code to do so. But this wasn’t good enough for serious systems, and one of the most frequently requested features in later versions of MySQL was that of foreign keys, enabling MySQL data to maintain referential integrity. A foreign key is simply a field in one table that corresponds to a primary key in another table. In the example above, customer_id would be the primary key in the customer table, uniquely identifying each record, and transaction_id would be the same in the customer_sales table. In the customer_sales table, the customer_id field could be an example of a foreign key, referring to its namesake in the customer table. A transaction should not exist without an associated customer. The code that generated these tables is clearly buggy!
Defining Foreign Keys in MySQL
Strictly speaking, for a field to be a foreign key, it needs to be defined as such in the database definition. You can ‘define’ a foreign key in any MySQL table type (including the default MyISAM table type), but they do not actually do anything - they are only used to enforce referential integrity in InnoDB tables.
In order to create a foreign key, you need the following:
- Both tables need to be InnoDB tables.
- To use the syntax FOREIGN KEY(fk_fieldname) REFERENCES table_name (fieldname)
- The field being declared a foreign key needs to be declared as an index in the table definition
Here is how you would define the two tables above with a foreign key:
CREATE TABLE customer ( customer_id INT NOT NULL, name VARCHAR(30), PRIMARY KEY (customer_id) ) TYPE = INNODB; CREATE TABLE customer_sales ( transaction_id INT NOT NULL, amount INT, customer_id INT NOT NULL, PRIMARY KEY(transaction_id), INDEX (customer_id), FOREIGN KEY (customer_id) REFERENCES customer (customer_id) ) TYPE = INNODB;
If you get the rather unhelpful error message:
ERROR 1005: Can't create table './test/customer_sales.frm' (errno: 150)
then check your foreign key definitions carefully - something is wrong with the definition. Common causes are a table not being of type InnoDB, a missing index on the same field (customer_id), or attempting to set a field to NULL when it cannot be (see the ON DELETE SET NULL clause below).
Referential integrity can be compromised in three situations: when creating a new record, deleting a record or updating a record. The FOREIGN KEY (transaction_id) REFERENCES customer (customer_id) clause ensures that when a new record is created in the customer_sales table, it must have a corresponding record in the customer table. After creating the above tables, insert the following data, which we will use to demonstrate some of the concepts:
mysql> INSERT INTO customer VALUES(1,'Nhlanhla'),(2,'Anton'); Query OK, 2 rows affected (0.00 sec) mysql> INSERT INTO customer_sales VALUES(1,23,1),(3,81,2); Query OK, 2 rows affected (0.00 sec)
Now insert the third record, referring to the non-existent customer 3:
mysql> INSERT INTO customer_sales VALUES(2,39,3); ERROR 1216: Cannot add or update a child row: a foreign key constraint fails
You cannot add the record, as customer_id 3 does not exist. The constraint has ensured your data keeps its integrity! However, what happens when we delete a record? Let’s add a customer 3, then add the customer_sales record again, after which we delete the 3rd customer:
mysql> INSERT INTO customer VALUES(3,'Malvin'); Query OK, 1 row affected (0.00 sec) mysql> INSERT INTO customer_sales VALUES(2,39,3); Query OK, 1 row affected (0.01 sec) mysql> DELETE FROM customer WHERE customer_id=3; ERROR 1217: Cannot delete or update a parent row: a foreign key constraint fails
So the constraint holds, and we would need to first delete the record from the customer_sales table. There is a way we could have allowed the delete to go ahead, which we will look at shortly, but first we will need to drop and recreate the index.