Sounds very complicated doesn’t it? What exactly is an entity and how can they have relationships? What use could that possibly be to anyone? Well let’s take it one step at a time and begin with what an entity relationship diagram is, shall we? An entity relationship diagram, or ERD, is a diagram in a flow chart type style that shows how different things like customers, orders, products, etc, are connected to each and the types of interactions they have together.
ERD’s are generally used for designing a data base of some description. However, before you can really start putting one together there’s a few things you have to decide first, chiefly figuring what the entities are. An entity is usually considered to be anything that has a noun. So for example in this list of potential entities, what are the correct ones?
Owner, dog and cat are the actual entities here because they have to be things that can have attributes that describe them. So for example, dog would have attributes like breed, age, colour, attitude, etc. The owner could have attributes such as name, age, address, number of pets, career, etc. Attributes are generally considered to be the adjectives that describe an entity.
Understanding an Entity
Now that we have that out of the way you may be wondering what does this have to do with databases? Probably better to just show you an example of an ERD before we get in to the details. Have a look at the below diagram and see if it makes any sense to you.
It probably seems more complicated than it is but lets talk through it anyway. This is the ERD for a company that wants to set up a database for details about it’s employees and departments as well as any appointments made with clients. The entities are; job description, employee, department, appointment, report, attachment, client and address.
Let’s take the employee entity to start with. Bear in mind that each employee will be entered in to the database following the format of this entity, so in this case, each of them will have to have their first name, last name, home phone number, job description and department information available for the database administrator. These are pretty self explanatory but you probably noticed the employee ID attribute as well. This is what’s called a primary key and is vital for any database. This is an ID that is unique for each individual entity in the database and is used to reference a specific person or object. So if an employee quits, the database administrator just gets their ID number and deletes it from the database, instead of manually deleting all of their details from every system they have used. You’ve probably been given a staff/student ID number in the past and now you know why!
Did you notice anything about the job description and department attributes? They are what’s called foreign keys. These are not necessary for every single entity as they reference another entity that the original entity may be a part of. Let’s say there are four departments in this company; sales, accounts, admin and marketing. The department entity has the primary key “departmentID” so each of the departments could be identified as sales:1, accounts:2, admin: 3, marketing: 4.
In this case the foreign key simply states that the employee with, let’s say, an ID of 10 (the primary key) , works in department 3 (foreign key saying they work in the admin department) and has a job description (which would be stored in the job description entity with it’s own ID), let’s say 15 for IT administrator. So from this info we can gather a lot of data fairly quickly without having to go to too much trouble.
Connecting Lines and Cardinality
But how does this relate to the rest of the diagram and what are those lines and circles connecting each entity? Those lines, or connecting lines as they are so creatively called, represent the type of relationship each entity has to one another. So in this diagram we can see that “department” is connected to “employee”, which makes sense, but is not connected to “address” as there is no logical connection between the two of them. You can see at the end of each connecting line there is a combination of vertical lines, three lines that fan out, known as crow’s feet, as well as circles.
These symbols refer to the cardinality of the connection between entities, or “how many instances of an entity relate to one instance of another entity”(1). So in plain old English that means that the “client” entity can have one address and that the address entity can have many clients. This makes sense when you consider that a client would have at least one work address, such as their office or factory, but that same address could hold many of this company’s clients.
Now those vertical lines and circles come in to effect. The vertical line denotes that the side of the relationship it is on is mandatory and the circle denotes the other side is optional. So for the full relationship between job description and employee it would read as the following; an employee has to have at least one job description, shown as the following:
As well as this a job description can have many employees, for example a sales agent job description can be applied to many employees, denoted as the following:
Makes sense right? So if we are to take the whole diagram and read it in such a way we would get the following:
- An employee has to have a job description but a job description can have many employees.
- An employee has to have a department and a department has to have many employees.
- An employee can have many appointments and an appointment has to have at least one employee.
- An appointment can have a report but a report has to part of an appointment.
- A report can have many attachments but an attachment has to be part of a report.
- An appointment has to have at least one client but a client can have many appointments.
- A client has to have an address but an address can have many clients.
Something else that needs to be consider when putting together an ERD is if there are any many-to-many relationships. Databases are not fans to many-to-many relationships as they can cause issue when running queries later on. Consider the following:
While it might make perfect sense to say that a project can have many employees working on it and an employee can be part of many projects this would cause problems in the database. How are we to know who is doing what project? We would need multiple primary keys in each entity which would cause a lot of unnecessary confusion. A lot of database won’t even allow you to store a many-to-many relationship to avoid this. According to Michael Blaha of DataVersity (2)many to many relationships “permit multiple records for a combination of (two entities), which was not the intent of (original) relationship“.
So how do we get around this? It’s quite straight forward really, we simply add another entity to hold the details of both the entities, such as this:
By taking this approach we can easily look up details of who is doing what project without causing issue with the database.
Actions within an ERD
Sometimes you might come across an ERD that contains diamond shapes on the connecting line between entities. These are called actions and they are used to “show how two entities share information in the database” (1). I think that this diagram quite accurately sums up the whole concept:
They are used more as a way for us to to understand the exact process that is taking place. If you were to look at this example without the actions it would be quite clear that a writer creates a novel and a consumer buys it but for a more complex database these actions may not be so obvious.
So there you have it, ERD’s summed up in about thirteen hundred words. As you can see these are pretty useful tools for organizing and understanding your data and saves you having to consider all of this when filling out your database. They cover the who, what, when and where of your database but the why is most likely because it is your job, or else you just really like databases! You can imagine that these diagrams can get very large and complicated very quickly but you now have the tools to get a general understanding of what is what regarding ERDs. I’ll leave you with this beast of a diagram, see if you can get your head around it because I’m done…enjoy!
1: Entity Relationship Diagram – Common ERD Symbols and Notations . 2017. Entity Relationship Diagram – Common ERD Symbols and Notations . [ONLINE] Available at: https://www.smartdraw.com/entity-relationship-diagram/. [Accessed 13 March 2017].
2: DATAVERSITY. 2017. More Database Design Errors – Confusion with Many-to-Many Relationships. [ONLINE] Available at: http://www.dataversity.net/more-database-design-errors-confusion-with-many-to-many-relationships/. [Accessed 14 March 2017].