Joining Data from Multiple Tables: Link Types Explained

Relational databases store data efficiently across many tables related to each other by common fields or columns. Reporting on relational databases requires some knowledge about those relationships and columns in order to link tables together in a way that will give you accurate results. Here are a few examples of 1:many relationships:

•    1 Sales Professional is authorized to sell many products 
•    1 Company has many contacts in different departments
•    1 police or fire call has many units responding

There are several tools available for reporting on relational databases.  Regardless of the reporting tool, you need to first understand the way the tables are set up and what common field connects or joins the information in those tables together.  Once you know that, you can decide what style of link or join will meet your needs.

For the purpose of explaining, I will describe two tables the “Main | Left” table and the “Many | Right” table and have designated examples of these in scenarios below.

1tomanyrelationship.png

Inner Join

Inner Joins are often the default setting in the reporting tool.  This returns records from both tables only if they have a match in each table.  Inner joins are designed for occasions where you only care to see data where there are matches present in both tables.  Because it is default, it is the most commonly used, but it can cause miscounts if records are unexpectedly left out of the results when no match is found.  

InnerJoin.png

Outer Joins are designed for occasions when you need to see data, even if it does not match anything in the other table.  We are given the option of the Left Outer Join, the Right Outer Join and the Full Outer Join.

Left Outer Join

The Left Outer Join displays all records from the left main table and only the matching records from the right many table. This is the safest method for joining two tables where full record counts are required, even if data is missing, and for this reason, it is also very commonly used. 

LeftOuterJoin.png

Right Outer Join

The Right Outer Join displays all records from the right table and only the matching records from the left table.  This style of linking is not as common.  In most relational databases, there isn’t much opportunity for a Many table to have a record without an associated Main table record.  For this reason, right outer join can be used to identify potential problems in the database.

RightOuterJoin.png

Full Outer Join

The Full Outer Join will show all records from both tables regardless of a match. This style of linking is also not as common.  I have seen this used primarily in HR/Finance for error checking and reconciliation.  For example, it’s sometimes necessary to identify instances where there is a payroll record without an employee 401k contribution or a employee 401k contribution without an accompanying payroll record.  This form of join lets you quickly identify problem records and make corrections.

FullOuterJoin.png

In Summary

When in a hurry, the safest option is always the left outer join where the left table is the main table and the right table is the many table.   This will ensure that you have all of your main records present plus the associated matching records from the many table.  When you need to find errors, try a full outer join to see where matches are missing.  

There’s no harm in linking different ways and looking at results until you find the right join strategy for your tables.  The most important ingredient of any report is data familiarity, and experimenting with link types can help you learn more about the way your data is setup.