Introduction
One of the key issues when it comes to data analysis is the accuracy of your data or rather the lack thereof. Most of our time is spent with cleaning the data before we can actually glean any insight from it.
Rather than cleaning data after it has been collected, one should aim at getting the correct data in the first place.
Enter left, data validation at the point of data entry. In some cases this means the application has to check that the correct data format has been used. A good example here is dates. In other cases, where only a defined set of values can be entered it is possible to provide a list of these values to choose from. A more sophisticated version of this setup is where multiple values need to be defined but each value is dependent on the previous one. An example here could look at cars with Volkswagen > Golf > Gold TDI 1.6 litres being one example.
When you work with Excel as a data entry tool, you are probably familiar with the fact that it provides pretty robust data validation features. In this article I want to show you a way that allows you to implement dynamic dependent data validation lists.
OK. Having said that there are two potential caveats though:
- the approach is based on the new dynamic array formulas UNIQUE and FILTER that are only available in Microsoft 365.
- In order to work properly when it comes to using this method in a data entry table it requires one line of VBA.
Here is a screenshot of the final worksheet as a guide
Step 1 – The Reference table
The first step is obviously the creation of your reference table. This is the table that contains the hierarchical data that will be used to create the data vaildation lists to use.
For the purpose of this article I have created a list based on four company divisions, each of which has four departments and each department includes 4 sections. Make sure you format the reference data as a table as this will ensure later that everything is fully dynamic. What I mean by that is that if you add additional reference data everything will continue to work without any further modifications.
Step 2 – Create the dynamic ranges that will become the source of the data validation
As I said this article provides a solution to dependent drop down lists based on the new dynamic array functions Unique and Filter. One key feature of these functions is that they automatically spill down, ie. they fill the column with all the values that are returned by the function. Unfortunately these functions cannot be used directly inside the data validation data source field. Interestingly though you can reference the spilled array range using the hashtag notation (eg I2#) where the initial cell reference denotes the starting cell and the hash tag indicates that this is a spilled range.
Simple dependent data validation
Lets start with the simple drop down validation scenario. The screencast below indicates what I mean by that.
In order to create these three drop down lists we need three functions to create the different validation lists. Since we can’t use these functions directly in the data validation definition we need to evaluate them in the spreadsheet and reference the result of that in the data validation definition.
Setting up validation for Divisions
For the division list we simply need a unique list of Division derived from the first column of our Reference table. I have named this table ReferenceTable for convenience and the formula to use is simply
=UNIQUE(ReferenceTable[Division])
Entering this formula in a single cell will result in what is called a spilled range, i.e the unique values from the Division column will be spilled down from where the formula is entered.
Once we have this list available we can use the spilled range as the source for our first data validation.
As it is a spilled range the length of the range could be variable. For instance if we add additional divisions to our reference table we would have more than four rows. Therefore there is a special notation for spilled ranges. You simply reference the first cell followed by a hash key
=E4#
Setting up validation for Departments
Now that we have our first level for the data validation we can move on to the validation for the department.
The principle is the same but this time we need to use the FILTER function. As the name suggests this function filters a list, table or range based on some filter criteria. The list we need is obviously the Department column from the Reference table. And the filter is based on the Divison column and is conditional on the value we slected for the division.
There is only one caveat. There are several duplicates for each department because of the fact that each department can have multiple sections. In order to get a unique list we just wrap the FILTER function in another UNIQUE function.
The final function therefore looks like this
UNIQUE(FILTER(ReferenceTable[Department],ReferenceTable[Division] = E16,""))
If you look carefully you will see that after the Filter criteria there is another option we can specify. This option defines what is shown if the Filter returns no values. In this case we simply return an empty string.
After setting up this formula we just follow the same steps as we did for the Division validation to define the Data Validation Source. Only differnece being is that we entered this formula in cell F4 and therefore the spilled range needs to refer to F4#.
Setting up validation for Sections
The steps we follow to establish the Section Validaton List is very similar to how we establish the Department Validation List. The only difference is that we have two criteria to evaluate – Division and Department, and both have to be true. These two filter criteria are simple logical tests that can either evaluate to true or false. We can exploit the fact that Excel saves these values as 0 (false) and 1 (true) internally. Therefore we simply have to multiply the two criteria, which will result in true (1) only if both criteria evaluate to true (1) but otherwise in false (0).
And here is the final function:
=UNIQUE(FILTER(ReferenceTable[Section],(ReferenceTable[Division]=E16) *(ReferenceTable[Department]=F16),""))
Dependent Dropdown validation in data entry table
Now that we know how to set up simple dependent data validation, lets look at how to do it in a table.
Obviously we are going to use the same formulas as before. The challenge this time though is the fact that the cells that we need to reference for the filter criteria change with each row. This means we cannot hard code them as we did before. Excel has a handy formula for this situation – OFFSET. At a minimum the function takes three parameter – a cell reference, a relative row offset and a relative column offset.
So if we want to kow what the value of the cell is that is one column to the left and is on the same row from the cell "D9" the offset function would look like this:
=OFFSET(D9,0,-2)
This is where I got a bit unstuck. Although offset is the perfect function for this scenario, the cell reference in the offset function still needs to change when another cell is selected. Unfortunately functions only recalculate when there is a change in the data but not when you simply select another cell – kind of makes sense though. So how do we address this challenge. This is where we need the one line of VBA code. In VBA it is very simple to do. You simply add the following function to the ThisWorkbook Excel object in the VBA editor (Developer tab > Visual Basic).
Private Sub Workbook_SheetSelectionChange(ByVal Sh As Object, ByVal Target As Range)
Worksheets("Dynamic dropdown").Range("B1").Value = "''" & ActiveSheet.Name & "'!" & ActiveCell.Address
End Sub
The result of adding this function is that every time you change your cell selection the cell address will be recorded in cell B1 of this sheet.
This now give us a fixed cell that provides the information of the current cell for the offset function. At the moment the value given in cell B1 is a text value, that we need to convert into an actual cell reference. For this purpose we have the INDIRECT function. The final offset function therefore reads as follows
= OFFSET(INDIRECT(B1),0,-1,1,1)
where the INDIRECT function provides the actual reference for the active cell (=selected cell).
Setting up validation in the data entry table for Divisions
After having talked about OFFSET and INDIRECT the first formula that we use is actually exactly the same as for the simple drop down validation function. Since the top level values are not dependent on anything we also just need the unique list of available divisions – Ergo nothing changes.
Setting up validation in the data entry table for Departments
As for the departments there is a single dependency. When we click in a department field, The value for the division is in the cell left of the current active cell. The respective drop down validation function can be written as
=UNIQUE(FILTER(ReferenceTable[Department],ReferenceTable[Division] = OFFSET(INDIRECT(B1),0,-1,1,1), "No Divison selected"))
All we need to do is to replace the hard coded cell reference in the original function with the appropriate OFFSET function referencing the cell before.
Setting up validation in the data entry table for Sections
The approach is the same for the data validation function for the sections. Here we have two dependencies. The first one referencing the division (2 cells to the left) and the department (one cell to the left).
As before we simply replace the hard coded references used in the function for the simple dependent drop down validation function with the appropriate OFFSET functions.
=UNIQUE(FILTER(ReferenceTable[Section],(ReferenceTable[Division]=OFFSET(INDIRECT(B1),0,-2,1,1))*(ReferenceTable[Department]=OFFSET(INDIRECT(B1),0,-1,1,1)),""))
Setting up the data validation source
Now that we have updated the formulas that we need to return the necessary values for the data validation we need to set up the actual data validation in our table. For each column in the table we follow the same steps as for the simple dropdown data validation and add the same data validation sources we used for the simple data validation. The good news is that as we expand the data entry table the data validation dropdown lists will be available automatically.
Conclusion
I hope you enjoyed this article on how to set up dependent dropdown data validation lists for a data entry table. Using the new dynamic array formulas UNIQUE and FILTER in conjunction with the old trusted friends of OFFSET and INDIRECT this is actually a rather simple task. The only fly in the ointment is the VBA code (but only because some people hate VBA – as far as I am concerned it doesn’t really pose a problem). If you can think of a way to reference the active cell without VBA let me know.
In this example I had the reference table as well as the formulas defining the data validation lists on the same sheet as the data entry table using data validation. Obviously you can have them on different sheets. The only change that you would need to make is that the data validation sources when setting up the actual data validation in the cell would need to reference the spilled ranges on the other sheet.
Share Your Thoughts