What is the Data Dictionary control?
Strike Graph's default control language is: "A data dictionary is maintained to detect errors based on the edit check configurations within the database."
This language should be customized to reflect the specific process that your organization has defined. For example, if your data dictionary contains additional content or is utilized in a different way, then you should update the control language to reflect that. It's important that your stated control description accurately reflects how your organization implements this control.
Who’s involved with this control?
Typical control owner: Data Privacy Officer or CISO
Typical parties involved: Database Administrator
How often should I perform this control?
Typical frequency: Continuously
Which framework is this for?
GDPR
SOC 2 (with Privacy or Processing Integrity)
ISO 27701
Why is this control important?
“A data dictionary is used to communicate the structure and content of data and provides meaningful descriptions for individually named data objects.” (USGS.gov) In plain english, a data dictionary is a list of all data elements in a data set and includes the data element name and description, data element properties (such as data type, size, nullability, etc), and other metadata that is helpful for decision making.
Maintaining an inventory of all data elements collected and processed by your organization allows you to apply appropriate protections. It is fundamental to your compliance with data privacy laws and regulations as you will need to know what data elements you are collecting from data subjects in order to ensure that the data elements remain valid.
In addition, knowing where sensitive or confidential data sits allows you to apply appropriate information security controls and will assist when a data subject requests an output of their information.
How do I demonstrate this control?
Provide a system generated extract of all (in scope) data elements used in the system. This is typically a list directly from the database that shows the field, the edit check element and then a brief description of the data in the field. You may need to manually add data classification category, data owner, and any other relevant metadata.
For example, if you collect a data subject’s name, zip code, and phone number, then you want to show the description of these fields in your database. The name field may have a requirement to be a string of text with no symbols allowed, the zip code may be defined as a string of 5 numbers, and the phone number may be defined as a string of 10 numbers (with or without a dash). Ideally, your data dictionary will also include a data classification category for each data element, a data owner, and any other relevant metadata.
Your extract may look like this:
Do I need to define every data identifier?
Focus on only the data elements that contain PII. Create this list now and maintain it going forward.
