databases store many years of history, some columns or cells will not have
values for all time periods because the data was not applicable or did not exist
during certain time periods. If spikes appear on trend analysis graphs, the meta
data repository should be consulted to determine the applicability of that
particular piece of data.
Timeliness: Business people will want to know when the source data was last
updated and which of the versions of the operational systems were used for the
update. Not all operational systems run daily or on the same day of the month.
One operational system may run on the last calendar day of the month while
another may run on the last business day of the month. Some operational
systems do not "close out the month" until they complete an adjustment run
four to ten days after the last calendar day of the month.
Physical Characteristics
Origin (source): Since BI target databases only store existing operational data
(internally generated and externally purchased), the origin or source for each
data element should be documented. One column in the BI target database can
be populated with data elements from multiple sources. For example, the
column Account Balance in the Account table could be populated from the data
element Demand Deposit Account Balance in the Checking Account source
database and from the data element Time Deposit Account Daily Balance in the
Savings Account Transaction file. Conversely, one source data element can feed
multiple columns in the BI target database. For example, the data element
Type Code may be used for two purposes in the operational system. The data
values "A", "B", and "C" of Type Code may be used to populate the column
Customer Type Code in the Customer table, and the data values "N", "O", and
"P" of the same Type Code may be used to populate the column Product Type
Code in the Product table.
Physical location: Several meta data components (e.g., tables, columns,
dataset names) should describe where the data resides in the BI decision-
support environment.
Transformation: Very few data elements can be moved from source to target
without any type of transformation. At a minimum, the data type and length
may have to change, or single-character codes may have to be translated into
multi-character mnemonics. In the worst case, lengthy business rules may
require more complicated transformations involving editing, filtering,
combining, separating, or translating data values.
Derivation: This component stores the calculation for derived columns. While
derived columns are customarily not stored in operational systems, it is the
norm to store them in BI target databases.
Aggregation and summarization: Similar to derivation, aggregation and
summarization rules should be stored as meta data.
Volume and growth: The size and growth of BI target databases are often
enormous. Therefore, projected as well as actual volumes should be
documented as meta data in terms of the number of rows and the percentage of
expected growth.