What is Metadata
Metadata
is simply defined as data about data. The data that are used to represent other
data is known as metadata.
For example the index of a book serves as metadata for the contents in the
book. In other words we can say that metadata is the summarized data
that leads us to the detailed data.
In
terms of data warehouse we can define metadata as following.
·
Metadata
is a road map to data warehouse.
·
Metadata
in data warehouse define the warehouse objects.
·
The
metadata act as a directory. This directory helps the decision support system
to locate the contents of data warehouse.
Note: In data warehouse we create metadata
for the data names and definitions of a given data warehouse. Along with this
metadata additional metadata are also created for time-stamping any extracted
data, the source of extracted data.
Categories of Metadata
The metadata can be broadly categorized into three categories:
1. Business Metadata - This metadata has the data ownership information, business definition
and changing policies.
2. Technical Metadata - Technical metadata includes database system names, table and
column names and sizes, data types and allowed values. Technical metadata also
includes structural information such as primary and foreign key attributes and
indices.
3. Operational Metadata - This metadata includes currency of data and data lineage. Currency
of data means whether data is active, archived or purged. Lineage of data means
history of data migrated and transformation applied on it.
Role of Metadata
Metadata has very important role in data warehouse. The role of
metadata in warehouse is different from the warehouse data yet it has very
important role. The various roles of metadata are explained below.
1.
The metadata act as a directory.
2.
This directory helps the decision support system to locate the
contents of data warehouse.
3.
Metadata helps in decision support system for mapping of data when
data are transformed from operational environment to data warehouse
environment.
4.
Metadata helps in summarization between current detailed data and
highly summarized data.
5.
Metadata also helps in summarization between lightly detailed data
and highly summarized data.
6.
Metadata are also used for query tools.
7.
Metadata are used in reporting tools.
8.
Metadata are used in extraction and cleansing tools.
9.
Metadata are used in transformation tools.
10.
Metadata also plays important role in loading functions.
Diagram to understand role of Metadata.
Metadata Respiratory
The Metadata Respiratory is an integral part of data warehouse
system. The Metadata Respiratory has the following metadata:
1. Definition of data
warehouse - This includes the description of structure
of data warehouse. The description is defined by schema, view, hierarchies,
derived data definitions, and data mart locations and contents.
2. Business Metadata - This metadata has the data ownership information, business
definition and changing policies.
3. Operational Metadata - This metadata includes currency of data and data lineage.
Currency of data means whether data is active, archived or purged. Lineage of
data means history of data migrated and transformation applied on it.
4. Data for mapping from
operational environment to data warehouse - This metadata includes source databases and their contents, data
extraction, data partition cleaning, transformation rules, data refresh and
purging rules.
5. The algorithms for
summarization - This includes dimension
algorithms, data on granularity, aggregation, summarizing etc.
Challenges for Metadata Management
The importance of metadata cannot be overstated. Metadata helps in
driving the accuracy of reports, validates data transformation and ensures the
accuracy of calculations. The metadata also enforces the consistent definition
of business terms to business end users. With all these uses of Metadata it
also has challenges for metadata management. The some of the challenges are
discussed below.
1.
The Metadata in a big organization is scattered across the organization.
This metadata is spread in spreadsheets, databases, and applications.
2.
The metadata could present in text file or multimedia file. To use
this data for information management solution, this data need to be correctly
defined.
3.
There are no industry wide accepted standards. The data management
solution vendors have narrow focus.
4.
There is no easy and accepted method of passing metadata.
No comments:
Post a Comment