mmx metadata framework
...the DNA of your data
MMX metadata framework is a lightweight implementation of OMG Metadata Object Facility built on relational database technology. MMX framework
is based on three general concepts:
Metamodel | MMX Metamodel provides a storage mechanism for various knowledge models. The data model underlying the metadata framework is more abstract in nature than metadata models in general. The model consists of only a few abstract entities... see more.
Access layer | Object oriented methods can be exploited using inheritance to derive the whole data access layer from a small set of primitives created in SQL. MMX Metadata Framework provides several diverse methods of data access to fulfill different requirements... see more.
Generic transformation | A large part of relationships between different objects in metadata model are too complex to be described through simple static relations. Instead, universal data transformation concept is put to use enabling definition of transformations, mappings and transitions of any complexity... see more.

Mapping UML class diagrams to MMX metamodel

November 21, 2008 22:23 by marx

An obvious choice for a tool for designing metamodels for MMX M2 level (MMX/M2) is UML class diagram. Class diagram is a mixture of elements concerned with both data structures and behaviour, and we are only interested in data structures aspect. The important elements that we need to consider while mapping an UML class diagram to MMX/M2 are: classes, interfaces, objects, attributes, annotations, associations, generalizations, enumerations and data types. Here's how those elements are mapped to the constructs of MMX Metamodel:

classes An UML class is implemented as an instance of MD_OBJECT_TYPE. A mandatory name column contains the name of the class and there's an indicator column to denote whether the class is an abstract or a concrete one.
interfaces An interface is implemented in exactly the same way as an abstract class.
objects An UML object is an instance of a class. Objects are implemented as instances of MD_OBJECT that get their object types supplied by MD_OBJECT_TYPE. The relationships between MD_OBJECT and MD_OBJECT_TYPE are essential for consistency of MMX model and are enforced by the facilities of referential integrity provided by the underlying database.
attributes An UML class attribute is implemented as an instance of MD_PROPERTY_TYPE. Each row in this table is related to the owner class of the attribute, and to the domain class of the attribute (a data type or an enumeration). In case a default value is provided it is stored in the default value column.
packages Package element is currently not mapped to MMX metadata model as it provides no additional benefits in this context. A metamodel is always assumed to belong to a single package with a single namespace.
annotations Comments and notes are stored as a text column of a class diagram element instance that it belongs to.
associations An UML association between two classes is implemented as an instance of MD_RELATION_TYPE. Each row in this table has two relations with MD_OBJECT_TYPE, one for each end of the association, and a name made up of both role names (all mandatory columns). An association type column indicates whether the row denotes an association, an aggregation or a composition, with null value denoting an association. Note that relationships in MMX metamodel are directional by design. 
aggregations An aggregation relationship is implemented exactly as an association, with the association type of aggregation ('A').
compositions A composition relationship is implemented exactly as an association, with the association type of composition ('C').
multiplicity Multiplicity of an association is stored in multiplicity type column of MD_RELATION_TYPE and takes a value from the predefined set of multiplicity types. The following notation is used:
0..1 (optional, zero or one) 'Z'
1 (one, or an exact number n) '1' ('n')
0..* or * (zero, one or more) '*'
1..* (at least one)  'P'
generalization Generalization has a very special role in MMX metadata model architecture as the mechanism for maintaining class hierarchies. Inheritance is implemented as parent-child relationship of MD_OBJECT_TYPE realizing superclass and subclass relations between classes. Note that MD_PROPERTY_TYPE and MD_RELATION_TYPE also have this relationship and can therefore constitute hierarchies of their own. Multiple inheritance is not permitted due to the single-parent nature of the parent-child relationship.  
enumerations Enumerations are implemented as instances of MD_OBJECT_TYPE inherited from an abstract domain class. Enumeration literals are stored as instances of MD_OBJECT related to one particular MD_OBJECT_TYPE.
data types Like enumerations, UML data types are instances of MD_OBJECT_TYPE inherited from an abstract data type class. Unlike enumerations, data types do not have an implicit set of possible values.
constraints UML constraints are technically just informal annotations to the model that have to be taken care of during system implementation. MMX Metamodel does not support constraints in any formal way so they are left for an application to handle. 

 

Not all UML class diagram elements and features (eg. those designed for code generation) are relevant in the scope of metamodeling and are therefore not considered here. As an example, visibility property of class attributes is of no concern in metamodel context. Similarily, not all MMX/M2 features are required for mapping UML class diagrams.  



Capturing RDBMS Metadata

November 12, 2008 12:57 by marx

Structure of a relational database is (normally) described in the data dictionary. Fortunately, SQL-99 has provided us with a standard mechanism (INFORMATION_SCHEMA) to access this information. Unfortunately, INFORMATION_SCHEMA does not cover everything required to load and synchronize the volatile data dictionary information with the metadata repository.

Transferring metadata into the repository is a three-step process, involving extraction from the source database, transformation to suitable format and, finally, loading into the repository. XML is used as the ‘transport media’ to intermediate between a source database and the database housing the MMX repository instance. Serialized file as a transfer mechanism has several advantages over direct database connections between two servers, namely:

  • transparency: no need for ‘drilling holes’ into layers of firewalls, DBAs and corporate rules;
  • asynchronicity: data extraction and loading need not take place simultaneously;
  • platform independency: not reliant on details of specific database and OS platforms. 

Metadata gets extracted from a data dictionary by executing SQL queries against INFORMATION_SCHEMA views (joined with some additional technical metadata) and are designed to output their results as serialized, well-formed XML. These queries have to be customized for each database platform considered according to the details of XML support and technical metadata storage provided by the platform. Metadata tags originating from a data dictionary are translated to fit the Relational Database Metamodel in MMX. This transformation is controlled by a XSLT stylesheet providing all necessary tags and classificators needed for load processing. A single XSLT stylesheet is sufficient to handle all different metadata transformations for data dictionary information from an arbitrary relational database. XSLT 2.0 is used for transformation. Transformed XML file contains all the information, including necessary meta-metadata, to be loaded into metadata repository. This process employs ‘XML shredding’ technique and is implemented in form of an SQL stored procedure. One procedure is capable of handling the processing of all metadata originating from a data dictionary, however due to differences in XML support details between different database servers the procedure is unique for each database platform.

INFORMATION_SCHEMA views identify the data objects with the combination of  catalog_name, schema_name and object_name. This uniform hierarchy is used to construct a standard URI to reference the original database object. The reference URI takes the form:

rdb://<database>/<catalog_name>/<schema_name>/<object_name>/<additionalelement(s)>

where <database> is either an IP address or a name uniquely identifying the database server; <catalog_name>,<schema_name> and <object_name> identify a database objecthierarchy and are provided by a INFORMATION_SCHEMA view; <additionalelement(s)> is required in case <object_name> is not on the last level of objects requiring identification, eg. <column_name> is required for column metadata.

In case the source metadata is provided in form other than direct extract from adatabase, eg. text file or spreadsheet, customised processing is required to convert the data definitions to well-formed XML format.



Relational Database Metamodel

November 2, 2008 12:50 by marx
Probably the best and most comprehensive metamodel covering all of the aspects and details of a relational database instance is the one found in the Eclipse Project docs. The model is based on SQL-92, is MOF2 compliant and is “capable of representing all non-syntactic aspects of SQL92 Data Definition Language (at the ‘intermediate’ conformance level)”. The model is submitted by IBM as part of forthcoming OMG IMM specification. IMM (Information Management Metamodel) is the new moniker for CWM (Commaon Warehouse Metamodel) as the ‘Warehouse’ word was not digestable for some and replaces the long due CWM 2.0.

MMX Framework provides full implementation of the model as defined in OMG specs. It includes 80 classes (both abstract and concrete). One open issue is the form of a URI reference used to link back to a relational database object. There is no common agreement on this; actually a W3C incubator group W3C RDB2RDF XG is just about to release a recommendation that should cover this topic as well.