Hetero-Homogeneous Data Warehouses (HH-DW)

Project start
January 2010
Project website
http://hh-dw.dke.uni-linz.ac.at/
Project type
PhD Thesis Project  ( Christoph Schütz )

Short description
Recent research in data warehousing has focused on heterogeneous schemas. Data warehouse schemas may be heterogeneous with respect to dimension hierarchies and facts. Aggregation levels may only exist in specific sub-dimensions; measures may only exist in specific sub-cubes. For example, in a geography dimension of a cube, level city rolls up to level country. Austrian cities, however, have an additional level bundesland. Current data warehouse modeling approaches are well-suited for the representation of homogeneous schemata while their capability in representing heterogeneities is rather limited. For example, the Dimension Fact Model (DFM) introduced by Golfarelli et al. is widely-used and well-accepted for conceptual data warehouse modeling, but has no support for heterogeneous facts, and provides only limited semantics for heterogeneous dimension hierarchies. More recent modeling approaches allow the introduction of heterogeneities in the data warehouse schema. However, because of these heterogeneities, new problems arise. The hetero-homogeneous modeling approach combines the clarity of a homogeneous approach while providing more flexibility than previous approaches. Hetero-homogeneous dimension hierarchies and cubes are homogeneous with respect to a common schema shared by all sub-dimensions and sub-cubes. They are heterogeneous in that different dimension hierarchies and cubes may have additional levels and attributes, facts and measures.

Project team
Christoph Schütz (DKE)

Publications
C. Schütz, M. Schrefl, B. Neumayr, D. Sierninger:
Incremental Integration of Data Warehouses: The Hetero-Homogeneous Approach
In: Proceedings of the ACM Fourteenth International Workshop on Data Warehousing and OLAP (DOLAP 2011), Glasgow, Scotland, U.K., October 28, 2011, ISBN 978-1-4503-0963-9, pp. 25-30, 2011.
B. Neumayr, M. Schrefl, B. Thalheim:
Hetero-Homogeneous Hierarchies in Data Warehouses
In: Proceedings of the Seventh Asia-Pacific Conference on Conceptual Modelling (APCCM 2010), Brisbane, Australia, January 18-21, 2010. Conferences in Research and Practice in Information Technology, Vol. 110. Sebastian Link and Aditya K. Ghose (Eds.), Springer Verlag, Lecture Notes in Computer Science (LNCS), Vol. 6520, ISBN 978-3-642-17505-3, pp. 61-70, Publication received Best Paper and Best Student Paper Award, 2010.
C. Schütz:
Extending data warehouses with hetero-homogeneous dimension hierarchies and cubes: A proof-of-concept prototype in Oracle
(Master Thesis, 2010)
Diplomarbeit, Betreuung: o. Univ.-Prof. Dr. Michael Schrefl, unter Anleitung von Mag. Bernd Neumayr, ausgeführt an der Universität Linz, Institut für Wirtschaftsinformatik - Data & Knowledge Engineering, Februar 2010.
D. Sierninger:
Integration von Data Marts in ein globales Data Warehouse mit hetero-homogenem Schema
(Master Thesis, 2011)
Diplomarbeit, Betreuung: o. Univ.-Prof. Dr. Michael Schrefl, unter Anleitung von Mag. Christoph Schütz, ausgeführt an der Universität Linz, Institut für Wirtschaftsinformatik - Data & Knowledge Engineering, Juni 2011.

Prototype We have developed a proof-of-concept prototype for building, managing, and querying hetero-homogeneous data warehouses. We use the Oracle Database 11g to run hetero-homogeneous data warehouses. The prototype ships under the GNU General Public License; usage is granted without any warranty. The prototype is still in development. You may download the current version of the prototype here.