© 2003 by British Computer Society
Representation of Web Data in A Web Warehouse
1 Centre for Advanced Information Systems, School of Computer Engineering, Nanyang Technological University, Singapore 639798 Email: assourav@ntu.edu.sg, nwkng@ntu.edu.sg 2 Department of Computer Science, University of MissouriRolla, Rolla, MO 65409, USA Email: madrias@umr.edu
We believe that, to manage Web data effectively, there is a need to build a data warehouse of Web data, i.e. a Web warehouse. In this paper, we focus on how to represent and store relevant hyperlinked Web documents effectively in a Web warehouse called WHOWEDA (WareHouse Of WEb DAta) for further querying and manipulation. We present a simple and general model for representing metadata, structure and content of Web documents and hyperlinks in WHOWEDA. We discuss node and link objects which are used to represent Web documents and hyperlinks respectively in WHOWEDA. These objects are first class objects in our data model called WHOM (WareHouse Object Model) which is designed to represent and manipulate Web data in the warehouse. An important feature of our model is that it represents metadata, content and structure as trees called node and link metadata trees, and node and link data trees.