[H-GEN] ITEE MPhil confirmation seminar: Andrae Muys, 02.00PM, Tue 20 Feb 2007

Andrae Muys andrae at netymon.com
Sun Feb 18 22:43:02 EST 2007


I tried to send this last week, but it seems to have got lost.

Begin forwarded message:

> From: Guido Governatori <guido at itee.uq.edu.au>
> Date: 13 February 2007 8:00:01 AM
> To: seminar-announce at itee.uq.edu.au
> Subject: [seminar-announce] ITEE MPhil confirmation seminar: Andrae  
> Muys, 02.00PM, Tue 20 Feb 2007
>
> Building an Enterprise Scale Database for RDF Data
>
> Speaker: Andrae Muys, ITEE
> When: 02.00PM, Tue 20 Feb 2007
> Venue: 78-420
> Host: David Carrington
> Abstract:
>
>   The large scale management of semistructured data is a problem of
>   increasing relevance. A number of otherwise unrelated fields are
>   facing an explosion in the amount of information being generated and
>   requiring management. These include such diverse areas as genomics
>   and biotech, knowledge representation, citation management, network
>   traffic analysis, as well as traditional heterogeneous database and
>   enterprise information integration.
>
>   The Resource Description Framework (RDF) is a suite of technology
>   standards produced by the W3C that was originally designed to
>   support internet metadata, primarily in the guise of the Semantic
>   Web. However at the core of RDF is a datamodel that promises to
>   provide an approach to solving the general problem of managing
>   semistructured data.
>
>   The Mulgara Project (http://www.mulgara.org) is an OpenSource
>   database implementing this datamodel. Its primary focus has been on
>   the application of RDF to large scale semistructured data
>   management. With the ability to scale to 1 billion statements, its
>   current version is amongst the best scaling implementations
>   available. We propose to investigate proposed modifications to
>   Mulgara's storage layer to support two orders of magnitude increase
>   in its scalability to approximately 100 billion statements. If
>   successful this would help alleviate many of the data management
>   problems mentioned above. The paper first provides a formal
>   definition of semistructured data in terms of vocabulary and
>   semantics; specifically the underlying assumptions of the relational
>   model that complicate the management of semistructured data. We then
>   examine how these assumptions interact with relational
>   normalisation, and how this provides a rationale for RDF as a model
>   for semistructured data. After introducing the current design and
>   functionality of Mulgara, the we then introduce the design of a new
>   store layer based on a combination of functional programming
>   techniques and traditional approaches to efficient external memory
>   datastructures. This combination is shown to provide substantially
>   simplified implementations of critical features required for large
>   scale datamanagement including: Lockfree multiversion concurrency;
>   Live backup and restore; Federation; and Replication.
>
> Biography:
> (biography unavailable)
> Type: MPhil confirmation
> Contact:
> David Carrington, seminar host (davec at itee.uq.edu.au)
> or Guido Governatori, ITEE seminar co-ordinator,  
> (guido at itee.uq.edu.au)
> ITEE seminar web page: http://www.itee.uq.edu.au/~seminar
> --------------------------------------------------
>

-- 
Andrae Muys
andrae at netymon.com
Principal Mulgara Consultant
Netymon Pty Ltd






More information about the General mailing list