Tutorials
Tutorial T1: Storage and Retrieval of XML Data using Relational Databases
Surajit Chandhuri (Microsoft Research) and Kyuseok Shim (KAIST)
Tuesday 11th September, 11:00-12:30, Aula Magna
|
Abstract:
XML is the dominant standard for exchanging and querying information on the Web. Efficient storage and retrieval of native XML data in a scalable manner is becoming increasingly important. All major vendors of relational database systems have undertaken initiatives to build in support for XML in their platforms. Our tutorial discusses the following core challenges in such an approach: (1) storage of XML data in relational systems, (2) query capabilities on stored XML, and (3) publishing of existing relational data in XML.
About the authors:
Surajit Chaudhuri is a senior researcher and manager of the Data Management, Exploration and Mining group at Microsoft Research. His research interests include self-tuning database systems, data warehousing, data mining on relational platform, and integration of relational, text and semi-structured information access in the context of XML. Prior to joining Microsoft Research, Surajit was a member of the technical staff at Hewlett-Packard Laboratories in Palo Alto. He did his B.Tech from Indian Institute of Technology-Kharagpur, INDIA, and Ph.D. from Stanford University, USA.
Kyuseok Shim is an assistant professor at Korea Advanced Institute of Science and Technology (KAIST) in Korea. Before joining KAIST, he was a member of technical staff and was one of the key contributors to the Serendip Data Mining project at Bell Laboratories. Before that, he worked for Quest Data Mining project at IBM Almaden Research Center. He received B.E. degree in Electrical Engineering from Seoul National University, and the MS and Ph.D. degrees in Computer Science from University of Maryland, College Park. He has been working in the area of databases, data mining and XML.
|
Tutorial T2: Data Management for Pervasive Computing
Mitch Cherniack (Brandeis University), Michael Franklin (University of California, Berkeley), and Stan Zdonik (Brown University)
First part: Tuesday 11th September, 14:30-16:00, Aula Magna
Second part: Tuesday 11th September, 16:30-18:00, Aula Magna
|
Abstract:
Pervasive computing is quickly moving from vision to reality. The combination of global wireless and wired connectivity along with increasingly small and powerful devices enables a wide array of new applications that will change the nature of computing. Beyond new devices and communications mechanisms, pervasive computing requires data management. In this 3-hour tutorial we will: (1) survey the pervasive computing landscape, with an emphasis on requirements for data management; (2) describe key data management technologies, including: data filtering and dissemination, event management, device synchronization, sensor data processing, and user profile management; and (3) outline open problems and areas for future research.
About the authors:
Mitch Cherniack is an Assistant Professor at Brandeis University; Michael Franklin is an Associate Professor at the University of California, Berkeley; Stan Zdonik is a Professor at Brown University; Cherniack, Franklin, and Zdonik have worked extensively in the areas of query processing, data broadcast and dissemination, caching and distributed data management. They currently lead the NSF-ITR-sponsored "Data Centers" project investigating profile-driven data management for pervasive computing environments.
|
Tutorial T3: Caching Technologies for Web Applications
C. Mohan (IBM Almaden Research Center)
Wednesday 12th September, 09:00-11:00, Aula Magna
|
Abstract:
The emergence of the Web has transformed the execution environment of transactional, server-side applications. 3 and 4-tier application environments involving browser-based clients and Web/application/ database servers are the norm these days. The generation and distribution of multimedia content has also increased dramatically. Attaining good end to end performance under these circumstances requires exploitation of caching technologies. Caching is being deployed at different stages in the software and hardware hierarchies. Work is in progress to design caching standards. In this tutorial, I will provide an introduction to different caching technologies and their support by different products and specialized systems/vendors. I will also discuss the tradeoffs involved with different caching granularities and cache deployment points.
About the author:
Dr. C. Mohan joined IBM Almaden Research Center in 1981. He was named an IBM Fellow in 1997 for being recognized worldwide as a leading innovator in transaction management. He received the 1996 ACM SIGMOD Innovations Award. From IBM, he has received 1 Corporate and 8 Outstanding Innovation/Technical Achievement Awards. He is an IBM Master Inventor with 33 patents. Mohan's research results are implemented in numerous IBM and non-IBM systems like DB2, MQSeries, Lotus Domino and S/390 Parallel Sysplex. He is the primary inventor of the ARIES family of recovery and locking methods, and the industry-standard Presumed Abort commit protocol. At VLDB'99, he was honored with the 10 Year Best Paper Award for the widespread commercial and research impact of the ARIES algorithms. He is an editor of VLDB Journal, and Journal of Distributed and Parallel Databases. Currently, Mohan is a member of the IBM Application Integration Middleware (AIM) Architecture Board and is working on database caching in the context of WebSphere and DB2.
|
Tutorial T4: Approximate Query Processing: Taming the Terabytes
Minos Garofalakis and Phillip Gibbon (Bell Laboratories)
Wednesday 12th September, 15:00-16:30, Aula Magna
|
Abstract:
Approximate query processing has recently emerged as a viable solution for dealing with the data volume and query complexity of modern decision-support applications. The strong incentive for high-quality approximate query answers has spurred a flurry of research activity, as well as some modest enhancements
to commercial systems. This tutorial aims to provide a comprehensive and meaningful overview of the key research results and commercial developments surrounding approximate query processing. In addition to the systematic coverage of research results in the area, our discussion will focus on: (1) Comparative evaluation of the various proposed data-reduction mechanisms. (2) Suitability and scope of related approximate query processing techniques. (3) State-of-the-art for commercial database servers. (4) Advanced techniques and promising research directions.
About the authors:
Minos Garofalakis is a Member of Technical Staff at the Information Sciences Research Center of Bell Laboratories, Lucent Technologies. He joined Bell Labs in 1998, after completing a Ph.D. in computer science at the University of Wisconsin-Madison. His current research interests lie in the areas of data reduction and mining, approximate query processing, data warehousing, and network management. He has served as a program committee member for several international conferences in the database area (including ACM SIGMOD, VLDB, and IEEE ICDE).
Phillip B. Gibbons is also a Member of Technical Staff at the Information Sciences Research Center of Bell Laboratories, Lucent Technologies. He joined Bell Labs in 1990, after completing a Ph.D. in computer science at the University of California at Berkeley. His current research interests include massive data sets, query processing and optimization, approximate query answering, data mining, and parallel computing. He has served on over a dozen program committees for international conferences, and is on the editorial board for the IEEE Transactions on Parallel and Distributed Systems.
|
Tutorial T5: Information Management for Genome Level Bioinformatics
Norman Paton and Carole Goble (University of Manchester)
First part: Thursday 13th September, 09:00-10:30, Aula Magna
Second part: Thursday 13th September, 11:00-12:30, Aula Magna
|
Abstract:
Genomics is an experimental, knowledge-based discipline central to biology and biomedical research. The recent sequencing of a number of complete genomes is moving the discipline from a first generation sequencing phase, to a second generation functional one. In both of these phases, novel experimental methods are generating substantial, complex information resources. The nature of the data and the tasks associated with the data will put significant pressure on current information management systems, and thus genome level bioinformatics can be seen as presenting important challenges to database researchers. The aim of this tutorial is to make the challenges presented by genome level bioinformatics familiar to the wider database research community. This will be done be introducing some of the relevant biology, describing some of the challenges associated with the modeling and management of genomic data, and pointing out some of the open issues in genome information management.
About the authors:
Norman Paton is a Professor of Computer Science at the University of Manchester. His research has mostly been on object databases, including active databases, deductive object-oriented databases, user interfaces to databases and spatial databases. In bioinformatics, he has considerable experience in the development of information management systems for managing, integrating and analysing biological data.
Carole Goble is a Professor of Computer Science at the University of Manchester. Her research interests are centred around the accessibility of information, particularly the use of terminological and ontological services for the representation and classification of metadata. In Bioinformatics, she has worked on several projects, in particular focusing on the development and use of ontologies for information integration and annotation.
|
Tutorial T6: Managing Business Processes via Workflow Technology
Frank Leymann (IBM)
First part: Friday 14th September, 09:00-10:30, Aula Magna
Second part: Friday 14th September, 11:00-12:30, Aula Magna
|
Abstract:
We will review the evolution of business process management technology. The fundamental modelling and runtime capabilities of workflow systems will be presented. Architectural aspects will be discussed based on IBM MQSeries Workflow. The link to extended transactions will be drawn. Structures of applications exploiting workflow technology will be sketched. The relevance of workflow technology for Web Services will become clear.
About the authors:
Frank Leymann is an IBM Distinguished Engineer, and a member of the IBM Academy of Technology. He studied Mathematics, Physics and Astronomy and received a M.Sc. (1982) and a Ph.D. (1984) both in Mathematics from University of Bochum, Germany. Since 1984 with IBM, Frank began as a system programmer and database programmer, worked on various database technology projects (libraries, universal relations, OO- and OR-DBMS, mining, repositories, tools) and workflow systems. He is in charge of IBM's workflow technology. Since a year he is also working on Web Services technology.
|
|