We are pleased to announce that our Keynote speakers will be:
- Pete Beckman on Looking toward Exascale Computing
- Carl Kesselman on Virtual Organizations By the Rules
- Tony McGregor on Case Studies in Computer Network Measurement
- Dr Max Ott on Global Experimental Testbeds for Studying Future Internet Technologies
Recently, Argonne National Laboratory installed a half petaflop Blue Gene/P system. It is the world’s fastest open science supercomputer. With 163,840 cores, the machine is beginning to provide insight on how we might build future platforms as we scale toward exascale computing. There are many challenges, including the dramatic shift to multicore, the cost of electric power, and the need for robust fault management. In this talk I will focus on the architecture and system software challenges we face as we continue to attack ever-larger computational problems.
Pete Beckman is a recognized global expert in high-end computing systems. During the past 20 years, he has designed and built software and architectures for large-scale parallel and distributed computing systems.
After receiving his Ph.D. degree in computer science from Indiana University, he helped found the university’s Extreme Computing Laboratory, which focused on parallel languages, portable run-time systems, and collaboration technology. In 1997 Pete joined the Advanced Computing Laboratory at Los Alamos National Laboratory, where he founded the ACL’s Linux cluster team and launched the Extreme Linux series of workshops and activities that helped catalyze the high-performance Linux computing cluster community.
Pete also has been a leader within industry. In 2000 he founded a Turbolinux-sponsored research laboratory in Santa Fe that developed the world’s first dynamic provisioning system for cloud computing and HPC clusters. The following year, Pete became Vice President of Turbolinux’s worldwide engineering efforts, managing development offices in Japan, China, Korea, and Slovenia.
Pete joined Argonne National Laboratory in 2002. As Director of Engineering, and later as Chief Architect for the TeraGrid, he designed and deployed the world's most powerful Grid computing system for linking production HPC computing centers for the National Science Foundation. After the TeraGrid became fully operational, Pete started a research team focusing on petascale high-performance software systems, wireless sensor networks, Linux, and the SPRUCE system to provide urgent computing for critical, time-sensitive decision support. In 2008 he became the Project Director for the Argonne Leadership Computing Facility, which is home to the world’s fastest open science supercomputer. He also leads Argonne’s exascale computing strategic initiative and explores system software and programming models for exascale computing.
Increasingly, collaborative activities in science are using the concept of virtual organization as an organizing principle. One benefit of viewing these collaborations from an organizational perspective is that there is a long history of studying how organizations can be structured to function effectively. Many of these organizational principles have been reflected in the design of enterprise architectures and the use of service oriented architecture concepts as an implementation vehicle for capturing these organizational constructs.
One approach to meeting organizational requirements in systems architecture has been to express organizational structure in terms of business roles, business processes and business rules. To date however, this type of analysis and associated infrastructure tools has not been applied in any consistent way to the concept of virtual organizations and their associated scientific applications. In my talk, I will explore these established approaches to business IT systems and their applicability to the virtual organizations that are being created to support scientific endeavors. As an example, I will describe how data management policies for virtual organization can be expressed as business rules, and implemented via existing business rules engines.
Carl Kesselman is a Professor in the departments of Industrial and System Engineering and Computer Science in the School of Engineering at the University of Southern California. Dr. Kesselman is also a Fellow in the Information Sciences Institute where he is the co-director of the Medical Information Systems Division. He received a Ph.D. in Computer Science from the University of California, Los Angeles, a Master of Science degree in Electrical Engineering from the University of Southern California, and Bachelors degrees in Electrical Engineering and Computer Science from the University at Buffalo. Dr. Kesselman also serves as Chief Scientist of Univa Corporation, a company he founded with Globus co-founders Ian Foster and Steve Tuecke.
Dr. Kesselman’s current research interests are are focused on the applications of distributed computing technology to Medical Informatics. In addition, he is interested in all aspects of Grid computing, including basic infrastructure, security, resource management, high-level services and Grid applications. He is the author of many significant papers in the field. Together with Dr. Ian Foster, he initiated the Globus Project™, one of the leading Grid research projects. The Globus project has developed the Globus Toolkit®, the de facto standard for Grid computing.
Dr. Kesselman received the 1997 Global Information Infrastructure Next Generation Internet award, the 2002 R&D 100 award, the 2002 R&D Editors choice award, the Federal Laboratory Consortium (FLC) Award for Excellence in Technology Transfer and the 2002 Ada Lovelace Medal from the British Computing Society for significant contributions to information technology. Along with his colleagues Ian Foster and Steve Tuecke, he was named one of the top 10 innovators of 2002 by InfoWorld Magazine. In 2003, he and Dr. Foster were named by MIT Technology Review as the creators of one of the “10 technologies that will change the world.” He was recognized in 2007 along with Dr. Stephan Eberich by an Internet2 Idea award and Computerworld’s Horizon award. In 2006 Dr. Kesselman received an Honorary Doctorate from the University of Amsterdam.
A network lies at the heart of many modern computer systems including most distributed and parallel systems. A good understanding of the behaviour and performance of the network is, therefore, a prerequisite for the understanding the behaviour and performance of many computer systems. Understanding network behaviour and performance is challenging because networks are among mankind’s largest and most complex creations and because most networks, including the Internet, are not designed with network measurement in mind. As a consequence, much network measurement relies on inference from data that does not directly describe the item of interest. Measurement also often requires a widely spread physical infrastructure, for example a network of monitoring machines. This is expensive both in terms of the hardware required and the resources needed to deploy and maintain the systems.
Network measurement systems can be catalogued under three main headings: active measurement, where data is injected into the network and the response is measured; passive measurement where characteristics of the data flowing on a link are measured; and control flow monitoring, where network management data is collected and analysed.
An orthogonal catalogue of network measurement, arranged by the motivation for measurement, is also possible. The major motivations for measurement are: curiosity based research intended to record and understanding the behaviour of the network; predicting the future behaviour of the network; monitoring the networks for potential failure; understanding and improving performance; discovery and diagnosis of current faults and discovery of undesirable user behaviour (primarily so-called “lawful intercept”). Success in these goals is important for the health of the network and the applications, organisations and societies that use the network. In the extreme case, the failure of some networks, probably including the Internet, can cause human fatalities. The categories listed in this catalogue are not distinct; some projects meet the needs of more than one category so the active/passive distinction is more common.
CAIDA’s Skitter project is an example of an active measurement project. Skitter measures the topology of the Internet using a technique similar to traceroute. The challenge for skitter is to understand the topology not from a point to point perspective (as is the case with traceroute) but from a global “macroscopic” perspective. To approximate a global view, CAIDA operates a network of about 30 probing machines at different Internet locations. Each probes approximately 1 million Internet addresses. The probe packets are addressed to a destination but have a limited lifetime so most packets expire before they reach the destination. When a packet expires, an error message is returned to the source. These warning packets, and their timing, provide information about the path that data takes from the probing machine to the destination. The challenges for skitter include: deploying and maintaining the probing machines; the discovery, maintenance and decay of a suitable set of probe destinations; maintaining global cooperation; and presentation of the results in a form that has meaning for human observers. Over the ten years to February 2008, the Skitter project collected 4Tb of compressed data.
The WAND WITS passive trace archive is an example of a passive measurement project. The WITS archive contains information about the packets that flow on the networks they monitor. These traces are collected using a passive tap which records, but does not modify, the data flowing on the link. The archive includes traces from commercial and academic networks. The challenges for passive measurement include: accurately capturing the traffic, including the time at which it was observed and protecting the privacy of the networks users while permitting useful analysis. The large volumes of data also present challenges including: management and storage, making the data available to the research community and analysis of the data. The WITS traces are collected using DAG passive measurement cards, from Endace systems. These cards are designed to capture every packets’ header, even under full load conditions. They use an externally synchronised clock (often a GPS receiver) to time stamp packets in hardware as they arrive. The challenges presented by the size of passive traces means availability of long traces is a particularly important challenge. The WITS traffic archive includes a very long trace set from the University of Waikato’s Internet connection. With the exception of a few outages, this trace set includes the TCP/IP header of every packet that entered and left the university since December 2003. Packets have a GPS based time stamp. The WITS trace archive is about 8Tb of compressed data.
The Oregon Route Views project is an example of a control flow measurement project. Route Views collects routing information from the Internet. To do this a routing information connection is established between a router at The University of Oregon and other routers “at interesting places” around the Internet. Currently there are about 40 of these peers. The Route Views routers accept routes from these other routers but never offers them any routes or routes any traffic to them. In this respect Route Views is passive, although it uses the network to exchange the data it collects. Because the Route Views routers get routes from each of the routers they peer with they normally see several paths to each destination. The initial purpose of Route Views was to allow network providers to see how their routing appeared from other points in the network, however the multiple viewpoints it provides for routing data has supported many academic projects and papers. A daily snapshot is stored of the routing system as it is seen by Route Views. Each snapshot is about 7Mb compressed (about 400Mb raw) with 18 months of data online totalling to about 4GB of data.
This talk will cover the motivations and challenges for network measurement as illustrated by some of the major network measurement projects that have been undertaken in recent years.
Associate Professor Tony McGregor is a member, and former chairperson, of the Department of Computer Science at The University of Waikato. He teaches operating systems and computer networks. His research interests lie in the computer networks area, especially in network performance measurement and simulation. He was the chief architect and project leader for the NLANR (National Laboratory for Applied Network Research) AMP (Active Measurement Project) which deployed for 150 network measurement devices around the world to measure the evolution and performance of the advanced research and education networks. Tony also leads the WAND centre for network innovation. Through the centre he has been involved with many industry based research projects including those for Alcatel, NZ Telecom, ihug. The centre has also been the source of a number of startup companies including Endace one of New Zealand’s fastest growing technology exporters.
Over the last 30 years the Internet has become a crucial backbone of our social as well as economic life. It has not only dramatically grown in size, but also in complexity as it took on many additional and crucial functions (such as mobility), not envisioned by its original design.
As the patches and cracks in the current architecture become more and more apparent, the calls for more radical “clean-slate” architectures for a next-generation Internet are increasing as well.
Many of us believe that to effectively conduct fundamental research in new networking and communication paradigms we need an experimentally-driven approach to validate those ideas in real-world like settings. This requires robust and relevant large-scale experimentation facility as well as the necessary culture, methods and tools to produce scientifically sound and repeatable results.
In my talk I will provide an overview of some of current activities, especially under the GENI (US) and FIRE (EU) initiatives. I will also outline the requirements for the planned large-scale experimental facilities as well the challenges to the research community to take full advantage of these resources.
Dr Ott is the Research Group Leader of the Networked Systems Theme at NICTA since January 2006. His group is active in various international testbed activities, such as FIRE in Europe and GENI in the US.
Before joining NICTA he co-founded Semandex, which is a pioneer provider of Content-Based Networks, a new generation of Enterprise Information Integration and Knowledge Management systems. He is also holding a Research Professor appointment at WINLAB, Rutgers University, where he is responsible for the software architecture of the ORBIT testbed.
He holds a Ph.D. in electrical engineering from Tokyo University, Japan, and an M.S. from the Technical University, Vienna, Austria.
He has worked on many different things over his career, including networking, operating systems, software development, any weird computer language, video processing, user interface design, web & semantic web technologies, starting a company, writing a business plan, changing diapers, Lego & toy trains, and kayaking.
E-mail addresses shown on this page are not to be used for unsolicited commercial electronic messages.
Last modified: Monday, 02-Feb-2009 14:27:35 NZDT