동향

Distributed Terascale Facility (DTF)

발주처

국가

분야

접수기간

~

URL


Distributed Terascale Facility (DTF) Primary Sponsor: National Science Foundation Deadline: 4/19/2001 KEYWORDS Program Solicitation NSF 01-51 DIRECTORATE FOR COMPUTER AND INFORMATION SCIENCE AND ENGINEERING DIVISION OF ADVANCED COMPUTATIONAL INFRASTRUCTURE AND RESEARCH FULL PROPOSAL DEADLINE(S): April 19, 2001 Synopsis of Program: In FY 2001 NSF seeks to open a pathway to future computing, communications, and information environments by creating a very large-scale system that is part of the rapidly expanding computational Grid1. NSF will establish an advanced, multi-site "distributed facility" connected by ultra high-speed networking that will lead to breakthroughs and enhance the capabilities of U.S. researchers in all areas of computational, computer, and information science and engineering. This environment will include at least one single-site computing system capable of five or more teraflops per second (peak) performance. However, since modern scientific and engineering research requires more than just computational capability, this terascale computer system will be embedded within an overall system that also provides sophisticated data handling and interaction with remote sites. This distributed facility will include substantial support for accessing, analyzing, processing, transmitting, and visualizing multi-terabyte data collections of current and future interest to the U.S. research community. This will require the DTF to have terabytes to petabytes of online and archival storage available for user access and multi-gigabit per second network connectivity. The DTF will be fully coordinated with the resources and activities of the existing PACI partnerships. Special consideration will be given to qualified proposals that utilize newer generation processors and other High Performance Computing equipment. Full exploitation of this new computational environment will be enabled by fundamental computer science research on new algorithms, data structures, system software, information mining and visualization techniques, and collaborative environments for data exploration and analysis. "The word 'grid' is chosen by analogy with the electric power grid, which provides pervasive access to power and, like the computer and a small number of other advances, has had a dramatic impact on human capabilities and society. We believe that by providing pervasive, dependable, consistent and inexpensive access to advanced computational capabilities, databases, sensors, and people, computational grids will have a similar transforming effect, allowing new classes of applications to emerge." From the Preface to The Grid, Blueprint for a New Computing Infrastructure, Morgan Kaufmann Publishers, Inc. (1999), edited by Ian Foster and Carl Kesselman. I. INTRODUCTION NSF has a long history of support for high-performance computing and networking, beginning with the Supercomputer Centers program established in 1985 and the NSFnet. In 1998 the Partnerships for Advanced Computational Infrastructure (PACI) program replaced the Supercomputer Centers program. The PACI program added emphasis on the coupling of computational and computer science in order to more effectively exploit the emerging capabilities of scalable parallel systems, high performance networking and high bandwidth, large capacity mass storage systems, in addition to putting in place a formal education, outreach and training program. Due to the increased computational capability that is now available, computational science is currently experiencing a revolution in its ability to solve research problems. The recent demonstration of computers with speeds of a teraflop or more (1012 floating point operations per second) has directed attention to important fundamental science and engineering problems which are not amenable to solution with current systems, but would be accessible to terascale range computation. The President's Information Technology Advisory Committee (PITAC) final report, Information Technology Research: Investing in Our Future, released on Feb. 24, 1999, states: "If the United States is to continue as the world leader in basic research, its scientists and engineers must have access to the most powerful computers. Therefore, the Committee recommends that the Federal government continue to provide these computing systems to the research community through major, shared-facility centers. To increase long-term, fundamental research across all science and engineering disciplines, the first priority should be to increase the computing capacity of the centers that can best serve the entire research community." On August 3, 2000 the National Science Board (NSB), NSF's governing body, authorized the three-year award for a Terascale Computing System to the Pittsburgh Supercomputing Center following a national competition. This system will begin operation in February 2001, and the full system is anticipated to reach peak performance by October 2001. While a terascale computing system will satisfy the current needs of a large number of scientists and engineers doing modeling and simulation, modern scientific and engineering research requires more than just computational capability. Investments in large scale research instrumentation being made in such diverse fields such as astronomy, biology, earthquake engineering, environmental science, geosciences, gravitational science, and high energy physics, will not yield their full returns unless corresponding investments are made in the infrastructure needed for data analysis. Terascale computing systems and large-scale scientific instruments and sensors are now routinely creating multi-terabyte data archives. All the researchers involved encounter similar problems since computed, observed, and experimental data all require data manipulation and storage, visualization, data mining and interpretation. The rapidly increasing rate at which data are being generated and the distance between its point of generation and those who need access to information contained in the data are problems that must faced. The concept of a computational grid, as enunciated in the book The Grid: A Blueprint for a New Computing Infrastructure (http://www.mkp.com/grids/) and at other forums, provides a means of meeting these needs. Briefly, the Grid is the sum of networking, computing, and data storage technologies needed to create a seamless, balanced, integrated computational and collaborative environment. An unprecedented opportunity exists to take advantage of the emerging Grid technologies to create a national infrastructure that also includes digital libraries from observations, web-based portals to a large suite of computational resources, and support for remote use of scientific instruments. In the original PACI solicitation, NSF anticipated this need by stating "The emergence of scaleable parallel systems, high performance networking and high bandwidth, large-capacity mass storage systems creates the opportunity for a national infrastructure consisting of a number of geographically distributed sites strongly coupled to high-end computational resources and to each other via high-speed communication networks". The possibility now presents itself to take further advantage of current capabilities in order to more tightly couple various distributed resources like computing systems, data repositories, visualization systems, etc., by creating a system that is part of the national computational grid and enhances the capabilities available over those located at just one site, e.g., data acquisition, control of remote instruments, and computational steering. In order to fully exploit such systems, researchers will require new algorithms, data structures, advanced system software, distributed access to very large data archives, sophisticated information mining and visualization techniques, and collaborative environments for data exploration and analysis. Important research advances will be required in every aspect of high-performance computing, communications and information processing, necessitating a long-term, sustained, and coordinated research program. Given the greater complexity of highly parallel computer architectures, high-speed networks, petabyte data archives, and visualization and teleimmersion systems, a balanced approach to the deployment of a distributed facility that can effectively integrate a terascale system is even more critical. The concept of a computational facility as an isolated entity, an island where everything necessary for the solution and analysis of problems exists, has been slowly changing due to the PACI partnerships and the emergence of the Grid. This program will establish an advanced, balanced, multi-site "distributed facility" connected by ultra high speed networks that will significantly enhance the capabilities of U.S. researchers in all areas of computational, computer and information science and engineering. This facility will in turn open a pathway to future, even larger scale, computing and information environments. II. PROGRAM DESCRIPTION The purpose of this solicitation is to continue NSF's role of serving the needs of the high end computational and computer science research community in both simulation and data intensive computation. It will provide the major infrastructure needed to lead to breakthroughs and enable further advances across all fields supported by NSF. It will also take advantage of and contribute to the emergence of the Grid as another means of providing computational capability, sophisticated data handling and interaction with remote sites. With this in mind, NSF seeks to fund an advanced "distributed facility" that will demonstrate both single site, and "Grid enabled" capabilities for both simulation and data exploration beyond what is available at current PACI sites. The goal is to deliver production quality service from the distributed facility. The need for a distributed facility also acknowledges that the expertise for the various components of this new type of facility do not necessarily all reside in the same location, and that by collaborative efforts, great synergies can be enabled. The sort of resources one might expect to be found in the distributed facility could be, but are not limited to, the examples given below. - one computing system capable of five or more teraflops per second (peak) performance located at a single site, - another large, but not necessarily comparably configured system at another site coupled with the first to test large scale distributed computing across The DTF and other resources. - a networked system optimized to support the use of data stored at one site by a major computational resource at a geographically distant site - visualization facilities allowing data residing at possibly more than one site to be viewed on a system remote from the data - a distributed storage system allowing data to be stored at various sites on the Grid from a single computational resource - ultra high-speed network connections that will enable computational resources to access unique scientific instruments directly for experiments in on-line control and data collection The proposed distributed facility will add to the already existing capabilities provided by NSF and form the foundation of a distributed computational infrastructure that will meet the growing demands for modeling and simulation as well as anticipate the current and future needs of the scientific and engineering communities dealing with exceptionally large data intensive information management applications. The distributed facility will include substantial support for accessing, analyzing, processing, transmitting, and visualizing multi-terabyte data collections of current and future interest to the U.S. research community. This will require the DTF to have terabytes to petabytes of online and archival storage available for user access and multi-gigabit per second network connectivity. Fundamental computer science research, such as the FY00 ITR GriPhyN project, will be necessary to achieve truly effective use of this national resource. It is the expectation of the NSF that this system will be fully coordinated with the resources and activities of the existing PACI partnerships, such as, but not limited to, user support and consulting. As such, the proposed facility will be managed for the use of the national community in cooperation with the Division of Advanced Computational Infrastructure and Research at NSF. This multi-site distributed facility will be integrated into the nationwide Grid and will supplement the capabilities that are available through the PACI partnerships. The goal of this solicitation is to achieve the most computational infrastructure for the broadest scientific and engineering community within the funds available. Since the proposed distributed facility is not simply a terascale computing platform, the requirements of this solicitation can only be met by a collaborative proposal involving two or more geographically distinct sites. Achieving the goals set forth in this solicitation will require a joint development effort between individual sites, multiple vendors, the PACI partnerships, computer scientists, software engineers, and other possible collaborators. The problems to be resolved in implementing a system of this scale are complex. Having collaborative expertise available to address the multitude of problems that are likely to occur will enhance the chances for success. It is anticipated that only a portion of the $45 million need be spent on a five or more teraflop computing system located at a single site, and that the successful DTF should be a balanced system. It is also anticipated that existing equipment at the collaborating sites involved in a successful proposal may be made available to, or upgraded for integration into, the proposed distributed facility. It is therefore important to include substantial documentation on existing resources, and detailed plans on how such resources might be utilized as part of the overall project. VIII. CONTACTS FOR ADDITIONAL INFORMATION General inquiries regarding Distributed Terascale Facility should be made to: * Richard L. Hilderbrandt, Program Director, CISE/ACIR, Advanced Computational Infrastructure and Research, Rm. 1122, telephone: (703)292-7093, e-mail: rhilderb@nsf.gov. * Richard Hirsh, Deputy Division Director, CISE/ACIR, Advanced Computational Infrastructure and Research, Rm. 1122, telephone: (703)292-8970, e-mail: rhirsh@nsf.gov. * Robert Borchers, Division Director, CISE/ACIR, Advanced Computational Infrastructure and Research, Rm. 1122, telephone: (703)292-8970, e-mail: rborcher@nsf.gov. For questions related to the use of FastLane, contact: * Richard L. Hilderbrandt, Program Director, CISE/ACIR, Advanced Computational Infrastructure and Research, Rm. 1122, telephone: (703)292-7093, e-mail: rhilderb@nsf.gov. IX. OTHER PROGRAMS OF INTEREST The NSF Guide to Programs is a compilation of funding for research and education in science, mathematics, and engineering. The NSF Guide to Programs is available electronically at http://www.nsf.gov/cgi-bin/getpub?gp. General descriptions of NSF programs, research areas, and eligibility information for proposal submission are provided in each chapter. Many NSF programs offer announcements or solicitations concerning specific proposal requirements. To obtain additional information about these requirements, contact the appropriate NSF program offices. Any changes in NSF's fiscal year programs occurring after press time for the Guide to Programs will be announced in the NSF E-Bulletin, which is updated daily on the NSF web site at http://www.nsf.gov/home/ebulletin, and in individual program announcements/solicitations. Subscribers can also sign up for NSF's Custom News Service (http://www.nsf.gov/home/cns/start.htm) to be notified of new funding opportunities that become available. ABOUT THE NATIONAL SCIENCE FOUNDATION The National Science Foundation (NSF) funds research and education in most fields of science and engineering. Awardees are wholly responsible for conducting their project activities and preparing the results for publication. Thus, the Foundation does not assume responsibility for such findings or their interpretation. NSF welcomes proposals from all qualified scientists, engineers and educators. The Foundation strongly encourages women, minorities and persons with disabilities to compete fully in its programs. In accordance with Federal statutes, regulations and NSF policies, no person on grounds of race, color, age, sex, national origin or disability shall be excluded from participation in, be denied the benefits of, or be subjected to discrimination under any program or activity receiving financial assistance from NSF (unless otherwise specified in the eligibility requirements for a particular program). Facilitation Awards for Scientists and Engineers with Disabilities (FASED) provide funding for special assistance or equipment to enable persons with disabilities (investigators and other staff, including student research assistants) to work on NSF-supported projects. See the program announcement/solicitation for further information. The National Science Foundation has Telephonic Device for the Deaf (TDD) and Federal Information Relay Service (FIRS) capabilities that enable individuals with hearing impairments to communicate with the Foundation about NSF programs, employment or general information. TDD may be accessed at (703) 292-5090, FIRS at 1-800-877-8339. The National Science Foundation is committed to making all of the information we publish easy to understand. If you have a suggestion about how to improve the clarity of this document or other NSF-published materials, please contact us at plainlanguage@nsf.gov. PRIVACY ACT AND PUBLIC BURDEN STATEMENTS The information requested on proposal forms and project reports is solicited under the authority of the National Science Foundation Act of 1950, as amended. The information on proposal forms will be used in connection with the selection of qualified proposals; project reports submitted by awardees will be used for program evaluation and reporting within the Executive Branch and to Congress. The information requested may be disclosed to qualified reviewers and staff assistants as part of the proposal review process; to applicant institutions/grantees to provide or obtain data regarding the proposal review process, award decisions, or the administration of awards; to government contractors, experts, volunteers and researchers and educators as necessary to complete assigned work; to other government agencies needing information as part of the review process or in order to coordinate programs; and to another Federal agency, court or party in a court or Federal administrative proceeding if the government is a party. Information about Principal Investigators may be added to the Reviewer file and used to select potential candidates to serve as peer reviewers or advisory committee members. See Systems of Records, NSF-50, "Principal Investigator/Proposal File and Associated Records," 63 Federal Register 267 (January 5, 1998), and NSF-51, "Reviewer/Proposal File and Associated Records," 63 Federal Register 268 (January 5, 1998). Submission of the information is voluntary. Failure to provide full and complete information, however, may reduce the possibility of receiving an award. Pursuant to 5 CFR 1320.5(b), an agency may not conduct or sponsor, and a person is not required to respond to an information collection unless it displays a valid OMB control number. The OMB control number for this collection is 3145-0058. Public reporting burden for this collection of information is estimated to average 120 hours per response, including the time for reviewing instructions. Send comments regarding this burden estimate and any other aspect of this collection of information, including suggestions for reducing this burden, to: Suzanne Plimpton, Reports Clearance Officer, Information Dissemination Branch, Division of Administrative Services, National Science Foundation, Arlington, VA 2230, or to Office of Information and Regulatory Affairs of OMB, Attention: Desk Officer for National Science Foundation (3145-0058), 725 17th Street, N.W. Room 10235, Washington, D.C. 20503. NSF 01-51 OMB control number: 3145-0058.