SURA Cyberinfrastructure Workshop:

Grid Application Planning & Implementation

January 5, 6 & 7, 2005
Georgia State University
Atlanta, GA

This event provided an opportunity for attendees to extend their understanding of various definitions of "grid" and grid-based applications, gain insight into a broad range of applications that are benefiting from grids, share recommendations and roadmaps for successful grid application deployments and ask grid experts about specific project needs and meet others interested in collaboration and peer support for deploying grids today. The workshop was targeted to reach a broad audience interested in moving forward with grid building and application deployment, including:

Georgia State University Continuing Education Credits were available to those who attended the full workshop and completed the evaluation form.


Wednesday, January 5

5:30 - 7:00 p.m. Reception
Keynote presentation
Edward Seidel, Director, Center for Computation & Technology
Louisiana State University
Abstract
Bio

Thursday, January 6

8:15 - 8:30 a.m. Opening
Welcome from SURA
Gary Crane, Director of IT Initiatives, SURA
Welcome from Georgia State University
JL Albert, Associate Provost of Information Technology and CIO, Georgia State University

8:30 - 10:00 a.m. Grid Overview: Concepts & Technologies
8:30 - 9:00 a.m. Grid Definitions & Perspectives
Mary Fran Yafchak, IT Program Coordinator
SURA
Abstract
Bio
9:00 - 9:45 a.m. Promises and Challenges of Grid and Cluster Computing
Heinz J. Schwarz
Sun Microsystems
Abstract
Bio
9:45 - 10:00 a.m. Moderated views from the audience

10:00 - 10:30 a.m. Break

10:30 a.m. - 12:00 p.m. "Parade of Grids"
10:30 - 11:00 a.m. Jefferson Lab: Experimental and Theoretical Physics Grids
Andy Kowalski
Jefferson Lab
Abstract
Bio
11:00 - 11:30 a.m. GridLab Application Areas
Gabrielle Allen, Associate Professor, Center for Computation & Technology
Louisiana State University
Abstract
Bio
11:30 a.m. - 12:00 p.m. SCOOP (SURA Coastal Ocean Observing & Prediction) v1.0
Sandra Redman, SCOOP IT team, and Research Scientist
University of Alabama in Huntsville
Abstract
Bios

12:00 - 1:00 p.m. Lunch

1:00 - 2:00 p.m. "Parade of Grids" (continued)
1:00 - 1:30 p.m. MCNC Grid Initatives
Phillip Emer
Chuck Kesler
MCNC Grid Computing and Networking Services
Abstract
Bios
1:30 - 2:00 p.m. SURA NMI Utility Grid: Sharing Resources, Sharing Results
Art Vandenberg, Director, Advanced Campus Services
Georgia State University
Abstract
Bio

2:00 - 2:40 p.m. How to "Grid-enable" an Application  
2:00 - 2:20 p.m. Focus Study #1: Mining on the Grid with ADaM
Sandra Redman, Research Scientist
University of Alabama in Huntsville
Abstract
Bio
2:20 - 2:40 p.m. Focus Study #2: Using the AT Grid for Genomics Research at the University of Florida - Big Biology meets Obvious Opportunity
Bill Farmerie, Scientific Director, ICBR Genomics Group
University of Florida
Abstract
Bio

2:40 - 3:10 p.m. Break

3:10 - 4:00 p.m. How to "Grid-enable" an Application (continued)  
3:10 - 3:30 p.m. Focus Study #3: Using Resource Virtualization Techniques to Grid-enable Coupled Coastal Ocean Models
Renato Figueiredo, Assistant Professor, Advanced Computing & Information Systems
University of Florida
Abstract
Bio
3:30 - 3:50 p.m. Focus Study #4: Parallel Algorithm for Multiple Genome Alignment Using Multiple Clusters
Art Vandenberg, Nova Ahmed, Yi Pan
Georgia State University
Abstract
Bios
3:50 - 4:00 p.m. Moderated discussion and Q&A

4:00 - 5:30 p.m. "Ask A Grid Expert" Panel & Interaction

Moderator: Edward Seidel, Director, CCT, Louisiana State University
Panelists:
* Gabrielle Allen, Associate Professor & Asst. Director, CCT, Louisiana State University
* Jay Boisseau, Texas Advanced Computing Center (TACC), UT Austin
* Brian Bonenfant, Business Development Manager, SGI
* Dr. Martin Maldonado, Sr. Technical Architect, Grid Computing, Universities & Research, IBM
* Rick Schlichting, Director, Software Systems Research, AT&T Labs
* Heinz Joerg Schwarz, Senior Program Manager, External Research Office, Sun Microsystems
* Jikku Venkat, CTO, United Devices


Friday, January 7

8:00 - 10:00 a.m. Topical Break-outs (top three selected by attendees)
1. Getting Started Building a Grid/Different Grid Technologies - Facilitator: Phil Emer, MCNC
2.
Application Tool Sets - Facilitator: Gabrielle Allen, Louisiana State University
3.
Authn/Authz & Security/Policies, Standards, Technologies - Co-facilitators: Art Vandenberg, Victor Bolet, Georgia State University

10:00 - 10:30 a.m. Break

10:30 a.m. - 12:00 p.m. Further Context for Implementation  
10:30 - 11:00 a.m. Designing a New Network Architecture for U.S. Research & Education
Steve Corbato´, Director, Network Initiatives
Internet2
Abstract
Bio
11:00 - 11:30 a.m. Supporting Grid Environments
Leigh Grundhoefer
iVDGL/Indiana University
Abstract
Bio
11:30 a.m. - 12:00 p.m. Directions & Opportunities for Funding
Sue Fratkin, SURA IT consultant
Fratkin Associates
Abstract
Bio

12:00 - 12:10 p.m. Adjourn

Presentation Abstracts & Speaker Bios

Keynote: Enabling Science and Engineering Applications on the Grid
Edward Seidel
Return to top

Abstract
The Grid has the potential to fundamentally change the way science and engineering are done. Aggregate power of computing resources, file systems, and sensors connected by networks ---of the Grid--- exceeds that of any single resource by orders of magnitude. At the same time, our ability to access and analyze distributed data, or carry out computations of the scale and level of detail required, for example, to study the Universe, or simulate a rocket engine, are severely constrained by available computing power. This keynote will discuss some modern computing applications, including simulations of colliding black holes, and show how they are driving the development of Grid computing technology. Applications are already being developed that are not only aware of their needs, but also of the resources available to them on the Grid. With new technologies being deployed, they will be able to adapt themselves automatically to respond to their changing needs, to spawn off tasks on other resources, and to adapt to the changing characteristics of the Grid including machine and network loads and availability. The presentation will highlight some of the emerging technologies that enable development of such applications, a number of innovative scenarios for computing on the Grid enabled by these technologies, and show how close these are to being a reality.

Speaker Bio
Ed Seidel is a physicist recognized worldwide for his work on numerical relativity, black holes, and high-performance computing. He earned his Ph.D. from Yale University in 1988. Seidel worked at the University of Illinois and led the National Center for Supercomputing Applications Numerical Relativity group for a number of years. He was a professor at the Max-Planck-Institut fuer Gravitationsphysik (Albert-Einstein-Institute) in Golm, Germany from 1996 - 2003. Seidel is currently serving as the director of the Center for Computation & Technology at Louisiana State University and Floating Point Systems Professor of Physics and Computer Science.

******************

Grid Definitions & Perspectives
Mary Fran Yafchak
Return to top

Abstract
This presentation will highlight a variety of definitions and concepts surrounding the term "grid" as applied to this emerging technology. Answers to questions of "what is a grid?" and "why use a grid" should be explored within this larger context - to plan for and progress within particular deployments while remaining aware of and open to the fluidity within the current grid evolution. The presentation will also serve as a guide to participation in the activities that follow within the workshop and how they can be leveraged to inform and support specific projects and applications.

Speaker Bio
Mary Fran Yafchak is the IT Program Coordinator for SURA (Southeastern Universities Research Association) where she works to further the development of regional information technology collaborations and their synergy with relevant national and international developments. She is a member of ViDe? (Video Development Initiative) and past co-chair and current member of the Internet2 Digital Video working group. Mary Fran is currently completing three years of program development and management for the NMI (NSF Middleware Initiative) Testbed Program, part of a NSF-funded partnership with Internet 2, EDUCAUSE, and the GRIDS Center.

In current and past roles, Mary Fran has enabled and supported diverse initiatives related to the development and dissemination of advanced network technologies. These include the NYSERNet Video-IP Project, the NYSERNet Multipoint Conferencing Service trial, integration of Internet resources within the K-12 community, and facilitation of university-based advanced application support teams. She also spent three years as an Internet trainer with the NYSERNet Information Technology Education Center (NITEC). Mary Fran holds a B.S. in Secondary Education/English from SUNY Oswego and a M.S. in Information Resource Management from Syracuse University.

******************

Promises and Challenges of Grid and Cluster Computing
Heinz Joerg Schwarz
Return to top

Abstract
This talk reviews trends in Supercomputing and gives an overview of challenges associated with Cluster computing and ways to address these challenges with design choices. Since June of 2005, more than 50% of the systems listed in the Top500 list are clusters, and the Trend continues with more and more organizations around the world building small and large, general purpose and very specialized clusters.

Clusters have gained popularity in both academia and industry because they provide a relatively cheap way to aggregate compute and data processing power. While there is certainly an incentive to use clusters, there are also challenges to make their use successful. The purpose of the talk is to uncover those challenges and demonstrates different ways to address them with design choices. Some of the major design criteria that have to be considered in a cluster architecture are discussed in a vendor neutral way at an introductory level. Specifically, Space/Power/Cooling, Interconnects, File-Systems and SW-Provisioning & Management Tools are reviewed. CPU Architectures and Parallel File systems are discussed with respect to these criteria. The talk also covers a brief working definition of clusters versus grids and describes briefly some grid infrastructure and services elements.

Speaker Bio
Heinz Joerg Schwarz [briefly pronounced "York"] is currently Senior Program Manager in the External Research Office (ERO) of Sun Microsystems, Inc. Laboratories.

In his current role, Joerg is responsible for identifying opportunities for and implementation of collaborations between Sun researchers and engineers with the academic community. This involves a broad spectrum of IT areas and legal aspects, mainly with regards to intellectual property. His own research area involves information systems for data- and numerical-intensive computing. He advises Sun's senior management on strategy and policy for HPTC. He published several papers on infrastructure for collaborative computing environments and gave keynote talks in numerous scientific conferences on Cluster, Grid and High Performance Computing.

Prior to his current assignment, Joerg was responsible for Sun's global business development in education and academic research. In 1999, shortly after accepting his assignment in Sun's global HQ, Joerg and his group started building a community of "Centers of Excellence". The Sun COE community pioneered the implementation of computational clusters and data grids in combination with large memory computers for key scientific applications. Today, many of the COEs are also focusing on the management of large datasets and collaborative, remote visualization. Joerg serves as chairman of the technical advisory and certification board of the NSF COE for high performance computing in Maui, Hawaii and University advisory boards.

Joerg has 15 years of experience in the IT Industry, 10 years as a consultant and manager in the Education, Research, and Healthcare industry. During his career at Sun Microsystems and Bull GmbH? , he also designed and implemented IT projects in retail, pharma and government. Joerg studied biology and history at the University of Cologne and holds a graduate degree in economics (with emphasize in data processing and international business) from the University of Applied Science, Cologne. He is currently finishing a Masters degree in Information Systems at the University of San Francisco.

******************

Jefferson Lab: Experimental and Theoretical Physics Grids
Andy Kowalski
Return to top

Abstract
An increasing data volume for experimental physics and a growing user community have led both High Energy and Nuclear Physics collaborations to adopt grid technology for simulation and analysis. Jefferson Lab has adopted a web services grid approach, where key services are stateless web services. As part of the SRM (Storage Resource Manager) collaboration, JLab has participated in two cycles of specification and software development of this component, and is using version 2 of the SRM in production analysis. The Lab's theory community is collaborating in the development of the International Lattice Data Grid, which is to be a grid-of-grids loosely coupled by standards for a Meta Data Catalog and Replica Catalog. Current experiences and near-term evolution of the experimental and theory grids will be presented.

Speaker Bio
Andy Kowalski obtained a B.S. in Computer Science from Old Dominion University in 1992. From 1992 to 1995, he worked on various DOT and DOD contracts designing IP-based networks and deploying command and control systems. In 1995, Andy joined Jefferson Lab where he has been responsible for the design and deployment of the mass storage system, compute farm, and the underlying network in support of the experimental physics program. In 2001, he started investigating data grids and how they could be used by the experimental program at Jefferson Lab. As a member of the Particle Physics Data Grid (PPDG) collaboration, he has worked with the SRM collaboration to help define the SRM specification as a standard service interface to the mass storage systems used by data grids. Recently, Andy became the Deputy Computer Center Director for Jefferson Lab.

******************

GridLab Application Areas
Gabrielle Allen
Return to top

Abstract
The European GridLab? project is building tools to provide application developers easy access to Grid technologies. The central piece of the GridLab? software is the Grid Application Toolkit, which provides an application oriented API to a range of Grid capabilities. In this talk, we describe the motivation and design of the Grid Application Toolkit and show how it is being adopted by different communities to provide capabilities such as job migration, data management, and notification.

Speaker Bio
Gabrielle Allen is Associate Professor of Computer Science at Louisiana State University, and Assistant Director for Computing Applications at the Center for Computation & Technology. Gabrielle is a PI for several large Grid projects, including GridLab? , UCoMS? , and GridChem? , and is the lead of the Cactus project. She obtained her PhD? in Astrophysics at Cardiff University in 1993 after obtaining the Certificate of Advanced Study in Applied Mathematics and Theoretical Physics from Cambridge University in 1989, and a B.S. in Mathematics from Nottingham University in 1988. Her research interests include grid and high performance computing, computational science and numerical relativity.

******************

SCOOP (SURA Coastal Ocean Observing & Prediction) v1.0
Sandra Redman
Return to top

Abstract
Our nation stands on the verge of creating a national system for observing and predicting the myriad events that occur in America's vital coastal waters. The Southeastern Universities Research Association (SURA) Coastal Ocean Observing and Prediction (SCOOP) program is working towards this end by integrating regional ocean observing systems and developing a prototype distributed laboratory based on grid technology for coastal research and operations at the national level. This system will impact use, stewardship and management of our coastal regions, and will allow us to protect them from a host of man-made and natural hazards, including pathogens, toxins, biohazards, and storms.

In the coming year, SCOOP will implement key elements of a distributed system for assessing and predicting environmental response to extreme events in the eastern U.S. coastal zone, from Canada to Mexico. The program will focus on storm surge, wind waves and surface currents, with special attention to predicting and visualizing phenomena that cause damage and inundation of coastal regions during severe storms and hurricanes. The researchers' goals are: (1) to measure, understand and predict environmental conditions, (2) to provide R&D support for operational agencies including NOAA, the U.S. Navy, and others, and (3) to include outreach and education components that assure relevance of their observing activities. This presentation will provide an overview of SCOOP and a discussion of the implementation plans for developing the grid-based SCOOP system.

Mining on the Grid with ADaM?
Sandra Redman
Return to top

Abstract
This presentation will highlight next-generation data mining applications using grid technology. The UAH Information Technology and Systems Center's Algorithm Development and Mining (ADaM? ) system mines large scientific data sets for geophysical phenomena detection and feature extraction. The ADaM? toolkit consists of interoperable components that can be linked together in a variety of ways to aid researchers in defining and performing data mining operations on scientific and engineering data. ADaM? is also available as a grid-based application, which provides researchers with the capability to use geographically distributed system resources in new ways to achieve greater results than previously possible.

As a member of the Environmental Hydrology Application Technology Team, ITSC researchers developed data mining algorithms using ADaM? for the Modeling Environment for Atmospheric Discovery (MEAD) Expedition, part of the National Center for Supercomputing Applications (NCSA) TeraGrid? Alliance program. The large-scale, distributed computational and storage infrastructure of a grid, with core middleware and services, make it an ideal platform for mining large volumes of computationally intensive satellite imagery as well as other large volumes of observational data such as those from Doppler radar. ADaM? is also being used to generate custom mining applications which accommodate the real time and dynamically-adaptive nature of mesoscale problems for the Linked Environments for Atmospheric Discovery (LEAD), a large NSF Information Technology Research (ITR) program to develop an integrated, scalable framework for use in accessing, preparing, assimilating, predicting, managing,! mining/analyzing, and displaying a broad array of meteorological and related information, independent of format and physical location. LEAD's grid of large-scale computing resources, real-time weather data, models and analysis tools will allow scientists to develop on-demand hazardous weather detection systems that can recognize and react to changing weather conditions. In addition, ADaM? has been used successfully as a NSF Middleware Initiative (NMI) Testbed Program application test

Speaker Bio
Sandra Redman is a Research Scientist in the Information Technology and Systems Center (ITSC) at the University of Alabama in Huntsville (UAH). Sandra is responsible for high-performance networking, grid computing, security and video technologies research programs at UAH. She was Co-Investigator for an NSF-funded project for high-performance network connectivity and is a technical resource for UAH Internet2 programs, including the successful collaboration with other research universities within the state and with representatives of the Alabama Research and Education Network (AREN) to establish the Gulf Central GigaPoP? (GCG), a large point of presence for high-performance networks in the state of Alabama. In addition to leading the NSF Middleware Initiative Testbed Site program at UAH, she is also leading several other related efforts, including a grid-based streaming video project for International Space Station (ISS) downlink video, and the ISS Space Development and Operat! ions Grid (SpaceDOG? ), a NASA program for performing remote payload operations in a grid environment. Sandra serves as lead for the NSF-funded Linked Environments for Atmospheric Discovery (LEAD) Grid and Web Services Testbed team, tasked with developing and deploying the LEAD testbed environment. She also serves on the National Space Science and Technology Center (NSSTC) IT Advisory Committee, tasked with developing and implementing IT and security policies.

******************

MCNC Grid Initiatives
Phil Emer, Chuck Kesler
Return to top

Abstract
MCNC Grid Computing & Networking Services is a non-profit organization committed to advancing education, innovation and economic development throughout North Carolina by delivering next-generation information technology services. MCNC Grid Computing & Networking Services develops, tests and deploys grid computing and advanced networking solutions. Through its North Carolina Research and Education Network (NCREN), MCNC provides high-speed Internet, video, audio, data and computing services to universities and other institutions. NCREN is the backbone for the North Carolina Statewide Grid and future technology growth. Phil Emer and Chuck Kesler will provide a summary of North Carolina grid test and integration activities led by MCNC and its collaborators partners. The talk will cover selected technical solution details as well as policy and other non-technical aspects of our testing and deployment experiences.

Speaker Bio
With over 15 years as an IT professional, Chuck Kesler brings an extensive information systems background to his role as MCNC's director of Grid and Data Center Services. Since joining MCNC in 2001, Chuck has provided technical architecture and project management for MCNC's grid computing and hosting initiatives. His activities have included spearheading the deployment of the North Carolina BioGrid? Testbed and leading a collaborative grid infrastructure working group that includes representatives from North Carolina's university community.

Before joining MCNC, Kesler was Director of Information Technology for Carolina BroadBand? , an emerging telecommunications service provider in the Carolinas. There he was responsible for leading the group that designed the data center infrastructure to support the company's internal computing needs and ISP service offerings. His team also deployed and supported all corporate IT systems, including desktops, e-mail, file servers and the company's intranet.

In addition, he served in several technical management roles at Interpath, a leading application service provider (ASP). His experience there included a key role in the design and build-out of a 9,000 square foot server farm. Working with cross-functional teams, he helped deliver hosted solutions to customers in a variety of industries, including e-commerce, financial, healthcare and pharmaceuticals. He also led the group that established the company's information security program, which involved policy development, strengthening firewall and intrusion detection systems, and successfully implementing incident response procedures.

Much of Chuck's early career was spent at North Carolina State University, where he helped build and support the Eos/Unity distributed computing environments used by more than 30,000 students, faculty and staff. As these systems were early adopters of technologies such as Kerberos and AFS, Kesler gained in-depth exposure to delivering sophisticated network-based services in a large scale, highly technical university community.

Speaker Bio
Phil Emer has spent more than 15 years working at the intersections of networking, research and academia. He is currently a senior member of the technical staff in MCNC's Advanced Technologies Group, and the program director of the grid technology evaluation center.

Prior to joining MCNC, Emer spent six years in the Office of Information Technology at N.C. State University reporting to the Vice Provost for IT. During his tenure at NCSU, he directed research and development activities in networking and managed the campus data and video network engineering and operations group. As the Director of R&D activities, he worked closely with network researchers in the National Science Foundation-supported Center for Advanced Computing and Communications at NCSU and at Duke University in the deployment of testbed and measurement infrastructures. As a founding member of the North Carolina Networking Initiative (NCNI), Emer participated in numerous regional initiatives supporting networking research and the deployment of emerging networking technologies. He also managed NCSU connectivity to and campus support for the vBNS and Abilene research networks.

His career began at IBM where he held engineering positions in the federal systems and network hardware divisions. While at IBM, Emer developed software specifications and reference implementations for large-scale, real-time communications and display systems, served on international teams developing advanced network protocols and interfaces, and authored international standards and specifications for Local Area Networking (LAN) and Asynchronous Transfer Mode (ATM) protocols. He also served as Vice President of Data Network Engineering for privately held Carolina Broadband, a competitive broadband service provider based in the Carolinas.

Emer has held various leadership positions including: Chairman, University of North Carolina Network Infrastructure Committee, Adjunct Professor and Visiting Lecturer, Department of Electrical and Computer Engineering, NCSU, NCSU Computer Networking Curriculum Committee, Technical Director, NCSU Multimedia Laboratory, and President, Southeast Region ATM Interest Group.

******************

SURA NMI Utility Grid: Sharing Resources, Sharing Results
Art Vandenberg
Return to top

Abstract
An overall goal of the NSF Middleware Initiative (NMI) is to foster and build sustainable cyberinfrastructure for education and research collaboration. Institutions participating in the NMI Integration Testbed program (http://www.nsf-middleware.org/testbed) have been evaluating the utility and usability of a cooperative grid environment. The SURA NMI Testbed Grid is a multi-institutional effort to deploy Grid technology for true inter-institutional (vs. project-specific) sharing of network-based resources. The ultimate objective is to provide a distributed, heterogeneous, and dynamic environment in which faculty researchers and collaborators, including students, can form "virtual organizations" using various grid components. The current iteration of the SURA NMI Testbed Grid provides a live mechanism to explore grid capabilities, identify and develop new technology and services, and advance the integration of grid infrastructure into the campus enterprise infrastructure.

Parallel Algorithm for Multiple Genome Alignment Using Multiple Clusters
Art Vandenberg, Nova Ahmed, Yi Pan
Return to top

Abstract
The multiple genome sequence alignment problem falls in the domain of problems that can be parallelized to address large sequence lengths. Although there is communication required for the computation, proper distribution can reduce the overall problem to a set of independent tasks whose results are merged. This presentation describes work carried out to adapt a parallel algorithm to a cluster environment and use of SURA NMI Testbed Grid resources to evaluate performance with respect to a traditional cluster environment. An interesting result is achieved in using multiple clusters to improve performance over single cluster environment, leading to potential additional research using multiple grid-enabled clusters.

Speaker Bio
Art Vandenberg has a Master's degree in Information & Computer Sciences from Georgia Institute of Technology, where he spent 15 years providing research support, database administration, applications development, and project management. Since 1997 he has been at Georgia State University with Information Systems & Technology. As Director of Advanced Campus Services he evaluates and implements middleware infrastructure and supports research computing. His current activities at Georgia State include deploying directory and grid middleware and collaborating with faculty researchers on infrastructure for high performance computing and grids. Art is working with several SURA (Southeastern Universities Research Association) sites to deploy regional grid resources. He is Co-PI on a National Science Foundation Information Technology Research grant investigating a unique approach to resolving metadata heterogeneity.

Speaker Bio
Dr. Yi Pan received his B.Eng. and M.Eng. degrees in Computer Engineering from Tsinghua University, China, in 1982 and 1984, respectively, and his Ph.D. degree in Computer Science from the University of Pittsburgh, USA, in 1991. Currently, he is a Professor in the Department of Computer Science at Georgia State University.

Dr. Pan's research interests include parallel and distributed computing, optical networks, wireless networks, and bioinformatics. Dr. Pan has published more than 80 journal papers with 27 papers published in various IEEE journals. In addition, he has published over 90 papers in refereed conferences (including IPDPS, ICPP, ICDCS, INFOCOM, and GLOBECOM).

His pioneer work on computing using reconfigurable optical buses has inspired extensive subsequent work by many researchers, and his research results have been cited by more than 100 researchers worldwide. His recent research has been supported by NSF, NIH, NSFC, AFOSR, AFRL, JSPS, IISF and the states of Georgia and Ohio. Dr. Pan has served as an editor-in-chief or editorial board member for eight journals including three IEEE Transactions and a guest editor for seven special issues. He has organized several international conferences and workshops and has also served as a program committee member for several major international conferences such as INFOCOM, GLOBECOM, ICC, IPDPS, and ICPP.

Speaker Bio
Nova Ahmed is a Ph.D. student in Computer Science at Georgia State University. Her Master's Thesis investigated a Parallel Algorithm for Memory Efficient Pairwise and Multiple Genome Alignment in Distributed Environments. As a graduate research assistant with the NMI Integration Testbed Program, she adapted a genome alignment algorithm from a shared memory system to run in a cluster and grid environment of the SURA NMI Testbed Grid. Currently, as a graduate research assistant with Georgia State's Advanced Campus Services, Nova Ahmed is providing distributed and parallel algorithm support for research faculty in the two areas of focus initiatives: Brains & Behavior, and Molecular Basis of Disease.

******************

Using the AT Grid for Genomics Research at the University of Florida - Big Biology meets Obvious Opportunity
Bill Farmerie
Return to top

Abstract
The Genomics Group of the Interdisciplinary Center for Biotechnology Research (ICBR) is dedicated to developing and operating centralized facilities for large-scale DNA sequencing, gene transcription analysis, and computational manipulation of gene sequence and gene expression information. The genome of an organism represents all of the genes that are inherited from generation to generation. The transcriptome represents the all of the genes that an organism is actively using at any particular moment. One frequent objective of many large-scale DNA sequencing projects is partial characterization of the transcriptome of novel organisms. This process usually begins with partial nucleotide sequence determination of clones randomly chosen from a cDNA library. Each cDNA clone is a copy of a messenger RNA (mRNA) molecule. Each mRNA molecule represents a single gene that the cell is actively using to enable, or catalyze, one or another biological function. Partial cDNA sequences (c.a! . 500 nt) are known as expressed sequence tags or ESTs for short. While a large collection of EST sequences is valuable, almost all downstream objectives require further computational analysis placing each EST sequence, where possible, into a biologically meaningful context.

The path to gene-related information discovery typically begins with searches of the NCBI and other public gene sequence databases using the BLAST search engine (1). The BLAST algorithm is relatively efficient at finding possible sequence similarity matches between a newly determined EST sequence and previously known gene sequences. We infer that the function of the gene represented by the EST is likely to be that of its closest known homolog. Finally, from the collective identity of the EST sequences represented in the entire transcriptome, biologists find clues about how a living organism uses its genetic complement to drive biological functions.

While BLAST searches are very useful, we frequently observe that for as many as one third or more of the EST sequences, BLAST does not find statistically significant homologs among known, characterized genes. This leaves us with a very significant gap in our knowledge of a part of the transcriptome and forces us to look for alternative methods of gene identification. One very powerful alternative is the HMMER (http://hmmer.wustl.edu/) search engine. HMMER uses Hidden Markov Models to discover homology relationships between genetic sequences and is an effective way of characterizing a significant fraction of the EST sequences failing identification using BLAST searches. Unfortunately HMMER searches require approximately ten times more computational processing than do BLAST searches. This processing penalty makes using HMMER for large numbers of queries prohibitively time consuming in a situation where we are constrained by limited computational resources. One very attracti! ve solution to this dilemma is grid computing.

The Academic Technologies Grid at the University of Florida (http://www.at.ufl.edu/grid) ties together 500 desktop computers located across several student-oriented computer laboratories. The AT Grid is implemented using the United Devices Grid MP platform (http://www.ud.com) and executes a grid-enabled form of HMMER. Using the AT Grid we are able to complete large-scale gene sequence homology searches, in a reasonable time-scale, which clearly benefits gene discovery research at the University of Florida.

1. Altschul, S.F., Gish, W., Miller, W., Myers, E.W. & Lipman, D.J. (1990) "Basic local alignment search tool." J. Mol. Biol. 215:403-410.

Speaker Bio
Bill Farmerie is the Director of the Genomics Group of the Interdisciplinary Center for Biotechnology Research at the University of Florida. The Genomics Group provides large-scale DNA sequencing and bioinformatics services, primarily in support of faculty-initiated research projects at the University of Florida. From the beginning, this group recognized that a successful genomics program must integrate data generation with downstream data analysis and began building both sequence production and computational capacities in parallel. Bill received his Ph.D. in Biomedical Sciences from the University of Tennessee. He has been at the University of Florida since 1989 and Director of the Genomics Group since it's inception in 1998.

******************

Using Resource Virtualization Techniques to Grid-enable Coupled Coastal Ocean Models
Renato Figueiredo
Return to top

Abstract
This talk will address the issue of how resource virtualization techniques facilitate the Grid-enabling of unmodified applications. The talk will focus on a case study of Grid-enabling an application from the coastal ocean modeling domain that couples a curvilinear-grid hydrodynamics 3D model (CH3D? ), a wave model (SWAN), and a circulation model (ADCIRC). The talk will describe how the virtualization of machines enables isolation and multiplexing of unmodified executable binary applications that require different operating systems and execution environments; how the virtualization of distributed file systems enables seamless data sharing among models; and how the virtualization of networks facilitates the management of wide-area Grid resources. In this context the talk will also overview a middleware solution - In-VIGO - that leverages such virtualization techniques to enable the creation of dynamic pools of virtual resources that can be aggregated on-demand for application-c! entric Grid computing sessions.

Speaker Bio
Renato Figueiredo received his Ph.D. degree in Computer Engineering from Purdue University in 2001. He is currently an Assistant Professor in the Advanced Computing and Information Systems (ACIS) Laboratory at the Department of Electrical and Computer Engineering, University of Florida. Prior to joining the University of Florida he was an assistant professor at Northwestern University. His research interests include distributed systems, virtual machines, file systems, networks and computer architecture.

******************

Designing a New Networking Environment for U.S. Research & Education
Steve Corbato´
Return to top

Abstract
Fueled in part by opportunities emerging from the Internet/telecom crash earlier this decade and new requirements emanating from the computational science disciplines, the research universities in the United States - in collaboration with their global peers - are fundamentally redesigning the models of high performance networking and related services. This talk will describe the evolution of advanced networking by reviewing the Internet2 Abilene IPv4/IPv6 network, the critical role played by available dark fiber assets and emerging regional optical networks (RONs), and finally, national and international optical networking efforts such as National LambdaRail? (NLR) and the Global Lambda Integration Facility (GLIF).

Speaker Bio
Steve Corbato´ is the Director of Network Initiatives and, on an interim basis, co-leads the Technology Direction and Development department at Internet2. In this role, he has overall responsibility for Internet2's advanced technology initiatives in networking, performance, middleware, and security. In previous roles at Internet2, he has led the Abilene Network through its 10-Gbps upgrade in 2001-02. In 2003, he spearheaded the formation of FiberCo? , a dark fiber holding and assignment vehicle to support regional and national optical networking initiatives. Steve is also responsible for Internet2's relationship with both The Quilt and National LambdaRail? .

While remaining an Internet2 employee, he currently is a visiting fellow in the Center for High Performance Computing at the University of Utah in Salt Lake City.

Prior to joining Internet2 in June 2000, Dr. Corbato´ was the technical lead for the Pacific Northwest Gigapop and manager of network engineering at the University of Washington, Seattle. He remains an affiliate faculty member in the Department of Computer Science and Engineering there.

Steve's background is in experimental astrophysics. He earned his B.A. cum laude from Rice University and his Ph.D. from the University of Pennsylvania. He later was a member of the research faculty at the University of Utah.

******************

Supporting Grid Environments
Leigh Grundhoefer
Return to top

Abstract
Allowing applications to easily run in a grid environment presents new challenges for resource providers and professional support organizations. This presentation will share experiences derived from the Grid Operations Center at Indiana University, which is supporting a production grid environment for the Grid3 project and the international Virtual Data Laboratory.

Speaker Bio
Leigh Grundhoefer received a Bachelor of Science in Applied Computer Technology in 1985 from Indiana State University in Terre Haute, Indiana. She has held positions as a Software Design Engineer and Unix Systems Administrator since that time. Leigh spent five years working in the telecommunication industry before taking a position with the Indiana University Computer Science Department in 1993. In 1995, she moved from the Computer Science Department to University Computing Services, now known as University Information and Technology Services.

At UITS, Leigh has held several positions: Managed group of junior Unix support professionals; Supercomputer system administration for the SCAAMP project; Developed and deployed the High Performance Storage System (HPSS) infrastructure; Provided technical and developmental support to the Research and Technical Services group, focusing primarily on the installation of IBM RS600 SP systems; Currently working with High Energy Physics group to create an international virtual data grid laboratory.

******************

Directions & Opportunities for Funding
Sue Fratkin
Return to top

Abstract
While the Federal budget is expected to be very tight in the coming years, funding to support "grid" research and applications continues in FY 2005 and into the foreseeable future. Cost sharing, collaboration and Cyberinfrastructure have become important terms and Agencies other than NSF have been making grants in this area. This session will explore current project funding as well as examine potential sources of funds from Federal Agencies as well as other sources.

Speaker Bio
Sue Fratkin is a public policy analyst concentrating on technology and telecommunications issues, particularly as they affect the higher education community. Realizing the impact that the growth in technology would have on this community, Sue founded Fratkin Associates in 1991. She regularly interacts with White House, Congressional and Federal Agency personnel and reports on governmental hearings, programs and publications, analyzing relevant legislation and regulations.

Fratkin Associates' clients include several Washington-based higher education and IT focused associations. Sue also serves as the Washington liaison for the Coalition for Academic Scientific Computation, an alliance of forty-one academic supercomputing centers in twenty-eight states.

As the author of numerous articles on public policy and telecommunications and technology policy pertaining to the higher education community, Sue participates as a panelist at national conferences and has served as a reviewer for Federal education technology programs.

Topic attachments
I Attachment Action Size Date Who Comment
pptppt Farmerie_SURA20050106.ppt manage 1181.5 K 15 Sep 2008 - 17:50 SaravanarajDuraisamy  
pptppt Fraktin-SURAGrid-1-07-05.ppt manage 31.0 K 15 Sep 2008 - 17:51 SaravanarajDuraisamy  
pptppt GSUmultiGenomeGridJan2005.ppt manage 589.5 K 15 Sep 2008 - 17:51 SaravanarajDuraisamy  
pptppt GridLab-SURAGrid_Jan05.ppt manage 5843.0 K 15 Sep 2008 - 17:51 SaravanarajDuraisamy  
pdfpdf InVigo-suraworkshop.pdf manage 642.9 K 15 Sep 2008 - 17:49 SaravanarajDuraisamy  
pptppt JLabGridWork.ppt manage 613.5 K 15 Sep 2008 - 17:51 SaravanarajDuraisamy  
pptppt NMIutilityGrid11Jan2005.ppt manage 3094.0 K 15 Sep 2008 - 17:52 SaravanarajDuraisamy  
pptppt SCOOP-final-SURAGrid.ppt manage 645.5 K 15 Sep 2008 - 17:52 SaravanarajDuraisamy  
pdfpdf Schwarz-SURAGridJan05.pdf manage 1340.4 K 15 Sep 2008 - 17:37 SaravanarajDuraisamy Promises and Challenges of Grid and Cluster Computing
pptppt UAH-DataMining-SURAGridJan05.ppt manage 3288.5 K 15 Sep 2008 - 17:52 SaravanarajDuraisamy  
pptppt YafchakforSURAGridJan05.ppt manage 201.0 K 15 Sep 2008 - 17:32 SaravanarajDuraisamy Grid Definitions & Perspectives
docdoc application-tool-breakout.doc manage 25.5 K 15 Sep 2008 - 17:52 SaravanarajDuraisamy  
docdoc grid-building-breakout.doc manage 89.0 K 15 Sep 2008 - 17:53 SaravanarajDuraisamy  
docdoc grid-security-breakout.doc manage 53.5 K 15 Sep 2008 - 17:53 SaravanarajDuraisamy  
pptppt grundhoefer-SURAGridJan05.ppt manage 1570.5 K 15 Sep 2008 - 17:51 SaravanarajDuraisamy  
pdfpdf panelXscript6Jan2005.pdf manage 101.4 K 15 Sep 2008 - 17:49 SaravanarajDuraisamy  
pdfpdf sura-atl-corbato-07-jan-2005.pdf manage 1129.1 K 15 Sep 2008 - 17:49 SaravanarajDuraisamy  
pptppt sura-mcnc-jan05.ppt manage 1049.5 K 15 Sep 2008 - 17:52 SaravanarajDuraisamy  
Topic revision: r1 - 15 Sep 2008 - 18:03:17 - SaravanarajDuraisamy
 
This site is powered by the TWiki collaboration platformCopyright &© by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback