Connections

Project

Timeline
Project Summary
EPIC SoW
Data Overview
Rubric


Data Resources

Data Resources Summary

DAKS
Other Data Sources
Existing Scientific Research Portals
More Projects


Data and Knowledge Systems (DAKS)
(http://daks.sdsc.edu/)

Divisions

Knowledge and Information Discovery (http://scirad.sdsc.edu/datatech/skidl.html) (description)

Knowledge-based Information Systems
(http://kbis.sdsc.edu/) (description)

Spatial Information Systems
( http://daks.sdsc.edu/sp/index.html) (description)

Advanced Query Processing
( http://daks.sdsc.edu/aqp/index.html) (description)

Data Grids Technologies
( http://daks.sdsc.edu/dg/index.html) (description)

Advanced Database Projects
( http://daks.sdsc.edu/adp/index.html) (description)

Geoinformatics
(http://daks.sdsc.edu/gl/index.html) (description)

SDSC Visualization
( http://vis.sdsc.edu/) (description)

Sustainable Archives and Library Technologies
(http://daks.sdsc.edu/salt/index.html) (description)

Major Initiatives

BIRN

Impressive site in general
Good survey of users preferences from 2002
No education resources yet (under construction)
Sophisticated data appropriate primarily to undergrad pre-med and graduate/med students

 

GEON

Includes GEON Portal, myGEON
Emphasis on data ontologies, workflows
Rocky Mountain Testbed
Mid-Atlantic Appalachian Orogen Testbed
AType workflow
SynSEIS, web-based seismic analysis tool
Earth History Testbed (CHRONOS, PGAP Search)
CHRONOS contains the Foram database, Foram Guide, and Foram Atlas
North American Volcanic database (NAVDAT)
Chronostratigraphic database

GEON is using ADN—the metadata standard developed by DLESE—and, in particular, the education-related metadata fields. This provides built-in “hooks” for the future, for creating educational modules with GEON resources, since the metadata fields would already be in place.

 

Information Integration Testbed (I2T)/Geogrid

Information mediation across heterogeneous spatial and survey data sources, and WSDL/SOAP web services integration. Demos link broken. No apparent data resources. Builds on MIX. Digital Government, SWB-related.

 

NARA, Data and Publications (Government site)

Persistent archiving.

SDSC site is extremely outdated, broken links. NARA home is very comprehensive. Includes education site, "The Digital Classroom" . Lots of good resources.

Science Environment for Ecological Knowledge (SEEK)

The Science Environment for Ecological Knowledge (SEEK) is a five year initiative designed to create cyberinfrastructure for ecological, environmental, and biodiversity research and to educate the ecological community about ecoinformatics. SEEK participants are building an integrated data grid (EcoGrid) for accessing a wide variety of ecological and biodiversity data and analytical tools (Kepler) for efficiently utilizing these data stores to advance ecological and biodiversity science. An intelligent middleware system (SMS) will facilitate integration and synthesis of data and models within these systems.

Other projects include:

Sparrow
Growl

Additional resources:

SEEK has established an Ecological Meta Language.

Morpho: Morpho allows ecologists to create metadata, (i.e. describe their data in a standardized format), and create a catalog of data & metadata upon which to query, edit and view data collections. In addition, It also provides the means to access network servers, in order to query, view and retrieve all relevant, public ecological data! Hosted by the Knowledge Network for Biocomplexity. Sponsored by National Center for Ecological Analysis and Synthesis , Long Term Ecological Research Network , Texas Tech and SDSC.

 

SRB

Distributed data management tool.

 
 

Projects

Project Descriptions

Knowledge and Information Discovery (http://scirad.sdsc.edu/datatech/skidl.html)

Projects

LOOKING - Laboratory for Ocean Observatory Knowledge INtegration Grid

Plone-based portal to federate ocean observatories into an integrated knowledge grid. No data.

Health Monitoring of Civil Infrastructure

Bridge sensornet technology, i nteractive Bridge Testbeds (account-based portal),structural Engineering. Excellent interactive interface to gauge monitoring data, graphs and tabular. Very fast graphing.

Collaborative Lake Metabolism Project

Allows downloading tabular data in .csv format from one sensor station, very slow. Site also includes near real time (the past weeks worth of data) data graphs of air temp, water temp, rel. humidity, dissolved oxygen, wind speed, precipitation for six instrumented buoys at multiple global locations. No tabular data option for this data.

ROADNet - Real-time Observatories, Applications, and Data management Network

ROADNet real-time data resources. Lots of data, very little coherence for the uninitiated. Mostly graphs. Lots of broken links or dead-ends ("no data for this time range"). Their time series data option takes the user to a waveform page with no explanation as to the nature of the data. Guessing it's seismic in nature. Units are either dimensionless or nm/sec??)

Also lists collaborator's data pages:

Great vision. Mostly seismic sensor data currently, but with plans to broaden to all types of environmental monitoring sensors. Lists Kepler as a related project. Users include Watch the Water, San Diego Coastal Ocean Observing System , and the Geophysics Branch, Naval Air Warfare Center Weapons Division.

CLEANER - Large-scale Engineering Analysis Environmental Research

Nationwide effort headed by NACSE to monitor heavily (human)-impacted environments. No real data yet. Ilya Zaslavsky is SDSC's rep. (see abstract).

NEON - National Ecological Observatory Network

Not exactly sure what NEON does. Lots of committees generating reports, some workshops, a smattering of outreach documents. No data, no option for the public to set up accounts. This is NSF's vision of NEON:

NEON is envisioned as “a continental scale research instrument consisting of geographically distributed infrastructure, networked via state-of-the-art communications. Cutting-edge lab and field instrumentation, site-based experimental infrastructure, natural history archive facilities and/or computational, analytical and modeling capabilities, linked via a computational network will comprise NEON. NEON will transform ecological research by enabling studies on major environmental challenges at regional to continental scales. Scientists and engineers will use NEON to conduct real-time ecological studies spanning all levels of biological organization and temporal and geographical scales. NSF disciplinary and multi-disciplinary programs will support NEON research projects and educational activities. Data from standard measurements made using NEON will be publicly available.” (NSF04549, 2004)

There is no indication on the NEON web site what this infrastructure is, nor how it is actually being used, if at all.

National Science Digital Library (NSDL) Text Categorization Testbed

Spatio-temporal Analysis of 911 Call Stream Data

Part of NSF's Digital Government Research Program. From the site: Our approach consists in doing spatiotemporal analysis to uncover meaningful patterns in 9-1-1 call data, correlating them to external State-wide and local events. Of particular interest is determining unusual spatiotemporal trends, or "signatures", in call data and associating them with external events such as wildfires and eartquakes. Finding such signatures will help provide advance warning to Public Safety Answering Points (PSAP's), fire responders, and other emergency service personnel. For this, we are using an array of advanced data management, data analysis, data mining, and statistical techniques.

I2G Web Services for Ecoinformatics (broken link)

PRAGMA Ecoinformatics

PRAGMA is an organization of pacific rim institutions dedicated to collaborative HPC. No real data, moreso supports infrastructure.

PRAGMA 2005

Downloads for KID group

AGU 2004 Poster: Hyperspectral Landcover Classification Service

(PDF, 11MB) (PNG, 2MB)

SKIDLkit data mining toolkit

SKIDL Web Services Software Stack

I2G Web Services Infrastructure and Sensor-based Lake Monitoring and Analysis Poster (PDF, 4MB) (PNG, 2MB)

 

Knowledge-based Information Systems (http://kbis.sdsc.edu/)

Members of the KBIS Lab conduct research in Data & Knowledge Engineering to support the management and integration of scientific data. Scientific data management often requires a combination of traditional database technology and knowledge representation techniques. Main areas of research include: data modeling, knowledge representation, and query processing for model-based mediation (a.k.a. semantic mediation), scientific databases and workflows, and knowledge-based digital libraries and archives. On several of these, KBIS works closely together with colleagues from other DAKS labs at SDSC, from NCMIR and BIRN-CC, and from the CSE Department, all located at U.C. San Diego.

The KBIS lab is led by Bertram Ludäscher.

People in the Lab.

Projects

GEON

SEEK

BIRN-CC

SciDAC/SDM

ROADNet

NARA, NHPRC

Digital Libraries

 

Spatial Information Systems ( http://daks.sdsc.edu/sp/index.html)

The Spatial Information Systems Lab conducts research and develops technologies and infrastructure that enable users to access, integrate and manage spatial information. Application domains range from Earth science, environmental and demographic analysis, to neuroscience and medical records. Working in collaboration with other DAKS Labs and UCSD programs (such as NCMIR, SIO, USP), we support spatial information processing and Web mapping in a variety of projects. Our main research focus is spatial and survey data integration, new online technologies such as WSDL/SOAP and web services, and XML-based web mapping. The Lab is lead by Ilya Zaslavsky.

Projects

NPACI Neuroscience Thrust and BIRN-CC

SIOExplorer

Superfund Basic Research Program

I2T/Geogrid

Technologies

Spatial Analysis and GIS

Spatial Data Wrapping and Mediation

Web-based interactive mapping

XML-based vector graphics

Web services for spatial and survey data processing

 

Advanced Query Processing ( http://daks.sdsc.edu/aqp/index.html)

The Advanced Query Processing Lab develops domain-specific data models and query processing techniques for a range of scientific problem areas. Research focuses on developing expressive data types for scientific data, semantic query evaluation techniques, and exploring the new paradigms for user-steerable query processing required for today's advanced scientific problems. The AQP Lab actively contributes to the semantic and spatial information integration research conducted by other labs in SDSC's DAKS program. The lab is lead by Amarnath Gupta.

People In the Lab.

Projects

NPACI Neuroscience Thrust

BIRN

SCIDAL

QUBIC

NSK-ITR On Gene Regulatory Networks

I2T/Geogrid

Technologies

XML Based

Wrapper Engines

RDBMS Wrapper

Information Integration

KB Integration (With KBI Lab)

Spatial Data Mediation

Data Models

Geometric Data

Graph/NeModelstwork

Scalar/Vector Field Models

Data Grids Technologies ( http://daks.sdsc.edu/dg/index.html)

More Information on the Projects can be found at http://www.npaci.edu/DICE/SRB/

Advanced Database Projects ( http://daks.sdsc.edu/adp/index.html)

The Advanced Database Projects Lab provides data services to advance science. The research focuses on making data available to researchers via traditional methods and API's which allow simple storage and retrieval of data regardless of type, size, and physical location. ADPL provides infrastructure for data mining, data warehousing, and query processing. The lab is lead by David L. Archbell.

Projects

GEON

BIRN

PDB

GAPP - MOCA

EOL

TeraGrid

UCSD Cancer Center

Civil ITR

KGI ITR

WIISARD

Geoinformatics (http://daks.sdsc.edu/gl/index.html)

The Geoinformatics Lab is lead by Dogan Seber

 

SDSC Visualization ( http://vis.sdsc.edu/)

 

Sustainable Archives and Library Technologies (http://daks.sdsc.edu/salt/index.html)

The Sustainable Archives and Library Technologies lab is headed by Richard Marciano.


Other Data Sources

Network for Earthquake Engineering Simulation (NEES)

Includes NEESit, IT support from SDSC. Education is still under construction. No useful data currently. No apparent timeline for it. Most menu links are inactive, TBD. National network of institutions, some activity.

LTER Spatial Data WorkBench

Site has some problems with respect to available data. The "download data" option requires the SRB's client browser inQ, but when I download and run with the specified parameters there is no server available. Most major links in the left menu are broken (matrix.sdsc.edu). The "browse metadata" link displays a list of 24 sites, but only one is a link, the Sevilleta LTER site, with some imagery for land cover classification over multi-year periods, but sparse, scattered, usefulness unclear.

RCSB Protein Data Bank

Hosted and supported by the Research Collaboratory for Structural Bioinformatics. Very comprehensive site for molecular biology. Includes an education page.

National Virtual Observatory

Well-established already with educational resources.
Education and Outreach
The DataScope data interface , a bit slow, not designed with a teacher in mind.
Survey of K-12 teachers showed they want:

Lesson plans targeted to specific student age groups
More interactive lessons
Data to use in the classroom

The Visible Human Embryo

No actual data in the usual sense, but very well-designed interface for education.

The Encyclopedia of Life

Outstanding but somewhat advanced resource for biologists. No clear education component. Assumes advanced knowledge of the subject. Sequence data, graphs, images, and 3D models using SVG Viewer available.

The Biology Student WorkBench

Excellent educational resource in general for molecular biology, proteomics, and genomics education.

The National Center for Microscopy and Imaging Research

The Cell-Centered Database

 

 


Existing Scientific Research Portals

Cactus Portal (numerical astrophysics)

Children's Hospital Bioinformatics Portal

UCSD Telescience Portal (remote microscopy)

GEONGRID Portal

Chronos Portal

Biology WorkBench

Protein DataBank

 


More Projects
( Borrowed from the SDSC Projects page)

SDSC serves as a critical IT partner to large scale projects in life sciences, geosciences, engineering and other disciplines. SDSC researchers collaborate on projects ranging from cellular signaling to earthquake effects to preserving large and irreplaceable data. Listed below are SDSC projects and the cyberinfrastructure areas they involve.

Alliance for Cellular Signaling (AfCS) - The AfCS-Nature Signaling Gateway is a comprehensive and up-to-the-minute resource for anyone interested in signal transduction.

Biodi - Advanced computational approaches to environmental and biodiversity information.

The Biology Workbench and NEXT Generation Biology Workbench (Swami) - Web-based tools for biologists allowing biologists to search many popular protein and nucleic acid sequence databases.

Biomedical Informatics Research Network (BIRN) - An initiative that fosters distributed collaborations in biomedical science by utilizing information technology innovations.

BorderSafe Integrated Feasibility Experiment - Addresses the interagency data-sharing and analysis challenges brought to light in the post-9/11 era.

CHRONOS - Works with the Earth science community to develop a dynamic, interactive and time-calibrated network of databases and visualization and analytical methodologies for sedimentary geology and paleobiology.

Cooperative Association for Internet Data Analysis (CAIDA) - Provides tools and analyses promoting the engineering and maintenance of a robust, scalable global Internet infrastructure.

Cyberinfrastructure for Phylogenetic Research (CIPRes) - An open collaboration working to enable large-scale phylogenetic reconstructions on a scale that will enable analyses of huge datasets containing hundreds of thousands of biomolecular sequences.

Data and Knowledge Systems (DAKS) - Creates data and knowledge cyberinfrastructure for scalable, end-to-end data management and knowledge discovery pipelines in data-intensive scientific computing.

The Encyclopedia of Life (EOL) - A collaborative global project designed to catalog the complete proteome of every living species in a flexible reference system.

GAMESS Web Portal - A tool that allows users to access files, create batch scripts, and track jobs running on SDSC systems: all through a simple web interface.

GEOsciences Network (GEON) - Develops prototype cyberinfrastructure for the earth sciences, based on close collaboration among IT and earth science researchers.

High Performance Wireless Research and Education Network (HPWREN) - Creates, demonstrates, and evaluates a non-commercial, prototype, high-performance, wide-area, wireless network in San Diego, Riverside, and Imperial counties.

Kepler Workflow tool - Produces an open-source scientific workflow system that allows scientists to design scientific workflows and execute them efficiently using emerging Grid-based approaches to distributed computation.

LIPID Metabolites and Pathways Strategy (Lipid Maps) - A project to produce a detailed understanding of the structure and function of lipids – cellular fats and oils implicated in a wide range of diseases, including heart disease, stroke, cancer, diabetes and Alzheimer’s disease.

National Center for Microscopy and Imaging Research (NCMIR) - Develops state-of-the-art 3D imaging and analysis technologies to help biomedical researchers understand biological structure and function relationships in cells and tissues.

NEES it - A service-focused organization that operates and supports the extensive IT infrastructure for the Network for Earthquake Engineering Simulation (NEES) Collaboratory.

National Laboratory for Advanced Data Research (NLADR) - A collaborative Research and development activity between National Center for Supercomputing Application (NCSA) and the San Diego Supercomputer Center (SDSC) in advanced data technologies.

National Laboratory for Applied Network Research (NLANR) - Provides technical, engineering, and traffic analysis support of National Science Foundation high performance connections sites and HPNSP (high-performance network service providers).

The Notebook Project - Provides users with rich interfaces to the next generation of online data and analytical services, coupled with the convenience of a personal local database.

OptIPuter - An envisioned infrastructure that will tightly couple computational resources over parallel optical networks using the IP communication mechanism.

Pacific Rim Applications and Grid Middleware Assembly (PRAGMA) - An open organization in which Pacific Rim institutions collaborate to develop grid-enabled applications and deploy infrastructure to allow data, computing, and other resource sharing.

Protein Data Bank (PDB) - the single worldwide repository for the processing and distribution of 3-D biological macromolecular structure data.

Protein Kinase Resource (PKR) - a collaborative project of protein kinase researchers and computational biologists working to create a database integrating molecular and cellular information.

Resurgence - Provides a general workflow infrastructure for computational chemistry that allows high-throughput calculations distributed on a computational grid.

Rocks - An open-source solution for managing Linux-based clusters.

San Diego Network Access Point (SD-NAP) - A neutral network traffic exchange facility that is intended to provide a location for local data network service providers to exchange Internet traffic.

SDSC Storage Resource Broker (SRB) - Client-server middleware that provides a uniform interface for connecting to heterogeneous data resources over a network and accessing replicated data sets.

Science Environment for Ecological Knowledge (SEEK) - A five year initiative designed to create cyberinfrastructure for ecological, environmental, and biodiversity research and to educate the ecological community about ecoinformatics.

Seamounts - Designed to gather information on species found in seamount habitats, and to provide a freely-available online resource for accessing and downloading these data.

SIO Explorer - The Scripps Institution of Oceanography, SDSC, and the UCSD Libraries are collaborating to create a modern Oceanography Digital Library that will enable inquiry-driven learning for both scientists and "K through gray" users.

Southern California Earthquake Center (SCEC) - Gathers and integrates new information about earthquakes in Southern California, into a comprehensive and predictive understanding of earthquake phenomena.

UCSD Superfund Basic Research Program - Implements modern scientific approaches to identify and characterize genomic stress responses elicited by waterborne pollutants found at Superfund sites.

The TeraGrid - A multi-year effort to build and deploy the world's largest, most comprehensive, distributed infrastructure for open scientific research.

Tools and Data Resources in Support of Structural Genomics - Systematic Protein Annotation and Modeling (SPAM) is a multi-institutional initiative to make better use of target sequences and structures.


Grid Development Group - Part of the Techology Research & Development division of SDSC whose mission is to develop grid middleware tools and technology.

Sciences Research and Development Division (SCIRAD) - The Sciences R&D Division at SDSC supports the cyberinfrastructure needs of all kinds of researchers across the disciplines of the natural sciences.

Visualization at SDSC - Focuses on data visualization and includes the integration of specialized display hardware from the gaming industry into the standard set of visualization tools and special programs that provide capabilities for rendering 3-D scenes, data mining and rendering textures.


 

News and Events

News

Events

Curriculum Spotlight: The Climate Timeline Information Tool


Data Resources