Data Science Solutions

SDSC offers complete data science solutions in a breadth of specialities via training, service contracts, and joint research collaborations.

Big Data Benchmarking

SDSC keeps close tabs on the latest advances in compute hardware, memory, storage, and networking, along with the latest techniques to manage data and computation. We benchmark systems to look deeply into how they work, where they run into bottlenecks, and how to improve their performance.

Data Distribution Platforms

Responding to a dramatic growth in acquisition of data that has been critical to scientific progress and knowledge, SDSC's Advanced Cyberinfrastructure Development (ACID) Lab leverages Big Data technologies and platforms to provide efficient access to invaluable data and High-Performance Computing systems to meet the processing demands for generating key insights and promoting scientific breakthroughs. 

Data & Information Virtualization

Data integration can be challenging enough, but responding to the complexities of integration across sources hosted outside of one’s environment can be daunting. SDSC experts are versed in best-of-breed techniques for integrating virtual data sources along with local data sets. Many of our data scientists are domain experts who can suggest additional datasets for increased data value.

Data Integration

SDSC excels at integrating data, especially big or “messy” data. We understand that data arrives in many states, from very trusted to unmonitored crowdsourced data. Some data is generated for different uses and may not have factored in the metadata needs of your current project. Researchers may have images, text and streaming data to integrate. SDSC experts specialize in all aspects of data integration, including ontology building.

Data Modeling, Design & Optimization

Whether one is new to managing data or has a current project that could benefit from optimization and better design, SDSC experts are eager to help. We specialize in building effective, flexible data schemas and architectures based on modeling with test data. We can generate test data, especially for projects where testing with live data is suboptimal, e.g. data governed by compliance regimes. SDSC can also monitor current projects over time and suggest incremental improvements for continuous optimization.

Graph Analytics

Graph Analytics is a rapidly developing area of research where a combination of graph-theoretic, statistical, and database techniques are applied to model, store, retrieve, and performance analyses on graph-structured data. These techniques enable researchers to understand the structure of a network and how it changes in different conditions, find paths between pairs of entities that satisfy different constraints, identify clusters or closely interacting subgroups inside a graph, or find subgraphs that are similar to a given patter.

Research Data Services

Leveraging extensive experience built over years of collaborating with data scientists to create scalable data science platforms, SDSC Research Data Services offers a complete suite of data science services, from infrastructure hosting and complex data storage, to FAIR (Findable, Accessible, Interoperable, Reusable) data consulting, all of which enables data scientists to focus on their science.

Spatial Data

SDSC's Spatial Information Systems Lab conducts research and develops technologies and infrastructure that enable users to access, integrate, and manage spatial information. Application domains range from hydrology and environmental sciences, to neuroscience.

Statistics, Machine Learning, & Predictive Analytics

Gathering data is easy. In fact, it’s so easy it’s exceeding our capacity to validate, analyze, visualize, store and curate. And, many of our critical scientific problems can only be solved by harnessing this data. SDSC's Predictive Analytics Center of Excellence (PACE) nurtures a rich collaborative learning environment to cultivate a national community of data scientists that embodies innovation through diversity of thought. Predicting future trends and behaviors – from the epic to the everyday – allows for proactive, knowledge-driven decisions.

Time Series & Streaming Data

SDSC data science experts specialize in storing, integrating, and analyzing all data types, including time series data. SDSC has aided researchers in the areas of eHealth, climate sensing, and Smart Cities, just to name a few. Such data sources hold additional challenges because of large volumes, and SDSC excels in the challenges of computing at scale.