Institute for Physical Artificial Intelligence Rosen Center for Advanced Computing
Purdue AI-Ready Research Datasets
Birck Nanotechnology Center

Birck Nanotechnology Center

Experimental results from metrology, thermal processing, deposition, lithography, and etching equipment. Data is being used to create digital twins and optimize processing parameters.

Zhihong Chen

Digital Farm

Digital Farm

Field pictures and measurements of crops. AI models provide reconstructed 3D image and predictions of yield.

Ignacio Ciampitti

Knowledge from Documents and Courses

Knowledge from Documents and Courses

Knowledge maps from documents shared by experts used to guide exploration of related concepts, finding associated resources, and more.

Daniel Mejia

nanoHUB Results Database

nanoHUB Results Database

FAIR repository with results from online simulations on nanoHUB. 1.6+ million entries from 700+ tools contributed by experts.

Daniel Mejia

Exploring New Materials

Exploring New Materials

AI-driven calculations of elastic constants of carbides for applications at extreme conditions.

Alejandro Strachan

Computing Cluster Usage and Failure

Computing Cluster Usage and Failure

Publicly available operational data from High Performance Computing (HPC) systems enables research in critical areas like system dependability and resource optimization.

Saurabh Bagchi

Protein Shapes with 3D Zernicke Descriptors

Protein Shapes with 3D Zernicke Descriptors

Protein structure analysis using 3D Zernike descriptors for comparative modeling and bioinformatics.

Daisuke Kihara

AgMIP Data Aggregator Tool

AgMIP Data Aggregator Tool

By combining results from a dozen different crop models, the GGCMI group provides robust estimates of climate impacts at fine-scale, worldwide, using CMIP6 climate models.

Carol Song

Common Hosted Datasets

Common Hosted Datasets

Centralized repositories of commonly used datasets for geosciences, life sciences, and climate modeling. Includes AI training datasets such as crawl data, road and traffic imagery, and indoor spaces hosted on Purdue HPC.

Preston Smith

Emerging Manufacturing Collaboration Center (EMC2) Dataset

Emerging Manufacturing Collaboration Center (EMC2) Dataset

The EMC2 dataset is real-time operating data collected from the building energy system of the Emerging Manufacturing Collaboration Center (EMC2), an industrial facility. The dataset integrates high-resolution local operating parameters, weather data from an adjacent meteorological station, and grid information from relevant utility sources. The combination of these multi-source data streams provides a comprehensive foundation for energy analytics

Ming Qu

Bindley Center Flow Cytometry Data

Bindley Center Flow Cytometry Data

Data from flow cytometry instruments enabling cellular analysis for life sciences research.

Bartek Rajwa