31. COMPUTATIONAL DATABASE
Department: Computer Science & Engineering
Faculty Advisor(s):
Scott B. Baden
Primary Student
Name: Alden P King
Email: apking@ucsd.edu
Phone: 858-534-9916
Grad Year: 2012
Abstract
Numerical simulations of technologically important phenomena can generate large datasets; extracting knowledge from these volumnious data sets is a technical challenge. Consider a common case where the simulation data is stored at the points of a regularly spaced mesh in space and time. Most scientists use ad-hoc methods for their analysis. Some application domains are able to use relational databases for their large datasets, but the relational model isn't appropriate for many application classes. Scientific computing analysis algorithms compute aggregation and stencil operations. While relational databases work well with aggregates, they are poorly designed for ranges, especially in more than one dimension. Scientific data, therefore, is usually stored in flat files with easily-calculatable offsets for each point. This essentially requires scientists to deal with their own data serialization instead of specialized (and optimized) software. We want to provide to scientific computing what the relational database has provided to businesses for so many years. We present a computational database which manages on-disk storage for user-defined data types and executes user-defined functions and queries over those types. Furthermore, we look at automating the optimization of the analysis algorithms for multiple compute targets, including both the host CPU and expansion cards, such as GPUs.