Overview    Publications    Projects    Services    Teaching   

Current Projects

I am working on the following projects, in collaboration with other professors at UTCS:

  • [Data+Distributed Systems]: A Scalable and Efficient Cloud Database over a CXL Pod (with Prof. Witchel)
  • [Data+Programming Languages]: An Optimization Framework for Data-Intensive User-Defined Programs (with Prof. Dillig)
  • [Data+Machine Learning]: A Unified Execution Engine for Efficient Large-Scale Multi-Modal Data Analysis

Past Projects

My past research in PhD and PostDoc was focused on supporting user-centered data analytical applications at scale by reshaping modern data analytical stacks. The major projects I worked on include:

  • FormS: a Python library that efficiently translates spreadsheet formulas to SQL queries
  • Smash: a string distance metric that captures acronyms, abbreviations, and typos together.
  • Transactional Panorama: a conceptual framework for user perception in analytical visual interfaces
  • Taco: efficient and compact spreadsheet formula graphs
  • Modin: a scalable dataframe system
  • Lux: a visualization recommendation library for data scientists to perform easy data exploration in dataframe workflow
  • CrocodileDB: a new database architecture that exploits time slackness to enable new resource-efficient query execution (video)
  • ACC: a high-performance main-memory database that adaptively choosees and mixes multiple concurrency control protocols