HYDRA Database Regenerator



Database Systems Lab
Indian Institute of Science



[About] [Download] [Publications] [Videos] [Team] [Contact]

About HYDRA
Welcome to the HYDRA Database Regenerator software developed at the Database Systems Lab, Indian Institute of Science. HYDRA is an easy-to-use graphical tool for the automated regeneration of client data processing environments at vendor sites. It is written entirely in Java and is operational on the PostgreSQL database engine.

A core requirement of database engine testing is the ability to create synthetic versions of the customer's data warehouse at the vendor site. HYDRA is a workload-dependent database regenerator that leverages a declarative approach to data regeneration to assure volumetric similarity, a crucial aspect of statistical fidelity. It uses an optimized linear programming (LP) formulation based on a novel region-partitioning approach. This spatial strategy drastically reduces the LP complexity, enabling it to handle query workloads on which contemporary techniques fail. Second, Hydra incorporates deterministic post-LP processing algorithms that provide high efficiency and improved accuracy. Third, Hydra introduces theconcept of dynamic regeneration by constructing a minuscule database summary that can on-the-fly regenerate databases ofarbitrary size during query execution, while obeying volumetric specifications derived from the query workload. Finally, a detailed experimental evaluation on standard OLAP benchmarks demonstrates that Hydra can efficiently and dynamically regenerate large warehouses that accurately mimic the desired statistical characteristics.

Download
Hydra
Technical Report pdf

Poster pdf.

Hydra Codebase zip Readme: text
PiGen
Technical Report pdf

PiGen Codebase: zip Readme: text

Publications
Scalable and Dynamic Regeneration of Big Data Volumes
Anupam Sanghi, Raghav Sood, Jayant Haritsa, and Srikanta Tirthapura
Proc. of 21st International Conference on Extending Database Technology (EDBT), Vienna, Austria, March 2018

HYDRA: A Dynamic Big Data Regenerator (demo)
Anupam Sanghi, Raghav Sood, Dharmendra Singh, Jayant Haritsa, and Srikanta Tirthapura
Proc. of 44th International Conference on Very Large Data Bases (VLDB), Rio de Janeiro, Brazil, August 2018
published as
PVLDB Journal, 11(12), August 2018, pgs. 1974-1977

Towards Generating HiFi Databases
Anupam Sanghi, Rajkumar S., and Jayant Haritsa
Proc. of 26th International Conference on Database Systems for Advanced Applications (DASFAA), Taipei, Taiwan, April 2021

Projection Compliant Database Generation
Anupam Sanghi, Shadab Ahmed, and Jayant Haritsa
Proc. of 48th International Conference on Very Large Data Bases (VLDB), Sydney, Australia, September 2022
published as
PVLDB Journal, 15(5), January 2022

Videos
Demo: Video

Contact
Email: haritsa [AT] iisc [dot] ac [dot] in

Primary Hydra Contributors (in chronological order of participation)

  • Jayant Haritsa (Project Lead)
  • Anupam Sanghi (PhD, CSA, IISc)
  • Raghav Sood (ME, CSA, IISc)
  • Dharmendra Singh (M. Tech, CSA, IISc)
  • Rajkumar S. (M. Tech Research, CSA, IISc)
  • Shadab Ahmed (M. Tech, CSA, IISc)
  • Tarun K. Patel (M. Tech, CSA, IISc)
  • Subhodeep Maji (M. Tech, CDS, IISc)
  • Prashik K. Rawale (M. Tech, CSA, IISc)
  • Manish Jayswal (M. Tech, CDS, IISc)