DSL Sponsored Research



  • Computational Genomics
      Sponsor: Dept. of Bio-Technology, Govt. of India

      Investigators: M. Bansal (MBU), S. Visweswara (MBU), N. Balakrishnan (SERC), J. Haritsa

      Summary: The goal of this project is to gain insights into biological processes through computational techniques. Firstly, the vast amount of genomics data will be analyzed with a variety of tools for establishing the sequence-structure-function relationships in proteins and DNA. Secondly, new mathematical and computational tools will be developed to gain new insights from the genomics data. Thirdly, the results of the investigations will be stored in custom-designed databases that can be accessed by scientists through public domain platforms. Finally, system architectures that are specifically tuned for computational genomics applications will be developed.

      Status: Completed

      Duration: June 2002 - May 2007

  • Database Engines for Sequences
      Sponsor: Dept. of Science and Technology, Govt. of India

      Investigator: J. Haritsa

      Summary: Biological sequence data is not well supported by current database technology, especially with regard to index structures, data storage, query optimization, query operators and preservation of privacy. Due to these shortcomings, biologists are forced to store their sequences in flat files, resulting in large running times for their queries, which will become only worse in the future given the exponential increase in the sequence data size and the number of queries. In this project, we aim to tackle the above-mentioned shortcomings and design a sequence-friendly database engine. Overall, the goal is to ensure that the standard biological queries of today which typically take a couple of hours to process can instead be handled in a much smaller time period. This will require addressing the problem in a holistic manner across all components of the database system and will involve algorithmic, structural, architectural and mathematical innovations.

      Status: Completed

      Duration: January 2003 - December 2007

  • OSHADHI: Design and Implementation of a Bio-diversity Database Management System
      Investigators: J. Haritsa, M. Gadgil (CES), V. Nanjundiah (CES)

      Sponsor: Dept. of Bio-technology, Govt. of India

      Summary: The biodiversity conservation of the large number of plant species of India has become very essential with the rapid growth in the number of plant species which strengthen the genetic base of India due to their rich economic importance. Several organizations undertaking the measures to conserve the biodiversity such as identification of species and monitoring the climatic conditions, are finding it necessary to have efficient and natural access to a variety of biodiversity data. The goal of the OSHADHI project is to adapt state-of-the-art database technology to the biodiversity domain and develop a comprehensive database management system for the biodiversity community. The technology inputs that will be used include object-oriented modeling, hierarchical access methods, spatial access techniques, extensible database systems, and client-server architectures.

      Status: Completed

      Duration: September 1998 - March 2001

  • Design and Analysis of Database Mining Algorithms
      Sponsor: HITACHI Ltd., Japan

      Investigator: J. Haritsa

      Summary: The problem addressed by this project is to design and analyze sampling-based algorithms for database mining. In particular, we wish to clarify the relationship between knowledge extracted from full-scale data and that from sampled data, using statistical methods. The advantage of sampling is that inferences about an entire population can be made based on characteristics exhibited by a representative subset of the population. This is achieved, however, at some cost in the accuracy of the results. We will attempt to quantify the performance versus accuracy tradeoff explicitly, and thereby determine the sample sizes needed to achieve the level of accuracy desired by the user. We will also investigate the possibility of deriving theoretical bounds on the accuracy of the sampling algorithms. Apart from theoretical results, we will also attempt to develop incremental data mining algorithms, wherein sampling is used as a first-cut technique for narrowing down the search space and then fine grain techniques are used to evaluate the reduced search space over larger data sets.

      Status: Completed

      Duration: December 1997 - May 1998

  • MINTO: A Software Tool for Mining Manufacturing Databases
      Sponsor: Dept. of Science and Technology, Govt. of India

      Investigator: J. Haritsa

      Summary: The goal of Database Mining is to discover information from historical organizational databases that can be used to improve their business decisions. Developing efficient algorithms for mining has become an active area of research in the database community in the last few years. However, all the current research prototypes are addressed towards commercial retail data, not manufacturing data which is far more complex, larger in size and different in nature. The problem addressed by this project is to develop a database mining package for mining manufacturing data. This mining software tool, called MINTO, will be customized for manufacturing databases. In particular, we wish to include constructs that support both tabular and complex data, devise sampling techniques that will allow for pattern generation without scanning through the entire database and quantify the error introduced by such sampling techniques, and devise parallel mining algorithms that will speedup the analysis process. Finally, we wish to develop a graphical user interface that facilitates usage of the tool.

      Status: Completed

      Duration: January 1997 - March 1999

  • MIDAS: A Database Design for Flexible Manufacturing Systems
      Investigators: J. Haritsa and V. Rajaraman

      Sponsor: Dept. of Science and Technology, Govt. of India

      Summary: Flexible manufacturing systems (FMS) cater to recent manufacturing trends such as continuous variability in product mix, frequent design changes and just-in-time inventory control. In order to achieve the required degree of flexibility, FMS need real-time access to information about the plant organization and operation. The MIDAS project aims to develop an object-oriented database system for use in flexible manufacturing environments. The system will support complex objects, active mechanisms, decision support, and embedded control.

      Status: Completed

      Duration: February 1995 - January 1997

  • A Distributed OO Simulation Testbed for Manufacturing Systems
      Sponsor: Dept. of Science and Technology, Govt. of India

      Investigators: M. Jacob and J. Haritsa

      Summary: Simulation has become the primary method of studying modern manufacturing systems involving computer control. It is being increasingly used not only at the time of design of a new manufacturing system, but also in real-time or operational environments for decision support. Traditional simulation languages have been used for this purpose, but they do not provide the flexibility required for detailed simulation of complex systems. Object oriented simulation environments are viewed as a good solution to this problem. Our aim therefore, in this project, is to develop an object oriented simulation testbed for manufacturing systems on a distributed computing platform.

      Status: Completed

      Duration: December 1994 - November 1996

  • DIAS: An Object-Oriented Database for Interconnect Analysis
      Sponsor: Texas Instruments India Pvt. Ltd.

      Investigator: J. Haritsa

      Summary: Due to drastic reductions in feature sizes of VLSI chips, device interconnect parasitics have begun to have a significant adverse impact on chip performance. Therefore, the electrical characteristics of these parasitics have to be taken into account in the design of IC chips. This involves processing data that is both large in size and complex in nature, calling for effective data management. The DIAS project aims to develop an object-oriented database system for addressing the parasitic data management problem.

      Status: Completed

      Duration: July 1995 - December 1996