Collaborative Research: Support Efficient Similarity Searches for Multidimensional Non-ordered Discrete Data Spaces

Supported by the National Science Foundation Research Grants
                     and The University of Michigan Research Grants (OVPR and UMD)
 

Principal Investigator at UM-D

Dr. Qiang Zhu
Department of Computer and Information Science,
The University of Michigan, Dearborn, MI, USA
qzhu@umich.edu
(NSF Grant for UM-D: IIS-0414594)
 

Principal Investigator at MSU

Dr. Sakti Pramanik
Department of Computer Science and Engineering,
Michigan State University, East Lansing, MI, USA
pramanik@msu.edu
(NSF Grant for MSU: IIS-0414576)
 

Graduate Students

Changqing Chen
Dashiell Matthews Kolbe
Hyun-Jeong Seok
Alok Watve
Qiang Xue

Undergraduate Students

Chad Klochko
Rachel Radziszewski

High School Student

David Kong

Alumni

Gang Qian

Project Overview

This collaborative research project conducted jointly by the Michigan State University and the University of Michigan at Dearborn is to investigate the issues and techniques for supporting efficient similarity searches in multidimensional Non-ordered Discrete Data Spaces (NDDS). Similarity searches in NDDSs are becoming increasingly important for application areas such as genome sequence databases. Efficient similarity searches require robust indexing techniques. Unfortunately, existing indexing methods are either not suitable for an NDDS (e.g., the R*-tree) or too generic to provide good performance for an NDDS (e.g., metric trees). The main goal of this project is to study the fundamental properties of NDDSs and develop indexing methods exploiting these properties to support efficient similarity searches in NDDSs. We introduce a set of essential geometric concepts for an NDDS based on some similar concepts for a traditional (ordered) Continuous Data Spaces (CDS). Using them, we explore and compare promising data-partitioning (DP) based and space-partitioning (SP) based indexing techniques for NDDSs, including index tree structures, building strategies, search algorithms and performance models. We also study other related issues including supporting various types of queries, adopting different distance measures, indexing hybrid data spaces with mixed ordered and non-ordered dimensions, developing efficient bulk loading techniques, and utilizing effective compression schemes. This research will provide contemporary database indexing techniques to solve relevant issues in the emerging fields such as bioinformatics, biometrics and E-commerce.

Project References

Project Web Sites

  • Web site at MSU
  • Web site at UM-D
  • Other Related Links

  • Database Research Group at UM-D

  • Number of visitors to this page is: Counter Image since 05/01/2005