Effective indexing and searching with dimensionality reduction in high-dimensional space

Seungdo Jeong, Sang-Wook Kim, Byung Uk Choi

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

In multimedia information retrieval, multimedia data are represented as vectors in high-dimensional space. Performance of many indexing methods maintaining these vectors dramatically degrades with increasing dimensionality, which is known as the dimensionality curse. To resolve this dimensionality curse, dimensionality reduction methods which map vectors in high-dimensional space into those in low-dimensional space before the data are indexed have been proposed. This paper addresses indexing and searching issues for low-dimensional data vectors which are reduced by a dimensionality reduction method. This method maps high-dimensional vectors into low-dimensional ones by using their norms and approximated angles. However, in indexing such data using the R-tree, which is a traditional multi-dimensional index structure, there several problems appear. In this paper, we first identify the problems associated with indexing and searching of low-dimensional vectors and then propose an approach to solve these problems. We formally prove that the proposed approach does not incur false dismissal in searching. Finally, we verify the superiority of the proposed approach via extensive experiments with synthetic and real-life data sets.

Original languageEnglish
Pages (from-to)291-302
Number of pages12
JournalComputer Systems Science and Engineering
Volume31
Issue number4
StatePublished - 2016 Jul 1

Fingerprint

Dimensionality Reduction
Indexing
High-dimensional
Dimensionality
Reduction Method
Multimedia
R-tree
Information retrieval
Information Retrieval
Resolve
Verify
Norm
Angle
Experiment
Experiments

Keywords

  • Dimensionality reduction
  • High-dimensional indexing
  • Multimedia information retrieval
  • Query processing

Cite this

@article{661b5dec0f774d4e960e77754ea1ea19,
title = "Effective indexing and searching with dimensionality reduction in high-dimensional space",
abstract = "In multimedia information retrieval, multimedia data are represented as vectors in high-dimensional space. Performance of many indexing methods maintaining these vectors dramatically degrades with increasing dimensionality, which is known as the dimensionality curse. To resolve this dimensionality curse, dimensionality reduction methods which map vectors in high-dimensional space into those in low-dimensional space before the data are indexed have been proposed. This paper addresses indexing and searching issues for low-dimensional data vectors which are reduced by a dimensionality reduction method. This method maps high-dimensional vectors into low-dimensional ones by using their norms and approximated angles. However, in indexing such data using the R∗-tree, which is a traditional multi-dimensional index structure, there several problems appear. In this paper, we first identify the problems associated with indexing and searching of low-dimensional vectors and then propose an approach to solve these problems. We formally prove that the proposed approach does not incur false dismissal in searching. Finally, we verify the superiority of the proposed approach via extensive experiments with synthetic and real-life data sets.",
keywords = "Dimensionality reduction, High-dimensional indexing, Multimedia information retrieval, Query processing",
author = "Seungdo Jeong and Sang-Wook Kim and Choi, {Byung Uk}",
year = "2016",
month = "7",
day = "1",
language = "English",
volume = "31",
pages = "291--302",
journal = "Computer Systems Science and Engineering",
issn = "0267-6192",
number = "4",

}

Effective indexing and searching with dimensionality reduction in high-dimensional space. / Jeong, Seungdo; Kim, Sang-Wook; Choi, Byung Uk.

In: Computer Systems Science and Engineering, Vol. 31, No. 4, 01.07.2016, p. 291-302.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Effective indexing and searching with dimensionality reduction in high-dimensional space

AU - Jeong, Seungdo

AU - Kim, Sang-Wook

AU - Choi, Byung Uk

PY - 2016/7/1

Y1 - 2016/7/1

N2 - In multimedia information retrieval, multimedia data are represented as vectors in high-dimensional space. Performance of many indexing methods maintaining these vectors dramatically degrades with increasing dimensionality, which is known as the dimensionality curse. To resolve this dimensionality curse, dimensionality reduction methods which map vectors in high-dimensional space into those in low-dimensional space before the data are indexed have been proposed. This paper addresses indexing and searching issues for low-dimensional data vectors which are reduced by a dimensionality reduction method. This method maps high-dimensional vectors into low-dimensional ones by using their norms and approximated angles. However, in indexing such data using the R∗-tree, which is a traditional multi-dimensional index structure, there several problems appear. In this paper, we first identify the problems associated with indexing and searching of low-dimensional vectors and then propose an approach to solve these problems. We formally prove that the proposed approach does not incur false dismissal in searching. Finally, we verify the superiority of the proposed approach via extensive experiments with synthetic and real-life data sets.

AB - In multimedia information retrieval, multimedia data are represented as vectors in high-dimensional space. Performance of many indexing methods maintaining these vectors dramatically degrades with increasing dimensionality, which is known as the dimensionality curse. To resolve this dimensionality curse, dimensionality reduction methods which map vectors in high-dimensional space into those in low-dimensional space before the data are indexed have been proposed. This paper addresses indexing and searching issues for low-dimensional data vectors which are reduced by a dimensionality reduction method. This method maps high-dimensional vectors into low-dimensional ones by using their norms and approximated angles. However, in indexing such data using the R∗-tree, which is a traditional multi-dimensional index structure, there several problems appear. In this paper, we first identify the problems associated with indexing and searching of low-dimensional vectors and then propose an approach to solve these problems. We formally prove that the proposed approach does not incur false dismissal in searching. Finally, we verify the superiority of the proposed approach via extensive experiments with synthetic and real-life data sets.

KW - Dimensionality reduction

KW - High-dimensional indexing

KW - Multimedia information retrieval

KW - Query processing

UR - http://www.scopus.com/inward/record.url?scp=84991688392&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:84991688392

VL - 31

SP - 291

EP - 302

JO - Computer Systems Science and Engineering

JF - Computer Systems Science and Engineering

SN - 0267-6192

IS - 4

ER -