Photo Colourbox

Algorithms and Data Structures for Grammar-Compressed Strings

PhD Defence by Patrick Hagge Cording, Tuesday 3 February at 13:00, Technical University of Denmark, Building 101A, Room 01 – 1. floor

Abstract

Textual databases for e.g. biological or web-data are growing rapidly, and it is often only feasible to store the data in compressed form. However, compressing the data comes at a price. Traditional algorithms for e.g. pattern matching requires all data to be decompressed - a computationally demanding task. In this thesis we design data structures for accessing and searching compressed data efficiently.
Our results can be divided into two categories. In the first category we study problems related to pattern matching. In particular, we present new algorithms for counting and comparing substrings, and a new algorithm for finding all occurrences of a pattern in which we may insert gaps. In the other category we deal with accessing and decompressing parts of the compressed string. We show how to quickly access a single character of the compressed string, and present a data structure that supports fast decompression of substrings from prespecified positions.

Supervisors:
Associate Professors Philip Bille and Inge Li Gørtz, DTU Compute

Examiners:

  • Professor Paul August Fischer, DTU Compute 
  • Professor Roberto Grossi, University of Pisa 
  • Research Professor Gadi M. Landau, Haifa University

Chairman of the defence:
Associate Professor Anders Bjorholm Dahl, DTU Compute

All are most welcome.

Tidspunkt

tir 03 feb 15
13:00

Arrangør

DTU Compute

Kontaktperson

Hvor

Technical University of Denmark, Building 101A, Room 01 – 1. floor