Document Image Representations for Automatic Thumbnail Creation

Dr. Kathrin Berkner

Ricoh Research Center
Menlo Park, CA

Tuesday, February 25th, 12:30 PM, ENS 637

berkner@congo.crc.ricoh.com


Abstract

More and more paper documents are stored in electronic form. In general, that includes scanning of a document and archiving it in some file system or database. With a steadily increasing amount of digitally available documents the problem of retrieval becomes important. Besides searching by meta data such as creation time, file location, etc., small low-resolution representations of the original images, so-called thumbnails, are often displayed in order to give the user some visual help to find a desired document. A big problem with current thumbnails is that the image content become unrecognizable, e.g. text becomes unreadable.

In this talk a representation of a document is presented that uses a combination of JPEG 2000-based image analysis and OCR-based text analysis to derive resolution-sensitive representations of document. Using those representations it is possible to create thumbnails that display recognizable content for every thumbnail size.

Biography

Kathrin Berkner received her PhD in mathematics from the University of Bremen, Germany, in 1996 and joined Ricoh Innovations in 1998 after being a postdoctorial researcher at Rice University. She is a Senior Research Scientist in the Color Image Processing group at the California Research Center of Ricoh Innovations. Her interests cover theory and applications of various types of wavelet transforms, use of the JPEG2000 standard for applications beyond compression, and new image representations.


A list of Wireless Networking and Communications Seminars is available at from the ECE department Web pages under "Seminars". The Web address for the Wireless Networking and Communications Seminars is http://signal.ece.utexas.edu/seminars