Implementing Image Processing Algorithms on FPGAs

Instructor: Donald Bailey, Massey University

Motivation

The application of FPGAs to image processing is a rapidly growing research area given recent increases in the power and size of FPGAs. The potential speed gains make it an attractive topic, although there are many challenges to implementing working algorithms on FPGAs. Most newcomers consider simply porting an existing software algorithm to an FPGA implementation. Unfortunately, this generally gives disappointing results. This tutorial aims to help those wishing to use FPGAs to accelerate image processing algorithms through some of the pitfalls, and provide a range of techniques that result in an efficient implementation, both computationally and in terms of resource requirements.

Brief description

FPGAs are increasingly being used as an implementation platform for real-time image processing applications because their structure is able to exploit spatial and temporal parallelism. There are typically four stages involved in designing an FPGA based image processing system: problem specification; algorithm development; architecture selection; system implementation. For each stage, the key challenges are identified, and the differences between developing a software and hardware based solutions are highlighted. Simply porting an algorithm onto an FPGA often gives disappointing results, because most image processing algorithms have been optimised for a serial processor. Therefore it is necessary to transform the algorithm to efficiently exploit the parallelism inherent within the algorithm. The process is illustrated with examples and case studies.

Objectives

To show how an image processing algorithm may be modified for efficient implementation on an FPGA. As a result of attending the tutorial, delegates will:

Content

The tutorial is targeted at those who want an introduction to the advantages, and also some of the problems of mapping image processing algorithms onto FPGAs. The examples will focus mainly on low to intermediate-level image processing operations, and applications.

The tutorial will discuss parallelism in the context of image processing, and how it may be exploited to accelerate image processing algorithms. An overview will be given of the architecture of modern FPGAs, and how this may be exploited for implementing image processing systems.

The image processing design process will be thoroughly described, and the differences in the design process between software and hardware based design highlighted. This includes a discussion of the algorithm development process, and the structure of typical image processing algorithms with the implications for FPGA based implementation. Architecture selection is considered at both the system level, and also the computational level. While most low level operations can be mapped efficiently to hardware, many intermediate and high level operations result in inefficient use of hardware requiring appropriate partitioning of the algorithm between hardware and software. Techniques for overcoming timing, bandwidth, and resource constraints will also be described.

The reasons why simply porting the software based algorithms often gives disappointing results is discussed. The solution requires the algorithm to be transformed to make effective use of parallelism. The transformation process requires analysing the algorithm to identify inherent parallelism, selecting and designing an appropriate parallel computation architecture, and mapping from the algorithm to that architecture.

These processes will be illustrated with practical examples, and case studies.