We can update you automatically when this page changes.
To receive regular notification of updates to our Model of the
Month section, click here.
Remember the good ol’ days? (Oh no, I can feel a history lesson coming on.) One of the key aspects of a device’s suitability for VLSI manufacture was its regularity and replicability. To a large degree this principle still holds true today. That is why large memories are always the testbed for the next-generation IC technology. This month, we’ve been involved in the design of a chip that holds true to that original VLSI idiom and present it here for your prediliction. Interestingly enough, it’s a kind of memory chip.
As part of an image-recognition chip-set, the Image Processing Cache Register Array (IPCRA) buffers the data flow between the read phase from each of the system framestores and the receipt of that data by one of the programmable image processor modules (pIP). In many image processing (IP) algorithms, the update value for a single pixel is dependent upon the value of a large number of framestored pixels which must be read from the framestore in order for the algorithm to execute. The cache between the framestore and the image processor is thus a read cache, buffering read pixels in an image processing-generic manner. The success of this approach relies on the fact that a lot of IP algorithms are window based. Thus, the pIPs and the IPCRAs are virtual devices in a system modelling context. The final pIP devices will not necessarily be application-specific processors, they could be function-dedicated ICs. For system development, a VHDL simulation is used to prove the performance of the algorithms being executed by the pIPs.
Figure 1. Simplified system diagram showing the data flow and the architecture of the storage support devices (FrameStore and IPCRA) around the pIP chips.
The requirements of each pIP (as far as the IPCRAs are concerned) can be different. In the final system, this will likely result in different ASICs for each IPCRA instance. In order to prove the algorithms used in the system simulation, the IPCRA model is parameterizable in order to handle the different window sizes used by the different pIPs; the simulations must show performance differences between multiple algorithms running simultaneously. Simply, in terms of VHDL model development, the (virtual) IPCRA instances differ only in their storage size; the data rate in and out of framestores is intended to be fixed across the system. Conceptually, the IPCRA chips adapt the data rate in and out of the pIPs to allow common framestore design.
Figure 2. Internal IPCRA architecture (known as “the Snake”) for a simple 16-pixel image line width and 3x3 window, showing the window storage elements and the data flow of each pixel through the IPCRA chip.
In order to create a parameterisable model of regular device (yes, it sounds like a contradiction, doesn’t it), it’s no surprise that the VHDL model of the IPCRA makes use of generics and generate statements. Generics and generates often go together in VHDL. Beyond this, an interesting aspect of the code is the use of a declaration in the generate block that implements the window_pixel outputs. Note that this is a VHDL’93 feature. Inside the generate block, the window_pixel output selection from the Snake is accomplished using processes. Usually, one expects to see component instances inside generate blocks, however, using a generate block to implement arrays of processes is quite legal in VHDL.
So highlights of this month’s model are:
At a lower level, the power consumed by large IPCRA chips (512-pixels per line and 9x9 windows) is a problem. Such chips can have 150k gates, all toggling at 75MHz. The Snake is thus designed as two parallel chains, one running off the positive edge of the clock, the other off the negative edge. This gives an effective reduction in power of 100%. This is reflected in the VHDL code, where the two chains are separate processes. This low-level aspect of the design is reflected in the complexity of the generate block used to implement the window_pixel output selection.
You are welcome to use the source code we provide but you must keep the copyright notice with the code (see the Acknowledgements page for more details).
-- +-----------------------------+ -- | Library: image_processing | -- | designer : Tim Pagden | -- | opened: 07 Jul 1996 | -- +-----------------------------+ -- Architectures: -- 07.07.96 RTL library vfp; architecture RTL of ipcra_generic is use vfp.bus_class.all; -- 16; 3,3 constant chain_base : integer := ((image_width * (window.depth - 1)) + window.width) / 2; -- 17 constant chain_extra : integer := ((image_width * (window.depth - 1)) + window.width) mod 2; -- 1 constant chain_length : integer := chain_base + chain_extra; -- 18 signal pos_chain : ulogic_8_vector(0 to chain_length-1); -- 0 to 17 signal neg_chain : ulogic_8_vector(0 to chain_length-1); -- 0 to 17 begin chain_pos : process (clock) begin if clock'event and clock='1' then pos_chain(0) <= pixel_in; for i in 1 to chain_length-1 loop pos_chain(i) <= pos_chain(i-1); end loop; end if; end process; chain_neg : process (clock) begin if clock'event and clock='0' then neg_chain(0) <= pixel_in; for i in 1 to chain_length-1 loop neg_chain(i) <= neg_chain(i-1); end loop; end if; end process; window_depth: for j in 0 to window.depth-1 generate window_width: for i in 0 to window.width-1 generate -- generate declarative part, '93 only constant cells_per_edge : integer := (image_width / 2); -- 8 -- generate statement part select_pos_neg : process (clock, neg_chain((cells_per_edge * j) + i/2), pos_chain((cells_per_edge * j) + i/2)) begin if (i mod 2) = 0 then if clock = '0' then window_pixel((j * window.width) + i) <= neg_chain((cells_per_edge * j) + i/2); else window_pixel((j * window.width) + i) <= pos_chain((cells_per_edge * j) + i/2); end if; else if clock = '0' then window_pixel((j * window.width) + i) <= pos_chain((cells_per_edge * j) + i/2); else window_pixel((j * window.width) + i) <= neg_chain((cells_per_edge * j) + i/2); end if; end if; end process; end generate; end generate; end RTL;
Footnote: Yes, we included pictures this month. This was only done as an aid to understanding the design. Previous Models of the Month have been conceptually simple so didn’t really need to be diagrammed. If you disagree, and think pictures are worth having, vote now by clicking here.
ASIC Design and Project
Services
VHDL’93 Update Reference
Doulos Training Courses
Copyright 1995-1996 Doulos
This page was last updated 7th July 1996.
We welcome your e-mail comments. Please contact us at: webmaster@doulos.co.uk