imported>OAbot: Open access bot: url-access updated in citation with #oabot.

2025-05-25T04:27:07Z

Open access bot: url-access updated in citation with #oabot.

New page

{{Multiple issues|
{{Advert|date=January 2021}}
{{Essay-like|date=January 2021}}
{{More citations needed|date=January 2021}}
}}
The '''asynchronous array of simple processors''' ('''AsAP''') architecture comprises a 2-D array of reduced complexity programmable processors with small [[scratchpad memories]] interconnected by a reconfigurable [[mesh network]]. AsAP was developed by researchers in the VLSI Computation Laboratory (VCL) at the [[University of California, Davis]] and achieves high performance and energy efficiency, while using a relatively small circuit area. It was made in 2006.<ref>{{Cite journal|last1=Yu|first1=Zhiyi|last2=Meeuwsen|first2=Michael J.|last3=Apperson|first3=Ryan W.|last4=Sattari|first4=Omar|last5=Lai|first5=Michael|last6=Webb|first6=Jeremy W.|last7=Work|first7=Eric W.|last8=Truong|first8=Dean|last9=Mohsenin|first9=Tinoosh|last10=Baas|first10=Bevan M.|date=March 2008|title=AsAP: An Asynchronous Array of Simple Processors|url=https://ieeexplore.ieee.org/document/4456790|journal=IEEE Journal of Solid-State Circuits|volume=43|issue=3|pages=695–705|doi=10.1109/JSSC.2007.916616|bibcode=2008IJSSC..43..695Y |s2cid=14523656 |issn=0018-9200|url-access=subscription}}</ref>

AsAP processors are well suited for implementation in future fabrication technologies, and are clocked in a [[globally asynchronous locally synchronous]] (GALS) fashion. Individual oscillators fully halt (leakage only) in 9 cycles when there is no work to do, and restart at full speed in less than one cycle after work is available. The chip requires no [[crystal oscillator]]s, [[phase-locked loop]]s, [[delay-locked loop]]s, global [[clock signal]], or any global frequency or phase-related signals whatsoever.

The multi-processor architecture makes use of task-level parallelism in many complex [[Digital signal processor|digital signal processor (DSP)]] applications, and also computes many large tasks using [[Granularity (parallel computing)#Fine-grained parallelism|fine-grained]] parallelism.

==Key features==
[[Image:Processor.jpg|thumb|right|300px|Block diagrams of a single AsAP processor and the 6x6 AsAP 1.0 chip]]
AsAP uses several novel key features, of which four are:
* Chip multi-processor (CMP) architecture designed to achieve high performance and low power for many DSP applications.
* Small memories and a simple architecture in each processor to achieve high [[Efficient energy use|energy efficiency]].
* Globally asynchronous locally synchronous (GALS) clocking simplifies the [[clock distribution network|clock design]], greatly increases ease of scalability, and can be used to further [[low-power electronics|reduce power dissipation]].
* Inter-processor communication is performed by a nearest neighbor network to avoid long global wires and increase scalability to large arrays and in advanced fabrication technologies. Each processor can receive data from any two neighbors and send data to any combination of its four neighbors.

==AsAP 1 chip: 36 processors==
[[Image:DiePhoto.jpg|thumb|right|175px|Die photograph of the first generation 36-processor AsAP chip]]
A chip containing 36 (6x6) programmable processors was taped-out in May 2005 in 0.18 μm CMOS using a synthesized standard cell technology and is fully functional. Processors on the chip operate at clock rates from 520 MHz to 540 MHz at 1.8V and each processor dissipates 32 mW on average while executing applications at 475 MHz.

Most processors run at clock rates over 600 MHz at 2.0 V, which makes AsAP among the highest known clock rate fabricated processors (programmable or non-programmable) ever designed in a university; it is the second highest known in published research papers.

At 0.9 V, the average application power per processor is 2.4 mW at 116 MHz. Each processor occupies 0.66 mm².

==AsAP 2 chip: 167 processors==
[[Image:Asap2.diephoto.300x327.touchedup.jpg|thumb|right|175px|Die photograph of the second generation 167-processor AsAP 2 chip]]
A second generation [[65 nm process|65 nm]] CMOS design contains 167 processors with dedicated [[fast Fourier transform]] (FFT), [[Viterbi decoder]], and video [[motion estimation]] processors; 16 KB shared memories; and long-distance inter-processor interconnect. The programmable processors can individually and dynamically [[Dynamic voltage scaling|change their supply voltage]] and [[Dynamic frequency scaling|clock frequency]]. The chip is fully functional. Processors operate up to 1.2 GHz at 1.3 V which is believed to be the highest clock rate fabricated processor designed in any university. At 1.2 V, they operate at 1.07 GHz and 47 mW when 100% active. At 0.675 V, they operate at 66 MHz and 608 μW when 100% active. This operating point enables 1 trillion [[multiply–accumulate operation|MAC]] or [[arithmetic logic unit]] (ALU) ops/sec with a power dissipation of only 9.2 watts. Due to its [[Multiple instruction, multiple data|MIMD]] architecture and fine-grain clock oscillator stalling, this energy efficiency per operation is almost perfectly constant across widely varying workloads, which is not the case for many architectures.

==Applications==
The coding of many DSP and general tasks for AsAP has been completed. Mapped tasks include: filters, [[convolutional coders]], interleavers, sorting, square root, [[CORDIC]] sin/cos/arcsin/arccos, [[matrix multiplication]], pseudo random number generators, [[fast Fourier transform]]s (FFTs) of lengths 32–1024, a complete k=7 [[Viterbi decoder]], a [[JPEG]] encoder, a complete fully compliant baseband processor for an [[802.11|IEEE 802.11a/g]] wireless LAN transmitter and receiver, and a complete [[CAVLC]] compression block for an [[H.264/MPEG-4 AVC|H.264]] encoder. Blocks plug directly together with no required modifications. Power, throughput, and area results are typically many times better than existing programmable DSP processors.

The architecture enables a clean separation between programming and inter-processor timing handled entirely by hardware. A recently finished [[C (programming language)|C]] compiler and automatic mapping tool further simplify programming.

==See also==
* [[Manycore processor]]
* [[Multi-core processor]]
* [[Multiple instruction, multiple data|MIMD]]
* [[Parallel computing]]
* [[Transputer]]

==References==
{{Reflist}}

* {{cite journal
|last=Truong
|first=Dean
|author2=Wayne H. Cheng
|author3=Tinoosh Mohsenin
|author4=Zhiyi Yu
|author5=Anthony T. Jacobson
|author6=Gouri Landge
|author7=Michael J. Meeuwsen
|author8=Anh T. Tran
|author9=Zhibin Xiao
|author10=Eric W. Work
|author11=Jeremy W. Webb
|author12=Paul V. Mejia
|author13=Bevan M. Baas
|title=A 167-Processor Computational Platform in 65 nm CMOS
|journal=IEEE Journal of Solid-State Circuits
|volume=44
|issue=4
|date=April 2009
|page=1130
|doi=10.1109/JSSC.2009.2013772
|bibcode=2009IJSSC..44.1130T
|s2cid=11502057
|url=http://web.ece.ucdavis.edu/vcl/pubs/2009.04.JSSC/
|url-status=dead
|archive-url=https://web.archive.org/web/20150621030532/http://web.ece.ucdavis.edu/vcl/pubs/2009.04.JSSC/
|archive-date=2015-06-21
|url-access=subscription
}}
* {{cite conference
|last=Truong
|first=Dean
|author2=Cheng, Wayne
|author3=Mohsenin, Tinoosh
|author4=Yu, Zhiyi
|author5=Jacobson, Toney
|author6=Landge, Gouri
|author7=Meeuwsen, Michael
|author8=Watnik, Christine
|author9=Mejia, Paul
|author10=Tran, Anh
|author11=Webb, Jeremy
|author12=Work, Eric
|author13=Xiao, Zhibin
|author14=Baas, Bevan M.
|title=A 167-processor 65 nm Computational Platform with Per-Processor Dynamic Supply Voltage and Dynamic Clock Frequency Scaling
|book-title=In Proceedings of the IEEE Symposium on VLSI Circuits, 2008
|place=Honolulu, HI
|date=June 2008
|pages=22–23
|url=http://web.ece.ucdavis.edu/vcl/pubs/2008.06.symp.vlsi/
|url-status=dead
|archive-url=https://web.archive.org/web/20141225104338/http://web.ece.ucdavis.edu/vcl/pubs/2008.06.symp.vlsi/
|archive-date=2014-12-25
}}
* {{cite journal
|last=Baas
|first=Bevan
|author2=Yu, Zhiyi
|author3=Meeuwsen, Michael
|author4=Sattari, Omar
|author5=Apperson, Ryan
|author6=Work, Eric
|author7=Webb, Jeremy
|author8=Lai, Michael
|author9=Mohsenin, Tinoosh
|author10=Truong, Dean
|author11=Cheung, Jason
|title=AsAP: A Fine-Grained Many-Core Platform for DSP Applications
|journal=IEEE Micro
|volume=27
|issue=2
|date=March–April 2007
|pages=34–45
|doi=10.1109/MM.2007.29
|s2cid=18443228
|url=http://web.ece.ucdavis.edu/vcl/pubs/2007.07.ieee.micro/
|url-status=dead
|archive-url=https://web.archive.org/web/20150625145934/http://web.ece.ucdavis.edu/vcl/pubs/2007.07.ieee.micro/
|archive-date=2015-06-25
|url-access=subscription
}}
* {{cite conference |last=Baas |first=Bevan |author2=Yu, Zhiyi |author3=Meeuwsen, Michael |author4=Sattari, Omar |author5=Apperson, Ryan |author6=Work, Eric |author7=Webb, Jeremy |author8=Lai, Michael |author9=Gurman, Daniel |author10=Chen, Chi |author11=Cheung, Jason |author12=Truong, Dean |author13=Mohsenin, Tinoosh |title=Hardware and Applications of AsAP: An Asynchronous Array of Simple Processors |book-title=In Proceedings of the IEEE HotChips Symposium on High-Performance Chips, (HotChips 2006) |place=Stanford |date=August 2006 |url=http://www.hotchips.org/archives/hc18 |conference= |access-date=2007-09-27 |archive-date=2014-02-28 |archive-url=https://web.archive.org/web/20140228050730/http://www.hotchips.org/archives/hc18/ |url-status=dead }}
* {{cite conference
|last=Yu
|first=Zhiyi
|author2=Meeuwsen, Michael
|author3=Apperson, Ryan
|author4=Sattari, Omar
|author5=Lai, Michael
|author6=Webb, Jeremy
|author7=Work, Eric
|author8=Mohsenin, Tinoosh
|author9=Singh, Mandeep
|author10=Baas, Bevan M.
|title=An Asynchronous Array of Simple Processors for DSP Applications
|book-title=In Proceedings of the IEEE International Solid-State Circuits Conference, (ISSCC '06)
|location=San Francisco, CA
|date=February 2006
|pages=428–429, 663
|url=http://web.ece.ucdavis.edu/vcl/pubs/2006.02/
|url-status=dead
|archive-url=https://web.archive.org/web/20141225092457/http://web.ece.ucdavis.edu/vcl/pubs/2006.02/
|archive-date=2014-12-25
}}

==External links==
* [http://vcl.ece.ucdavis.edu/ VLSI Computation Lab, UC Davis]
* [http://vcl.ece.ucdavis.edu/asap/ Asynchronous Array of Simple Processors (AsAP) project]
* [http://www.eetimes.com/document.asp?doc_id=1242912 EETimes article describing AsAP]

[[Category:Manycore processors]]
[[Category:Digital signal processors]]
[[Category:Parallel computing]]

Asynchronous array of simple processors - Revision history

imported>OAbot: Open access bot: url-access updated in citation with #oabot.