<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>http://debianws.lexgopc.com/wiki143/index.php?action=history&amp;feed=atom&amp;title=Massively_parallel_processor_array</id>
	<title>Massively parallel processor array - Revision history</title>
	<link rel="self" type="application/atom+xml" href="http://debianws.lexgopc.com/wiki143/index.php?action=history&amp;feed=atom&amp;title=Massively_parallel_processor_array"/>
	<link rel="alternate" type="text/html" href="http://debianws.lexgopc.com/wiki143/index.php?title=Massively_parallel_processor_array&amp;action=history"/>
	<updated>2026-04-22T19:02:34Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.43.1</generator>
	<entry>
		<id>http://debianws.lexgopc.com/wiki143/index.php?title=Massively_parallel_processor_array&amp;diff=6810664&amp;oldid=prev</id>
		<title>imported&gt;Rofraja: Replaced 1 bare URLs by {{Cite web}}; Replaced &quot;Archived copy&quot; by actual titles</title>
		<link rel="alternate" type="text/html" href="http://debianws.lexgopc.com/wiki143/index.php?title=Massively_parallel_processor_array&amp;diff=6810664&amp;oldid=prev"/>
		<updated>2025-06-29T18:50:14Z</updated>

		<summary type="html">&lt;p&gt;Replaced 1 bare URLs by {{Cite web}}; Replaced &amp;quot;Archived copy&amp;quot; by actual titles&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;{{Short description|Type of integrated circuit}}&lt;br /&gt;
A &amp;#039;&amp;#039;&amp;#039;massively parallel processor array&amp;#039;&amp;#039;&amp;#039;, also known as a &amp;#039;&amp;#039;&amp;#039;multi purpose processor array&amp;#039;&amp;#039;&amp;#039; (&amp;#039;&amp;#039;&amp;#039;MPPA&amp;#039;&amp;#039;&amp;#039;) is a type of [[integrated circuit]] which has a [[massively parallel]] array of hundreds or thousands of [[Central processing unit|CPU]]s and [[Random-access memory|RAM]] memories. These processors pass work to one another through a [[Reconfigurability|reconfigurable]] interconnect of [[Channel (communications)|channels]]. By harnessing a large number of processors working in parallel, an MPPA chip can accomplish more demanding tasks than conventional chips. MPPAs are based on a software parallel [[programming model]] for developing high-performance [[embedded system]] applications.&lt;br /&gt;
&lt;br /&gt;
==Architecture==&lt;br /&gt;
&lt;br /&gt;
MPPA is a [[Multiple instruction, multiple data|MIMD]] (Multiple Instruction streams, Multiple Data) architecture, with [[distributed memory]] accessed locally, not shared globally. Each processor is strictly encapsulated, accessing only its own code and memory. Point-to-point communication between processors is directly realized in the configurable interconnect.&amp;lt;ref&amp;gt;Mike Butts, &amp;quot;Synchronization through Communication in a Massively Parallel Processor Array&amp;quot;, IEEE Micro, vol. 27, no. 5, September/October 2007, [[IEEE Computer Society]]&amp;lt;/ref&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The MPPA&amp;#039;s massive parallelism and its distributed memory MIMD architecture distinguishes it from [[Multi-core (computing)|multicore]] and [[Manycore processor|manycore]] architectures, which have fewer processors and an [[symmetric multiprocessing|SMP]] or other [[Shared memory architecture|shared memory]] architecture, mainly intended for general-purpose computing. It&amp;#039;s also distinguished from [[GPGPU]]s with [[Single instruction, multiple data|SIMD]] architectures, used for [[High-performance computing|HPC]] applications.&amp;lt;ref&amp;gt;Mike Butts, &amp;quot;Multicore and Massively Parallel Platforms and Moore&amp;#039;s Law Scalability&amp;quot;, Proceedings of the Embedded Systems Conference - Silicon Valley, April 2008&amp;lt;/ref&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Programming==&lt;br /&gt;
&lt;br /&gt;
An MPPA application is developed by expressing it as a hierarchical [[block diagram]] or [[workflow]], whose basic objects run in parallel, each on their own processor. Likewise, large data objects may be broken up and distributed into local memories with parallel access. Objects communicate over a parallel structure of dedicated channels. The objective is to maximize aggregate throughput while minimizing local latency, optimizing performance and efficiency. An MPPA&amp;#039;s [[model of computation]] is similar to a [[Kahn process network]] or [[communicating sequential processes]] (CSP).&amp;lt;ref&amp;gt;Mike Butts, Brad Budlong, Paul Wasson, Ed White, &amp;quot;Reconfigurable Work Farms on a Massively Parallel Processor Array&amp;quot;, Proceedings of [[FCCM]], April 2008, [[IEEE Computer Society]]&amp;lt;/ref&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Applications==&lt;br /&gt;
&lt;br /&gt;
MPPAs are used in high-performance [[embedded system]]s and [[hardware acceleration]] of [[desktop computer]] and [[Server (computing)|server]] applications, such as [[video compression]],&amp;lt;ref&amp;gt;Laurent Bonetto, &amp;quot;Massively parallel processing arrays (MPPAs) for embedded HD video and imaging (Part 1)&amp;quot;, Video/Imaging DesignLine, May 16, 2008 http://www.eetimes.com/document.asp?doc_id=1273823&amp;lt;/ref&amp;gt;&amp;lt;ref&amp;gt;Laurent Bonetto, &amp;quot;Massively parallel processing arrays (MPPAs) for embedded HD video and imaging (Part 2)&amp;quot;, Video/Imaging DesignLine, July 18, 2008 http://www.eetimes.com/document.asp?doc_id=1273830&amp;lt;/ref&amp;gt; [[image processing]],&amp;lt;ref&amp;gt;Paul Chen, &amp;quot;Multimode sensor processing using Massively Parallel Processor Arrays (MPPAs)&amp;quot;, Programmable Logic DesignLine, March 18, 2008 http://www.pldesignline.com/howto/206904379&amp;lt;/ref&amp;gt; [[medical imaging]], [[network processing]], [[software-defined radio]] and other compute-intensive streaming media applications, which otherwise would use [[FPGA]], [[digital signal processor|DSP]] and/or [[Application-specific integrated circuit|ASIC]] chips.&lt;br /&gt;
&lt;br /&gt;
==Examples==&lt;br /&gt;
&lt;br /&gt;
MPPAs developed in companies include ones designed at: [[Ambric]], [[PicoChip]], [[Intel]],&amp;lt;ref&amp;gt;Vangal, Sriram R., Jason Howard, Gregory Ruhl, Saurabh Dighe, Howard Wilson, James Tschanz, David Finan et al. &amp;quot;An 80-tile sub-100-w teraflops processor in 65-nm cmos.&amp;quot; Solid-State Circuits, IEEE Journal of 43, no. 1 (2008): 29-41.&amp;lt;/ref&amp;gt; [[IntellaSys]], [[GreenArrays]], [[ASOCS]], [[Tilera]], [[Kalray]], [[Coherent Logix]], [[Tabula (company)|Tabula]], and [[Adapteva]]. [[Aspex (Ericsson)]] Linedancer differs in that it was a Massive wide &amp;#039;&amp;#039;SIMD&amp;#039;&amp;#039; Array rather than an MPPA. Strictly speaking it could qualify as [[Single Instruction Multiple Threads|SIMT]] due to all 4096 of the 3,000 gate cores having its own Content-Addressable Memory.&amp;lt;ref&amp;gt;{{Cite book|chapter-url=https://link.springer.com/chapter/10.1007/978-94-009-0643-3_39|doi = 10.1007/978-94-009-0643-3_39|chapter = Artificial Neural Network on a Massively Parallel Associative Architecture|title = International Neural Network Conference|year = 1990|last1 = Krikelis|first1 = A.|page = 673|isbn = 978-0-7923-0831-7}}&amp;lt;/ref&amp;gt;&amp;lt;ref&amp;gt;{{Cite web| title=Effective Monte Carlo simulation on System-V massively parallel associative string processing architecture | url=https://core.ac.uk/download/pdf/25268094.pdf | archive-url=https://web.archive.org/web/20210606003056/https://core.ac.uk/download/pdf/25268094.pdf | archive-date=2021-06-06}}&amp;lt;/ref&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Fabricated MPPAs developed in universities include: 36-core&amp;lt;ref&amp;gt;Yu, Zhiyi, Michael Meeuwsen, Ryan Apperson, Omar Sattari, Michael Lai, Jeremy Webb, Eric Work, Tinoosh Mohsenin, Mandeep Singh, and Bevan Baas. &amp;quot;An asynchronous array of simple processors for DSP applications.&amp;quot; In IEEE International Solid-State Circuits Conference,(ISSCC’06), vol. 49, pp. 428-429. 2006&amp;lt;/ref&amp;gt; and 167-core&amp;lt;ref&amp;gt;Truong, Dean, Wayne Cheng, Tinoosh Mohsenin, Zhiyi Yu, Toney Jacobson, Gouri Landge, Michael Meeuwsen et al. &amp;quot;A 167-processor 65 nm computational platform with per-processor dynamic supply voltage and dynamic clock frequency scaling.&amp;quot; In Symposium on VLSI Circuits, pp. 22-23. 2008&amp;lt;/ref&amp;gt; [[Asynchronous Array of Simple Processors|Asynchronous Array of Simple Processors (AsAP)]] arrays from the [[University of California, Davis]], 16-core RAW&amp;lt;ref&amp;gt;Michael Bedford Taylor, Jason Kim, Jason Miller, David Wentzlaff, Fae Ghodrat, Ben Greenwald, Henry Hoffmann, Paul Johnson, Walter Lee, Arvind Saraf, Nathan Shnidman, Volker Strumpen, Saman Amarasinghe, and Anant Agarwal, &amp;quot;A 16-issue multiple-program-counter microprocessor with point-to-point scalar operand network,&amp;quot; Proceedings of the IEEE International Solid-State Circuits Conference, February 2003&amp;lt;/ref&amp;gt; from [[MIT]], and 16-core&amp;lt;ref&amp;gt;Yu, Zhiyi, Kaidi You, Ruijin Xiao, Heng Quan, Peng Ou, Yan Ying, Haofan Yang, and Xiaoyang Zeng. &amp;quot;An 800MHz 320mW 16-core processor with message-passing and shared-memory inter-core communication mechanisms.&amp;quot; In Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2012 IEEE International, pp. 64-66. IEEE, 2012.&amp;lt;/ref&amp;gt; and 24-core&amp;lt;ref&amp;gt;Ou, Peng, Jiajie Zhang, Heng Quan, Yi Li, Maofei He, Zheng Yu, Xueqiu Yu et al. &amp;quot;A 65nm 39GOPS/W 24-core processor with 11&amp;amp;nbsp;Tb/s/W packet-controlled circuit-switched double-layer network-on-chip and heterogeneous execution array.&amp;quot; In Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2013 IEEE International, pp. 56-57. IEEE, 2013.&amp;lt;/ref&amp;gt; arrays from [[Fudan University]].&lt;br /&gt;
&lt;br /&gt;
The Chinese [[Sunway (processor)|Sunway]] project developed their own 260-core [[SW26010]] manycore chip for the [[TaihuLight]] supercomputer, which is as of 2016 the world&amp;#039;s fastest supercomputer.&amp;lt;ref name=dongarra2016&amp;gt;{{Cite web|url=http://www.netlib.org/utk/people/JackDongarra/PAPERS/sunway-report-2016.pdf|title=Report on the Sunway TaihuLight System|last=Dongarra|first=Jack|date=June 20, 2016|website=www.netlib.org|access-date=June 20, 2016}}&amp;lt;/ref&amp;gt;&amp;lt;ref&amp;gt;{{Cite journal| last1 = Fu| first1 = Haohuan| last2 = Liao| first2 = Junfeng| last3 = Yang| first3 = Jinzhe| last4 = Wang| first4 = Lanning| last5 = Song| first5 = Zhenya| last6 = Huang| first6 = Xiaomeng| last7 = Yang| first7 = Chao| last8 = Xue| first8 = Wei| last9 = Liu| first9 = Fangfang| last10 = Qiao| first10 = Fangli| last11 = Zhao| first11 = Wei| last12 = Yin| first12 = Xunqiang| last13 = Hou| first13 = Chaofeng| last14 = Zhang| first14 = Chenglong| last15 = Ge| first15 = Wei| last16 = Zhang| first16 = Jian| last17 = Wang| first17 = Yangang| last18 = Zhou| first18 = Chunbo| last19 = Yang| first19 = Guangwen|display-authors=3|date=2016|title=The Sunway TaihuLight Supercomputer: System and Applications|journal=Sci. China Inf. Sci.| volume = 59| issue = 7|doi=10.1007/s11432-016-5588-7| doi-access = free}}&amp;lt;/ref&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Anton 3 processors, designed by [[D. E. Shaw Research]] for [[molecular dynamics]] simulations, contain arrays of 576 processors arranged in a 12×24 tiled grid of pairs of cores; a routed network links these tiles together and extends off-chip to other nodes in a full system.&amp;lt;ref&amp;gt;{{Cite book |last1=Shaw |first1=David E. |last2=Adams |first2=Peter J. |last3=Azaria |first3=Asaph |last4=Bank |first4=Joseph A. |last5=Batson |first5=Brannon |last6=Bell |first6=Alistair |last7=Bergdorf |first7=Michael |last8=Bhatt |first8=Jhanvi |last9=Butts |first9=J. Adam |last10=Correia |first10=Timothy |last11=Dirks |first11=Robert M. |last12=Dror |first12=Ron O. |last13=Eastwood |first13=Michael P. |last14=Edwards |first14=Bruce |last15=Even |first15=Amos |title=Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis |chapter=Anton 3 |date=2021-11-14 |language=en |location=St. Louis Missouri |publisher=ACM |pages=1–11 |doi=10.1145/3458817.3487397 |isbn=978-1-4503-8442-1|s2cid=239036976 |doi-access=free }}&amp;lt;/ref&amp;gt;&amp;lt;ref&amp;gt;{{Cite book |last1=Adams |first1=Peter J. |last2=Batson |first2=Brannon |last3=Bell |first3=Alistair |last4=Bhatt |first4=Jhanvi |last5=Butts |first5=J. Adam |last6=Correia |first6=Timothy |last7=Edwards |first7=Bruce |last8=Feldmann |first8=Peter |last9=Fenton |first9=Christopher H. |last10=Forte |first10=Anthony |last11=Gagliardo |first11=Joseph |last12=Gill |first12=Gennette |last13=Gorlatova |first13=Maria |last14=Greskamp |first14=Brian |last15=Grossman |first15=J.P. |title=2021 IEEE Hot Chips 33 Symposium (HCS) |chapter=The ΛNTON 3 ASIC: A Fire-Breathing Monster for Molecular Dynamics Simulations |date=2021-08-22 |chapter-url=https://ieeexplore.ieee.org/document/9567084 |location=Palo Alto, CA, USA |publisher=IEEE |pages=1–22 |doi=10.1109/HCS52781.2021.9567084 |isbn=978-1-6654-1397-8|s2cid=239039245 }}&amp;lt;/ref&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==See also==&lt;br /&gt;
* [[Manycore processor]]&lt;br /&gt;
* [[AI accelerator]]&lt;br /&gt;
* [[Asynchronous array of simple processors]]&lt;br /&gt;
* [[SW26010]]&lt;br /&gt;
* [[Array processor]]&lt;br /&gt;
&lt;br /&gt;
==References==&lt;br /&gt;
{{Reflist}}&lt;br /&gt;
&lt;br /&gt;
[[Category:Manycore processors]]&lt;br /&gt;
[[Category:Parallel computing]]&lt;/div&gt;</summary>
		<author><name>imported&gt;Rofraja</name></author>
	</entry>
</feed>