Designing a Novel Reversible Systolic Array Using QCA

Many efforts have been done about designing nano-based devices till today. One of these devices is Quantum Cellular Automata (QCA). Because of astonishing growth in VLSI circuits Designs in larger scales and necessity of feature size reduction, there is more need to design complicated control systems using nano-based devices. Besides, since there is a critical manner of temperature in QCA devices, complicated systems using these devices should be designed reversibly. This article has been proposed a novel architecture for QCA circuits in order to utilizing in complicated control systems based on systolic arrays with high throughput and least power dissipation.


1-Introduction
FET-based Devices since the 1970s has been created and nowadays FETs have an incredible improvement however, FETs got serious effects making any progress in scaling more difficult because of 0.1 um limitations at gate lengths.Feature size reduction solutions have more advantage than fighting against those effects [1].From one side, Quantum mechanical based Devices keep promise of faster speeds and astonishingly reduced feature size [2] and from another side cellular automata offer many advantages like scalability, simplicity in implementation , computationally and inherently parallel [3].
Hence, one of the most common alternatives is Quantum Cellular Automata proposed by Lent.QCA is new paradigm for computing and using CA architecture in which each cell consists of a central quantum dot and four neighboring dots occupied by two electrons.Potent polarized ground state has been generated by the combination of the Coulomb Repulsion between two electrons and discrete electronic charge.QCA approach permits ultra-fast operations eliminating problems of interconnect delays, having resistive and capacitive effects, resulting ultra-low power dissipation, and making limited densities associated with conventional architectures [4].
In most regular ways of fabrications, QCA Devices have been operated at low temperatures.Below a critical temperature there are no errors at all.Above that specific temperature, accumulating errors inevitably occur and make the results wrong [4].Besides, Landauer principles demonstrates that the erasure of one bit of information in computing process dissipates at least KTln2 Joules.[5] It is demonstrated that the Devices of Computing machines which perform logical functions that do not have a single-valued inverse, requires a minimal heat generation, per machine cycle, typically of the order of kT for each irreversible function.Where K is Boltzmann's constant and T is the temperature [6].
Apparently, A feasible solution is reversible computing at logic level by establishing a one-to-one onto mapping between the input and output states of a circuit.As a result, dissipation can be avoided if computation is carried out with no loss of information [7].
QCA Technology because of its pipeline nature is not appropriate for a complicated control system design.Systolic arrays first proposed by Kung and Leiserson in [8] take many advantages of pipelining , parallelism and simple local control and concurrently have influence on enhancing the performance of the whole computing machine [9].
Processing elements (PE) are particles of this array in which each PE executes only one operation as an example of SIMD machine.Data flows in entire systolic array by pumping data in and out rhythmically.Based on the data flows, there are two types of category.In the first type known as Accumulator-based, results reside in processing element.An example of this type is a systolic matrix multiplier.In the second type known as Adder-based, results flow from one PE to the next.An example of this type is a Galois Field (GF) multiplier [10].
In this paper, it has been introduced a novel ultra-low power dissipation Reversible QCA circuits with no information loss and suitable for a complicated control systems based on add-multiply based systolic array design with the potential high throughput.
The paper is organized as follows: In Section 2, background information on QCA and simple designs, Reversible Gate Schemes and Systolic Array Architectures are presented.Then in Sections 3, related works on Reversible QCA Circuits, QCA Systolic Array circuits and Reversible Systolic Array Designs are introduced and finally in Section 4, the proposed Processing Element Design in QCA and its extension, comparison table and simulation result are investigated and at the end, conclusion and future works are stated.

2-1-QCA Basics
A QCA cell consists of four quantum dots shaped in a square and two excess electrons that can occupy those dots with mutual electrostatic repulsion by each other.The cell shown in figure1A has two stable states when the cell is charged with two excess electrons.These two diagonal states represent logic 0 and logic 1 (figure 1B).The primary device to design any logic circuits such as [11] in QCA is majority gate that includes five standard cells: 3 input cells, voter cell and 1 output cell.The Majority gate (figure 1C) can be operated as an OR gate or an AND gate by getting a constant input to one of the inputs.If the constant value is 0, the AND operation is performed for two other inputs.And if that is 1, the OR operation is expected from two other inputs.It can be demonstrated that NOT gate as shown in figure 1D can be simply designed [12].
QCA circuits design is partitioned into four adjacent clocking zones along one dimension known as Switch, Hold, Release and Relax.As demonstrated in figure 1E, During the Switch phase, polarization under trace of neighbouring cells takes place and represents a binary logic value, electrons because of middle barriers do not switch and retain their polarity in the Hold phase.In the Release phase middle barriers are reduced and the polarity lost.In the Relax phase, there is no middle barrier and a cell do not influence on its neighbours [13].Furthermore, in [14] it has been reported a QCA design optimization methodology based on majority gates which must be directly implemented on fundamental gate instead of optimizing the design for AND-OR gates.The Designs can be illustrated in one or multi-layer such as reported in [15].

2-2-Reversible Design
A reversible logic function is a one-to-one onto mapping (bijection) between inputs and outputs, i.e. each input pattern is mapped to a unique output pattern, while each output pattern has a unique input pattern mapped to it.Hence in reversible computing, there is no information loss.Reversible gates are capable of restoring inputs from outputs.According to these definitions, Traditional logic functions (such as AND and OR) are not reversible, because more than one input state is mapped to a common output state.

2-3-Systolic Array Design
The potency factor of a Systolic system is flowing data between PEs which are enabled for single operation such as matrix multiplication as a special purpose machine, although there have been many efforts to design a programmable PE.
It has been shown that systolic arrays due to the placement methods and topographies are different.According to [18], These One-dimensional and two-dimensional arrays has been demonstrated in figure 3a such as Linear Systolic array which is applied for matrix-vector multiplication, one-dimensional convolution (FIR filtering), Orthogonal systolic array for mappings of the matrix-matrix multiplication algorithm, Hexagonal systolic array for the matrix-matrix multiplication algorithm results in a hexagonal array, Triangular systolic array for Gaussian elimination and other decomposition algorithms.[19] In figure 3b, Kung cell as a processing element and the hexagonal systolic array has been shown.[10].

3-1-Reversible QCA
It has been designed multiple reversible QCA circuits yet.Generally, these designs are partitioned to two groups as reversible circuits and reversible gates.Some of reversible gates designs are binary incrementer [20], BCD Adder and Subtractor [21], Reversible Flip Flops [22], Reversible ALU [23] and etc.Some gate level reversible design of QCA are Fredkin gate introduced in [24] as a reversible gate which its design is depicted in [17] has been shown in figure 4. The other one is Toffoli gate reported in [16] which its path of design from introduction till today has been demonstrated in Figure 5.

3-2-QCA Systolic Design
It With regard to the intrinsic of systolic arrays introduced in section 2.3 as a SIMD machine, in [9] the QCA systolic designs published in format of case studies: Matrix Multiplier (Accumulator based systolic array) and Galois Field Multiplier ( Adder Based systolic array).All Steps of designing a QCA Systolic architecture is:

3-2-1-Case Study: Matrix Multiplier
Due to mentioned-above steps as a QCA Systolic design has been depicted in figure 6 including Logical Formula of single instruction (Figure 6a), Data Graph scheme (Figure 6b), PE type referred to Accumulator based and design(Figure 6c) , Majority based schematic design of single instruction (Figure 6d) and at last 2 × 2 Matrix QCA Design (Figure 6e).As it has been shown in Figure 7 , Logical Formula of single instruction (Figure 7a), Data Graph scheme and Multiplier algorithm (Figure 7b), PE type referred to adder based, Majority based schematic design of single instruction (Figure 7c), 2 × 2 Matrix QCA Design (Figure 7d).

3-3-Reversible Systolic Array Design
One of the most considering methods described in [10] to design a m-ary Galois Function using reversible systolic array is Add-multiply PE design based with demonstrating m-ary Galois logic reversibility.
The inner and outer design of the PE is introduced In Figure 8a and 8b respectively.The PE performs operations in Galois Logic GF (2) in a reversible manner.In Figure 8c, There has been demonstrated a 2D reversible systolic array architecture over GF (2) by interconnecting the GF (2) Toffoli gates.Therefore, The inspection of GF (2) working Toffoli-based Processing Element (TPE) has been proved that reversibility of whole systolic system can be satisfied.

4-Proposed Design
According to the issues mentioned before, Table 2 shows advantages of introduced architectures in order to solve the problems and represent a novel architecture with high throughput, low power dissipation, no loss information and heat generation, Nano structure based, well suited for complicated systems and capable of pipelining.Besides, "the investigation of realizing the introduced quantum systolic arrays using the quantum dot technology" in the Future Works of [10] has been stated before.

4-1-Processing Element Design
Designing a reversible systolic system using QCA is considerably depends on TPE selection.Since the logical function of Toffoli gate based on GF (2) corresponds to adder based processing element logic, Hence QCA based Toffoli processing element (TPE) can be used in order to design a systolic system.According to the power dissipation analysis reported in [27], Toffoli gate have calculated as the minimum heat generation in all operational temperatures among all reversible gates.Vital factors to select a suitable TPE are having none unstable NOT gate, having none output which is not capable of branching, minimum cell number, minimum space area, proper functionality, output signal corresponding to Toffoli truth table, having none rotated cells and minimum clock to result the output.Therefore, the selected TPE in this article has been proposed in figure 9 based on reported Toffoli in [26] with proper output signal and all important factors mentioned above.

4-2-Design the proposed 𝟐 × 𝟐 Galois Field Matrix Multiplier
Regard to designing steps of a QCA systolic stated in section 3.2, the proposed 2 × 2 multiplier based on GF (2) using proposed TPE, has been shown in Figure 10.It can be easily extensible on higher dimensions and contemporarily preserve all advantages like lowest cell number and no heat generation.

5-Conclusion
In this article, it has been proposed the reversible systolic array design using QCA based on binary GF(2) and selecting adder-based TPE based on the most optimized Toffoli gate in the aspect of power dissipation and minimum output clocking.It can be utilized in complicated control system as in pipelined systolic array architecture, capability of increased computations, very reduced feature size and no power dissipation as in Nano based device and capability of no heat generation and no loss information as in reversible gate design.Possible future works are design and extend Toffoli gate design using accumulator-based and produce matrix multiplier, QCA systolic array design and extend on multidimensional topographies and etc.

Figure 1 .
Figure 1.(a) One QCA Cell (b) Two States of Logics respectively from left to right : 0 , 1 (c) A Majority QCA design (d) a Robust NOT QCA Design (e) Four Phases Clocking scheme

Figure 3 .
Figure 3. (a) Types of systolic array topographies: Linear, Rectangular, Hexagonal, Triangular (b) Kung Cell as a PE in Hexagonal systolic array

 2 × 2
Logical Formula of single instruction  Data Graph scheme  PE type and design according to the necessities in propagating output (Adder Based) or memorizing it (Accumulator based) Majority based schematic design of single instruction Matrix QCA Design  Extended QCA Designs

Figure 6 . 3 - 2 - 2 -
Figure 6.(a) Single instruction function, (b) Data Graph, (c) Selected PE, (d) the Schematic based on Majority gate, (e) Design of  ×  Matrix multiplier based on QCA systolic array 3-2-2-Case Study: Galois Field Multiplier GF (2 m ) is a finite field in which addition and multiplication are performed in modulo m.Hence, GF (2) consisting only 0 and 1 elements can Addition and multiplication on each pairs of field elements, Respectively results XOR and AND.

Figure 7 .
Figure 7. (a) Single instruction function, (b) Data Graph, (c) Selected PE, (d) the Schematic based on Majority gate, (e) Design of  ×  Galois Field based on QCA systolic array

Figure 9 .
Figure 9. Design reversible TPE and the stable output

Figure 10 .
Figure 10.Design the proposed  ×  Galois Field Matrix multiplier