# Low-Power and Compact Analog-to-Digital Converter Using Spintronic Racetrack Memory Devices Qing Dong, Student Member, IEEE, Kaiyuan Yang, Student Member, IEEE, Laura Fick, Student Member, IEEE, David Fick, Member, IEEE, David Blaauw, Fellow, IEEE, and Dennis Sylvester, Fellow, IEEE Abstract—Current-induced domain wall (DW) motion in spintronic racetrack memory promises energy-efficient analog computation using compact magnetic nanowires. This paper explores the feasibility of analog-to-digital converter (ADC) based on current-induced DW motion and introduces an n-bit ADC using n racetrack magnetic nanowires. With each magnetic nanowire having a different configuration granularity, an n-bit binary or gray code is generated simultaneously. The proposed ADC structure achieves 21 fJ/conversion-step at 20 MHz with an area of about 10 $\mu$ m<sup>2</sup>. The racetrack ADC is suitable for applications requiring dense ADC arrays, such as image sensors. This paper describes one ultrahigh speed digital pixel sensor imaging system benefiting from the racetrack ADC. Index Terms—Analog-to-digital converter (ADC), domain wall (DW), emerging circuits and devices, imager sensor, race-track memory, spin-transfer torque, spintronic. ### I. INTRODUCTION OW-power and compact data converters are an essential part of sensor nodes as the link between the sensor and data processing. Also, in high-speed massive parallel sensors such as imagers, each photodiode includes a moderateaccuracy but compact analog-to-digital converter (ADC) for parallel data conversion [1]. CMOS implementations of such data converters face two challenges. The first is the difficulty of integrating ADCs with sensors in every pixel or channel due to the large area of analog circuits. This is exacerbated by poor scaling of analog circuits in CMOS due to process variation in advanced technologies [2]. The second is the high static power of analog data converters [3]. As a result, most highspeed image sensors use only column parallel ADCs in their sensor array to balance area/power and performance [4], [5]. However, image sensors with much higher frame rates are in demand for emerging imaging applications such as integral machine vision, time-of-flight imaging, and 3-D highdefinition television. Recently, a number of new materials and novel devices have been proposed to replace MOS transistors in specific Manuscript received June 4, 2016; revised August 24, 2016; accepted October 10, 2016. Date of publication December 2, 2016; date of current version February 22, 2017. This work was supported by STARnet, a Semi-conductor Research Corporation program sponsored by MARCO and DARPA. (Corresponding author: Q. Dong.) The authors are with the University of Michigan, Ann Arbor, MI 48105 USA (e-mail: qingdong@umich.edu). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TVLSI.2016.2622224 applications. The discovery of current-induced domain wall (DW) motion has driven the invention of several spintronic devices that hold promise for nonvolatility, high endurance, high density, and low power [6], [7]. With perpendicular magnetic anisotropy (PMA) in CoFeB/MgO structures, multiple magnetic domains separated by DWs can be maintained in one nanowire for multibit nonvolatile memory [8]–[11]. A four-terminal device mCell using DW switching was proposed for logic computation [12]. DW neurons have also been reported as suitable for current comparators in SAR ADC [3], [13], [14]. However, the DW neuron ADC does not fully leverage spintronic devices as most parts (including the DAC) are still implemented in CMOS. This paper presents a novel spintronic-based data converter that leverages the nonvolatility, low power, and high density of spintronic devices. We propose an n-bit racetrack ADC structure using n magnetic nanowires with different configuration granularity for each bit [15]. The current-steering DW motion can convert n bits binary data or gray code in parallel. Exploiting the nonvolatility of racetrack memory, the converted data is stored intrinsically, eliminating the need for additional memory cells and saving time spent on writing data. As most components are spintronic devices, the design achieves compact area, scalability, low static power, and no leakage. Compared to a conventional low-power CMOS SAR ADC, the proposed racetrack ADC can achieve a 1000 × smaller area with comparable energy efficiency figureof-merit (FOM). We also describe the potential application of the racetrack ADC to a high-speed imaging system using the 8 bit racetrack ADC as an in-pixel ADC. Results indicate that the frame rate is increased by 50× compared to a CMOS digital pixel sensor (DPS) while retaining high fill factors as in analog pixel sensors (APSs). Section II presents the model we developed to describe the relationship between steering current and DW motion velocity in racetrack memory to explore the potential of data conversion with DW motion. Section III shows the basic circuit structure and operations of the proposed ADC. Section IV analyzes the influence of variation, noise, stability, and reliability on the converter. Section V demonstrates simulation results of the proposed design and analyzes the tradeoffs among area, power, performance, and scaling of the ADC. Section VI describes the high-speed imaging system Fig. 1. Structure of a racetrack memory device consisting of a magnetic nanowire and two MTJ heads as the read and write ports. The racetrack nanowire is manufactured on top of MOSFETs, avoiding planar area overhead. Fig. 2. (a) Threshold current density decreases with reduced cross-sectional area [3], [17]–[19]. (b) Once current density exceeds the threshold, DW motion velocity linearly increases with higher current density based on the compact model in [10]. with the proposed racetrack ADC. The paper is concluded in Section VII. ### II. RACETRACK MEMORY DEVICES The racetrack memory device is a magnetic nanowire comprising multiple magnet domains separated by DWs [8]-[11]. A single data bit is stored as the local spin polarity within the DW magnet strip at a given position. DWs can be shifted along the magnetic strip by induced horizontal charge current. Fig. 1 shows the structure of a PMA racetrack memory consisting of one magnetic nanowire and two MTJ heads as the read and write ports. Given a current pulse Iwrite on the write MTJ, the magnetic domain beneath that MTJ in the magnetic nanowire will be nucleated through spin-transfer torque. At the same time, the horizontal shift current I<sub>shift</sub> can move the data along the magnetic stripe. By alternatively asserting write current and shift current, the racetrack can store a sequence of DWs. Previous works have explored the potential of building hundreds of DWs in one magnetic nanowire [8], [11]. With such a high density, the area efficiency can be as high as 1 F<sup>2</sup>/bit [10], providing much higher density than other nonvolatile memory technologies. The polarization of the magnetic domain beneath the read MTJ can be detected by sensing the resistance, which is affected through the tunnel magnetoresistance effect. Reported MTJ reading access times for a megabyte-scale array are as fast as 4 ns [16]. Moreover, this device can be implemented above CMOS transistors in the backend process, reducing the total area and interconnection delay. Spin-dependent electron scattering can cause the charge current through the magnetic nanowire to be spin polarized. When a spin-polarized electron crosses a DW, its spin polarization will rotate 180° from one magnetic domain to the other. To maintain total spin-angular momentum, the change in the current spin-polarization will be transferred to the local magnetization and create a spin-torque, causing the DW to move [8]. The DW moves along the flow of spin-polarized electrons, which is opposite to the direction of the charge current. Both theoretical and experimental studies [3], [17]–[19] have shown that the threshold charge current density for DW motion in a PMA nanowire depends on the nanowire crosssectional area. According to the adiabatic spin transfer torque model, the threshold current density decreases with reduced width and thickness as shown in Fig. 2(a) [3], [17]–[19]. When the driving current exceeds the threshold current, the DW moves along the nanowire. Higher driving current can generate higher DW motion velocity. Using the compact model in [10], Fig. 3. 3-bit racetrack converter consists of three magnetic nanowires. After fabrication, each nanowire is configured with different DW granularity and represents an individual bit by current injection. During data conversion, current under test will flow through the nanowire, and all DWs will move together. The moving distance is lineally proportionally to the current under test. After conversion, the converter needs to be reset for the next cycle. DW velocity can be described as $$v = \frac{\beta \mu P}{\alpha e M_s} (J - J_{\text{th}}) \tag{1}$$ where $\beta$ is the nonadiabatic coefficient, $\mu$ is the Bohr magneton, P is spin polarization percentage of the tunnel current, $\alpha$ is the damping constant, e is the elementary charge, $M_s$ is the demagnetization field, J is current density, and $J_{\text{th}}$ is threshold current density. Velocity v can be increased with a higher current density [Fig. 2(b)]. In a certain current range, the relationship between v and J would be quite linear [6], [7], [13]. Given this linear characteristic, current-induced DW motion is suitable for analog computation and data conversion in particular. ### III. PROPOSED RACETRACK CONVERTER ### A. Overview of Racetrack Converter Operation 1) Configuration: Fig. 3 shows the structure of the proposed racetrack converter with 3 bits as an example. An n-bit converter requires n nanowires (Fig. 4). Each nanowire will have $2^n$ DWs, one read MTJ, and one write MTJ. Each nanowire will be configured differently such that each generates a single bit, from LSB to MSB. Then, the polarization of the magnetic domain beneath n read MTJs represents the digitalized value from 0 to $2^n - 1$ . This configuration is done only once postfabrication, using the write MTJ port. When applying a positive current pulse on the write MTJ, the magnetic domain beneath that MTJ becomes spin-polarized with downward direction representing data 0 and a negative current pulse Fig. 4. Data conversion scheme for an *n*-bit racetrack converter. generates upward spin-polarized magnetic domain (data 1). With a sequence of alternating write and shift current pulses, corresponding data can be stored on nanowires one by one. Altogether, such a design can store 256\*8 bits on 8 magnetic nanowires to form an 8-bit racetrack data converter. According to [8] and [20], DW write pulse is about 10 ns using $1.2 \times 10^8$ A/cm<sup>2</sup> vertical current. Therefore, the write energy is less than 1 pJ/bit [21]. As we only write once, the write power and latency are not important in this application. 2) Data Conversion: As shown in Fig. 3, the input current under measurement flows through the nanowire in the opposite direction of the reset current. In this case, all the DWs move right simultaneously. As the current under measurement for each nanowire has the same value, the DWs in different nanowires move at the same velocity. After a fixed time T, the DWs will stop. The distance X that DWs move can be expressed as $$X = v * T = \frac{\beta \mu P}{\alpha e M_s} \left( \frac{I}{\text{Area}} - J_{\text{th}} \right) * T.$$ (2) The distance X is linearly proportional to the current I or the time T, which makes racetrack nanowires promising for both current-digital and time-digital converters. 3) Read: The polarization of the magnetic domain beneath the read MTJs stores the digitized value of the distance a DW has moved (ranging from 0 to $2^n - 1$ ). As the read MTJ head is upward spin-polarized, MTJ resistance with a downward spin-polarized nanowire domain could be $2-3 \times$ higher than that with the upward polarized nanowire domain. Therefore, by sensing the resistance of the read MTJ head above a given nanowire, a 0 or 1 state can be defined. Using sense amplifiers, the data can be read out as a digital value. In a more simplified design, write, reset, and read can be performed with a single Fig. 5. Racetrack converters function similar to a combination of data converter and nonvolatile memory. Interface circuits to CMOS must provide current to the converter and sense the current at its output. universal MTJ in the position of the read MTJ in Fig. 3. The writing pulse in that case is applied to the single MTJ to configure the nanowires. With the input shift current flowing in the opposite direction, the data are leftshifted one by one, rather than right. Reset detection is then performed by sensing all the 0 s as the beginning point using the universal MTJ. 4) Reset: After conversion, the write MTJ will subsequently perform reset point detection. Horizontal reset current flows through the nanowire to cause DW motion. When all the DWs move back to their original position, write MTJ resistances undergo their resistance transitions, which is detected by sense amplifiers. Once the resistance change is sensed, the shift current is cut off, signifying the completion of the reset phase. Sensing current is much smaller than writing current and threshold current for DW motion, and therefore does not induce any change in the nanowire. If the latency of the sense amplifier is too long, over-reset or under-reset might happen. To address this problem, two-step reset should be used: 1) employing high current to quickly shift all DWs back with sense amplifiers coarsely detecting all 0s and 2) using small current to move each nanowire slowly with each sense amplifier finely verifying the reset point. After the first step, most of the nanowires will be reset to the original position, but some might have one-bit ahead or behind. As each nanowire has its own sense amplifier and fine reset control, the second step can carefully move each nanowire back to origin separately. The whole verification process can take less than 10 ns, which is only 20% of the whole cycle time (50 ns). Both global coarse reset control and local fine reset control are simple logic gates (Fig. 9.) ### B. Racetrack ADC Most spintronic devices are suitable for current-mode computation because their characteristics have a direct mathematical relationship with current. The operation of mLogic [12], all-spin-logic [22], and DW neuron [3], [13], [14] are all based on current. Our proposed racetrack converter is fully compatible with these current-mode spintronic logic devices. By combining with these other approaches, more complex current-steering mixed-signal systems can be implemented. However, most CMOS modules remain voltage based. To realize integration, interfaces between CMOS and racetrack converter are needed. As shown in Fig. 5, the racetrack converter includes the functionality of both a data converter and nonvolatile memory. The interface circuits with CMOS Fig. 6. (a) Schematic of 4T all-pMOS V-I converter and (b) simulation results of its Iout-Vin characteristics. need to provide current at the input of the converter and sense current at its output. Thus, the interface circuits mainly include sense amplifiers and a voltage-current (V-I) converter for the ADC. For the ADC, the front-end interface should provide a current linearly dependent on the input voltage. Fig. 6(a) shows a 4T all-pMOS V-I converter with the racetrack nanowire as the load. T0, T1, and T2 in the first stage make up an attenuator (amplifier with gain less than 1). The output voltage of attenuator V0 will linearly follow the change in the input voltage Vin with opposite phase. The range of V0 is smaller than Vin, forcing T3 to operate in the velocity saturation region. As the electrical characteristics of a racetrack nanowire mimic a resistor, the current through the nanowire also changes linearly with input voltage. Moreover, the lower limit of current range is not 0 as T3 operates in the velocity saturation region across the full input voltage range. We can design this lowest current to compensate the threshold current of DW motion [23]. Fig. 6(b) shows the simulation results of the V-I converter's Iout-Vin characteristics. The transconductance of this 4T V-Iconverter is quite linear, with an adjusted R-square value of 0.99996. Furthermore, this V-I converter is built using all pMOS, which increases its tolerance to process corners. In addition, as a source follower the current is insensitive to the ground voltage of T3. Therefore, VSSH can be raised to achieve lower power. Fig. 7. (a) Midpoint meta-stability problem. (b) Solution with Gray Coding. (c) Schematic of current sense amplifier. (d) Sense amplifier current offset simulation results. The racetrack nanowire itself is a type of nonvolatile memory. After conversion, data can be stored immediately in the nonvolatile racetrack memory without area and timing overhead. With traditional CMOS current sense amplifiers [Fig. 7(c)], stored data can be accessed. There is one situation to carefully consider. As shown in Fig. 7(a), when a DW moves to the midpoint beneath the read MTJ, the resistance difference between the read MTJ and reference MTJ (with average resistance value) becomes very small. This may cause meta-stability in the sense amplifier and induce errors. Further, if most bits are approaching their flipping point, the error becomes significant. To avoid multibit flipping, we proposed to use gray code instead of binary code to ensure only one bit changes at a time. Fig. 7(d) shows the current offset distribution of the CMOS current sense amplifier with 10 K Monte Carlo simulation. The standard deviation of the current sense amplifier offset is 0.7 $\mu$ A, much smaller than the sensing current range (from 20 to 50 $\mu$ A). Moreover, we exploit the self-reference sensing scheme to narrow the meta-stability region. As shown in Fig. 8(a), an additional (reference) MTJ is placed next to the read MTJ. This additional MTJ serves as a reference with the opposite phase to the read MTJ. Use the LSB gray code nanowire as an example. If the reference MTJ is placed 2 units distance away from the read MTJ on the top of the same nanowire, the DW polarity beneath the reference MTJ will always be complementary to that of the read MTJ. In the LSB gray code nanowire, bit 2 keeps commentary to bit 0(2-2) or bit 4(2+2). If the read MTJ resistance is high, that of the reference MTJ is low. When the resistance of the read MTJ approaches its middle value, the resistance of the reference MTJ similarly approaches its middle value from the opposite direction. As illustrated in Fig. 8. (a) Self-reference sensing using the LSB gray code nanowire as an example. If the reference MTJ is placed 2 units distance away from the read MTJ, the DW polarity beneath the reference MTJ will always be complementary to that of the read MTJ. (b) Sensing dead zone can be narrowed by 2× using self-reference sensing. Fig. 8(b), this technique shortens the meta-stability window of the sense amplifier by 50%. For other bits in 3-bit gray code, a 4-unit distance will keep results always complementary. Neither Gray coding nor self-sensing induces area or delay overheads. Fig. 9 shows the block diagram of an 8-bit racetrack ADC including shared V-I converter, eight rancetrack nanowires, and eight sense amplifiers. Write, shift, and reset switches are also shown in the figure. To minimize the mismatch among these nanowires, an offset compensation circuit is included in each nanowire. The main idea of the method is to add a tunable resistor connected in series with each naowire. The simplest implementation of this tunable resistor is ratioed linear-region transistors connected in parallel. The number of connected on-state transistors can be tuned digitally according to the threshold current mismacth of the nanwire. Calibration is required to determine the value of the tuning bits. ### IV. UNCERTAINTY ANALYSIS AND DISCUSSION ### A. Process Variation Both CMOS process variation and racetrack nanowire variation will influence data conversion accuracy. For the CMOS part of the design, the 4T V-I converter is tolerant to systematic variation due to its all-pMOS implementation. In particular, systematic variation creates only a minor offset to the V-I conversion curve with little impact on slope and linearity. The offset can be canceled using a simple N-well bias compensation method. However, the V-I converter remains sensitive to random variations induced by random dopant fluctuation and line edge roughness (LER). We upsize the pMOS transistors to alleviate the influence of these random variations. In addition, random variation also leads to sense amplifier offset. Device sizing and/or autozero calibration techniques [24] can be employed to enhance mismatch tolerance. For the racetrack nanowire itself, the major sources of variation include: 1) MTJ layer area; 2) tunneling oxide thickness; and 3) cross-sectional area of the nanowire. Both 1) and 2) affect MTJ resistance [25] and may lead to read failure. The proposed self-reference sensing alleviates this influence. MTJ area can also impact the dynamic spin-polarization characteristic during configuration; this can be ameliorated by extending the write time and performing verification after configuration to ensure successful spin-polarization of each domain. The nanowire cross-sectional area can affect the threshold current density for DWs to move and shift the $(\nu - J)$ th curve of DWs. LER-induced cross-sectional area random variation is potentially the most severe variation for the proposed racetrack converter. Fortunately, similar to threshold voltage mismatch in CMOS devices, the threshold current mismatch arising from cross-sectional area random variation can be alleviated by variation-aware circuit design techniques (current offset compensation methods) or postsilicon calibration techniques. We proposed one mismatch compensation method as shown in Fig. 9. This paper focuses on the basic structure of a racetrack nanowire-based converter architecture, however, advanced design techniques commonly used in CMOS converters, such as time-interleaving [26], can also be applied to improve its conversion accuracy or performance. Moreover, thanks to the extremely small area, an additional bit can be included to compensate for potential accuracy losses arising from variation with only 12.5% area and power overhead for 8-bit ADC, as an example. While adding one additional bit in CMOS ADC may even take about 100% area consumption, because the CMOS ADC area is in a quadratic relationship to the number of bits. However, the racetrack ADC only requires one extra nanowire and one extra sense amplifier to add one bit. If the target is 8 bits ADC function, a 9-bit racetrack ADC can be used to realize the 8-bit function with only 13% extra area overhead and the accuracy can be improved compared to using only 8-bit ADC. ### B. Noise The proposed racetrack converter works in current mode during data conversion. Compared with a conventional voltage mode CMOS converter, current mode computation suffers less from noise [27]. Moreover, during conversion a constant current flows through the nanowires without any switching. Therefore, the proposed racetrack converter is immune to switching-related noise, making thermal noise the dominant noise source during data conversion. Both the V-I converter and racetrack nanowires will contribute to thermal noise. Spintronic devices generate much less thermal noise than MOS transistors because of their smaller resistance [28]. Based on previous analysis, the total integrated thermal noise current of Fig. 9. Eight-bit ADC block diagram and offset compensation method. the converter could potentially be three orders of magnitude smaller than the input current. The simulated noise standard deviation (both thermal and flicker) of both V-I converter and nanowire is $\sim 0.42$ mV. ### C. Stability and Reliability Using PMA magnetic material, the current-driven motion in a DW is not sensitive to pinning and local magnetic fields or temperature [6], [8]. 10-year retention at 150 °C can be achieved and endurances above $10^{10}$ cycles have been reported with $10^9 \text{A/cm}^2$ write current density in 90 nm technology, which demonstrates the great reliability [7], [19], [29], [30]. Thanks to the low resistance of the nanowire ( $\sim$ 10 k $\Omega$ for 3.5 nm $\times$ 30 nm $\times$ 16 $\mu$ m [12]), the joule heating power is less than 10 uW, comparable to CMOS. This combination of high endurance and excellent retention indicates that the technology is a good match for high sampling rate nonvolatile data conversion. Moreover, as converted data will be sent to a processing module immediately after conversion, there are no concerns with thermal stability-related retention. ### V. SIMULATION RESULTS AND ANALYSIS We build compact Verilog-A models of MTJ and racetrack nanowire based on published experimental data [6], [7], [9], [10]. The dimensions of each nanowire are designed to be 3.5 nm $\times$ 30 nm $\times$ 16 $\mu$ m, compatible with a 32 nm technology. Co-simulation with CMOS circuits (a commercial 32 nm SOI technology) is performed by SPICE simulator. Fig. 10 shows the layout design of the 8-bit ADC. The CMOS circuits and nanowires are stacked and their areas are matched. Fig. 11 shows data conversion simulation results of an 8 bit racetrack ADC. The ADC input voltage range is [0], [0.9V] Fig. 10. Layout of the 8-bit ADC with 10 $\mu$ m<sup>2</sup> total area. Fig. 11. Simulated data conversion of the ADC. with 3.52 mV LSB. The ADC has a variable input voltage with fixed sampling time for DWs to move. The V-I converter changes the ADC input voltage to a current for data conversion. The current range is 7–40 $\mu$ A (6.7-38 MA/cm²) for this 32 nm technology. The V-I converter can tune the output current range to guarantee the DW velocity increase linearly with higher current density. In state-of-the-art experimental results, the needed current density to move the DW by 250 nm within 2 ns is 18 MA/cm² at 90 nm technology [7], [29]. Moreover, [7] experimentally shows a quite linear relationship of DW velocity upon current density range from 18 to 50 MA/cm². Fig. 12. (a) Power and area. (b) Breakdown of each ADC component. Racetrack nanowires consume the most power, though area overhead can be ameliorated by placement above MOSFETs. TABLE I COMPARISON WITH RECENT 8-bit LOW-POWER CMOS ADCS WITH COMPARABLE SAMPLING RATES | | CMOS<br>ADC [36] | CMOS<br>ADC [37] | Racetrack<br>ADC | |-------------------|------------------|------------------|------------------| | Technology (nm) | 90 | 40 | 32 | | Sample Rate (MHz) | 10 | 20 | 20 | | Resolution (b) | 8 | 8 | 8 | | Power (µW) | 26.3 | 84.9 | 96.5 | | FOM (fJ/conv.) | 12 | 19.2 | 21 | | SNDR (dB) | 48.50 | 44.90 | 46.21 | | ENOB | 7.70 | 7.17 | 7.38 | | Area (mm²) | 0.021 | 0.0153 | 0.00001 | As the threshold current density will decrease with reduced cross-sectional area [17]–[19], less than 6.7 MA/cm² threshold current density for 32 nm technology is achievable as mentioned in [3] and [12]–[14]. Also, good linearity of current density range from 6.7 to 38 MA/cm² in 32 nm technology should be achievable. The minimum DW velocity at 6.7 MA/cm² will be 1m/s, with which one bit code (30 nm) can be shifted within 20 MHz cycle time. Fig. 12 shows the power and area breakdown of each component. Among the three parts, the racetrack nanowires dominate power consumption, taking more than half of the total power. The nanowires operate continuously during conversion and possess resistor-alike electrical characteristics. Sense amplifiers only operate for a short time (less than 1 ns) after conversion and remain in a low-power standby mode during conversion. Compared with sense amplifiers, the V-I converter consumes more power yet less area. By raising VSSH, V-I converter power can be lowered. As racetrack nanowires can be placed on the top of the MOS, they do not induce extra area overhead if carefully designed. Table I shows the characteristics of a racetrack ADC in 32 nm technology. At 20 MHz, the total power is 96 $\mu$ W. Furthermore, the area is only 10 $\mu$ m<sup>2</sup>, which is three orders of magnitude smaller than the state-of-the-art CMOS ultralow-power SAR ADCs with comparable FOMs ( $\sim$ 20 fJ/conversion-step). Racetrack ADC power consumption is input dependent. As shown in Fig. 13(a), higher input voltages generate higher currents through the nanowires and $V\!-\!I$ converters, consuming higher power. The average power also increases with a higher sample rate of the ADC [Fig. 13(b)]. However, because of the uncertain DW velocity linearity at ultrahigh currents, we consider a modest operating frequency of 20 MHz to ensure reliability. With a wider linear region of DW velocity, higher sampling rates can be achieved. Another advantage of the proposed racetrack ADC is that the total area and power increase linearly with resolution rather than exponentially [Fig. 13(c)]. Adding one bit requires only one additional magnetic nanowire and one sense amplifier. Racetrack ADCs also benefit from the significant scalability of spintronic devices. With technology scaling [31], [32], the total nanowire length can be shortened, which will lower the required velocity to achieve the same sample rate. The cross-sectional area will be minimized, lowering the threshold current density of DW. As DW motion is based on the current density, a smaller cross-sectional area also translates to smaller current needed to achieve the same velocity. Therefore, to realize a constant sampling rate, the average power reduces cubically with scaling Fig. 13(d). There are four major factors affecting the ADC nonlinearity: 1) nonlinearity of the DW velocity upon current density; 2) mismatch of the nanowire; 3) sense amplifier offset; and 4) nonlinearity of the V-I converter. For 1), the nonlinearity of the DW velocity upon current density can deteriorate the DNL. However, it is hard to estimate the accurate nonlinearity with insufficient and sparse published experimental data. With a compact model, it is ideally linear. For 2), nanowire mismatch will also shift the DNL. According to [33] and [34], the standard deviation of DW mismatch can be approximately 5%. Fortunately, with the proposed mismatch-compensation method (Fig. 9) the nanowire mismatch can be minimized to less than 2%, affecting DNL with a 2% variance. For 3), the sense amplifier offset will shift each code randomly. Using Monte Carlo simulation, the standard deviation of the current offset is 0.7 uA. With 0.1 V across the read MTJ and the reference MTJ, the current Fig. 13. (a) Relationship between power and input voltage. (b) Average power increases with higher sample rate. (c) Total area and power increase linearly with more bits. (d) Average power reduces cubically with technology scaling. under sensing ranges from 20 to 50 uA. Thanks to our self-reference technique, the effective sigma offset is 0.35 uA, which can shift DNL by 1.2% LSB. For 4), according to the simulated results, the nonlinearity of the V-I converter will contribute a 5nA shift in average, which is about 4% LSB. In total, the standard deviation of the DNL shift is 7.2% LSB. We also did noise simulation of both V-I converter and nanowires (as resistors). The standard deviation of the noise (both thermal and flicker) is 0.42 mV which is 12% LSB. By using the method in [35], the SNDR is 46.21 with 12% LSB noise and 7.2% DNL shift. The ENOB is 7.38. ## VI. HIGH-SPEED IMAGE SENSOR WITH RACETRACK ADCs High speed imaging systems employing in-pixel ADC, also known as DPS have several advantages over widely used conventional APS architecture with columnwise ADC, including much higher speed, better scalability, and less noise (readrelated column fixed-pattern noise and column readout noise). In particular, frame rate can be improved by more than $10 \times$ over APS with column-based ADC. With in-pixel ADC, only digital data is read out, which is faster and consumes lower power than reading analog values followed by conversion. However, the major bottleneck limiting the application of CMOS DPS is the large pixel size and low fill factor due to the area overhead of ADC and memory. Reference [1] reported a high-speed DPS with per-pixel single-slope moderate-accuracy ADC and 8-bit 3T DRAM. The dynamic range and frame rate are greatly enhanced with DPS architecture yet area and fill factor are unreasonably high [4], [5]. The proposed racetrack ADC is a combination of ADC and nonvolatile memory with extremely compact area, making it well suited to the DPS image architecture that commonly relies on a CMOS moderate-accuracy ADC and separate memory. Fig. 14 shows the DPS block diagram with the proposed racetrack ADC in each pixel. Unlike a CMOS single-slope ADC that requires analog ramp voltage from peripheral DAC and write-in data for memory, a racetrack ADC only requires CLK from a peripheral block during conversion, simplifying the required peripheral circuits and alleviating noise and I-R drop (Fig. 15). The system structure is similar to a typical nonvolatile memory bank. Sense amplifiers are placed at the bottom of the array (Fig. 14) and shared by the array. Therefore, their area are not included in the pixel cell. Both read and configuration operations are performed row by row like a nonvolatile memory. During read, when one row get accessed, the read MTJs will be connected to the shared sense amplifiers to read out the data. After one row read-out, the address will change and activate the other row alternately. After read of the whole array, all racetrack nanowires need to be reset for the next conversion. Configuration is also performed row by row. When one row gets activated, BLs will be connected to each write MTJs header through pMOS switches. Write current flow through BLs to the write MTJ and flip the polarity beneath it, and then shift current flow through the WL with pMOS switches. Conversion will be done inside each pixel all at the same time. Both the V-I converter and access transistors can be implemented with only pMOS devices, further minimizing the required area as large N-P well spacing is not needed. Fig. 14. DPS block diagram implemented with racetrack ADC. # Racetrack In-pixel ADC & Memory Racetrack In-pixel ADC & Memory Racetrack In-pixel ADC & Memory Racetrack In-pixel ADC & Memory Racetrack In-pixel ADC & Memory ADC Read out Fig. 15. Digital pixel cell comparison. Unlike CMOS single-slope ADCs, the racetrack ADC does not require analog ramp voltage and write-in data. Readout can be done with shared sense amplifiers like memory readout. onverte Moreover, the racetrack nanowires can be placed on the top of pMOS transistors. In this way, the nanowires will not induce extra area overhead. Fig. 16 illustrates the layout implementation of $2\times3$ pixels. The nanowire is somewhat long but very narrow, hence $1\times3$ pixels can be arranged together to match nanowire length with $3\times8$ nanowires placed above the access pMOS. In this arrangement, area is still dictated by transistors rather than the nanowires. Thus, the fill factor is significantly improved over CMOS alone. Table II compares CMOS APS and DPS image sensors with a racetrack DPS image sensor. Sensor fill factor matches CMOS APS and is $3\times$ better than CMOS DPS. Frame rate is improved by $50\times$ with a lower power consumption. The proposed racetrack ADC is very promising for this high-speed imaging application. Fig. 16. Layout implementation of $2 \times 3$ pixels. Racetrack nanowires can be placed on top of the access transistors. TABLE II COMPARISON BETWEEN CMOS APS, CMOS DPS, AND RACETRACK DPS | | CMOS<br>APS [4] | CMOS<br>DPS [1] | Racetrack<br>DPS | |-----------------------|-----------------|-----------------|------------------| | Sensor Fill Factor | 40% | 15% | 42% | | Frame Rate | 3 500 | 10 000 | >500 000 | | ADC Resolution (b) | 12 | 8 | 8 | | ADC Conv. Time (ns) | 500 | 25 | 50 | | Energy Per Frame (μJ) | 280 | 5 | 0.04 [16] | ### VII. CONCLUSION We analyze the potential of current-induced DW motion for data converter application and present an ADC design scheme based on racetrack magnetic nanowires. The 8-bit ADC can achieve 21 fJ/conversion-step, while the area is less than 10 $\mu$ m<sup>2</sup>, which is 1000× smaller than state-of-the-art CMOS ADCs with similar energy efficiency. The results indicate that racetrack converters hold promise for future low-power small-area applications requiring multiple ADCs. A high-speed racetrack ADC-based DPS image sensor system is also proposed to show one potential application of this design. ### REFERENCES - S. Kleinfelder, S. Lim, X. Liu, and A. El Gamal, "A 10000 frames/s CMOS digital pixel sensor," *IEEE J. Solid-State Circuits*, vol. 36, no. 12, pp. 2049–2059, Dec. 2001. - [2] X. Li, B. Taylor, Y. Chien, and L. T. Pileggi, "Adaptive post-silicon tuning for analog circuits: Concept, analysis and optimization," in *Proc. IEEE ICCAD*, Nov. 2007, pp. 450–457. - [3] M. Sharad, D. Fan, and K. Roy, "Low power and compact mixed-mode signal processing hardware using spin-neurons," in *Proc. IEEE ISQED*, Mar. 2013, pp. 189–195. - [4] M. Furuta, Y. Nishikawa, T. Inoue, and S. Kawahito, "A high-speed, high-sensitivity digital CMOS image sensor with a global shutter and 12-bit column-parallel cyclic A/D converters," *IEEE J. Solid-State Circuits*, vol. 42, no. 4, pp. 766–774, Apr. 2007. - [5] S. Lim, J. Lee, D. Kim, and G. Han, "A high-speed CMOS image sensor with column-parallel two-step single-slope ADCs," *IEEE Trans. Electron Devices*, vol. 56, no. 3, pp. 393–398, Mar. 2009. - [6] D. Chiba et al., "Control of multiple magnetic domain walls by current in a Co/Ni nano-wire," Appl. Phys. Exp., vol. 3, no. 7, p. 073004, 2010. - [7] S. Fukami et al., "High-speed and reliable domain wall motion device: Material design for embedded memory and logic application," in Proc. Symp. VLSI Technol., 2012, pp. 61–62. - [8] L. Thomas et al., "Racetrack memory: A high-performance, low-cost, non-volatile memory based on magnetic domain walls," in Proc. IEEE IEDM, Dec. 2011, pp. 24.2.1–24.2.4. - [9] A. J. Annunziata *et al.*, "Racetrack memory cell array with integrated magnetic tunnel junction readout," in *Proc. IEEE IEDM*, Dec. 2011, pp. 24.3.1–24.3.4. - [10] Y. Zhang, W. S. Zhao, D. Ravelosona, J.-O. Klein, J. V. Kim, and C. Chappert, "Perpendicular-magnetic-anisotropy CoFeB racetrack memory," J. Appl. Phys., vol. 111, no. 9, p. 093925, 2012. - [11] M. Sharad, C. Augustine, G. Panagopoulos, and K. Roy, "Spin-based neuron model with domain-wall magnets as synapse," *IEEE Trans. Nanotechnol.*, vol. 11, no. 4, pp. 843–853, Jul. 2012. - [12] M. Sharad, D. Fan, and K. Roy, "Ultra low power associative computing with spin neurons and resistive crossbar memory," in *Proc. ACM/IEEE DAC*, May/Jun. 2013, pp. 1–6. - [13] Q. Dong, K. Yang, L. Fick, D. Fick, D. Blaauw, and D. Sylvester, "Racetrack converter: A low power and compact data converter using racetrack spintronic devices," in *Proc. IEEE ISCAS*, May 2015, pp. 585–588. - [14] H. Noguchi et al., "A 250-MHz 256 b-I/O 1-Mb STT-MRAM with advanced perpendicular MTJ based dual cell for nonvolatile magnetic caches to reduce active power of processors," in Proc. Symp. VLSI Circuits, Jun. 2013, pp. C108–C109. - [15] N. Ben-Romdhane, W. S. Zhao, Y. Zhang, J.-O. Klein, Z. R. Wang, and D. Ravelosona, "Design and analysis of racetrack memory based on magnetic domain wall motion in nanowires," in *Proc. IEEE/ACM NANOARCH*, Jul. 2014, pp. 71–76. - [16] D. Morris, D. Bromberg, J.-G. Zhu, and L. Pileggi, "mLogic: Ultra-low voltage non-volatile logic circuits using STT-MTJ devices," in *Proc.* ACM/IEEE DAC, Jun. 2012, pp. 486–491. - [17] S. Fukami, Y. Nakatani, T. Suzuki, K. Nagahara, N. Ohshima, and N. Ishiwata, "Relation between critical current of domain wall motion and wire dimension in perpendicularly magnetized Co/Ni nanowires," *Appl. Phys. Lett.*, vol. 95, no. 23, p. 232504, 2009. - [18] A. Yamaguchi, K. Yano, H. Tanigawa, S. Kasai, and T. Ono, "Reduction of threshold current density for current-driven domain wall motion using shape control," *Jpn. J. Appl. Phys.*, vol. 45, no. 5A, pp. 3850–3853, 2006. - [19] S. Fukami et al., "Low-current perpendicular domain wall motion cell for scalable high-speed MRAM," in Proc. Symp. VLSI Technol., 2009, pp. 230–231. - [20] C. Augustine *et al.*, "Numerical analysis of domain wall propagation for dense memory arrays," in *Proc. IEEE IEDM*, Dec. 2011, pp. 17.6.1–17.6.4. - [21] S. Fukami, M. Yamanouchi, S. Ikeda, and H. Ohno, "Domain wall motion device for nonvolatile memory and logic—Size dependence of device properties," *IEEE Trans. Magn.*, vol. 50, no. 11, Nov. 2014, Art. no. 3401006. - [22] B. B. Aein, D. Datta, S. Salahuddin, and S. Datta, "Proposal for an all-spin logic device with built-in memory," *Nature Nanotechnol.*, vol. 5, pp. 266–270, Feb. 2010. - [23] K. Shamsi, Y. Bi, Y. Jin, P.-E. Gaillardon, M. Niemier, and X. S. Hu, "Reliable and high performance STT-MRAM architectures based on controllable-polarity devices," in *Proc. IEEE ICCD*, Oct. 2015, pp. 343–350. - [24] B. Giridhar, N. Pinckney, D. Sylvester, and D. Blaauw, "A reconfigurable sense amplifier with auto-zero calibration and pre-amplification in 28 nm CMOS," in *Proc. IEEE ISSCC*, Feb. 2014, pp. 242–243. - [25] Y. Zhang, X. Wang, and Y. Chen, "STT-RAM cell design optimization for persistent and non-persistent error rate reduction: A statistical design view," in *Proc. IEEE ICCAD*, Nov. 2011, pp. 471–477. - [26] W. C. Black and D. A. Hodges, "Time interleaved converter arrays," IEEE J. Solid-State Circuits, vol. 15, no. 6, pp. 1022–1029, Dec. 1980. - [27] J. M. Musicer and J. Rabaey, "MOS current mode logic for low power, low noise CORDIC computation in mixed-signal environments," in *Proc.* ACM ISLPED, 2000, pp. 102–107. - [28] M. J. Hall, V. Gruev, and R. D. Chamberlain, "Noise analysis of a current-mode read circuit for sensing magnetic tunnel junction resistance," in *Proc. IEEE ISCAS*, May 2011, pp. 1816–1819. - [29] R. Nebashi et al., "A content addressable memory using magnetic domain wall motion cells," in Proc. Symp. VLSI Circuits, 2011, pp. 300–301. - [30] T. Suzuki et al., "Low-current domain wall motion MRAM with perpendicularly magnetized CoFeB/MgO magnetic tunnel junction and underlying hard magnets," in *Proc. Symp. VLSI Technol.*, 2013, pp. T138–T139. - [31] S. Fukami *et al.*, "Scalability prospect of three-terminal magnetic domain-wall motion device," *IEEE Trans. Magn.*, vol. 48, no. 7, pp. 2152–2157, Jul. 2012. - [32] S. Fukami *et al.*, "20-nm magnetic domain wall motion memory with ultralow-power operation," in *Proc. IEEE IEDM*, Dec. 2013, pp. 3.5.1–3.5.4. - [33] S. Motaman and S. Ghosh, "Adaptive write and shift current modulation for process variation tolerance in domain wall caches," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 24, no. 3, pp. 944–953, Mar. 2016. - [34] R. Dorrance, F. Ren, Y. Toriyama, A. A. Hafez, C.-K. K. Yang, and D. Markovic, "Scalability and design-space analysis of a 1T-1MTJ memory cell for STT-RAMs," *IEEE Trans. Electron Devices*, vol. 59, no. 4, pp. 878–887, Apr. 2012. - [35] J. A. Fredenburg and M. P. Flynn, "Statistical analysis of ENOB and yield in binary weighted ADCs and DACS with random element mismatch," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 59, no. 7, pp. 1396–1408, Jul. 2012. - [36] P. Harpe, C. Zhou, X. Wang, G. Dolmans, and H. de Groot, "A 12 fJ/conversion-step 8 bit 10 MS/s asynchronous SAR ADC for low energy radios," in *Proc. IEEE ESSCIRC*, Sep. 2010, pp. 214–217. - [37] K. Yoshioka, A. Shikata, R. Sekimoto, T. Kuroda, and H. Ishikuro, "An 8 bit 0.35–0.8 V 0.5–30 MS/s 2 bit/step SAR ADC with wide range threshold configuring comparator," in *Proc. IEEE ESSCIRC*, Sep. 2012, pp. 381–384. Qing Dong (S'14) received the B.S. and M.S. degrees in microelectronics from Fudan University, Shanghai, China, in 2010 and 2013, respectively. He is currently pursuing the Ph.D. degree with the University of Michigan, Ann Arbor, MI, USA. His current research interests include memory circuits design, and monitoring circuits design for process variation and BTI. Mr. Dong was a recipient of the Best Paper Award at the 2012 IEEE International Conference on Solid-State and Integrated Circuit Technology, the 2015 IEEE International Symposium on Circuits and Systems, and the 2016 IEEE Symposium on Security and Privacy. **Kaiyuan Yang** (S'13) received the B.S. degree in electronics engineering from Tsinghua University, Beijing, China, in 2012, and the M.S. degree in electrical engineering from the University of Michigan, Ann Arbor, MI, USA, in 2014, where he is currently pursuing the Ph.D. degree. His current research interests include low power digital and mixed-signal circuit design, and hardware security. Mr. Yang was a recipient of the Best Paper Award at the 2015 IEEE International Symposium on Cir- cuits and Systems and the 2016 IEEE Symposium on Security and Privacy. Laura Fick (S'11) received the B.S. degree in electrical engineering from the University of Maryland, College Park, MD, USA, in 2011, and the M.S. degree in electrical engineering from the University of Michigan, Ann Arbor, MI, USA, in 2013, where she is currently pursuing the Ph.D. degree on a National Science Foundation Graduate Research Fellowship. Her current research interests include neuromorphic VLSI design, low-power circuit design, and high-performance mixed-signal design. **David Fick** (S'08–M'10) received the B.S.E. degree in computer engineering, the M.S.E. degree in computer science and engineering, and the Ph.D. degree in computer science and engineering from the University of Michigan, Ann Arbor, MI, USA, in 2006, 2009, and 2012, respectively. He is currently the Chief Technical Officer of Isocline Engineering, an ASIC Research Corporation, Austin, TX, USA. He has authored over 25 papers and holds six patents. His current research interests include neuromorphic computing, high-performance mixed-signal computation, and 3-D integrated circuits. **David Blaauw** (M'94–SM'07–F'12) received the B.S. degree in physics and computer science from Duke University, Durham, NC, USA, in 1986, and the Ph.D. degree in computer science from the University of Illinois at Urbana–Champaign, Champaign, IL, USA, in 1991. He was with Motorola, Inc., Austin, TX, USA, where he was the Manager of the High Performance Design Technology group. Since 2001, he has been with the faculty at the University of Michigan, Ann Arbor, MI, USA, where he is currently a Professor. He has authored over 500 papers and holds 50 patents. His current research interests include VLSI design, including near-threshold and subthreshold design for ultralow power mm scale sensor nodes. Dr. Blaauw was the Technical Program Chair and General Chair for the International Symposium on Low Power Electronic and Design. He was also the Technical Program Co-Chair of the ACM/IEEE Design Automation Conference and a member of the ISSCC Technical Program Committee. **Dennis Sylvester** (S'95–M'00–SM'04–F'11) received the Ph.D. degree in electrical engineering from the University of California at Berkeley, Berkeley, CA, USA, in 1999. He has held research staff positions with the Advanced Technology Group, Synopsys, Mountain View, CA, USA, the Hewlett-Packard Laboratories, Palo Alto, CA, USA, and visiting professorships at the National University of Singapore, Singapore, and Nanyang Technological University, Singapore. He is currently a Professor of Electrical Engineering and Computer Science with the University of Michigan, Ann Arbor, MI, USA, and the Director of the Michigan Integrated Circuits Laboratory, and has supervised a group of 10 faculty and more than 70 graduate students. He is the Co-Founder of Ambiq Micro, Austin, TX, USA, a fabless semiconductor company developing ultralow-power mixed-signal solutions for compact wireless devices. He has authored or co-authored over 375 articles along with one book and several book chapters. He holds 20 U.S. patents. His current research interests include the design of millimeter-scale computing systems and energy-efficient near-threshold computing. Dr. Sylvester served on the Executive Committee of the ACM/IEEE Design Automation Conference and serves on the Technical Program Committee of the IEEE International Solid-State Circuits Conference. He also serves as a Consultant and Technical Advisory Board Member for electronic design automation and semiconductor firms in his research areas. He was a recipient of the NSF CAREER Award, the Beatrice Winner Award at ISSCC, an IBM Faculty Award, an SRC Inventor Recognition Award, and eight Best Paper Awards and Nominations. He is also a recipient of the ACM SIGDA Outstanding New Faculty Award and the University of Michigan Henry Russel Award for distinguished scholarship. His dissertation was awarded the David J. Sakrison Memorial Prize as the most Outstanding Research in the UC-Berkeley. He has served as an Associate Editor of the IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN and the IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION SYSTEMS, and a Guest Editor of the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II.