# Reference Oversampling PLL Achieving -256-dB FoM and -78-dBc Reference Spur

Ji-Hwan Seol<sup>®</sup>, *Member, IEEE*, Kyojin Choo<sup>®</sup>, *Member, IEEE*, David Blaauw<sup>®</sup>, *Fellow, IEEE*, Dennis Sylvester<sup>®</sup>, *Fellow, IEEE*, and Taekwang Jang<sup>®</sup>, *Senior Member, IEEE* 

Abstract—This article presents a low jitter, low power, low reference spur LC oscillator-based reference oversampling digital phase locked loop (OSPLL). The proposed reference oversampling architecture simultaneously offers a low in-band phase noise, a wide-bandwidth, and a low spur. In addition, this article proposes an LC digitally controlled oscillator (DCO) for the proposed OSPLL to achieve a fast frequency update and fine frequency resolution, while its varactor switching timing is set optimally for low jitter using the proposed DCO tuning pulse timing control scheme. The proposed OSPLL was fabricated in a 28-nm CMOS process. The integrated rms jitter of the PLL was measured at 67.1 fs for an output frequency of 4 GHz. The in-band phase noise of the PLL was -129.2 and -132.5 dBc/Hz at 1- and 5-MHz offset frequencies. The measured reference spur of the PLL was -78.1 dBc. Total PLL power consumption was 5.2 mW, resulting in -256.3-dB PLL jitter-power FoM, while occupying 0.17-mm<sup>2</sup> area.

*Index Terms*—Digital controlled oscillator (DCO), digital PLL, frequency synthesizer, reference oversampling PLL (OSPLL), reference spur.

## I. INTRODUCTION

**G** ENERATION of high-purity clock sources is becoming more crucial in today's communication systems. With the advent of advanced communication systems such as 5G wireless radios and ultrahigh-speed wireline transceivers, the required clock jitter is now below 100 fs [1]–[3], [51]. For example, the integrated jitter requirement of the local oscillator (LO) in 5G transceivers can be as low as 90 fs to maintain an acceptable error vector magnitude (EVM) [1], [2]. The front-end ADCs in ultrahigh-speed wireline transceivers demand clock jitter less than 100 fs to avoid signal-to-noise ratio (SNR) degradation [3].

However, it is hard to generate a clock with such a low jitter while satisfying other requirements such as power and

Kyojin Choo, David Blaauw, and Dennis Sylvester are with the Department of Electrical and Computer Engineering, University of Michigan, Ann Arbor, MI 48109 USA.

Taekwang Jang is with the Integrated Systems Laboratory, Swiss Federal Institute of Technology, 8052 Zürich, Switzerland (e-mail: tkjang@ iis.ee.ethz.ch).

Color versions of one or more figures in this article are available at https://doi.org/10.1109/JSSC.2021.3089930.

Digital Object Identifier 10.1109/JSSC.2021.3089930

spur, due to conflicting trade-offs [4]–[9]. A wide bandwidth is beneficial since it provides larger suppression of the oscillator phase noise, achieving better power efficiency [50]. However, to achieve wide bandwidth, it is critical that the in-band phase noise from the loop components is low, which can incur large power consumption [9]. High reference spur [4]–[6] and stability issues [7], [8] are another concern that make it difficult to achieve wide bandwidth. Therefore, the bandwidth of a PLL is typically set much lower than the stability limit of  $F_{\text{REF}}/10$ , increasing the power consumption of the oscillator and degrading overall PLL power efficiency [43], [48]–[50].

There have been considerable efforts to reduce the jitter of frequency synthesizers over the years [10]-[21], [40]-[42]. Charge pump PLLs (CPPLLs) are widely used due to its robustness. However, a large charge pump current is required to achieve low in-band phase noise [9]. Also, the charge pump nonidealities-induced reference spur limits the PLL bandwidth [43], degrading its power efficiency [40]–[42]. Sub-sampling PLLs can attain low in-band phase noise by sampling the voltage-controlled oscillator (VCO) output with the reference clock [10]–[15]. A large phase detector (PD) gain can be obtained by utilizing the high VCO output slope, which in turn greatly suppresses the noise from the loop components. However, the sampling operation can disturb the VCO output, resulting in a high reference spur [14]. Also, the narrow capture range of the sub-sampling PD makes the loop susceptible to disturbance, therefore, an additional monitoring block is required [15]. Injection-locked clock multipliers (ILCMs) offer excellent VCO noise suppression by directly injecting the reference clock to the VCO [16]-[18]. Thanks to the instantaneous phase correction, the noise suppression bandwidth can be greater than that of conventional PLLs while low in-band phase noise is achieved in a power-efficient manner. However, due to the direct injection of the reference clock to the VCO, ILCMs are susceptible to a large reference spur, necessitating complex calibration [18]. Bang-bang PLLs (BBPLLs) are promising since they offer excellent power efficiency by making use of a low power 1-bit time-to-digital converter (TDC) as their phase detection block [19]-[21]. However, large quantization noise from the bang-bang phase detector (BBPD) can limit their in-band phase noise level.

One way to avoid the conflicting trade-offs and improve the performance of PLLs is to increase the reference frequency. A higher reference frequency can lower the in-band phase noise by reducing the charge pump noise [22], [26] or TDC quantization noise [23]. Also, the PLL bandwidth can be increased without stability or spur concerns [4], [26].

0018-9200 © 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

Manuscript received January 30, 2021; revised April 25, 2021 and May 24, 2021; accepted June 6, 2021. Date of publication June 25, 2021; date of current version September 24, 2021. This article was approved by Guest Editor Wei-Zen Chen. (*Corresponding author: Ji-Hwan Seol.*)

Ji-Hwan Seol is with the Department of Electrical and Computer Engineering, University of Michigan, Ann Arbor, MI 48109 USA, and also with Samsung Electronics, Hwasung 445-330, South Korea (e-mail: jhseol@umich.edu).



Fig. 1. (a) Operation of a conventional PLL. (b) Operation of the reference oversampling PLL. (c) Comparison between a conventional PLL and the OSPLL.

Consequently, reference boosting schemes, such as a reference doubler [24], or a quadrupler [25], [26], have become popular as a way of generating high reference frequencies without using expensive high frequency crystal oscillators. In [24] and [25] a delay line with an XOR gate is used, while Megawer *et al.* [26] employed comparators with multiple voltage references to boost the frequency of the input reference clock. However, both approaches rely on the absolute accuracy of the delay or voltage references to avoid a period error of the output clock, which can lead to a reference spur. Therefore, previous reference boosting schemes require complex calibration logic, making it hard to achieve a multiplication factor greater than four.

In this article, we present a reference oversampling PLL (OSPLL) [28] whose loop operates at the output frequency  $F_{OUT}$ , which effectively suppresses the noise from the loop components by oversampling, thereby achieving low in-band phase noise. The low in-band phase noise, in turn, enables wide bandwidth and better oscillator noise filtering, which is achieved with less stability issue and low reference spur as the loop operates at  $F_{OUT}$  rather than  $F_{REF}$ . This article also presents an LC digitally controlled oscillator (DCO) for the OSPLL with fine frequency resolution and fast switching capability. Furthermore, a timing control scheme is proposed to optimally position the varactor switching pulse for low phase noise. In combination with the low in-band phase noise and the wide bandwidth, the implemented PLL achieved 67.1-fs integrated rms jitter with 5.2-mW power consumption at 4 GHz, which leads to -256.3 FoM. The measured reference spur of the PLL was -78.1 dBc.

This article is organized as follows. Section II provides the principle and basic operation of the OSPLL. Section III discusses the design of the LC-DCO used in the proposed PLL. Section IV describes the design and implementation detail of each building block. The measurement results are presented in Section V, and conclusions are then drawn in Section VI.

### II. PRINCIPLE OF THE REFERENCE OSPLL

To better explain the principle of the proposed OSPLL, we compare the OSPLL with a conventional PLL whose loop operates at every reference cycle  $T_{\text{REF}}$ , as shown in Fig. 1(a). The reference clock of the PLL is generated from a crystal oscillator (XO). Since the XO output (REF) is a sinusoidal

waveform, it is first converted to a square wave by a reference buffer (BUF). The output of the reference buffer  $CK_{REF}$  is then applied to a PD. The PD compares the positive edges of  $CK_{REF}$ to those from the feedback clock, FB, which is the divided oscillator output clock. The PD operates at every  $T_{REF}$  cycle, which is the period of  $CK_{REF}$  and FB. The PD output is then applied to the loop filter (LF), and the LF output LF<sub>OUT</sub> drives the oscillator, suppressing the oscillator noise. In this case, the reference and PD noise are amplified by the frequency division ratio (*N*). Also, the reference buffer can induce large noise and/or power when it converts the sinusoidal reference, which has a slow slope, to a square wave [10].

Fig. 1(b) shows the conceptual diagram of the OSPLL [27], [28]. Without a reference buffer, the sinusoidal reference REF is directly applied to the sampling switch input. The switch samples its input at the falling edge of the oscillator output FB, which is the same as OUT, forming a PD. The PD of the OSPLL is based on the sampling circuits whose output voltage is proportional to their input phase error with the gain defined by the slope of the reference clock  $dV_{\text{REF}}/dt$  [29]. As the PD samples the phase error at every  $T_{OUT}$ , the loop now operates at a  $F_{OUT}$  rate, noise from the loop components is effectively suppressed by oversampling. This is illustrated in Fig. 1(b), where the black dots represent the sampling points, and the phase error of FB is converted to the sampled voltage  $V_{\rm SMPL}$ . Note that different dc value at the sampling points is addressed using N interleaved PDs implemented with the ac coupling scheme, which will be explained later in this section. The extracted phase error at every FB edge then drives LF and its output LF<sub>OUT</sub> suppresses oscillator noise. In other words, the OSPLL loop operates in  $F_{OUT}$  not  $F_{\text{REF}}$ , simultaneously enabling low in-band phase noise, wide bandwidth, and low spur as shown in Fig. 1(c). In addition, since the OSPLL directly samples the sinusoidal reference from the XO, no reference buffer is needed, improving power efficiency and output phase noise.

Fig. 2(a) shows the detailed structure of the proposed OSPLL. *N* reference sampling PDs (RSPDs) directly sample the input reference sinewave REF from the XO.  $\Phi\langle 1:N\rangle$  denotes the time-interleaved sampling clocks generated from the DCO output (OUT) so that the effective sampling period of the RSPDs is the DCO period  $T_{\text{DCO}}$  while each RSPDs operates in  $T_{\text{REF}}$ , relaxing the speed requirements. The RSPD



Fig. 2. (a) Simplified block diagram of the OSPLL. (b) Conceptual diagram of the sinusoidal reference and the RSPD sampling points.



Fig. 3. (a) RSPD with the ac coupling. (b) Operation of the RSPD with ac coupling scheme. (c) Pivot RSPD. (d) Operation of the Pivot RSPD. (e) Assignment of the pivot RSPD and ac coupled RSPDs according to the RSPD index.

sampling clock  $\Phi(1:N)$  is generated from the multiphase generator, driven by OUT. Each RSPD operating at  $\Phi(i)$  receives  $\Phi$ SMPL $\langle i \rangle$  and  $\Phi$ COMP $\langle i \rangle$  to control the sampling switch and the comparator, respectively, in the corresponding RSPD $\langle i \rangle$ . At the falling edge of  $\Phi$ SMPL $\langle i \rangle$ , REF is sampled on a sampling capacitor  $C_{S}$ . Assuming there is no noise in OUT, each RSPD samples an identical point of the sinewave reference REF, marked with a blue dot with corresponding RSPD index as shown in Fig. 2(b). Jitter in OUT will then cause a change of the sampled voltage, which represents the phase error. Therefore, the nominal dc value of  $V_{\text{SMPL}}$  is constant, and the voltage fluctuation due to the DCO jitter is superpositioned on the dc voltage. The sampled voltage  $V_{\text{SMPL}}$  is then quantized by a comparator at the rising edge of  $\Phi COMP(i)$  to produce 1-bit PD output PDOUT $\langle i \rangle$ , which indicates the phase of OUT either leads or lags the phase of REF. An ac coupling capacitor is placed between the sampling capacitor and the comparator to remove the dc voltage of  $V_{\text{SMPL}}$ . PDOUT(1:N) is then converted to DCO control word (DCW) by the digital loop filter (DLF) and fed to the DCO, forming a bang-bang PLL (BBPLL) operating at every DCO cycle.

Since each RSPD samples different points of the sinusoidal reference REF, the sampled voltages of the RSPDs have different nominal dc values, which raise several issues. In order for the comparator to accurately quantize the phase error, the sampled voltage must be compared with a reference voltage representing the nominal voltage of the sampling point. As each RSPD samples a different point of the sinusoidal reference clock, *N* voltage references that accurately represent

the sampling points are needed, which is impractical. The different common mode levels of each comparator also make it difficult to optimize the comparator. The ac-coupling scheme shown in Fig. 3(a) solves this issue by removing the dc of the sampled voltage, so that only a single reference voltage can be used for every RSPD. The dc level of  $V_{\text{SMPL}}$  is blocked by  $C_C$  and only the ac fluctuation caused by the output jitter is passed to the comparator input, which is biased at  $V_{\text{CM}}$  using a large resistor. Note that the other comparator input, V-, is also biased at  $V_{\text{CM}}$  through the same type of resistor. Therefore, the comparator can operate at its optimal common mode voltage,  $V_{\text{CM}}$ . This ac coupling operation is illustrated in Fig. 3(b). The DCO jitter is converted to  $V_{\text{SMPL}}$  and then its dc level is shifted to around  $V_{\text{CM}}$  through ac coupling.

Since only ac fluctuations of the phase error can be detected through the coupling capacitor, an RSPD with ac coupling cannot provide absolute dc phase locking. Therefore, one of N RSPDs, RSPD(1), is designed as a pivot RSPD, without the ac coupling capacitor, to achieve absolute dc phase locking as shown in Fig. 3(c). The comparator reference voltage of the pivot RSPD is set to  $V_{\rm CM}$ , which is also the positive zero crossing point of REF, so that it can provide the pivot position of the phase locking at the REF zero crossing while the sampling points of the other RSPDs are defined relative to the pivot position as shown in Fig. 3(e). Fig. 3(d) shows the operation of the pivot RSPD where the sampled voltage  $V_{\rm SMPL}$  is directly compared with  $V_{\rm CM}$  to provide dc phase locking point.



Fig. 4. Conceptual diagram that the static multiphase error, denoted as  $\Delta ta$ ,  $\Delta tb$ ,  $\Delta tc$ , does not affect the comparator operation.



Fig. 5. Noise source of the RSPD.

As the ac coupling scheme effectively blocks the dc variation of the sampled voltage, the proposed RSPD is insensitive to the multiphase error between sampling clock  $\Phi(1:N)$ , or the distortion of the input sinewave reference. The sampling clocks,  $\Phi(1:N)$ , are generated from the multiphase generator and distributed to each RSPD. Mismatch in the multiphase generator and the wire delay can create multiphase error between  $\Phi(1:N)$ . This static phase error can move the sampling point slightly, resulting in the dc voltage shift at  $V_{\rm SMPL}$  as shown in Fig. 4. However, the dc voltage shift in  $V_{\text{SMPL}}$  is blocked by the ac coupling capacitor and it does not affect the 1/0 probability at the comparator output. Likewise, the distortion of the sine reference is blocked by the ac coupling. Therefore, the multiphase error and the reference distortion do not create a reference spur at the PLL output, which is verified by simulation.

The intrinsic noise of an RSPD is composed of sampling noise  $(V_{N,SMPL})$  and comparator noise  $(V_{N,COMP})$  as shown in Fig. 5. The noise power generated by the two noise sources is identical for every RSPD. However, the input jitter-to-sampled voltage gain, represented by  $\Delta V_{SMPL}/\Delta t$  at RSPD $\langle i \rangle$ , depends on the slope (SLOPE $\langle i \rangle$ ) of the sinusoidal reference where the RSPD $\langle i \rangle$  samples, which can be expressed as follows:

$$\Delta V_{\text{SMPL}} = \text{SLOPE}\langle i \rangle \cdot \Delta t$$
$$= \frac{d}{dt} [A_{\text{REF}} \sin(\omega_{\text{REF}} t)] \cdot \Delta t, \quad t = \frac{i-1}{N} T_{\text{REF}}$$
$$= A_{\text{REF}} \omega_{\text{REF}} \cos\left(2\pi \frac{i-1}{N}\right) \cdot \Delta t \tag{1}$$

where  $A_{\text{REF}}$  is the amplitude of the input sinusoidal reference. The main implication of (1) is that due to the different slopes of the sinusoidal reference that each RSPD samples, the gain from the timing jitter  $\Delta t$  to the sampled voltage differs for each RSPD, as does the SNR of each RSPD. For example, near the zero-crossing point of the reference clock, the input slope is steeper, thus the gain from  $\Delta t$  to  $\Delta V_{\text{SMPL}}$  is higher as shown in Fig. 6(a). This high gain suppresses the subsequent noises  $V_{N,\text{SMPL}}$  and  $V_{N,\text{COMP}}$ , providing excellent  $\Delta t$  detection. On the other hand, when the sampling points are near the peak, the gain is small, so the RSPD output is mostly dominated by  $V_{N,\text{SMPL}}$  and  $V_{N,\text{COMP}}$  rather than  $\Delta t$ .

For the optimal noise performance of the PLL, it is beneficial to emphasize the RSPDs that sample steeper points of REF compared with the RSPDs that sample REF near its peak points. For this, we set a different proportional path gain PGAIN(1:N) for each RSPD, so that the gain of each RSPD is proportional to the slope of the sampling points, which is quadrature to the reference clock, REF, as shown in Fig. 6(a). Note that it is even possible to remove the RSPDs near the peak/bottom which has near zero PGAIN value to save area and power. Fig. 6(b) shows the simplified structure of the proposed DLF, where the RSPD output PDOUT(1:N)is multiplied by the corresponding proportional path gain PGAIN(1:N), realizing the gain profile in the proportional path output UP/DN(1:N). Note that for the integral path, only one of the PDOUTs, PDOUT(1), is used to lower the operation speed and save power.

A discrete time domain model [30], shown in Fig. 7, is adopted to simulate the PLL noise and optimize its design parameters. The periodically time variant parameters such as SLOPE[i] and PGAIN[i] are modeled as shown in Fig. 7, while the index *i* is calculated as  $MOD_N[k] + 1$ .  $MOD_N$  is a modulo-N operation, and k is the index of the timestamps of the simulation, which increases by 1 at every  $T_{\text{DCO}}$ . In other words, the entire model operates in a single DCO clock domain. The reference sampling operation is modeled with a gain block with the gain of SLOPE[i] as derived in (1), and the sampling noise is added at its output. The comparator model is composed of a 1-bit quantizer and a latency block,  $Z^{-D}$ , that models the comparator evaluation time (D = 6 in the implementation, where D represents the number of DCO cycles). Also, the input referred noise of the comparator  $V_{N,\text{COMP}}[k]$  is added to the input of the quantizer. The DLF consists of a proportional path and an integral path. The proportional path is modeled with the periodically varying gain PGAIN[i], which is proportional to SLOPE[i] with the amplitude of  $\beta$ . The integral path sub-samples its input when  $MOD_N[k] = 0$  because the integral path only utilizes PDOUT(1). The sampled output is then applied to the integrator and the integral gain alpha. The output of the DLF is followed by the DCO, which is modeled as an integrator with a gain of  $K_{DCO}$  and the DCO noise  $t_{N,DCO}[k]$  added at the output. The DCO output is fed back to the input, forming a PLL loop.

Fig. 8 shows the simulated output phase noise spectrum of the proposed OSPLL model. The number of the RSPDs (N) in the OSPLL is 20 and the reference frequency is 200 MHz, resulting in a 4-GHz output frequency. In order to show the advantage of the proposed OSPLL architecture, a conventional BBPLL is also simulated and its phase noise spectrum is plotted in Fig. 8. Compared with the BBPLL, the OSPLL achieved 10.1 dB of in-band phase noise suppression; the in-band phase noise of the OSPLL and the conventional BBPLL are -132.9 and -122.8 dBc/Hz, respectively. The



Fig. 6. (a) Different sampling gains due to the different reference slope. (b) Simplified structure of the proportional path with the gain profile and the integral path.



Fig. 7. Discrete time domain model of the proposed OSPLL.



Fig. 8. Simulated phase noise plot of the proposed OSPLL (N = 20) and the conventional BBPLL (N = 1) at 4-GHz  $F_{OUT}$  and 200-MHz  $F_{REF}$ .

total integrated rms jitter of the OSPLL and BBPLL are 65 and 122.3 fs, respectively.

In the sinusoidal reference, there exists several low slope sampling points such that the RSPD output is dominated by the loop components noise. As a result, there will be an in-band phase noise amplification compared with when all sampling points have the same highest slope of the sinewave reference. This in turn lowers the in-band phase noise improvement of oversampling from 10logN to 10logN– $A_N$ , where  $A_N$ is the noise amplification. Fig. 9(a) shows the simulated in-band phase noise amplification with different multiplication ratio N. The noise amplification is 3 dB regardless of N when  $N \ge 2$ . This also agrees with the simulation results in Fig. 8,



Fig. 9. (a) Simulated noise amplification due to sinusoidal reference when cosine PGAIN profile is used. (b) Tested PGAIN profile when N = 20. (c) Simulated noise amplification compared with flat profile when N = 20.

where the improvement from oversampling when N = 20 is  $10\log N - A_N = 13 \text{ dB} - 3 \text{ dB} = 10 \text{ dB}$ . Note that when N = 1, 2, the noise amplification is 0 dB since all sampling points have the same highest reference slope.

The effect of the time-varying proportional gain (PGAIN) on the noise amplification is verified by comparing OSPLLs



Fig. 10. Simulated PLL Jitter according to the loop latency.



Fig. 11. Conceptual operation of the LC-DCO for the OSPLL.

operating with different PGAIN profiles, as shown in Fig. 9(b). Flat profile has the same PGAIN value for all RSPD indexes. The cosine is the profile we adopted in this design, which is quadrature to the input sinewave reference. Clipped cosine and sawtooth profiles represent steeper and more gradual slopes compared with the cosine profile. Fig. 9(c) shows the simulated in-band phase noise amplification when N = 20. The noise amplification of the flat and cosine profile was 4 and 3 dB, respectively, showing that 1-dB in-band phase noise improvement is achieved by using the time varying cosine PGAIN profile. Clipped cosine and sawtooth profile show 3.07- and 3.08-dB noise amplification, which is slightly worse than the cosine profile, but this implies that as long as the profile emphasize the high slope sampling points and deemphasize the low slope sampling points, the improvement is similar.

In addition to the noise of each building block, one important parameter of the OSPLL for jitter minimization is the latency of the loop, which is dominated by the comparator evaluation time modeled as  $Z^{-D}$  in Fig. 7. As the optimal loop bandwidth for the proposed OSPLL is high (~20 MHz) due to its low in-band phase noise level, the minimization of the latency is critical for low jitter [19]. Fig. 10 shows the simulated PLL jitter with different latency values, increasing at a 0.9-fs/250-ps rate. In this work, the latency is designed with six DCO cycles (1.5 ns) as a result of a trade-off between the latency and the comparator input transistor size. Note that even though the loop latency is multiples of the effective loop operating cycle ( $T_{DCO}$ ), there is less stability or limit cycle issue since the loop time constant ( $1/F_{BW}$ ) is much larger compared with the loop latency [45], [46].

## III. DCO DESIGN FOR REFERENCE OSPLL

The design of an LC-DCO for the OSPLL introduces several requirements that differ from those of a conventional PLL. Fig. 11 shows the conceptual diagram of an LC-DCO for the



Fig. 12. (a) Conceptual operation of the unit varactor. (b) Simulated  $V_{\text{BOT}}$  when  $V_{\text{STEP}}$  is swept from 0 to 0.6 V. (c) Simulated  $F_{\text{STEP}}$  when  $V_{\text{STEP}}$  is swept from 0 to 0.6 V. (d) Detailed structure of the unit cell of the proportional DAC.

OSPLL. Phase detection is performed in every DCO cycle, as is frequency control. However, due to the  $F_{\text{DCO}}$  rate frequency control, it is challenging to support the fine frequency tuning step that is needed for the LC-DCO. According to simulation, the required frequency step of the proportional path is less than 10 kHz, which requires a sub-10-aF capacitance change. In conventional PLLs, where the loop operates in a reference cycle, this LC-DCO fine frequency step can be easily implemented using delta sigma modulator (DSM) clocked at  $F_{\text{DCO}}$  or a lower frequency [19]–[21]. However, this scheme cannot be adopted for the proportional path of the OSPLL, whose update rate is equal to  $F_{DCO}$ . In addition, the DCO needs to support the proportional path gain profile, which is proportional to the slope of the sampled reference point as explained in Section II. Therefore, the DCO should have the ability to set a different gain at each RSPD output as seen in Fig. 11.

Fig. 12 shows the structure of the unit cell of the proposed proportional DAC used in the LC-DCO to meet the aforementioned requirements of the OSPLL. Fig. 12(a) shows the principle of the unit cell operation. The unit cell consists of two NMOS varactors whose gates are connected to the  $V_{\text{DCOP}/N}$  while the bottom nodes of the varactors  $V_{\text{BOT}}$  are connected to a driver whose supply voltage is  $V_{\text{STEP}}$ . This driver generates a pulse with an amplitude of  $V_{\text{STEP}}$ , which drives the varactor bottom node for one DCO cycle.

This short pulse enables fast frequency switching so that the DCO frequency can be briefly increased by  $F_{\text{STEP}}$  for one DCO cycle. The size of the frequency step,  $F_{\text{STEP}}$ , is controlled by  $V_{\text{STEP}}$ . By setting  $V_{\text{STEP}}$  sufficiently small, we can achieve



Fig. 13. (a) Structure of the proportional DAC. (b) Structure of the sub-DAC of the proportional DAC. (c) Conceptual operation of the proportional DAC.

fine frequency resolution of less than 10 kHz. Fig. 12(b) shows the simulated waveform of the  $V_{BOT}$  node of a unit varactor where  $V_{STEP}$  is swept from 0 to 0.6 V. Fig. 12(c) plots the corresponding  $F_{STEP}$ , which shows that  $F_{STEP}$  can be continuously tuned from 0 to 100 kHz. Note that the current load from  $V_{STEP}$  is less than 10  $\mu$ A, which can be supplied from a low power LDO. Fig. 12(d) shows the detailed schematic of the proposed unit cell which consists of two sets of the unit varactor and its drivers. Transmission gate type drivers are used to provide fast  $V_{BOT}$  rising/falling slope across the  $V_{STEP}$  tuning range (0–0.6 V), while delay matched pre-drivers drive the transmission gate type drivers.

Fig. 13(a) shows the structure of the proportional DAC (PDAC). To realize the proportional path gain profile, the PDAC is constructed with *N* sub-DACs, whose gain can be separately controlled. A sub-DAC is assigned to the output of corresponding RSPD $\langle i \rangle$ . Fig. 13(b) shows the sub-DAC, which consists of seven unit cells to provide gain programmability as previously described. A 3-bit PGAIN signal controls the number of unit cells that are enabled when UP/DN toggles. This enables each sub-DAC with eight-level programmability of 3-bit binary coding. PGAIN $\langle i \rangle$  for each sub-DAC is predetermined during the design phase to provide a quadrature profile to the sinusoidal reference shape. Fig. 13(c) shows the detailed operation of the proportional DAC. The DCO frequency switches according to the UP/DN pulses, and the



Fig. 14. Conceptual diagram that describes the phase disturbance due to the varactor switching.

size of the frequency step is controlled by the PGAIN value of each sub-DAC.

When the UP/DN pulses toggle at every DCO cycle, capacitive coupling from the varactor switching injects deterministic noise to the DCO, which may lead to an increase in PLL output jitter. Fig. 14 illustrates varactor switching, in which the bottom node of the varactor  $V_{\text{BOT}}$  rises by  $V_{\text{STEP}}$  to increase DCO frequency. This voltage transition injects charge into the DCO tank due to charge feedthrough, causing a voltage fluctuation  $\Delta V$  at the DCO waveform  $V_{\text{DCOP/N}}$ .  $\Delta V$ is determined by the capacitance divider ratio, expressed as follows:

$$\Delta V = \frac{C_{\text{VAR}}}{C_{\text{VAR}} + C_{\text{TANK}}} V_{\text{STEP}}$$
(2)

where  $C_{\text{TANK}}$  is the total tank capacitance of the LC-DCO, and  $C_{\text{VAR}}$  is the capacitance between  $V_{\text{BOT}}$  of a varactor and the  $C_{\text{TANK}}$ , which includes the varactor gate channel capacitance, the gate-to-drain/source overlap capacitance, and the parasitic capacitances due to layout. This voltage fluctuation can cause a phase disturbance of the DCO waveform by directly shifting its zero-crossing, resulting in jitter increase. To minimize such a side effect, we need to: 1) minimize the  $\Delta V$  upon varactor switching and 2) prevent the created voltage fluctuation from disturbing the zero-crossing of the DCO waveform.

The magnitude of  $\Delta V$  depends on the tank capacitance  $C_{\text{VAR}}$  as in (2), which varies according to the varactor switching location with respect to the DCO waveform as shown in Fig. 15(a) and (b). As an example, the gate capacitance is small when  $V_{\text{DCO}}$  is low because the varactor is in depletion mode. Therefore, it is preferable to switch  $V_{\text{BOT}}$  at the valley point of  $V_{\text{DCO}}$ . On the other hand, the varactor switching at the peak point of  $V_{\text{DCO}}$  should be avoided as the large varactor capacitance results in a large  $\Delta V$ .

 $\Delta V$  then causes DCO output phase disturbance by altering the zero-crossing point of the waveform, and the amount of the phase disturbance depends on the location of the varactor switching with respect to the DCO sinusoid, as described by the impulse sensitivity function (ISF) [31]. As shown in Fig. 15(c), if  $\Delta V$  occurs near the peak/valley of  $V_{DCO}$ , the translation from  $\Delta V$  to phase disturbance is minimized. However, the phase disturbance due to  $\Delta V$  is maximum near the  $V_{DCO}$  zero crossings.



Fig. 15. Simulated phase disturbance due to the varactor switching (simulation condition: TT/25 °C/ $V_{\text{STEP}} = 0.3$  V/PGAIN = 111). (a) DCO waveform. (b) Voltage disturbance due to the varactor switching according to the switching position. (c) Normalized ISF of the DCO. (d) Phase disturbance of the DCO according to the varactor switching position.



Fig. 16. (a) Proposed DCO tuning pulse timing control scheme. (b) Timing diagram.

Therefore, to minimize the phase disturbance, the varactor needs to be switched at the valley point of  $V_{\text{DCO}}$  where both  $\Delta V$  and the ISF can be minimized. Near the zero crossing is not the optimal point due to the high ISF. The peak of  $V_{\text{DCO}}$  is also not the optimal point due to the high  $C_{\text{VAR}}$ . Fig. 15 illustrates the simulation results of  $\Delta V$ , normalized ISF, and phase disturbance, clearly showing that the valley of  $V_{\text{DCO}}$  exhibits minimum phase disturbance due to the varactor switching.

To ensure the optimal timing for varactor switching at the DCO, the timing that UP/DN pulses reach the DCO should be controlled. Fig. 16 shows the control scheme for



Fig. 17. Overall structure of the proposed OSPLL.

the DCO tuning pulse, which generates UP/DN pulses with a programmable delay line to position them at the optimal varactor switching timing. Similar to [32], a digitally controlled delay line is used to control the timing of the UP/DN pulses. Instead of using N different delay lines for N UP/DN pulses, which incurs area overhead, we used a single shared delay line as shown in Fig. 16(a). The DCO output is first divided by two before it is applied to the delay line to lower the power consumption of the delay line. The delay line output, CLK\_DLY, then drives the N pulse generator unit, PULSE\_GEN(1:N) where the CLK\_DLY is multiplied by MASK(1:N) to select the corresponding pulse from CLK\_DLY and generate UP/DN $\langle 1:N \rangle$ , as shown in Fig. 16(b). Note that in the PULSE GEN $\langle i \rangle$  unit with odd index, CLK\_DLY is inverted before it is multiplied by MASK(i) to maintain the same UP/DN polarity. In Fig. 16(b), tDLY is the delay of the delay line, which is controlled with a 5b control word to set the timing of the UP/DN pulses. To cover the optimal range shown in Fig. 15(d), the delay range of the delay line is larger than 0.5  $\times$  T<sub>DCO</sub> range considering the PVT variation. The delay code is set during the test time for optimal jitter, and the measurement results are discussed in Section V.

#### **IV. PLL DESIGN AND IMPLEMENTATION**

# A. Overall Architecture

Fig. 17 shows the overall architecture of the proposed PLL: 20 RSPDs are used in this design, so that the PLL can generate integer-N multiplication of the reference frequency by 20 or less. The multiphase generator is clocked by OUT and generates the interleaved clocks  $\Phi$ SMPL(1:20) and  $\Phi COMP(1:20)$ , which drive the RSPDs. The outputs of the RSPDs, PDOUT(1:20), are applied to the DLF, which is composed of a proportional and an integral path. The proportional path contains a DCO pulse generator, and a delay line whose input is a divided-by-two clock of the DCO output. The integral path is used for the type-II operation to provide better DCO flicker noise suppression and a zero phase offset. For the integral path, only one of the RSPD outputs, PDOUT(1), is used to reduce its operation speed and power consumption. The 11 LSBs of the integrator are applied to DSM to obtain fine frequency resolution. The design also includes a startup



Fig. 18. Detailed schematic of the RSPD with its noise source.

path for wide and robust initial locking. When the PLL is turned on, the startup path using a conventional bang-bang phase and frequency detector (BBPFD) achieves initial phase and frequency lock. Then, the startup path is disabled, and the reference oversampling path takes over the loop. The startup path shares the integral path with the reference oversampling path to save area whereas the output of the BBPFD directly drives the DCO for the proportional path.

#### B. Reference Sampling Phase Detector

Fig. 18 shows the detailed schematic of the RSPD. An NMOS transistor driven by bootstrapped  $\Phi$ SMPL is used as a sampling switch (SW). The sampling capacitor  $C_S$ , the ac coupling capacitor  $C_C$ , and the dummy capacitor  $C_D$  are implemented with MOM capacitors.

It is important to optimize the noise and power of the RSPD, since the in-band noise of the PLL is dominated by the RSPD and its power takes up a large portion of the total PLL power. The RSPD noise mainly comprises comparator noise ( $V_{N,COMP}$ ) and KT/C<sub>S</sub> sampling noise ( $V_{N,SMPL}$ ). The reduction of  $V_{N,COMP}$  requires more comparator power, while a larger area is needed to reduce  $V_{N,SMPL}$ . In this design, we first set the noise of the comparator given the PLL total power budget, with the goal of achieving optimal PLL jitter-power efficiency [33]. We then decide the size of the sampling capacitor so that the sampling noise does not dominate the overall RSPD noise. According to the behavioral simulation, the required input referred noise of the comparator is 100  $\mu$ V.

Next, given the comparator noise, we determine the size of capacitors  $C_S$  and  $C_C$ . Maximizing the sampling capacitor  $C_S$  is advantageous for reducing the sampling noise. At the same time, the coupling capacitor  $C_C$  should also be large to minimize the signal attenuation due to the parasitic capacitance  $C_P$  at  $V_+$  node, which can be represented as follows:

$$\Delta V_{+} = \frac{C_{C}}{C_{C} + C_{P}} \Delta V_{\text{SMPL}}.$$
(3)

Therefore, given a fixed total capacitance,  $C_{\text{TOT}}$ , there exists an optimal  $C_S$  to  $C_C$  ratio. Fig. 19(a) shows the simulated PLL jitter, when  $C_S$  is increased and  $C_C$  is decreased for a fixed  $C_{\text{TOT}}$ . When  $C_S$  is small, the sampling noise dominates the comparator noise. Hence, increasing  $C_S$  can improve the



Fig. 19. (a) Simulated PLL jitter according to  $C_S$  with  $C_{\text{TOT}} = 1.4$  pF. (b) Simulated PLL jitter according to  $C_{\text{TOT}}$ . Optimal  $C_S/C_C$  ratio is used for each point, which was obtained using the method in (a).



Fig. 20. Simulated PLL phase noise plot with different HF corner of the ac coupling path.

PLL jitter in this region. However, as  $C_S$  becomes larger, the signal attenuation effect due to a smaller  $C_C$  starts to degrade the PLL jitter. Therefore, the optimal ratio between  $C_S$ and  $C_C$  can be determined for minimum PLL jitter. Fig. 19(b) shows the PLL jitter with different  $C_{\text{TOT}}$  values, where the optimal  $C_S/C_C$  ratio is used for each  $C_{\text{TOT}}$ . As shown in the figure, the PLL jitter improvement diminishes as  $C_{\text{TOT}}$  is increased, and eventually saturates as the comparator noise starts to dominate the sampling noise. Thus, we set  $C_{\text{TOT}}$ to 1.4 pF, the point beyond which increasing  $C_{\text{TOT}}$  gives negligible jitter improvement considering the area increase (<0.1 fs per 100 fF). The capacitances of  $C_S$  and  $C_C$  are set to 1 pF and 400 fF, respectively, yielding the optimal  $C_S/C_C$ ratio. Note that a matching capacitor,  $C_D$ , which is equivalent to the series capacitance of  $C_S$  and  $C_C$  (300 fF), is attached to the negative terminal of the comparator to minimize the comparator offset induced by the impedance mismatch.

The bias resistor  $R_B$  sets the high pass corner of the ac coupling path. Its value should be sufficiently low, otherwise it will block the low frequency component of the RSPD input phase error information, degrading the phase noise. As shown in Fig. 20, the low frequency phase noise increases if the high pass corner frequency is large ( $\approx 100 \text{ kHz}$ ). Note that the phase noise flattens again around 10 kHz due to the pivot RSPD.



Fig. 21. (a) Detailed structure of the comparator in an RSPD. (b) Early reset scheme and its operation.

In this design, the high pass corner is set to a sufficiently low value (below 1 kHz) using a PMOS transistor biased in the sub-threshold region to avoid phase noise degradation at low-offset frequency. Note that  $V_{\text{BIAS}}$  for the resistor is generated from a diode connected replica transistor with current source load to reduce the resistance variation with PVT. Since  $V_{\pm}$  node is biased with high resistance, the input transistors of the comparator are designed with thick gate transistors to minimize the gate leakage.

#### C. Comparator of the RSPD

The comparator dominates the noise and power of an RSPD, making its optimization critical. Fig. 21(a) shows the schematic of the two-stage dynamic comparator adopted in this design [34]. It consists of a dynamic preamplifier followed by a regenerative latch. The comparator is in reset phase when CK is low, pre-charging INT\_P/N node to VDD. As CK goes high, the input transistors M1, 2 start to discharge INT\_P/N with a differential current in proportion to its input voltage difference, providing a voltage gain. When the voltages reach the latch threshold of the second stage, the latch outputs OP/ON are fully regenerated to the rail-to-rail digital level by the positive feedback. Note that the clock gating transistors M1 and M2, instead of their source side to reduce kickback when CK toggles.

Comparator noise is mainly determined by pre-amplifier noise since the gain of the first stage pre-amplifier suppresses the noise from the second stage latch. Adding capacitor  $C_{\text{COMP}}$ at the integration node (INT\_P/N) reduces the noise level by narrowing the noise bandwidth [35]. In this design, a 430-fF  $C_{\text{COMP}}$  was used to meet the target input-referred comparator

noise of 100  $\mu$ V. Adding C<sub>COMP</sub> inevitably increases comparator power consumption, since  $C_{\text{COMP}}$  is fully charged to VDD and then discharged to VSS in every cycle, consuming  $2 \times f_{\text{REF}} \times C_{\text{COMP}} \times V_{\text{DD}}^2$ . To reduce comparator power consumption while maintaining its noise performance, we adopted the early reset scheme [36] as shown in Fig. 21(b). The early reset scheme can reduce the energy dissipated in the capacitor by stopping capacitor discharge once comparator evaluation is finished. As shown in Fig. 21(b), the NOR gate NR resets the S-R latch when either OP or ON is asserted, thereby stopping the discharge of the INT\_P/N node at the  $V_{\text{STOP}}$  level. As a result, power consumption is reduced to  $2 \times f_{\text{REF}} \times$  $C_{\text{COMP}} \times (V_{\text{DD}}^2 - V_{\text{STOP}}^2)$ . Note that signal EARLY\_RST\_EN is added to turn on/off the early reset scheme for test purposes. The measurement result confirmed 25% power saving without phase noise degradation, when the early reset scheme is turned on.

In addition to its noise, the offset of the comparator should be minimized to ensure proper operation of the RSPD. If the offset is large, the output of the comparator can be dominated by the offset rather than the input phase error. In this case, the comparator output can be skewed to either one or zero, generating a spur at the PLL output and degrading phase noise. To remove comparator offset, we adopted continuous background offset cancellation using a charge pump as shown in Fig. 21(a) [37]. The charge pump adjusts  $V_{OC}$  based on the comparator output.  $V_{OC}$  is then applied to M3, forming a feedback loop so that offset is removed and equal probability of 1s and 0s is ensured at the output. Note that it is important to match the charge pump UP and DN currents to prevent the reference spur. The UP/DN mismatch will alter the 1/0 probability of the comparator output and create the reference spur. In this work, the UP/DN mismatch is suppressed to less than 2%, using the charge pump mismatch reduction technique [44]. We verified with simulation that the 2% charge pump mismatch creates less than -100-dBc reference spur.

It is important that the bandwidth of the offset cancellation loop is sufficiently low so that the low frequency component of the PD input is not attenuated by the offset cancellation loop, degrading PLL phase noise performance. We set the bandwidth of the offset cancellation loop lower than 1 kHz, similar to the high pass corner of the ac coupling path. The charge pump is biased with a small current to realize such a bandwidth.

#### D. Multiphase Generator

The multiphase generator produces interleaved clocks to drive the RSPDs. Fig. 22(a) shows the structure of the multiphase generator, which is based on a ring counter design [38]. Twenty unit cells form a closed loop in which each unit cell is composed of clock gating logic, a flip-flop, a mux, and control signal generation logic. The DCO output clock is applied to all of the unit cells as a common clock and a bypass mux is included in each cell to provide programmability of the frequency division ratio [27]. Initially, half of the unit cells are set to one while the other half are set to zero. Then the initialized signals circulate the loop when CLK toggles, generating interleaved clocks as shown in Fig. 22(b).



Fig. 22. (a) Proposed multiphase generator. (b) Operation of the multiphase generator. (c) Timing diagram of each unit cell.



Fig. 23. Overall structure of the DCO.

To reduce the power consumption of the multiphase generator, the clock input CLK\_G for the FF is gated so that CLK\_G toggles only when the input and output of the unit cell are different, as shown in Fig. 22(c). The simulation result shows that the gating logic saves 20% power. The unit cell also contains a control signal generator to generate  $\Phi$ SMPL $\langle i \rangle$  and  $\Phi$ COMP $\langle i \rangle$  from  $\Phi i$  as shown in Fig. 22(c).

# E. DCO

Fig. 23 shows the detailed structure of the DCO. We used digitally controlled resistors to set the DCO current instead of the current source to mitigate the flicker noise upconversion. The DAC of the DCO consists of a proportional DAC, an integral DAC, and a PVT DAC. The proportional DAC is driven by UP/DN(1:20) generated from the DCO pulse generator. The integral DAC is composed of a DSM DAC, a fine DAC, and a coarse DAC. Of the integral path output, 25 bit is divided into three groups and controls each integral DAC. The LSB part (0–10 bit) is applied to the DSM



Fig. 24. Die photograph.



Fig. 25. Measured phase noise spectrum of the proposed OSPLL at 4 GHz.

for the fine frequency step. The middle part (11–16 bit) is converted to the thermometer code and drives the fine DAC, whereas the MSB part (17–24 bit) is applied to the coarse DAC. The DSM and the fine DAC use the varactor-based unit cell, which is the same as the one used in the proportional DAC, whereas the coarse DAC uses a MOM capacitor as its unit cell. The PVT DAC sets the coarse frequency of the DCO using 8-bit binary control code. An ac coupled inverter with resistive feedback is used to convert the DCO waveform to rail-to-rail digital output, which drives the multiphase generator, and a divide-by-4 circuit driving the DSM logic.

## V. MEASUREMENT RESULTS

The proposed OSPLL is fabricated in 28-nm CMOS. Fig. 24 shows the die photograph and the PLL core occupies 0.17 mm<sup>2</sup>. To prevent the unwanted noise coupling to the sinusoidal reference, the reference distribution line is carefully shielded with the ground. Also, we used the thick and wide metal layer for the reference line to reduce the resistance and minimize the sinewave attenuation. The PLL generates 4-GHz output frequency using 200-MHz reference input, while consuming 5.2 mW. The reference sinusoidal clock is generated from an external crystal oscillator (Sprinter model from Wenzel Associates). A peak-to-peak 1.2-V reference signal is applied to the chip to avoid any reliability issues of the sampling switches.

Fig. 25 shows the measured phase noise spectrum of the PLL and the free-running DCO measured from a Keysight E5052B signal source analyzer. The in-band phase noise



Fig. 26. Measured output spectrum of the proposed OSPLL.

is -129.2 dBc/Hz at 1-MHz offset and -132.5 dBc/Hz at 5-MHz offset. The integrated rms jitter of the PLL is 67.1 fs, and the integration range is from 10 kHz to 100 MHz. The PLL achieved -256.3 dB of PLL jitter-power FoM. The DCO phase noise in free-running mode at 4 GHz was -113.6 dBc/Hz at 1-MHz offset.

Fig. 25 also shows the measured phase noise spectrum of the proposed the conventional BBPLL mode for comparison. Note that for a fair comparison, both the OSPLL and BBPLL mode are measured with the optimal bandwidth which gives the lowest jitter. The in-band phase noise improvement of the OSPLL was larger than 20 dB compared with the BBPLL, where the in-band phase noises of the OSPLL and the BBPLL were -130 and -105 dBc/Hz, respectively. The integrated rms jitter was improved from 430 fs (BBPLL) to 67 fs (OSPLL), a greater than  $5 \times$  improvement.

The measured jitter performance of the BBPLL (430 fs) is worse than the simulated jitter of the conventional BBPLL in Fig. 8 (122.3 fs). We verified with simulation that this is due to the phase noise contribution from the inverter-based reference buffer of the BBPLL, which is necessary to convert the sinusoidal reference from the crystal oscillator to the rectangular pulse for the BBPFD. Since the input of the reference buffer is a sinusoidal clock with slow slope, the noise generated from the first-stage inverter is large [10]. The inverter chain reference buffer used in this design consumes 100  $\mu$ W. Increased power could lower the noise contribution, but at the expense of higher overall power [21]. The proposed OSPLL avoids this issue since the OSPLL directly samples the sinusoidal reference from the crystal oscillator so that it does not need a reference buffer.

Fig. 26 shows the measured output spectrum of the proposed PLL measured from a Keysight EXA N9010B spectrum analyzer. The PLL achieved a reference spur of -78.1 dBc.

Fig. 27 shows the measured PLL output jitter according to the delay code of the proportional path timing control scheme that controls the varactor switching timing of the DCO proportional path. Across all 5-bit delay codes, the measured PLL output jitter changes from 67 to 82 fs, demonstrating the importance of the varactor switching timing control. The optimal code can be set around 11 for minimum output jitter. The range of the delay line is larger than  $0.5 \times T_{DCO}$  to cover the optimal region given PVT variation.



Fig. 27. Measured output jitter of the PLL with difference delay code of the DCO tuning pulse control scheme.



Fig. 28. Measured power breakdown of the PLL.



Fig. 29. Measured PLL jitter across the supply voltages.

Fig. 28 shows the measured power consumption of each building block. The RSPD, DCO, DCO buffer, multiphase generator and digital logic consume 1.5, 1.7, 0.6, 1.0, and 0.4 mW, respectively, resulting in 5.2 mW of total power. Note that three RSPDs near the peak and three RSPDs near the bottom are turned off to save power.

Fig. 29 shows the variation of the PLL output jitter across the supply voltages from 0.85 to 0.95 V. The jitter variation was less than 5%. Fig. 30 shows the jitter and reference spur measurement of five different chips. Note that the same supply, bias, and control codes are used for all chips. The jitter performance varied from 67.1 to 72.2 fs, while the spur performance measurement results range from -79.3 to -75.0 dBc.

Table I compares the performance of the proposed PLL with recent low-jitter integer-N PLLs. Thanks to the oversampling, the proposed PLL achieved the low in-band phase noise of -130 dBc/Hz. The measured bandwidth was 20 MHz, much

|                         |                | RFIC 16     | ISSCC 18    | ISSCC 18     | ISSCC 19     | ISSCC 19     | ISSCC 18 | ISSCC 20    | ISSCC 16 |
|-------------------------|----------------|-------------|-------------|--------------|--------------|--------------|----------|-------------|----------|
|                         | This Work      | Ravinuthula | Turker      | Sharkia      | Kim          | Zhang        | Sharma   | Mercandelli | Elkholy  |
|                         |                | [41]        | [40]        | [11]         | [13]         | [12]         | [29]     | [39]        | [17]     |
| Туре                    | OSPLL          | Charge-pump | Charge-pump | Sub-sampling | Sub-sampling | Sub-sampling | Sampling | Sampling    | ILCM     |
|                         |                | PLL         | PLL         | PLL          | PLL          | PLL          | PLL      | PLL         |          |
|                         | Type-II        | Type-II     | Type-II     | Type-I       | Type-II      | Type-II      | Type-I   | Type-I      | Type-I   |
| Process (nm)            | 28             | 65          | 16          | 65           | 65           | 40           | 65       | 28          | 65       |
| Reference (MHz)         | 200            | 500         | 500         | 100          | 100          | 200          | 50       | 500         | 125      |
| F <sub>BW</sub> (MHz)   | 20             | 0.2         | 3           | 6            | 2            | 4            | 2        | 7           | 1        |
| $F_{BW}/F_{REF}$        | 1/10           | 1/2500      | 1/166       | 1/17         | 1/50         | 1/50         | 1/25     | 1/71        | 1/125    |
| F <sub>OUT</sub> (GHz)  | 4              | 8           | 12.5        | 5            | 3.8          | 14           | 2.55     | 12.5        | 8        |
| Int. Jitter (fs)        | 67.1 (69.3*)   | 110         | 53.6        | 185.3        | 72           | 56.4         | 110      | 51.7        | 104.7    |
| Int. Range (Hz)         | 10K-100M       | 10K-20M     | 10K-10M     | 10K-50M      | 1K-30M       | 1K-100M      | 10K-100M | 1K-100M     | 10K-30M  |
| In-band PN (dBc/Hz)     | -130           | -104        | -127.1      | -122.3       | -123.6       | N/A          | -121.3   | -129        | -115.6   |
| (Normalized to 4GHz)    |                |             |             |              |              |              |          |             |          |
| Ref. Spur (dBc)         | -78.1 (-77.5*) | N/A         | -75.5       | -64.1        | -75          | -64.6        | -63      | -73.5       | -43      |
| Power (mW)              | 5.2            | 140         | 45          | 1.1          | 19.1         | 7.2          | 3.7      | 18          | 2.7      |
| FOM (dB)                | -256.3         | -237.7      | -246.8      | -254         | -250.1       | -256.4       | -253.5   | -253.2      | -255.4   |
| Area (mm <sup>2</sup> ) | 0.17           | 1.48        | 0.35        | 0.01         | 0.21         | 0.23         | 0.36     | 0.16        | 0.27     |

TABLE I Comparison With Recent Low-Jitter PLLs

\*Averaged across 5 chips



Fig. 30. Measured PLL jitter and reference spur of five chips.



Fig. 31. Reference spur versus PLL FoM comparison plot of prior arts.

larger compared with state-of-the-art PLLs. In combination with the low in-band phase noise and the wide bandwidth, the proposed PLL achieved a sub-100-fs low-jitter performance of 67.1 fs<sub>rms</sub> while maintaining excellent power efficiency, yielding a state-of-the-art jitter-power FoM of -256.3 dB. The proposed PLL demonstrated the best performance in terms of reference spur compared with the other PLLs. The performance comparison of the proposed PLL and recent

published PLLs in terms of the reference spur performance versus the PLL FoM is shown in Fig. 31. The proposed OSPLL achieved state-of-the-art performance in jitter, power, and spur among recent published work.

#### VI. CONCLUSION

article, we presented an ultralow In this jitter LC-DCO-based reference oversampling PLL. The reference oversampling technique effectively boosts the reference frequency to the output frequency, lowering in-band phase noise and achieving a wide bandwidth at the same time. The ac coupling technique in the RSPD removes the effect of the offset and timing mismatch of the RSPD, enabling low reference spur. Since the sinewave reference from the crystal oscillator is directly utilized, the noisy and power-hungry reference buffer in conventional PLLs can be removed. The varactor-switching noise of the LC-DCO is minimized by the proposed DCO tuning pulse timing control scheme, therefore the optimal PLL jitter performance is maintained. The proposed PLL, fabricated in 28-nm CMOS, achieved 67.1-fs<sub>rms</sub> measured jitter performance and 5.2 mW of power consumption at 4 GHz, resulting in -256.3-dB PLL FoM, while maintaining -78.1-dBc reference spur.

#### REFERENCES

- W. El-Halwagy, A. Nag, P. Hisayasu, F. Aryanfar, P. Mousavi, and M. Hossain, "A 28-GHz quadrature fractional-N frequency synthesizer for 5G transceivers with less than 100-fs jitter based on cascaded PLL architecture," *IEEE Trans. Microw. Theory Techn.*, vol. 65, no. 2, pp. 396–413, Feb. 2017.
- [2] W. Wu et al., "A 28-nm 75-fsrms analog fractional-N sampling PLL with a highly linear DTC incorporating background DTC gain calibration and reference clock duty cycle correction," *IEEE J. Solid-State Circuits*, vol. 54, no. 5, pp. 1254–1265, May 2019.
- [3] L. Kull et al., "A 24–72-GS/s 8-b time-interleaved SAR ADC with 2.0–3.3-pJ/Conversion and >30 dB SNDR at Nyquist in 14-nm CMOS FinFET," *IEEE J. Solid-State Circuits*, vol. 53, no. 12, pp. 3508–3516, Dec. 2018.

- [4] K. J. Wang, A. Swaminathan, and I. Galton, "Spurious tone suppression techniques applied to a wide-bandwidth 2.4 GHz fractional-N PLL," *IEEE J. Solid-State Circuits*, vol. 43, no. 12, pp. 2787–2797, Dec. 2008.
- [5] T. Wu, P. K. Hanumolu, K. Mayaram, and U. K. Moon, "Method for constant loop bandwidth in LC-VCO PLL frequency synthesizers," *IEEE J. Solid-State Circuits*, vol. 44, no. 2, pp. 427–434, Feb. 2009.
- [6] C.-T. Ko, T.-K. Kuan, R.-P. Shen, and C.-H. Chang, "A 7-nm FinFET CMOS PLL with 388-fs jitter and -80-dBc reference spur featuring a track-and-hold charge pump and automatic loop gain control," *IEEE J. Solid-State Circuits*, vol. 55, no. 4, pp. 1043–1050, Apr. 2020.
- [7] F. M. Gardner, Phaselock Techniques. Hoboken, NJ, USA: Wiley, 2005.
- [8] A. Elkholy, D. Coombs, R. K. Nandwana, A. Elmallah, and P. K. Hanumolu, "A 2.5–5.75-GHz ring-based injection-locked clock multiplier with background-calibrated reference frequency doubler," *IEEE J. Solid-State Circuits*, vol. 54, no. 7, pp. 2049–2058, Jul. 2019.
- [9] S. Levantino, G. Marzin, C. Samori, and A. L. Lacaita, "A wideband fractional-N PLL with suppressed charge-pump noise and automatic loop filter calibration," *IEEE J. Solid-State Circuits*, vol. 48, no. 10, pp. 2419–2429, Oct. 2013.
- [10] X. Gao, E. A. M. Klumperink, M. Bohsali, and B. Nauta, "A low noise sub-sampling PLL in which divider noise is eliminated and PD/CP noise is not multiplied by N<sub>2</sub>," *IEEE J. Solid-State Circuits*, vol. 44, no. 12, pp. 3253–3263, Dec. 2009.
- [11] A. Sharkia, S. Mirabbasi, and S. Shekhar, "A 0.01 mm<sup>2</sup> 4.6-to-5.6GHz sub-sampling type-I frequency synthesizer with -254dB FOM," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2018, pp. 256–258.
- [12] Z. Zhang, G. Zhu, and C. P. Yue, "A 0.65 V 12-to-16 GHz sub-sampling PLL with 56.4 fs<sub>rms</sub> integrated jitter and -256.4 dB FoM," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2019, pp. 488–489.
- [13] J. Kim et al., "16.2 A 76fsrms jitter and -40dBc integrated-phase-noise 28-to-31GHz frequency synthesizer based on digital sub-sampling PLL using optimally spaced voltage comparators and background loop-gain optimization," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2019, pp. 258–259.
- [14] X. Gao, E. A. M. Klumperink, G. Socci, M. Bohsali, and B. Nauta, "Spur reduction techniques for phase-locked loops exploiting a subsampling phase detector," *IEEE J. Solid-State Circuits*, vol. 45, no. 9, pp. 1809–1821, Sep. 2010.
- [15] Y. Lim *et al.*, "17.8 A 170MHz-lock-in-range and -253dB-FoM<sub>jitter</sub> 12-to-14.5GHz subsampling PLL with a 150μW frequency-disturbance-correcting loop using a low-power unevenly spaced edge generator," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2020, pp. 280–282.
- [16] S. Choi et al., "153 FS<sub>RMS</sub>-integrated-jitter and 114-multiplication factor PVT-robust 22.8 GHZ ring-LC-hybrid injection-locked clock multiplier," in Proc. IEEE Symp. VLSI Circuits, Jun. 2018, pp. 185–186.
- [17] A. Elkholy, A. Elmallah, M. Elzeftawi, K. Chang, and P. K. Hanumolu, "10.6 A 6.75-to-8.25GHz, 250fsrms-integrated-jitter 3.25 mW rapid on/off PVT-insensitive fractional-N injection-locked clock multiplier in 65nm CMOS," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Jan. 2016, pp. 192–193.
- [18] A. Elkholy, M. Talegaonkar, T. Anand, and P. Kumar Hanumolu, "Design and analysis of low-power high-frequency robust sub-harmonic injection-locked clock multipliers," *IEEE J. Solid-State Circuits*, vol. 50, no. 12, pp. 3160–3174, Dec. 2015.
- [19] T.-K. Kuan and S.-I. Liu, "A bang bang phase-locked loop using automatic loop gain control and loop latency reduction techniques," *IEEE J. Solid-State Circuits*, vol. 51, no. 4, pp. 821–831, Apr. 2016.
- [20] L. Bertulessi, L. Grimaldi, D. Cherniak, C. Samori, and S. Levantino, "A low-phase-noise digital bang-bang PLL with fast lock over a wide lock range," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2018, pp. 252–254.
- [21] A. Santiccioli *et al.*, "A 66-fs-rms jitter 12.8-to-15.2-GHz fractional-N bang-bang PLL with digital frequency-error recovery for fast locking," *IEEE J. Solid-State Circuits*, vol. 55, no. 12, pp. 3349–3361, Dec. 2020.
- [22] D. Park and S. Cho, "A 14.2 mW 2.55-to-3 GHz cascaded PLL with reference injection and 800 MHz delta-sigma modulator in 0.13  $\mu$ m CMOS," *IEEE J. Solid-State Circuits*, vol. 47, no. 12, pp. 2989–2998, Dec. 2012.
- [23] R. B. Staszewski, D. Leipold, C.-M. Hung, and P. T. Balsara, "TDCbased frequency synthesizer for wireless applications," in *Proc. IEE Radio Freq. Integr. Circuits (RFIC) Syst. Dig. Papers*, Jun. 2004, pp. 215–218.

- [24] X. Gao et al., "A 28 nm CMOS digital fractional-N PLL with -245.5 dB FOM and a frequency tripler for 802.11abgn/ac radio," *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2015, pp. 1–3.
- [25] F. Song, Y. Zhao, B. Wu, L. Tang, L. Lin, and B. Razavi, "16.5 A fractional-N synthesizer with 110fsrms jitter and a reference quadrupler for wideband 802.11ax," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2019, pp. 264–265.
- [26] K. M. Megawer, A. Elkholy, M. G. Ahmed, A. Elmallah, and P. K. Hanumolu, "Design of crystal-oscillator frequency quadrupler for low-jitter clock multipliers," *IEEE J. Solid-State Circuits*, vol. 54, no. 1, pp. 65–74, Jan. 2019.
- [27] J.-H. Seol, D. Sylvester, D. Blaauw, and T. Jang, "A reference oversampling digital phase-locked loop with -240 dB FOM and -80 dBc reference spur," in *Proc. Symp. VLSI Circuits*, Jun. 2019, pp. C160–C161.
- [28] J.-H. Seol, K. Choo, D. Blaauw, D. Sylvester, and T. Jang, "A 67fsrms jitter, -130 dBc/Hz in-band phase noise, -256-dB FoM reference oversampling digital PLL with proportional path timing control," *IEEE Solid-State Circuits Lett.*, vol. 3, pp. 430–433, 2020.
- [29] J. Sharma and H. Krishnaswamy, "A dividerless reference-sampling RF PLL with -253.5dB jitter FOM and <-67dBc reference spurs," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2018, pp. 258–260.
- [30] I. L. Syllaios, R. B. Staszewski, and P. T. Balsara, "Time-domain modeling of an RF all-digital PLL," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 55, no. 6, pp. 601–605, Jun. 2008.
- [31] A. Hajimiri and T. H. Lee, "A general theory of phase noise in electrical oscillators," *IEEE J. Solid-State Circuits*, vol. 33, no. 2, pp. 179–194, Feb. 1998.
- [32] R. B. Staszewski, D. Leipold, K. Muhammad, and P. T. Balsara, "Digitally controlled oscillator (DCO)-based architecture for RF frequency synthesis in a deep-submicrometer CMOS process," *IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process.*, vol. 50, no. 11, pp. 815–828, Nov. 2003.
- [33] X. Gao, E. A. M. Klumperink, P. F. J. Geraedts, and B. Nauta, "Jitter analysis and a benchmarking figure-of-merit for phase-locked loops," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 56, no. 2, pp. 117–121, Feb. 2009.
- [34] D. Paik, M. Miyahara, and A. Matsuzawa, "An analysis on a dynamic amplifier and calibration methods for a pseudo-differential dynamic comparator," *IEICE Trans. Fundamentals Electron., Commun. Comput. Sci.*, vol. E95-A, no. 2, pp. 456–470, 2012.
- [35] M. van Elzakker, E. van Tuijl, P. Geraedts, D. Schinkel, E. A. M. Klumperink, and B. Nauta, "A 10-bit charge redistribution ADC consuming 1.9 μW at 1 MS/s," *IEEE J. Solid-State Circuits*, vol. 45, no. 5, pp. 1007–1015, May 2010.
- [36] P.-H.-P. Wang *et al.*, "A near-zero-power wake-up receiver achieving -69-dBm sensitivity," *IEEE J. Solid-State Circuits*, vol. 53, no. 6, pp. 1640–1652, Jun. 2018.
- [37] M. Miyahara, Y. Asada, D. Paik, and A. Matsuzawa, "A low-noise selfcalibrating dynamic comparator for high-speed ADCs," in *Proc. IEEE Asian Solid-State Circuits Conf.*, Nov. 2008, pp. 269–272.
- [38] Z. Ru, N. A. Moseley, E. Klumperink, and B. Nauta, "Digitally enhanced software-defined radio receiver robust to out-of-band interference," *IEEE J. Solid-State Circuits*, vol. 44, no. 12, pp. 3359–3375, Dec. 2009.
- [39] M. Mercandelli et al., "17.5 A 12.5GHz fractional-N type-I sampling PLL achieving 58fs integrated jitter," in *IEEE Int. Solid-State Circuits* Conf. (ISSCC) Dig. Tech. Papers, Feb. 2020, pp. 274–276.
- [40] D. Turker et al., "A 7.4-to-14 GHz PLL with 54 fsrms jitter in 16 nm FinFET for integrated RF-data-converter SoCs," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2018, pp. 378–379.
- [41] V. Ravinuthula and S. Finocchiaro, "A low power high performance PLL with temperature compensated VCO in 65nm CMOS," in *Proc. IEEE Radio Freq. Integr. Circuits Symp. (RFIC)*, May 2016, pp. 31–34.
- [42] S. Ek et al., "A 28-nm FD-SOI 115-fs jitter PLL-based LO system for 24–30-GHz sliding-IF 5G transceivers," *IEEE J. Solid-State Circuits*, vol. 53, no. 7, pp. 1988–2000, Jul. 2018.
- [43] L. Kong and B. Razavi, "A 2.4 GHz 4 mW integer-N inductorless RF synthesizer," *IEEE J. Solid-State Circuits*, vol. 51, no. 3, pp. 626–635, Mar. 2016.
- [44] J.-S. Lee, M.-S. Keel, S.-I. Lim, and S. Kim, "Charge pump with perfect current matching characteristics in phase-locked loops," *Electron. Lett.*, vol. 36, no. 23, pp. 1907–1908, Nov. 2000.
- [45] Z. Xu, M. Miyahara, K. Okada, and A. Matsuzawa, "A 3.6 GHz lownoise fractional-N digital PLL using SAR-ADC-Based TDC," *IEEE J. Solid-State Circuits*, vol. 51, no. 10, pp. 2345–2356, Oct. 2016.

- [46] M. Zanuso, D. Tasca, S. Levantino, A. Donadel, C. Samori, and A. L. Lacaita, "Noise analysis and minimization in bang-bang digital PLLs," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 56, no. 11, pp. 835–839, Nov. 2009.
- [47] H. Xu and A. A. Abidi, "Design methodology for phase-locked loops using binary (bang-bang) phase detectors," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 64, no. 7, pp. 1637–1650, Jul. 2017.
- [48] S. S. Nagam and P. R. Kinget, "A low-jitter ring-oscillator phase-locked loop using feedforward noise cancellation with a sub-sampling phase detector," *IEEE J. Solid-State Circuits*, vol. 53, no. 3, pp. 703–714, Mar. 2018.
- [49] I. Galton, "Delta-sigma fractional-N phase-locked loops," in *Phase Locking in High-Performance Systems*. New York, NY, USA: Wiley, 2003, pp. 23–33.
- [50] A. Elkholy et al., "A 3.7 mW low-noise wide-bandwidth 4.5 GHz digital fractional-N PLL using time-amplifier-based TDC," IEEE J. Solid-State Circuits, vol. 50, no. 4, pp. 867–880, Apr. 2015.
- [51] B. Razavi, "Jitter-power trade-offs in PLLs," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 68, no. 4, pp. 1381–1387, Apr. 2021.



He joined Samsung Electronics DRAM Design Team, Hwasung, South Korea, in 2012, where he contributed to the development of mobile DRAMs,

including LPDDR2, LPDDR3, LPDDR4, and LPDDR5. His research interests include frequency synthesizers, deep learning hardware, memory systems, and ultralow power system design.



**Kyojin Choo** (Member, IEEE) received the B.S. and M.S. degrees in electrical engineering from Seoul National University, Seoul, South Korea, in 2007 and 2009, respectively, and the Ph.D. degree from the University of Michigan, Ann Arbor, MI, USA, in 2018.

He is currently a Post-Doctoral Research Fellow at the University of Michigan, Ann Arbor. From 2009 to 2013, he was with Image Sensor Development Team, Samsung Electronics, Hwasung, South Korea, where he designed signal readout chains for

mobile/DSLR image sensors. During his Ph.D., he interned with Apple, Cupertino, CA, USA, and consulted to several companies, including Sony Electronics, San Jose, CA, USA. He holds 17 U.S. patents. His research interests include charge-domain circuits, sensor interfaces, energy converters, high-speed links/timing generators, and millimeter-scale integrated systems.



**David Blaauw** (Fellow, IEEE) received the B.S. degree in physics and computer science from Duke University, Durham, NC, USA, in 1986, and the Ph.D. degree in computer science from the University of Illinois at Urbana–Champaign, Champaign, IL, USA, in 1991.

Until August 2001, he worked for Motorola, Inc., Austin, TX, USA, where he was the Manager of High Performance Design Technology Group and won the Motorola Innovation Award. Since August 2001, he has been on the faculty of the

University of Michigan, Ann Arbor, MI, USA, where he is the Kensall D. Wise Collegiate Professor of EECS. He is the Director of the Michigan Integrated Circuits Laboratory. He has published over 600 articles and holds 65 patents. He has researched ultralow-power wireless sensors using subthreshold operation and low-power analog circuit techniques for millimeter systems. This research was awarded the MIT Technology Review's "one of the year's most significant innovations." His research group introduced so-called near-threshold computing, which has become a common concept in semiconductor design. Most recently, he has pursued research in cognitive computing using analog, in-memory neural-networks for edge-devices and genomics for precision health.

Dr. Blaauw has received numerous best paper awards. He was the General Chair of the IEEE International Symposium on Low Power and a member of the IEEE International Solid-State Circuits Conference (ISSCC) Analog Program Subcommittee. He received the 2016 SIA-SRC Faculty Award for lifetime research contributions to the U.S. semiconductor industry.



**Dennis Sylvester** (Fellow, IEEE) received the Ph.D. degree in electrical engineering from UC-Berkeley, Berkeley, CA, USA, in 1999.

He is the Edward S. Davidson Collegiate Professor of electrical and computer engineering at the University of Michigan, Ann Arbor, MI, USA. His research has been commercialized via three major venture capital funded startup companies: Ambiq Micro, Cubeworks, and Mythic. His main research interests are in the design of miniaturized ultralow power microsystems, touching on analog, mixed-

signal, and digital circuits. He has published over 500 articles and holds more than 50 U.S. patents in these areas.

Dr. Sylvester has received 14 best paper awards and nominations and was named a Top Contributing Author at ISSCC and the most prolific author at the IEEE Symposium on VLSI Circuits. He is currently a member of the Administrative Committee for the IEEE Solid-State Circuits Society, an Associate Editor for the IEEE Journal of Solid-State Circuits, and was an IEEE Solid-State Circuits Society Distinguished Lecturer for 2016–2017. He held research staff positions at Synopsys, Mountain View, CA, USA, and Hewlett-Packard Laboratories, Palo Alto, CA, USA, as well as visiting professorships at the National University of Singapore, Singapore, and Nanyang Technological University, Singapore.



Taekwang Jang (Senior Member, IEEE) received the B.S. and M.S. degrees in electrical engineering from KAIST, Daejeon, South Korea, in 2006 and 2008, respectively, and the Ph.D. degree from the University of Michigan, Ann Arbor, MI, USA, in 2017. His dissertation was titled "Circuit and System Designs for Millimeter-Scale IoT and Wireless Neural Recording."

From 2008 to 2013, he worked at Samsung Electronics Company Ltd., Yongin, South Korea, focusing on mixed-signal circuit design, including analog

and all-digital phase-locked loops for communication systems and mobile processors. After working as a Post-Doctoral Research Fellow at the University of Michigan, he joined ETH Zürich, Zürich, Switzerland, in 2018, as an Assistant Professor and is leading the Energy-Efficient Circuits and IoT Systems Group.

Dr. Jang is a member of the Competence Center for Rehabilitation Engineering and Science and the Chair of the IEEE Solid-State Circuits Society, Switzerland Chapter while working at ETH Zürich. He was a co-recipient of the IEEE Transactions on Circuits and Systems 2009 Guillemin-Cauer Best Paper Awards. His research interests include ultralow power systems, biomedical circuits, frequency synthesizers, and data converters.