<table>
<thead>
<tr>
<th><strong>Title</strong></th>
<th>Behavior-level Analysis of a Successive Stochastic Approximation Analog-to-Digital Conversion System for Multi-channel Biomedical Data Acquisition</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Author(s)</strong></td>
<td>TANI, Sadahiro; MATSUOKA, Toshimasa; HIRAI, Yusaku; KURATA, Toshifumi; TATSUMI, Keiji; ASANO, Tomohiro; UEDA, Masayuki; KAMATA, Takatsugu</td>
</tr>
<tr>
<td><strong>Citation</strong></td>
<td>IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences. E100-A(10) P.2073-P.2085</td>
</tr>
<tr>
<td><strong>Issue Date</strong></td>
<td>2017-10-01</td>
</tr>
<tr>
<td><strong>Text Version</strong></td>
<td>publisher</td>
</tr>
<tr>
<td><strong>URL</strong></td>
<td><a href="http://hdl.handle.net/11094/65064">http://hdl.handle.net/11094/65064</a></td>
</tr>
<tr>
<td><strong>DOI</strong></td>
<td></td>
</tr>
<tr>
<td><strong>rights</strong></td>
<td>Copyright © 2017 The Institute of Electronics, Information and Communication Engineers</td>
</tr>
<tr>
<td><strong>Note</strong></td>
<td></td>
</tr>
</tbody>
</table>
Behavior-Level Analysis of a Successive Stochastic Approximation Analog-to-Digital Conversion System for Multi-Channel Biomedical Data Acquisition

Sadahiro TANI*, Member, Toshimasa MATSUOKA**, Senior Member, Yusaku HIRAI†, Toshifumi KURATA†, Keiji TATSUMI†, Tomohiro ASANO†, Masayuki UEDA††, Nonmembers, and Takatsugu KAMATA†††, Member

SUMMARY In the present paper, we propose a novel high-resolution analog-to-digital converter (ADC) for low-power biomedical analog front-ends, which we call the successive stochastic approximation ADC. The proposed ADC uses a stochastic flash ADC (SF-ADC) to realize a digitally controlled variable-threshold comparator in a successive-approximation-register ADC (SAR-ADC), which can correct errors originating from the internal digital-to-analog converter in the SAR-ADC. For the residual error after SAR-ADC operation, which can be smaller than thermal noise, the SF-ADC uses the statistical characteristics of noise to achieve high resolution. The SF-ADC output for the residual signal is combined with the SAR-ADC output to obtain high-precision output data using the supervised machine learning method.

key words: SAR-ADC, DAC error calibration, stochastic A/D conversion, mismatch, machine learning

1. Introduction

Due to rising health concerns, the need for wearable high-precision biomedical sensors that can operate under battery power for long periods of time is rapidly increasing. Integration of biomedical sensor nodes for monitoring various biopotential signals (e.g., electrocardiograms (ECGs), electroencephalograms (EEGs), or electromyogram (EMGs)), temperature, and pressure, is also highly attractive in practice because a small, low-power, low-cost device can be realized. One way to realize this integration is a flexible and reconfigurable circuit design platform that takes into consideration time-to-market pressure and design complexity [1], [2]. Such a platform can help to realize low-power high-precision analog-to-digital converters (ADCs) that are suitable for use in a flexible and reconfigurable multi-channel biomedical data acquisition analog front-end (AFE), as shown in Fig. 1. The digital signal processing block filters, decimates, and demultiplexes according to the target signals. Although ΔΣ ADCs are often popular for use in biomedical sensor applications [1], they are not suitable for multi-channel AFEs having fast scan rates. In the present study, an oversampled successive-approximation-register ADC (SAR-ADC) [3] is used to reduce noise and errors within the signal bandwidth. The oversampling ratio OSR depends on the number of channels used. In addition, considering low-voltage operation (≤ 0.5 V) for a low-power AFE, a small gain for low-noise amplifiers (LNAs) is preferred in order to reduce nonlinear distortion. Such a situation requires a sufficient dynamic range for the ADC, even under a low supply voltage. There is a trade-off relationship between the power consumption and precision of the ADC [4].

The SAR-ADC shown in Fig. 2 [5] converts continuous analog data to digital output via binary search through all possible quantization levels from $D_{n-1}$ of the most significant bit (MSB) to $D_0$ of the least significant bit (LSB). At each step of the search, an internal digital-to-analog converter (DAC) and comparators are used repeatedly. Typically, an internal DAC in a SAR-ADC is composed of arrayed capacitors with capacitances weighted by powers of two. The precision of the entire SAR-ADC is significantly affected by the precision of the DAC. However, a
high-precision DAC is difficult to realize using small capacitors with a large mismatch. Therefore, error correction techniques are important for realizing high-resolution SAR-ADCs.

The present paper demonstrates the feasibility of a novel low-voltage (≤ 0.5 V) high-resolution ADC system for low-power AFEs, which we call the successive stochastic approximation ADC (SSA-ADC). Although the proposed system is based on SAR-ADC architecture, it has two key features. One is the use of a stochastic flash ADC (SF-ADC) with statistical device mismatches [6]–[9], which can operate as a comparator with a digitally controlled threshold [10]. It is useful to reduce influence of comparator offset in the SAR-ADC operation. In addition, the SF-ADC originates from the principle of stochastic resonance, which is used to detect small signals with noise [11]–[14]. By applying statistical noise characteristics of the SF-ADC to additionally quantize the residual signal in the last SAR-ADC operation, which can be smaller than the thermal noise, signal resolution enhancement can be expected. The other feature of the SSA-ADC is error correction using a machine learning method, which can realize a more complicated error correction function than a simple digital-domain calibration technique [5]. It corrects errors originated from capacitor mismatch in the internal capacitor DAC and non-ideality of the SF-ADC used for the additional quantization. This technique, which was developed based on previous studies on the SAR-ADC [15], [16], can combine the SF-ADC output for the residual signal with the SAR-ADC output to obtain high-precision output data.

2. Stochastic Flash ADC

As shown in Fig. 3, the SF-ADC comprises an array of $N$ comparators and a ones adder. The ones adder outputs a binary code, which corresponds to the number of comparators outputting ‘high’. The SF-ADC is designed based on stochastic resonance, where a weak signal can be enhanced by adding white noise or by employing an input-referred (IR) offset distribution of comparator ensemble [11]–[14]. Although each offset voltage $\Delta_{\text{off}}$, for individual comparators cannot be predicted, statistical values such as the standard deviation and the mean can be calculated. For a sufficiently large number of comparators, the comparator offset follows a Gaussian distribution. Therefore, the probability $P(V_{\text{in}})$ that one comparator outputs one (high) can be expressed as follows [8], [9], [14]:

$$P(V_{\text{in}}) = 1 - \frac{1}{2} \text{erfc} \left( \frac{V_{\text{in}}}{\sqrt{2} \sigma_{\text{off}}} \right) \approx \frac{n_H}{N},$$

(1)

where \(\text{erfc}(x) = \frac{2}{\sqrt{\pi}} \int_{x}^{\infty} \exp(-z^2)dz\) is the complementary error function, $V_{\text{in}}$ is an input voltage, $\sigma_{\text{off}}$ is the standard deviation of the offset voltages, and the mean of the IR offset voltages is assumed to be zero. Moreover, $n_H$ is the number of comparators outputting high and corresponds to the output of the ones adder in Fig. 3. The SF-ADC determines its digital output according to $n_H$. Based on the characteristics of the SF-ADC, $n_H$ can be used as a digitally controlled variable-threshold comparator (DCVTC) by comparing $n_H$ with the digital threshold $D_{\text{th}}$ in order to quantize $n_H$ to 1-bit code, as shown in Fig. 4. This operation mode of the SF-ADC can be used in the SAR-ADC in the present study (SAR-ADC mode). The concept of the DCVTC was also used in a previous study on multi-bit $\Delta - \Sigma$ ADC [10].

The SF-ADC also can use statistical noise to addition-ally quantize residual error in the last SAR-ADC operation in this work (SF-ADC mode), which can be buried under time-varying noise, such as thermal and flicker noise. This technique can enhance the signal resolution so as to detect even signals that are buried under noise. The detail is described in Sect. 3.

Figure 5 shows an example of small-signal detection using stochastic resonance. Even when the signal itself does not exceed the threshold in a comparator, adequately superimposing noise on the signal can generate comparator output according to the signal with a certain probability. If the noise can be assumed to follow a Gaussian distribution, the signal can be restored [14].

Table 1 shows types of noise produced inside the SSA-ADC and their characteristics. Each comparator generates noise independently. From Fig. 3, the IR noise for each comparator can be merged with its offset as an instantaneous comparator offset. In other words, the comparator offset is equivalent to the DC IR noise [14]. Neither the IR noise nor
the offset have a correlation in ensemble statistics. Thus, the standard deviation of the instantaneous comparator offset ($\sigma_{\text{off,eff}}$) can easily be estimated as follows:

$$\sigma_{\text{off,eff}} = \sqrt{\sigma_{\text{off}}^2 + \sigma_n^2},$$

where $\sigma_n$ is the standard deviation of IR comparator noise.

In order to use stochastic resonance, the probability of the comparator response to the noise-superimposed signal must be observed at higher resolution. There are two methods by which to enhance the resolution, i.e., increasing the total comparator counts and oversampling. Based on a previous study [14], [17], the resolution is inversely proportional to $\sqrt{N^2S}$, where $N_S$ is the oversampling ratio for the SF-ADC. In the present study, a small value of $\sigma_{\text{off,eff}}$ is considered for detecting the noise-level signal, which requires an offset cancellation technique in circuit design. The area of the comparator must be large compared to that of the standard-cell-like comparator [18]. This limits the number of comparators $N$ considering the total occupied area. Unlike the linearization of the piecewise inverse Gaussian approximation [18], error correction and encoding using the supervised machine learning, as described in Sect. 5, allows for the relaxed assumption of Gaussian comparator offset, which is useful for reducing $N$. In the SSA-ADC, an oversampling technique is also used to quantize the residual signal in the last SAR-ADC operation with high resolution, where the SF-ADC block repeatedly samples the same input several times ($N_S$), and the SF-ADC output is averaged in the time domain. As shown in Table 1, this oversampling reduces the influence of DAC and buffer noise shared in an ensemble of comparators in the SF-ADC block, which is not reduced by ensemble statistics [14].

### 3. System Configuration

The system configuration of the proposed SSA-ADC system is shown in Fig. 6. It has two modes for A/D conversion and a test mode for DCVTc calibration described in Sect. 4. In the SAR-ADC mode for MSB-side conversion (solid lines in Fig. 6), the SF-ADC operates as a DCVTc. Some errors associated with the internal DAC in this mode are corrected by controlling the digital threshold $D_\theta$ dynamically at each conversion step, where the controlling data is stored in the register according to a foreground test. In the SF-ADC mode for LSB-side conversion (double lines in Fig. 6), the SF-ADC quantizes residual error in the last SAR-ADC operation. Oversampling and averaging for high resolution can be implemented with a simple cumulative adder. The machine learning function combines the output $D_U$ from the SAR-ADC mode operation and the output $D_L$ from the SF-ADC mode and generates the total output $D_{\text{out}}$ with calibration parameters obtained during the learning mode. Bit configurations of $D_U$, $D_L$, and $D_{\text{out}}$ are expressed in Fig. 7 and the following equation:

$$D_U = (D_{U,N_l-1}, \ldots, D_{U,0}),$$

$$D_L = (D_{L,N_l-1}, \ldots, D_{L,0}),$$

$$D_{\text{out}} = (D_{\text{out},N_l-1}, \ldots, D_{\text{out},0}).$$

Considering $N_S$ oversampling in the SF-ADC mode (oversampling only inside the SF-ADC block in addition to the oversampling described in Sect. 1), the conversion for a sampling point requires at least $N_U + N_S + 1$ steps. In the present study, in order to achieve the target resolution ($\approx 1/\sqrt{N^2S}$ [14], [17]) under limited conversion speed and occupation area, $N = 511$, $N_S = 8$, and $N_L = \log_2((N + 1)N_S) = 12$. In addition, $N_U = n = 12$, $N_{L2} = 6$, and $N_{\text{out}} = 18$ are set. Addition of $N_{L3} = 4$ can correct the residual error in the SAR-ADC mode. For digital re-mark-band to reduce non-linearity in the SF-ADC mode, $N_{L3} = N_L - N_{L1} - N_{L2} = 2$ is used as the fractional part.

Both a foreground test to determine optimum digital thresholds ($V_{\text{in,test}}, D_{\text{in,test}}$, and $D_{\text{test}}$ in Fig 6) and machine learning to obtain calibration parameters ($V_{\text{in,ideal}}$ and $D_{\text{in,ideal}}$ in Fig. 6) are repeated until the conversion error is reduced below the allowable level. Note that the comparator number $N$ is smaller than that in a previous report (2,047 in [18]) thanks to the oversampling and supervised machine learning technique.

Details of some calibration techniques for the SSA-ADC system are described in the following sections.

### 4. DAC Error Correction

The internal capacitor DAC in the SAR-ADC mode of the SSA-ADC has a differential configuration and is implemented using capacitors and MOS switches, as shown in Fig. 8 [5]. A buffer is used to drive the input capacitance of the SF-ADC. The sample and hold function is also embedded in this circuit. In sampling, the switches are in the state connecting them to input terminals ($V_{\text{in,p}}, V_{\text{in,n}}$), as indicated by the dashed lines. In successive comparisons, the switches are in the state connecting them to $V_{\text{REF},p}(D_i = 1)$.
or \( V_{\text{REF},n} (D_l = 0) \) according to capacitor DAC input \( D_i \) (\( i = 0, \ldots, n - 1 \)), as indicated by the solid lines. The differential input voltage of the comparator (the SF-ADC block in the SSA-ADC) corresponds to the difference between the sampled value and the analog value generated from digital input \( D_l \).

The positive-side (negative-side) capacitors \( C_{i,p(n)} \) are weighted by powers of two with errors, as expressed by

\[
C_{i,p(n)} = \begin{cases} 
2^i C_u \left( 1 + \varepsilon_i + s_p(n) \frac{\Delta \varepsilon_i}{2} \right), & i \leq d \\
2^{i-d} C_u \left( 1 + \varepsilon_i + s_p(n) \frac{\Delta \varepsilon_i}{2} \right), & i > d, 
\end{cases}
\]

(4)

where \( s_p = 1, s_n = -1 \), \( C_u \) is unit capacitance, \( \varepsilon_i \) is the deviation from the ideal value of \((C_{i,p} + C_{i,n})/2\), and \( \Delta \varepsilon_i \) is the difference between \( C_{i,p} \) and \( C_{i,n} \). Both \( \varepsilon_i \) and \( \Delta \varepsilon_i \) originate from capacitance mismatch. Split capacitor \( C_{C,p(n)} \), which can scale \( C_{i,p} \) and \( C_{i,n} \) to reduce the capacitance area, has a capacitance of \( C_u \) with some errors [19]. Here, \( C_{i1,p(n)} \) and \( C_{i2,p(n)} \) are parasitic capacitances, such as wiring capacitances. This section describes the DAC error from the capacitance mismatch and parasitic capacitances. The parasitic capacitance is assumed to be independent of applied voltage.

In the present study, focusing charge conservation on the floating node of the capacitances in sampling and successive comparison, \( V_{\text{DAC},p(n)} \) and \( V_{\text{DAC},n,p} \) in Fig. 8 were calculated for SAR-ADC numerical simulation. Considering the first-order terms of small errors, such as \( \varepsilon_i \) and \( \Delta \varepsilon_i \), \( V_{\text{DAC}}(= V_{\text{DAC},p} - V_{\text{DAC},n}) \) in successive comparison is expressed as follows:

\[
V_{\text{DAC}} \approx \sum_{i=0}^{n-1} 2^{i-d} C_u \left[ 1 + \varepsilon_i + \beta u \left( d - i + \frac{1}{2} \right) \right] \times \left[ (2D_i - 1)V_{\text{REF}} - V_m \right] \\
+ \left( \Delta \varepsilon_i + \Delta \alpha \right) + \Delta \beta \left( d - i + \frac{1}{2} \right) \times (V_{\text{CM}} + \Delta V_{\text{in,REF}}) \\
+ \alpha C_{p,eff,0} \left( \varepsilon_{p,eff} + \frac{\Delta \alpha}{\alpha} \right) V_{\text{CM}},
\]

(5)

where \( u(x) (u(x) = 0 (x < 0), 1 (x \geq 0)) \) is the unit step function, and \( V_{\text{REF}} = V_{\text{REF},p} - V_{\text{REF},n} \). The remaining parameters are given by

\[
\Delta V_{\text{in,REF}} = \frac{V_{\text{REF},p} - V_{\text{REF},n} - V_{V_{\text{in},p} - V_{V_{\text{in},n}}}}{2},
\]

(6)

\[
\alpha + s_p(n) \frac{\Delta \varepsilon_i}{2} = C_{i1,p(n)} + C_{C,p(n)},
\]

(7)

\[
\beta + s_p(n) \frac{\Delta \varepsilon_i}{2} = C_{i1,p(n)} C_{i2,p(n)} - 1,
\]

(8)

\[
C_{i1,p(n)} = \sum_{i=0}^{d} C_{i,p(n)} + C_{p2,p(n)},
\]

(9)


In the above analysis, capacitance mismatches between the positive and negative sides influence the effective comparator offset for each successive comparison, which corresponds to the terms related to $V_{CM}$ and $\Delta V_{in,REF}$ in the right-hand side of Eq. (5). This causes incorrect DAC code selection and residual error in the last SAR-ADC operation over the SF-ADC input range. In addition, nonlinearity of the capacitance (capacitor voltage coefficient) must be considered for high DAC linearity in the practical case [20], [21]. This dynamically influences capacitance mismatches between both sides according to $V_{in}$ and $D_i$.

In order to overcome such complicated capacitance mismatches, dynamic compensation for the effective comparator offset is proposed in the present study, assuming that the error of capacitor DAC $\Delta V_{DAC} = V_{DAC} - V_{DAC,ideal}$ depends on capacitance selection according to the digital input $D_{in} = (D_{n-1}, \ldots, D_1, D_0)$. The dependence of the DAC error on the digital input $D_{in}$ can be conceptually expressed as

$$\Delta V_{DAC}(D_{in}) = E_{off} + \sum_{i=0}^{n-1} E_i(D_i), \quad (14)$$

where $E_{off}$ is the error that is independent of the input code, and $E_i(i = 0, \ldots, n - 1)$ is the error that is dependent only on the input code $D_i$. The system configuration shown in Fig. 6 can correct errors expressed in Eq. (14) by determining the optimal digital threshold $D_{th}$ of the SF-ADC for each conversion step. A multiplexer inserted at the input terminal of the DAC replaces the input with test data $D_{in,test}$ in a test mode (DCVTC calibration mode indicated by dotted lines in Fig. 6), where $D_{in,test}$ and $V_{in,test}$ are obtained from external components as $D_{in}$ and $V_{in}$, respectively. Setting $V_{in,test} = V_{DAC,ideal}(D_{in,test})$ according to $D_{in,test}$ generates the DAC error $\Delta V_{DAC}$ at the input terminal of the comparator. Since the SF-ADC operates as a DCVTC in this system, the SF-ADC output corresponds to the optimal digital threshold $D_{th}$ for a given $D_{in,test}$. By using each value of the determined $D_{th}$ dynamically in the SAR-ADC mode, the DAC error can be reduced. Since the error factors from $e_i$ in the right-hand side of Eq. (5) can be easily corrected via machine learning, as described later herein, the target factors for DAC error correction are capacitance mismatches between the positive and negative sides, which prevent incorrect DAC code selection. As a natural outcome, this correction process can reduce the offset in a comparator function realized by the SF-ADC in the SAR-ADC mode. The required precision for $V_{in,test}$ is related to the SF-ADC input range. Although the required precision for $V_{in,test}$ influences $D_{th}$, error correction by machine learning can recover the precision in the final ADC output.

However, the number of optimal values of $D_{th}$ for all cases of $D_i (i = 1, \ldots, n - 1)$ is $2^{n-1}$. Note that LSB $D_0$ is 0 for the first to the $(n - 1)$-th conversion steps and is 1 in the last step. For example, in the case of $n = 12$, the number of optimal $D_{th}$ values is 2,048. Considering registers prepar-

---

**Fig. 8** $n$-bit capacitor DAC with an SF-ADC block.

\[
C_{i2,p(n)} = \sum_{i=n+1}^{n-1} C_{i,p(n)} + C_{p1,p(n)}, \quad (10)
\]

\[
C_{tot,p(n)} = C_{i1,p(n)} + C_{i2,p(n)}, \quad (11)
\]

\[
C_{p,eff,0} = \left(1 + s(p)^r \frac{E_{off}}{2}\right)
\]

\[
= C_{p1,p(n)} + \frac{C_{i2,p(n)}C_{C,p(n)}}{C_{i1,p(n)} + C_{C,p(n)}} \quad (12)
\]

\[
V_{DAC,ideal} = \sum_{i=0}^{n-1} \frac{2^i}{2^n - 1} C_e(2D_i - 1)V_{REF} - V_{in}. \quad (13)
\]
ing them and evaluation tests determining them, it is not practical. Therefore, an efficient technique is required for generating the optimal \( D_h \) dynamically according to \( D_n \).

A technique to efficiently generate the optimal \( D_h \) is proposed based on Eq. (14). This technique uses \( E_i(1) - E_i(0) \), which is detected more easily than \( E_i(1) \) and \( E_i(0) \). First, \( \Delta V_{DAC}(D_{n,\text{std}}) \) is detected using the standard test input code \( D_{n,\text{std}} \). Although the standard test input code can be arbitrarily selected, \( D_{n,\text{std}} \) is defined as the all zero code in the present study in order to simplify the control circuit, i.e., \( D_i = 0 (i = 0, \ldots, n-1) \). In this case, \( \Delta V_{DAC}(D_{n,\text{std}}) \) is expressed as follows:

\[
\Delta V_{DAC}(D_{n,\text{std}}) = E_{\text{off}} + \sum_{i=0}^{n-1} E_i(0).
\]

(15)

Next, the specific test input code \( D_{m,j} (j = 0, \ldots, n-1) \), which is defined as the \( D_{m} \) with \( D_i = \delta_{i,j} \) (Kronecker’s delta \( \delta_{i,j} = 0(i \neq j) = 1(i = j) \)), is considered. As an example, test input codes of \( n = 12 \) are shown in Table 2. The DAC error for \( D_{m,j} \) (\( \Delta V_{DAC}(D_{m,j}) \)) is expressed as

\[
\Delta V_{DAC}(D_{m,j}) = \Delta V_{DAC}(D_{m,\text{std}}) + E_{j}(1) - E_{j}(0).
\]

(16)

As described above, the error related to the \( j \)-th bit can be obtained and stored in registers as the difference from that for the standard test input code \( D_{n,\text{std}} \).

The dynamic threshold calculator shown in Fig. 6 generates the optimal \( D_h \) in the following manner. First, \( D_h \) is set equal to \( \Delta V_{DAC}(D_{m,\text{std}}) \). Then, when \( D_j = 1 \), \( E_j(1) - E_j(0) \) is added to \( D_h \). Otherwise, nothing is done. In this way, for all of the \( 2^{n-1} \) DAC input codes in the successive comparison, the DAC error correction can be executed only with one standard data \( \Delta V_{DAC}(D_{m,\text{std}}) \) and all differences \( E_i(1) - E_i(0) \). These data are stored in the register as configuration data and are read to generate optimal \( D_h \) dynamically at each conversion in the SAR-ADC mode, as shown in Fig. 6.

This dynamic digital threshold configuration can ensure the residual error in the last conversion step of the SAR-ADC mode within the input range of the SF-ADC block. Although the precision of the voltage source for \( V_{m,\text{test}} \) influences \( D_h \), error correction by machine learning can recover the precision in the final ADC output, as shown later in Sect. 6. The required voltage source accuracy is related to the SF-ADC input range. Noted that the DAC error correction is robust to the variation of the standard deviation of instantaneous comparator offset \( \sigma_{\text{off, eff}} \) shown in Eq. (2).

### 5. Encoding and Error Correction by Machine Learning

In the SF-ADC mode, an output code \( D_L \) from the SF-ADC block through the averaging circuit expresses the probability corresponding to the residual error in the SAR-ADC mode. However, this code does not have the same scale as in the SAR-ADC mode because the SF-ADC uses comparator offset including IR noise as a reference, as described in Sect. 2. Hence, encoding \( D_L \) to binary code with the same scale as \( D_{U} \) obtained in the SAR-ADC mode is required in order to combine \( D_U \) and \( D_L \) codes for the total ADC code \( D_{\text{out}} \). This encoding is the re-quantization that quantizes the output \( D_L \) from the averaging circuit in the digital domain. As an output range of the re-quantization can be determined after production, a programmable encoder is needed. The programmable encoder uses encoding parameters, i.e., quantization levels, that can be determined and registered after production. This encoding for re-quantization can improve the SNDR characteristics of the SF-ADC [22].

In a previous study [10], the encoding SF-ADC output was used to correct the error originating from the feedback DAC in multi-bit \( \Delta \Sigma \) ADC. Similarly, the encoding in the SSA-ADC can also correct the error from the SAR-ADC mode, which corresponds primarily to the term related to \( (2D_i - 1)V_{\text{REF}} - V_{\text{in}} \) in Eq. (5). Here, \( N_L \) bits of \( D_L \) in Fig. 7 are prepared for this purpose. Note that the DAC error correction described in Sect. 4 targets primarily capacitance mismatches between the positive and negative sides, although target classification is not strict.

Thus, the SSA-ADC uses encoding with error correction for the SF-ADC characteristics and the non-ideality of the capacitor DAC, which remain even when using the DAC error correction described in Sect. 4. The key issue is how to determine parameters required in the encoding. Once these parameters are obtained, the encoding itself can be easily performed. In order to determine these parameters, machine learning, which is a kind of supervised learning based on error correction in a SAR-ADC [15], is introduced in the present study. Good parameter sets can predict calibrated data with small generalization errors (errors for unlearned data) by learning a small number of obtained data.

In learning mode (Learning Enable = 1 in Fig. 6), an analog voltage \( V_{m,\text{ideal}} \) corresponding to an ideal ADC output code \( D_{\text{out,ideal}} \) is input to the ADC. The voltage \( V_{m,\text{ideal}} \) must be eligible to have precision of the target ADC resolution. As a result, MSB-side output \( D_U \) and LSB-side output \( D_L \) are obtained for the ideal code \( D_{\text{out,ideal}} \). By repeating a similar process, an index set \( T \) for labels indicating a sufficient number of training data can be prepared. The \( k \)-th training data \( D_{k}^{(k)} = (D_{k}^{(u)}, D_{k}^{(l)}) \), \( k \in T \) has the same bit configuration as in Eq. (3).

\[
D_{k}^{(u)} = (D_{u,N_k-1}^{(k)}, \ldots, D_{u0}^{(k)})
\]

\[
D_{k}^{(l)} = (D_{l,N_k-1}^{(k)}, \ldots, D_{l0}^{(k)})
\]

(17)
The oversampling ratio of the SSA-ADC is \( R = 500 \). In this study, the full-scale voltage is \( V_{FS} = 0.5 \text{ V} \) and the final total precision of the AD conversion is 18-bit (1 LSB is approximately 1.91 \( \mu \text{V} \)). Although the present study considers multiple-impedance method based on the expectation maximization was proposed in order to supplement the circuit calibration and to guide the verification process with the information obtained through the monitoring process, this method requires die-level process monitoring circuits [24]. The calibration techniques used in the present study do not require die-level process monitoring. In addition, compared to a simple digital-domain calibration technique [5], the proposed technique can realize more complicated error correction.

6. Behavior-Level Simulation Results

In order to verify the effectiveness of the proposed method, behavior-level simulations were carried out using MATLAB, where the full-scale voltage is \( V_{FS} = 0.5 \text{ V} \) and the final precision of the AD conversion is 18-bit (1 LSB is approximately 1.91 \( \mu \text{V} \)). Although the present study considers the multi-channel AFE shown in Fig. 1, we herein focus on one-channel operation with the highest conversion speed. The oversampling ratio of the SSA-ADC is \( OSR = 500 \). In the present study, \( n = 12, d = 5, \) and \( V_{REF} = V_{CM} = 0.25 \text{ V} \) in Fig. 8. Since the present study was conducted prior to transistor-level simulation, parasitic capacitances were not considered in the simulation. The capacitance elements in the present study were calculated through incremental learning. Appendix A shows the derivation of Eq. (18). In this approach, only a set of parameters \( \mathbf{e} \) and \( \mathbf{f} \) related to \( D_U \) from the SAR-ADC mode has a continuous degree of freedom. A set of parameters \( \mathbf{h} \) and \( g \) related to \( D_L \) from the SF-ADC mode can be determined focusing on monotonic re-quantization characteristics for the residue error in the SAR-ADC mode. The computation for the minimization problem is based on matrix operations, as described in Appendix B. The advantage of this approach is its suitability for incremental learning. As shown in Fig. A.1, from the stored matrices (\( \mathbf{H}_{T,F-1} \) and \( \mathbf{S}_{T,F-1} \)) and additional training data sets, the renewed parameters \( \mathbf{w} \) are obtained as their mean \( \mu_T \) and variance-covariance matrix \( \Sigma_T \). The size of the stored matrices depend only on the bit configuration \((N_U, N_L, N_3)\) and not on the number of training data sets, which is also useful for practical application.
Table 3  Simulation conditions and parameters for the 18-bit ADC.

<table>
<thead>
<tr>
<th>Parameter</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>Sampling Frequency</td>
<td>500 kHz</td>
</tr>
<tr>
<td>Signal Bandwidth</td>
<td>500 Hz</td>
</tr>
<tr>
<td>Input Signal Frequency</td>
<td>164.99 Hz</td>
</tr>
<tr>
<td>Input Differential Amplitude</td>
<td>0.5 V</td>
</tr>
<tr>
<td>( N )</td>
<td>511</td>
</tr>
<tr>
<td>( N_5 )</td>
<td>8</td>
</tr>
<tr>
<td>Full Scale Range</td>
<td>0.5 V</td>
</tr>
<tr>
<td>Buffer Gain</td>
<td>2</td>
</tr>
<tr>
<td>( C_u )</td>
<td>1.2 pF</td>
</tr>
<tr>
<td>( \sigma^{\text{off}}_{C_u} )</td>
<td>0.003</td>
</tr>
<tr>
<td>Comparator Input Offset ( \sigma^{\text{off}} )</td>
<td>3 mV (standard deviation)</td>
</tr>
<tr>
<td>Comparator IR Noise</td>
<td>120 ( \mu V_{\text{rms}} )</td>
</tr>
<tr>
<td>Buffer Output-referred Noise</td>
<td>100 ( \mu V_{\text{rms}} )</td>
</tr>
<tr>
<td>Comparator Input Dead Zone</td>
<td>( \pm 60 \mu V )</td>
</tr>
<tr>
<td>Number of Training Data Sets</td>
<td>14,000</td>
</tr>
</tbody>
</table>

Fig. 8 are assumed to be implemented with parallel connection of unit capacitances \( C_u \). The mismatches of unit capacitances are considered to be \( \delta C_u / C_u = 0.003 \), where \( \delta C_u \) is the standard deviation of capacitances \( C_u \). This value is smaller than that used in a previous study on a 15-bit SAR-ADC [15] because of the larger unit capacitance in the present study (\( C_u = 1.2 \) pF). Although the dynamic digital threshold technique is used in this simulation, the non-linearity of capacitance was not considered because it depends on the details of the fabrication process [20]. The input offset standard deviation of the comparators in the SF-ADC is set to 3 mV, based on the assumption that offset calibration has been implemented. In the present study, the number of comparators is set to \( N = 511 \). The capacitor mismatch, comparator offset, and white noise are assumed to follow Gaussian distributions in the present study. The parameters used in the present study are listed in Table 3. The KTC thermal noise generated at sampling was neglected because it is estimated to be 7.3 \( \mu V_{\text{rms}} \) and to be much smaller than the buffer IR noise (50 \( \mu V_{\text{rms}} \)). The influence of this thermal noise in the learning mode on the tuning parameters \( w \) was relaxed by Bayesian linear regression [15], [23], as described in Appendix B. Moreover, the OSR oversampling and low-pass filtering described later herein can reduce the influence of thermal noise. Flicker noise in the comparators was not considered in the present study because of a lack of practical transistor-level design, as described above.

In order to determine the parameters of \( w \), initial learning with 2,000 training data sets and incremental learning steps was carried out. The number of training data sets per incremental learning was 800. Two-step selection of training data was adopted for 800 regions of \( (D_U, D_L) \), as described in detail in Appendix B. The first step selects 10 sets of training data per region at random, and the second step extracts one of set of training data per region based on error statistics, resulting in 800 sets of training data for one incremental learning iteration. The default number of incremental learning steps was 15, which means learning a total of 14,000 sets of training data (only 5.3% of 18-bit output code patterns).

In order to clarify the generalization error for the proposed SSA-ADC, transient simulation for full-range ramp analog input was carried out, and the uniformity of the occurrence probability was investigated. Figure 9 shows the distribution of \( D_U \), which is 12-bit output in the SAR-ADC mode, and the upper 12-bit codes of \( D_{\text{out}} \) were corrected according to the machine learning results. Calibration of machine learning can significantly improve linearity. Figure 10 shows the distribution of \( D_L \), which is 12-bit output in the SF-ADC mode, and the lower 6-bit codes of \( D_{\text{out}} \) obtained by machine learning. The original distribution of \( D_L \) is composed of two Gaussian distributions, each mean of which corresponds to typical residual errors at the last conversion of the SAR-ADC mode. By machine learning, the linearity of the lower 6-bit codes of \( D_{\text{out}} \) can be significantly improved. In simulation of both training and evaluating the proposed ADC, the white noise described above is taken into account. Therefore, there are inevitable errors even in reproduction of the same learned analog input. Fortunately, these errors can be significantly reduced by low-pass filtering, as

![Fig. 9](image-url) Distribution of \( D_U \) (output in the SAR-ADC mode) before machine learning (ML) and upper 12-bit of corrected code \( D_{\text{out}} \) after ML for a full-range ramp analog input. (The sums of occurrence probabilities for 64 continuous \( D_U \) codes are plotted for clarity.)

![Fig. 10](image-url) Distributions of (a) \( D_U \) (output in the SF-ADC mode) before machine learning (ML) with fitted Gaussian curves and (b) corrected lower 6-bit output code after ML for a full-range ramp analog input.
demonstrated later herein.

Figure 11 shows the conversion error from the ideal code for the last 800 training data, which were obtained in the learning mode. Using Eq. (26), the error is defined as

$$e_W = \frac{1}{|T|} \sum_{k \in T} E_W(D^{(k)}),$$  \hspace{1cm} (29)

where $|T|$ is the number of training data sets. Machine learning reduces the conversion error to approximately $\pm 30$ LSB, which corresponds to approximately the IR noise level of the buffer ($\pm 50 \mu V_{\text{rms}}$). In order to evaluate the effect of the dynamic digital threshold technique, the same simulation was carried out with and without this technique. As shown in Fig. 11, the dynamic digital threshold technique has little effect. This is attributed to the significant error compensation by the machine learning. However, the dynamic digital threshold technique is still useful to ensure that the residual error in the last conversion step of the SAR-ADC mode is within the input range of the SF-ADC block.

The output code density test [25] was carried out for a full-range single-tone signal input (frequency: 164.99 Hz). In order to obtain static characteristics with 99% confidence at 0.2-bit accuracy, the number of sampling points was set to $10^4$. Figures 12(a) and 12(b) show the differential nonlinearity (DNL) error in the output code calibrated by machine learning. Figure 12(c) shows the integral nonlinearity (INL) error of the calibrated output code, which is calculated using the end-point method. The large INL error originates from the reduced output range of the capacitor DAC with $C_{C,p(n)} = C_u$ shown in Fig. 8. When the code range is limited from $2^6$ to $2^{18}-2^6-1$, which corresponds to 0.4998 V for $V_{FS} = 0.5$ V, the INL can be improved, as shown in Fig. 12(d).

In order to investigate the signal-to-noise and distortion ratio (SNDR) in the proposed SSA-ADC, transient simulation for full-range single-tone signal input (frequency: 164.99 Hz) was carried out with a sampling frequency of 500 kHz, which is oversampled ADC operation, as described in Sect. 1. Figure 13 shows FFT spectra of the ADC outputs corrected by machine learning with and without low-pass filtering (bandwidth: 500 Hz). As a part of the digital signal processing block shown in Fig. 1, the sixth-order Butterworth low-pass filter (LPF) was used for the filtering of the ADC outputs. This is because noise and error reduction by signal bandwidth elimination have a signifi-

![Fig. 11](image1.png)  \hspace{1cm} ![Fig. 12](image2.png)

**Fig. 11** Conversion error (a) without and (b) with the dynamic digital threshold technique for the last 800 sets of training data after incremental machine learning (total number of training data sets: 14,000).

**Fig. 12** (a) DNL error, (b) distribution of DNL error, (c) INL error, and (d) corrected INL error of the output code calibrated after incremental machine learning (total number of training data sets: 14,000).
Based on the stochastic signal detection, 18.4 bits are predicted for an ENOB of 17.5 bits. Similarly, ENOB values of 18.0 and 18.4 bits (ENOB) of approximately 17.0 bits for a bandwidth of 500 Hz (OSR = 500). If \( N_S \) is doubled (\( N_S = 16 \)) in order to enhance the ENOB, similar simulation results achieved an ENOB of 17.5 bits. Similarly, ENOB values of 18.0 and 18.4 bits are predicted for \( N_S = 32 \) and 64, respectively. Based on the stochastic signal detection [14], [17], doubling \( N_S \) can enhance ENOB by approximately 0.5 bits. Usually, the digital processing block shown in Fig. 1 decimates after filtering according to the signal bandwidth, i.e., down-sampling is performed for efficient data transfer. The signal down-sampled by 128 was also verified, and no significant degradation was observed.

Figure 14 shows the learning curve for the ENOB and the error obtained in the above simulations. The figure reveals that incremental learning beyond 15 times provides little improvement. In other words, a relatively small amount of training data can provide generalization.

7. Conclusions

In the present paper, we proposed an SSA-ADC for a novel low-power high-resolution AFE. In SAR-ADC mode for MSB-side bits, an SF-ADC block is used for the dynamic digital threshold technique. The SF-ADC block calibrates some error originating from the internal capacitor DAC to ensure that the residual error in the last conversion step of this mode is within the input range of the SF-ADC block. Configuration data for the dynamic digital threshold configurations were prepared preliminarily using a minimal amount of test input data and were used to generate the optimal threshold dynamically for each conversion in the SAR-ADC mode. These operations minimize the number of operations and the register size to store configuration data. For the residual error after SAR-ADC operation, which can be smaller than thermal noise, the SF-ADC uses the statistical characteristic of noise to obtain LSB-side bits. The SF-ADC output for the residual signal is combined with the SAR-ADC output to obtain high-precision ADC output data using a machine learning technique based on Bayesian linear regression. This technique compensates for capacitance mismatch and SF-ADC non-linearity. Behavior-level simulation results on a 18-bit SSA-ADC revealed its effectiveness.

Since the present feasibility study only provides behavior-level predictions, extensive future research on practical circuit implementation is necessary. However, we expect that the present study will establish new solutions to serious problems on ADCs in advanced device technologies.

Acknowledgments

The present study was supported by the Adaptable and Seamless Technology Transfer Program through Target-driven R&D (A-STEP) of the Japan Science and Technology Agency (JST). The authors would also like to thank Proassic, Ltd. for providing technical support.

References


Appendix A: Derivation of Eq. (18)

Based on physical phenomena in the SAR-ADC mode, which is expressed by Eq. (5), error correction for the output in the SAR-ADC mode is considered. At the last conversion step of the SAR-ADC mode, all digital inputs $D_i$ are determined. Since $V_{DAC} \approx 0$ at this time, $V_{in}(= V_{in,p} - V_{in,n})$ can be given by

$$V_{in} \approx \frac{2^{d+1}}{2^n - 1 + \sum_{i=0}^{n-1} e_i + (2^{d+1} - 1)\beta} \times \left[ \sum_{i=0}^{n-d} \left( 1 + e_i + \beta u \left( d - i + \frac{1}{2} \right) \right) D_i V_{REF} - \left( e_i + \beta u \left( d - i + \frac{1}{2} \right) \right) \frac{V_{REF}}{2} + \frac{1}{2} \left( \Delta e_i + \frac{\Delta \alpha}{\alpha} + \Delta \beta u \left( d - i + \frac{1}{2} \right) \right) \right] \times (V_{CM} + \Delta V_{in,REF}) - \frac{2^n - 1}{2^{d+1}} V_{REF} + \frac{C_{p,eff,0}}{C_u} \left( F_{p,eff} + \frac{\Delta \alpha}{\alpha} \right) V_{CM}.$$  \hspace{1cm} (A-1)

In the ideal case without capacitance errors or parasitic capacitances, $\alpha = 2^{d+1}/(2^n - 1)$, $\beta = 0$, and input voltage $V_{in}$ can be given by

$$V_{in} \approx \left( \frac{2}{2^n - 1} \sum_{i=0}^{n-1} 2^i D_i - 1 \right) V_{REF}.$$  \hspace{1cm} (A-2)

The factor of $2/(2^n - 1)$ is different from the ideal factor $1/2^{n-1}$, which originates from split capacitance $C_{C,p(n)}$ with unit capacitance $C_u$ used from the viewpoint of fabrication error [19]. In the present study, this gain error can also be taken into account in error correction.

In order to calibrate the ADC, only ideal output code for experimentally obtained output code $D_{in}$ can be used as training code $D_{j,i}$. As such, Eq. (A-1) is modified to obtain a simple model, as follows:

$$V_{in} \approx \left( \frac{1}{2^n-1} \sum_{i=0}^{n-1} 2^i D_{j,i} - 1 \right) V_{REF} \approx \left( \frac{1}{2^n-1} \sum_{i=0}^{n-1} 2^i D_{in} \left( 1 + e_i + \sum_{j=i+1}^{n-1} f_{j,i} D_{in,j} \right) - 1 \right) V_{REF},$$  \hspace{1cm} (A-3)

where $e_i$ is related to the coefficient of $D_i$ in Eq. (A-1), and $f_{j,i}$ introduces other errors into the coefficient of $D_{in,j}$. In successive comparisons from the MSB code, errors generated in the upper bit selection influence errors in the lower bit. Thus, only a set of $f_{j,i}$ ($i < j$) is used. In order to reduce the number of learned parameters, higher-order terms in $D_{in,j}$ were neglected. This simple model is used in the correction function of Eq. (18).

The second term of the correction function given by Eq. (18) originated from the non-linearity characteristics of the SF-ADC. In other words, each bit of the thermometer code of the SF-ADC output has a different weight. Moreover, $h$ corresponds to the array of these weights. The average of the $N_{L3}$ bits of the fractional part is taken into account as the third term in the correction function given by Eq. (18).

Appendix B: Minimization Problem Using Bayesian Linear Regression

Here, the minimum problem expressed in Eq. (28) can be solved easily. By extending the previous study [15], the optimal values of the tuning parameters ($w^*$) for the correction function expressed by Eq. (18) can be obtained analytically as follows:

$$w^* = (\Phi'I\Phi_T + I_c)^{-1} \Phi'I d_T,$$  \hspace{1cm} (A-4)

where
$$\Phi(D_U, D_L) = \Phi(D_{U(0)}, \ldots, D_{U(N_U-1)}, D_{U(N_U)}, \ldots, D_{U(N_L-1)}, D_{U(N_L)}, \ldots, D_{U(N_U-1)}),$$

where $\Phi$ is the training data set in the ensemble

$$I_c = \text{diag} \{c_1, \ldots, c_1, c_3, \ldots, c_3, c_2, \ldots, c_2, 0\}$$

$D_L = D_{U(D)}, \ldots, D_{U(D)}$ in Eq. (20) is assumed to follow a Gaussian distribution.

The posterior probability distribution of $\beta_{out}$ determines the variance of the corrected data. The posterior probability of $w$ in the case of obtaining the training data ensemble $D_T$, $p(w|D_T)$, is most likely given by

$$p(w|D_T) = N(w|\mu_T, S_T),$$

where $\Phi_T$ has $\phi(D_{U}, D_{L})^T$, $k \in T$ in its row, as shown in Eq. (A-6), and the vertical vector $d_T$ has $D_{out(D)}^{(k)} - D_{U(D)}^{(k)}$, $k \in T$ in its element, as shown in Eq. (A-8). In this case, the expected probability distribution of $D_{out}$ for a given set of $D_U$ and $D_L$ is given by

$$p(D_{out|D}) = N(D_{out|D}|\mu_{out|D}, S_{out|D}),$$

where $\mu$ and $\sigma^2$ are given by Eqs. (A-12) and (A-11), respectively, for $T = U_{0 \leq k < T_k}$. As shown in Fig. A-1, the calculations for them use the stored matrices ($\mu_{T,-1}$ and $S_{T,-1}$) and additional training data sets [15]. The size of the stored matrices depend only on the bit configuration ($N_U, N_L, N_{I3}$).

As a training data selection method, the two-step selection shown in Fig. A-2 is uniformly applied to training data with a large error. Note that $D_U$ and $D_L$ cannot be assigned directly due to noise and errors. By using the approximation $D_{out} = h_{U|D}(D_U, D_L)$, adequate incremental training data sets can be obtained [16]. For the sake of simplicity, other selection techniques proposed in [16] are not used in the present study.
Sadahiro Tani received B.S. and M.S. degrees in electronic engineering from Osaka University, Osaka, Japan, in 1979 and 1981, respectively. In 2004 he received a Ph.D. degree in information systems engineering from Osaka University. During 1981–2012, he worked for Sharp Corporation, Nara, Japan. During 2014–2016, he was a project researcher at Osaka University. His current research includes CMOS mixed signal circuits. Dr. Tani is a member of the IEICE and the IEEE.

Toshimasa Matsuoka received B.S., M.S., and Ph.D. degrees in electronic engineering from Osaka University, Osaka, Japan, in 1989, 1991, and 1996, respectively. During 1991–1998, he worked at the Central Research Laboratories, Sharp Corporation, Nara, Japan, where he was engaged in research and development related to deep-submicron CMOS devices and ultra-thin gate oxides. Since 1999, he has been working at Osaka University, where he is an Associate Professor. His current research includes CMOS analog/RF circuits and device modeling. Dr. Matsuoka is a member of the Japan Society of Applied Physics, the IEICE, and the IEEJ.

Yusaku Hirai received B.S. and M.S. degrees in electronic engineering from Osaka University, Osaka, Japan, in 2013 and 2015, respectively. He is now working for THine Electronics Inc., where he is developing high-speed serial interfaces.

Toshifumi Kurata received an M.S. degree in electric engineering from Osaka University, Osaka, Japan, in 2015. He is now working for Yamaha Corporation.

Keiji Tatsumi received a B.S. degree in engineering from Kyoto University, Kyoto, Japan in 1993, and an M.S. degree in information science from Nara Institute of Science and Technology, Nara, Japan in 1995. In 2006, he received a Ph.D. degree in informatics from Kyoto University. Since 1998, he has been working at Osaka University, where he is an Associate Professor. His current research includes meta-heuristics for global optimization and machine learning. Dr. Tatsumi is a member of the Operations Research Society of Japan, the ISCIIE, and the SICE.

Tomohiro Asano received B.S. and M.S. degrees in electronic engineering from Osaka University, Osaka, Japan, in 2014 and 2016, respectively.

Masayuki Ueda received a B.S. degree in physics from Konan University, Hyogo, Japan in 1987. During 1987–2014, he worked for Renesas System Design Co., Ltd. (formerly NEC IC Microcomputer Systems, Ltd.), Japan, where he was engaged mainly in the design of analog circuits. Since 2014, he has been working at SPChange, LLC, Japan.

Takatsugu Kamata received a B.S. degree in electrical engineering from Tokyo City University in 1987. During 1987–2001, he worked for Motorola Inc., where he was engaged in the design of RF/IF ICs for cellular phones. During 2001–2010, he worked for RISStream Corp., where he carried out research and development related to TV tuner ICs. In 2010, he received a Ph.D. degree from Osaka University, Japan. Currently he is working for SPChange, LLC. Since 2014, he has been also a Special Appointed Associate Professor in Osaka University. His current research includes an implantable medical device, wireless transceiver and contact-less power circuit/system. Dr. Kamata is a member of the IEICE and IEEE.