˜˜˜ CIRCUIT ANALYSIS AND OPTIMISATION Lecture Notes by Tadej Tuma, December 11, 2024 University of Ljubljana, Slovenia, Faculty of Electrical Engineering 1 Kataloˇ zni zapis o publikaciji (CIP) pripravili v Narodni in univerzitetni knjiˇ znici v Ljubljani COBISS.SI-ID 217853443 ISBN 978-961-243-474-8 (PDF) URL: https://fides.fe.uni-lj.si/~tuma/Circuit_Analysis_and_Optimisation.pdf Copyright 2024 Zaloˇ zba FE. All rights reserved. Razmnoˇ zevanje (tudi fotokopiranje) dela © v celoti ali po delih brez predhodnega dovoljenja Zaloˇ zbe FE prepovedano. Naslov: Circuit Analysis and Optimisation Avtor: Tadej Tuma Zaloˇ znik: Zaloˇ zba FE, Ljubljana Izdajatelj: Fakuleta za elektrotehniko, Ljubljana Urednik: prof. dr. Saˇ so Tomaˇ ziˇ c Kraj in leto izida: Ljubljana, 2024 1. elektronska izdaja 2 Contents 1 Introduction 4 1.1 What is Electronic Design Automation (EDA)? . . . . . . . . . . 4 1.2 The SPICE pedigree from UC Berkeley . . . . . . . . . . . . . . 7 1.3 The Prerequisite Quiz . . . . . . . . . . . . . . . . . . . . . . . . 8 1.4 A Short Refresher . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2 Linear Resistive Networks 12 2.1 Circuit Tableau . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.2 Basic Nodal Equations . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3 Modified Nodal Equations . . . . . . . . . . . . . . . . . . . . . . 19 3 Solving Linear Equations Sets 25 3.1 Sparse Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2 LU Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.3 Successful Pivoting Techniques . . . . . . . . . . . . . . . . . . . 29 3.4 Numerical Error Control . . . . . . . . . . . . . . . . . . . . . . . 34 4 Introducing Nonlinear Devices 38 4.1 Fixed Point Iteration . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.2 Newton–Raphson Algorithm . . . . . . . . . . . . . . . . . . . . . 41 4.3 Convergence Detection . . . . . . . . . . . . . . . . . . . . . . . . 47 4.4 Automatic Convergence Helpers . . . . . . . . . . . . . . . . . . . 49 5 Frequency-Domain Analysis 57 5.1 Small Signal (AC) Analysis . . . . . . . . . . . . . . . . . . . . . 57 5.2 Small Signal (NOISE) Analysis . . . . . . . . . . . . . . . . . . . 59 6 Time-Domain (TRAN) Analysis 62 6.1 Backward Euler Integration . . . . . . . . . . . . . . . . . . . . . 62 6.2 Trapezoidal Integration . . . . . . . . . . . . . . . . . . . . . . . 66 3 1 Introduction This course attempts to explain the mechanisms of SPICE OPUS (http:// www.spiceopus.si/ ) in formal terms. You can certainly use a circuit simulator without understanding its mathematical background, just as you can drive a car without knowing what is going on under the hood. This may sound like a good excuse to skip this course, in which case many phenomena will just have to go unexplained. For instance, why does SPICE only compute nodal voltages and currents through independent voltage sources? Or why is it that sometimes a simple operating point analysis will last longer than a complex transient? What exactly is the meaning of all those cryptic warnings and error messages? Where is the source of convergence problems and how can one avoid them? How accurate and reliable are the obtained simulation results? 1.1 What is Electronic Design Automation (EDA)? Electronic design automation, also referred to as electronic computer-aided de-sign (ECAD), is a category of software tools for designing electronic systems such as integrated circuits and printed circuit boards. The tools work together in a design flow enabling engineers to device and analyze entire semiconductor chips. The central EDA component is always a circuit simulator like SPICE OPUS. As a matter of fact, in the 1960s and 70s the IC industry was a rising star on the high-tech horizon. It quickly became obvious that designing ICs by trial and error was just too tedious and expensive. Prediction of circuit behavior using mathematical models of circuit components and digital computers seemed to be a far better approach. 1981 marks the beginning of EDA as an industry. Today, semiconductor chips can have millions of components operating on a nanometer scale. One nanometer is 10−9 meters or a row of 10 hydrogen atoms. As a circuit designer, how are you going to test your circuit of millions of tiny transistors? No way of connecting a probe to some internal node and verify a voltage. All the testing must be done on EDA software level! Obviously the nu-merical results must be extremely reliable, which is difficult enough. Moreover, manufacturing processes can’t ensure constant device parameters like capaci-tances, resistances, voltage gains, etc. On the contrary, all parameters deviate widely from their nominal values. EDA software must be capable of predicting the circuit behavior down to the last detail including manufacturing parameter deviations. The modern chip fabrication process basically has three actors. Everything starts with the customer (investor) coming up with specifications for a new chip. Then a chip design team is contracted, which has qualified and highly experienced manpower as well as all the necessary EDA tools. The design house must also have a close working relationship with the foundry eventually producing the chip because the EDA software critically relies on accurate models of the fabrication process. The fabrication flow can be seen in figure 1. Once the designers have all the input data they use EDA tools to come up with the chip’s blueprint. In this step, the design team totally relies on software simulation. The final output is traditionally called tape-out, because it used to be delivered on magnetic tapes. At this point things become really 4 Photolitography technology (ASML) customer designer foundry (TSMC ...) chip device specifi- model cations library EDA software physical circuit tape-out chip fabrication physical chip testing measu- rement results somebody gets the blame :( specs no met? yes everbody gets a bonus :) Figure 1: Chip fabrication flow. expensive, since the tape-out data is given to the foundry which starts the physical implementation. The final chip is then cycled back to the customer who compares the performance against the initial specification. If everything adds up, mass production starts, otherwise the blaming game begins. This process is widely known as ‘fabless manufacturing’, because the physical production is outsourced to a highly specialized manufacturing plant or foundry. There are many chip design companies out there, but only a handful of fabs that can produce state of the art chips, the most famous being TSMC (Taiwan Semiconductor Manufacturing Company). Actually things become really weird when it comes to the production technology of these high tech fabs. Worldwide, there is only one company supplying cutting edge photolitography technology, namely the Dutch ASML holding! Now let us take a closer look at the process of chip design. It basically consists of an electrical and a physical design cycle, as in figure 2. Obviously everything starts with the specifications of what the chip should actually do and the properties of the building blocks that a fab is offering. The designer’s initial 5 chip device specifi- model cations library design topology set parameters under circuit extract parasitics test place and route set up simulation scenario simulation input circuit run simulations (SPICE OPUS ...) simulation results no yes physical specs met? circuit tape-out electrical design physical design Figure 2: Chip design flow. step is to come up with a suitable circuit topology and set respective device parameters. Next the circuit needs to be put into many different testbench configurations and a number of analyses have to be specified. Usually this step is called the simulation scenario and it includes the circuit as well as all testing specifications. Now the actual simulation is run on an analog circuit simulator (like SPICE OPUS). The results are compared against the given specifications and they most certainly won’t match the first time around. Based on the type of mismatch the designer must now cycle back and adjust the circuit parameters and maybe even the topology. This loop is executed until the specs are roughly met. Once the dimensions of the devices (transistors) are roughly known their physical placement on the wafer must be planned and they must be connected 6 according to the given circuit topology. This is done with highly specialized place and route tools. The result is the final manufacturing information (tape-out). Unfortunately the placement and the metallic connections introduce par-asitic resistances, inductions and most notably capacitances to the circuit under test. These have to be added to the circuit and the electric design loop has to be run again. Obviously we have a bit of a chicken and egg situation. So both loops in figure 2 must be run alternately many times until the design finally meets the specs. Remember, so far all this is on a simulation level! Only now is the tape-out sent to the foundry for manufacturing and this is when expenses really start clocking. After production the final product arrives at the customer (figure 1) for measurements. This is the moment of truth. Can the mass production begin or is another design cycle necessary? If the chip performance is insufficient something has gone wrong. Either the design team has inadequately set up the simulation scenario, or the device models from the foundry are not correct or the EDA simulation tools are off. In any case another manufacturing cycle is necessary, so the customer looses money and valuable time-to-market. Who is to blame? 1.2 The SPICE pedigree from UC Berkeley The EDA tools are relatively easy to eliminate from the blaming game. There is a huge variety of circuit simulators out there, ranging from free academic tools to very expensive commercial ones. They all go back to a common ancestor, the SPICE project from UC Berkeley, which started in 1971 and ended 1993 with the publicly released source code SPICE 3F5. All contemporary circuit simulators – including SPICE OPUS – are based on this internal numerical apparatus. If used properly, most circuit simulators will give you surprisingly similar results. However close enough is not good enough if expensive redesign cycles in figure 1 are at stake. Actually everything revolves around the extremely complex device model library at the top of figure 2. You see, the results are only as good as the process models supplied by the fab. So this is the holy grail. The fab must come up with a cutting edge manufacturing process and it must supply an accurate device library for a given EDA simulator. Up to a point the fab must guarantee that their devices will behave as predicted by the simulations. And there’s the rub! The fab can’t possibly check their model library on every analog circuit simulator out there. They usually just check two simulators, Spectre (Cadence Design Systems) and HSPICE (Synopsis), thereby making these two the golden standard! Should the design cycle in figure 1 be based on any simulator but Spectre or HSPICE the fab will promptly wash it’s hands off any prototypes not matching the specs. This puts a lot of technical and legal responsibility on Cadence and Synopsis resulting in very expensive simulators. A single annual license for Spectre is typically in the same price range as the annual salary of the senior designer using it. Many design companies keep their costs down by using free simulators like SPICE OPUS in early stages of the design, but no design company dares to tape-out anything that has not be thoroughly checked on a standard simulator. Even if not the golden standard, SPICE OPUS is valuable in early stages of contemporary chip design as well as in eduction and research. 7 1.3 The Prerequisite Quiz The lectures in this course are based on the assumption that you have a solid mathematical background and sufficient analog circuit knowledge. You should be able to answer the following questions. 1. What are the rules for matrix multiplications (additions)? 2. Can you explain the matrix determinant? 3. What are eigenvalues and eigenvectors? 4. How do you find the inverse of a matrix? 5. What is a matrix permutation? 6. How are systems of equation represented in matrix form? 7. Can you explain Gaussian elimination steps? 8. What is a singular matrix? 9. What does Kirchhoff’s current (voltage) law state? 10. How is a circuit graph defined? 11. How are nodal equation formulated? 12. What are Bode’s theorems? 13. How do you test circuits for stability? 14. What is a positive (negative) feedback? 15. What are the types of MOSFETs? 16. What are the basic MOSFET orientations? 17. What are the sources of noise? 8 1.4 A Short Refresher This is a rapid rundown of basic matrix math. You might take a quick look, just to make sure you’re up to speed. However, if your math needs brushing up ask Google for more details. So here it goes. Matrix An×m is a two dimensional array of arbitrary elements ai,j with n rows and m columns   a a . . . a 1,1 1,2 1,m  2,1 2,2 2,m  a a . . . a A  n × m = .  .. .. . . ..   .  . . .  an,1 an,2 . . . an,m n×m Matrices are usually marked in bold font, while element positions are given as subscripts. For clarity reasons the comma in the subscript may be omitted. Where applicable, the subscripts as well as the matrix dimensions are dropped altogether. Square matrices are special since n = m . For instance, the identity or unity matrix   1 0 0 0 I4×4 =   . 0   0 1 0 0   0 1 0 0 0 0 1 4×4 Then, there is the diagonal matrix with its strange sister, the anti-diagonal matrix     d 0 0 0 0 0 0 d 11 14 D =   , Da =   . 0  22   23  0 d 0 0 0 0 d 0  33   32  0 d 0 0 d 0 0 0 0 0 d44 d41 0 0 0 Diagonal elements ai,i in general are also called pivots. Therefore, a diagonal matrix consists of nonzero pivots only. Next come the lower triangular and upper triangular matrices     l 0 0 0 u u u u 11 11 12 13 14  21 22   22 23 24 l l 0 0 0 u u u L =   , U =   .  31 32 33   33 34 l l l 0 0 0 u u l 41 l42 l43 l44 0 0 0 u44 As soon as matrix An ×m has only one row n = 1, it is called a row vector. Consequently, a matrix with m = 1 is called a column vector. Obviously, a matrix with only one row and one column is just another scalar A1×1 = [a11] = a. There are even some exotic matrices with at least one zero dimension n = 0 or m = 0. They are called empty matrices, not to be confused with zero matrices An×m = 0, having zero elements ai,j = 0 rather than zero dimensions. ˜˜˜ 9 There is a lot of fun stuff you can do with matrices. For example, you can transpose a matrix by exchanging its rows and columns     T a a . . . a a a . . . a 1,1 1,2 1,m 1,1 2,1 n,1 AT  n =  × m =  .  .. .. . . ..    .. .. . . ..  .  2,1 2,2 2,m   1,2 2,2 n,2  a a . . . a a a . . . a     . . . . . . . an,1 an,2 . . . an,m a1,m a2,m . . . an,m n × m m×n As expected, by transposing the transposition, you get your original matrix back ( T T T A ) = A . A symmetric matrix satisfies A = A . How about adding two matrices? As long as they have the same dimensions, you just add individual elements An×m+Bn×m = Cn×m where ci,j = ai,j +bi,j. As usual, matrix additions are commutative, so A+B = B+A. By the way, the transposition of matrix additions is distributive, hence T T T A + B = ( A + B ) . Matrix multiplication, on the other hand, is a bit more tricky. By defini-tion, the number of the multiplier’s columns must match the number of the multiplicand’s rows. The product will inherit its row count from the multiplier and its column count from the multiplicand. Therefore, An ×l · Bl×m = Cn×m, where each element of C is the scalar product of a row vector from A and a corresponding column vector from B, namely l X ci,j = ai,kbk,j. k=1 Matrix multiplication is obviously not a commutative operation AB 6= BA. In order to explicitly distinguish the multiplier from the multiplicand, you often hear that B is either pre-multiplied or post-multiplied by A. A properly sized identity matrix is a neutral multiplier IA = A as well as multiplicand AI = A. Matrix multiplication satisfies the rules of associativity (AB)C = A(BC) and distributivity (A + B)C = AC + BC. A square matrix A is called invertible or non-singular if there exists a matrix B such that AB = BA = I, in which case B −1 T −1 = A . A matrix satisfying A = A is called orthogonal. One more fun fact. Multiplying a matrix by a scalar x · A means multiplying each element x · ai,j. On the other hand, adding a scalar to a matrix x + A is mathematically not defined. ˜˜˜ Apart from additions and multiplications, there are three elementary matrix row and column operations. If you want to manipulate rows in A, you pre-multiply it by a respective operator F, mapping F · A = f r(A). You can do the same to columns by post-multiplication A · F = fc (A). Multiplication, that is multiplying all elements of a row or column by a con- stant. In order to multiply row number 2 in matrix A4×2 by x, you pre-multiply it by a diagonal matrix D4×4 with d2,2 = x and all other pivots equal to one       1 0 0 0 a a a a 11 12 11 12    21 22  21 22 0 x 0 0 a a xa xa       = .    31 32  31 32  0 0 1 0 a a a a 0 0 0 1 a41 a42 a41 a42 10 If you want to erase a row, you just set x = 0. This also works for columns. Suppose you want to multiply column 1 by y. Use a post-multiplication by a correctly sized diagonal matrix     a a ya a 11 12 11 12  21 22  21 22 a y 0 ya a a     = .  31 32  31 32 a a 0 1 ya a a41 a42 ya41 a42 Addition, that is adding one row or column to another. Let us say you want to add row number 4 to row number 2 in the same matrix as above. All you need is to pre-multiply it by an operator matrix F, which is equal to a unit matrix with an additional 1 in column 4 of row 2, f 2,4 = 1       1 0 0 0 a a a a 11 12 11 12    21 22  21 41 22 42 0 1 0 1 a a a + a a + a       = .    31 32  31 32  0 0 1 0 a a a a 0 0 0 1 a41 a42 a41 a42 The same goes for column additions, where you use F in post-multiplication. Permutation, that is interchanging two rows or columns of a matrix. To exchange rows, the operator F must be a unit matrix with respectively interchanged rows. So in order to switch rows 3 and 4 in A4 you pre-multiply it by       1 0 0 0 a a a a 11 12 11 12    21 22  21 22 0 1 0 0 a a a a       = .    31 32  41 42 0 0 0 1 a a a a 0 0 1 0 a41 a42 a31 a32 Again, a post-multiplication will perform a respective column permuta-tion. ˜˜˜ If you think the above is just a boring theory, think again! By cleverly combining multiple column and row operators, a whole new world opens up. Remember the Gaussian elimination algorithm? It goes somewhat like this: ‘... to eliminate one element below the pivot, multiply the pivot row by the element to be eliminated and divide it by the pivot and then subtract the pivot row from the one below, and then do this for all rows below the pivot row ...’. Just like a lawyer talking. With matrix operators all this becomes very elegant. Just one single pre-multiplication is needed to eliminate all elements below a11       1 0 0 0 a a a a 11 12 11 12  21   11 a −1 −1 − a 1 0 0 a21 a22   22 21 11 0 a − a a a12 =   31 11   31 a −  −1     −1  . a 0 1 0 a a32  0 a32 − a31a a 11 12 − −1 −1 a 41 41 a 0 0 1 a a 42 0 a − a aa 11 42 41 11 12 There are countless examples of beautiful matrix transformations based op-erator pre- and post-multiplications, leading to efficient computer algorithms. However, this section is just a refresher, so we really have to stop here. 11 2 Linear Resistive Networks We do have to start with some 101-course basics, but we will move very fast from there. Let us observe an arbitrary network consisting of branches numbered 1, 2, . . . , b with two terminals each, interconnected through n+1 nodes numbered 0, 1, . . . , n. We know that voltages cannot be considered absolute, so we need to name ex-actly one reference node, also referred to as the ground node. Let the ground node be the one numbered 0. So we have n nodal voltages across each re-spective node to the ground node v n1 , vn2 , . . . , vnn. Each of the b branches is conducting a branch current ib , ib , . . . , i , which is flowing from the respective 1 2 b b positive branch terminal t + to the negative one t−. We also formulate voltages across each branch vb , v , that is, from the respective positive branch 1 b , . . . , v 2 b b terminal t+ to the negative t−. We have barely started and we already have a lot of indices in our expres-sions, so let us simplify the notation by introducing vectors. We will refer to column vectors vn, vb, and ib rather than their individual components,       v n1 v b1 i b1  n2 v v  b2  ib2 v  n =  , v  , i =  .. b =   ..    b . (1)   ..  .    .   .  vn vb i n bbb Next, we need to determine the node numbers of the positive and negative terminals for each branch. The configuration of the branches is also called the circuit topology and is usually given as a full incidence matrix Af with n + 1 rows corresponding to nodes and b columns corresponding to branches,   a a . . . a 0,1 0,2 0,b  1,1 1,2 1,b a a . . . a A  f =  . (2)  .. .. . . ..  .  . . .  an,1 an,2 . . . an,b The positive terminal of each branch is marked by +1 in the respective row, while the negative terminal is denoted by −1. All other elements in Af have zero values. So branch number k in figure 3 with its positive terminal at node p and its negative terminal at node m would have zero elements in the k-th column of Af except for ap,k = +1 and am,k = −1. Each column in Af thus has exactly two nonzero elements and each row has one nonzero element for each branch connected to it. v vb nk i v p b knm t+ t− Figure 3: Arbitrary branch in the network. Note that, by definition, a positive branch current ib is flowing from t to t k +− through the branch regardless of the branch type. This seems a bit unnatural when the branch contains an independent voltage source – we expect supply 12 voltage sources to drive supply current through the network, rather than the other way around. This is why you normally see negative supply currents in SPICE OPUS output listings. Notation Dimension Description n + 1 − number of nodes + one reference node (ground) b − number of branches v n n × 1 column vector of nodal voltages v b b × 1 column vector of branch voltages ib b × 1 column vector of branch currents Af n + 1 × b full incidence matrix of the network A n × b reduced incidence matrix of the network Table 1: The vectors and matrices used in the mathematical description of the circuit topology. Besides the full incidence matrix Af , we also need the reduced incidence matrix A, which actually is Af minus the first row, describing the connections to the ground node. By reducing the matrix in this way, we are not losing any information because we know that each column in A with a single nonlinear element has to be connected to ground. In fact, by reducing Af to A we are removing redundant information. We summarize the notation we have so far in table 2. At this point we are ready to write down Kirchhoff’s voltage law applied to each branch, ATv − v = 0, (3) n b and its twin, Kirchhoff’s current law applied to each node, Aib = 0. (4) In a very formal but elegant way, equation (3) is actually saying that the sum of voltage drops along the loop from ground to positive branch terminal, negative branch terminal and back to ground is equal to zero. This holds for all branches. Equation (4) is stating that the sum of all branch currents entering or leaving each node equals zero. Let us now focus on the devices in the branches, where the functional de-pendency between the branch voltages vb and the respective branch currents ib is determined. Because we are discussing only linear resistive networks in this section, we can state the branch equations as a linear function of the branch variables, Yvb + Zib − e = 0. (5) Y and Z are b-dimensional square matrices containing admittance and im-pedance. In most cases these matrices will have nonzero elements only on their respective diagonal, as most branch relations involve only their own branch volt-age and current. However, sometimes we need extra-diagonal nonzero elements to accommodate different kinds of controlled sources. We also need to consider independent voltage and current sources, as they will obviously cause zero rows in Z and Y, respectively. The additions to the notation are listed in table 2 13 Notation Dimension Description e b × 1 column vector of excitation Y b × b admittance coefficient matrix Z b × b impedance coefficient matrix Table 2: The vector and the matrices used in the mathematical description of the circuit’s branches. Before continuing we will demonstrate, that the expression (5) really can accommodate all eight basic linear elements. Let us start with a 2kΩ resistance in branch 1, a 20mS conductance in branch 2, a 3V independent voltage source in branch 3 and a 15mA independent current source in branch 4           − 1 0 0 0 v b b 1 2kΩ 0 0 0 i 0 1    b     b    2 0 20mS 0 0 v 0 − 1 0 0 i 0 +     2     =   . (6) 0  0 1 0   v b 0 3   0 0 0   i b 3   3V  0 0 0 0 vb 0 0 0 1 ib 15mA 4 4 If you do the multiplications and additions in (6) you will promptly end up with vb = 2kΩ · i = 20mS· 1 b , i 1 bv 2b , v 2b = 3V and i = 15mA. 3 b 4 Likewise, all four types of controlled sources can be included, predictably causing extra diagonal nonzero elements in Y and Z           − 1 0 50 0 v b b 1 0 0 0 0 i 0 1    b     b    2 0 0 0 0 v 0 − 1 0 400 i 0 +    2      =   . (7) 0  0 − 1 0   v b   60 0 0 0   i b   0  3 3 0 2 0 0 vb4 0 0 0 1 ib4 0 Performing the operations in (7) is getting us a voltage-controlled voltage source vb1 = 50·vb3 in branch 1, a current-controlled current source ib2 = 400·ib4 in branch 2, a current-controlled voltage source vb = 60 i in branch 3 and a 3 · b 1 voltage-controlled current source ib = 2 v 4 ·b in branch 4. 2 Maybe you have noticed some inconsistency with regard to units of mea- surement in expression (5). In example (6) we have used the correct standard SI units, which doesn’t fully agree with Y being a pure admittance matrix and Z being a pure impedance matrix as suggested in table 2. The same goes for vector e, it contains a wild mix of units. Even worse, the units in example (7) are completely missing. The correct expression for the current-controlled voltage source would be really vb = 60Ω · i , but no engineer would ever call 3 b 1 the controlling coefficient a resistance. So it really depends on who you are. All the above expressions are written from an electrical engineering point of view. A mathematician couldn’t care less about units, he would only see variables with numerical values and so would a programmer. A physicist, on the other hand would be perfectly OK with a controlling coefficient in Ωs. He might even call it a transimpedance. The important lesson here is, whatever results you get out of SPICE OPUS are numerically correct, but the corresponding units of measurement are your responsibility. 14 2.1 Circuit Tableau Equation (5) can be viewed as an extension to the well-known Ohm’s law, so we can set up a complete equation set for any linear resistive network. To this end we simply combine (3), (4), and (5) in matrix notation,  T      A − I 0 v 0 n    b    0 0 A v − 0 = 0. (8) 0 Y Z ib e The equation set ( T 8 ) is also known as the circuit tableau, where A denotes the transposition of the reduced incidence matrix and I is a diagonal unity matrix. In principle, we could solve (8) directly and obtain all circuit variables. The equation set, however, is very large as it involves 2b + n equations with 2b + n unknowns. Due to the topological part, the equation set includes a large trivial portion with integer coefficients. A direct solution would thus be rather inefficient. But before we continue, it is time we illustrated the circuit tableau with a simple example. Consider the resistive circuit in figure 4. Besides the ground node, it only has two more nodes and a total of four branches. v n v 1n2 g2 i1 g3 g4 Figure 4: A simple resistive network. First we need to capture the circuit topology by setting up the reduced in-cidence matrix. To this end we need the branch numbers, the node numbers, and the branch orientation, that is, we need to decide which is the positive and the negative terminal of each branch. With some devices, like the independent current source, we have no choice, because their functionality is terminal depen-dent, but with simple resistors, we can choose any orientation we want. So let us set up the incidence matrix according to branch current directions as indicated in figure 4, − 1 1 0 0 A = . (9) 0 −1 −1 1 With the incidence matrix in place, we can immediately formulate Kirch- hoff’s voltage law using equation (3),     − 1 0 v b1   n  1 1 − 1 vv − b  2   n2 0 −     = 0, (10) 1 v vb3  0 1 vb4 as well as Kirchhoff’s current law from (4), 15   i b 1 − 1 1 0 0 i  b  2 = 0. (11) 0   − 1 − 1 1 i  b 3 ib 4 When setting up branch relations, we again have to make some choices. Simple resistors can be set up either as admittances in Y or as impedances in Z. As we will see later, the admittance form is much more efficient. Therefore, we would like to express branch currents with branch voltages. In this case the impedance matrix is Z = −I and equation (5) becomes           0 0 0 0 v b b 1 1 − 0 0 0 i − i 1 1  2   b     b    2 0 g 0 0 v 0 − 1 0 0 i 0 +     2     −   = 0. (12) 0  0 g 3 0   v b 1 0 3   0 0 −   i b 3   0  0 0 0 g4 vb 0 0 0 −1 ib 0 4 4 The complete circuit tableau (8) for our simple resistive circuit thus looks like this:       − 1 0 − 1 0 0 0 0 0 0 0 v n1 0    n    2 1 − 1 0 − 1 0 0 0 0 0 0 v 0       0 − 1 0 0 − 1 0 0 0 0 0 v 0    b1          0 1 0 0 0 − 1 0 0 0 0 v 0    b   2        0 0 0 0 0 0 − 1 1 0 0 v 0    b   3  − = 0.       0 0 0 0 0 0 0 − 1 − 1 1 v 0    b   4        0 0 0 0 0 0 − 1 0 0 0 i − i    b   1 1       0 0 0 g 0 0 0 − 1 0 0 i 0  2   b    2  3   b   3 0       0 0 0 g 0 0 0 − 1 0 i 0  0 0 0 0 0 g4 0 0 0 −1 ib 0 4 (13) As expected, we are facing a 10 by 10 equation set, which can be solved directly. However, in order to improve computational efficiency, let us try to compact this equation set. 2.2 Basic Nodal Equations With some simplifications we can considerably reduce the circuit tableau. For easier handling we simplify our notation (8) by appending the excitation vector to the coefficient matrix,  T  n v   A −I 0 0  b  v     0 0 A 0 = 0. (14)  b  i 0 Y Z e −1 Then we rearrange the order of variables as well as the order of equations. In terms of matrices, we are permuting columns and rows. Specifically, we move column 1 in equation (14) between columns 3 and 4. At the same time, we 16 exchange rows 2 and 3, thus getting   v  T  b − I 0 A 0  b  i     Y Z 0 e = 0. (15)  n  v 0 A 0 0 −1 As the right-hand side of the equation is zero, we are allowed to multiply the left-hand side with any nonsingular matrix expression as long as the dimensions of the matrix match. We choose a unity matrix with the addition of Y in the first column of the second row. It is not difficult to see that the inverse of such a matrix always exists, thus guaranteeing nonsingularity. After the multiplication is carried out, we get    T  b v      T  b v           0 Y Z 0 e = 0 Z YA e = 0. Y  b   T i ib  I I 0 0 −I 0 A 0 −I 0 A 0  n   n  v v 0 0 I 0 A 0 0 0 A 0 0 −1 −1 (16) By this transformation we have eliminated branch voltages vb from the equa-tion set, much as one step of a Gaussian elimination would eliminate one vari-able. The only difference is that we are working with entire submatrices instead of real numbers. Instead of the 2b + n dimensional equation set, we now have only b + n independent remaining equations,   i T b Z YA e  n  v = 0, (17) A 0 0 −1 which can be solved directly. The solution (i b and vn ) would then be sub- stituted back into (16) to obtain vb. The trick was to arrange for a zero column below the diagonal element −I in (16). If we repeat this strategy once more, we can eliminate ib as well. So this time we multiply the system by a unity matrix with the addition of −1 − AZ in the second column of the third row. Of course, we are assuming that the inverse of the impedance matrix −1 Z exists, which is not always the case, but we will return to this question later. Right now we get    T  b v         0 0 Z YA e = 0 −1  n v  − 0 T   i b I I 0 0 −I 0 A 0 AZ I 0 A 0 0 −1   v  T  b − I 0 A 0 T  ib      0 Z YA e = 0. (18) 0 −1 T −1  n v  0 −AZ YA −AZ e −1 The final reduced equation set only has n independent variables and is the well-known circuit nodal equation set, − −1 T −1 AZ YA v + AZe = 0. (19) n We can directly solve (19) for the nodal voltage vector vn. All other circuit variables can be obtained by substitution back in (18). Note that we need not 17 solve the equation sets, just evaluate the matrix expressions. As expected, the branch voltages are trivial, v T = Av . (20) b n The branch currents, on the other hand, involve the inverse of the impedance coefficient matrix, i −1 T −1 = − Z YA v + Ze. (21) b n We can postpone the discussion about the existence of −1 Z for just a little longer: fortunately, the example circuit in figure 4 does have a very convenient impedance matrix Z = −I. The inverse is trivial, so nodal equations (19) are greatly simplified, T AYAv − Ae = 0. By taking A from (9) and Y and e from n (12), we can directly formulate the nodal equation set for the example circuit, g − g v i 2 2 n1 1 − = 0. (22) −g2 g2 + g3 + g4 vn 0 2 ˜˜˜ However, in computer programs like SPICE OPUS mathematical expres-sions are not necessarily implemented directly into algorithms. In this case the computer algorithm will cut a few corners. The coefficient matrix in nodal equations is T AYA . Suppose the network is composed exclusively of admittances y i, which makes Y a diagonal matrix. We know that the incidence matrix A only has 1 and −1 nonzero elements. So postmultiplying A with Y will multiply column i of A with the corresponding y i from Y. Let us assume that the admittance y in branch i is connected to node p and m. The multiplication product (AY) will have the same dimensions as A, moreover it will have the same nonzero-element pattern. Only now all nonzero values are multiplied by the respective admittance   0     0 0   . .  p,i     p,i  . . . +1 . . . . 0 . . . + y . . .       0 0 0 y 0 0 = 0 .    i,i          . . . − 1 . . . . . −  m,i     m,i  . . . y . . . 0 0 n × b n×b 0   0 . b×b Thus, in the final coefficient matrix ( T AY ) A the entries are the inner prod-uct of a row of ( T AY ) with a column of A . But the columns of (AY) are the rows of T A , so the entry corresponds to the inner product of two rows of A resulting in a perfectly symmetric matrix   0 .. ..    p,i  . . . + y . . . . .     0 0 +1 0 − 1 0 =    i,p i,m      . . . − y . . . .. ..  m,i  . . 0 b×n n ×b 18   .. .. . .   . . . + y . . . − y . . .  p,p p,m  =   . (23)  ..   .. . .    . . . − y . . . + y . . .  m,p m,m    .. .. . . n×n We have arrived at a simple algorithm for building nodal equations directly from admittances. An admittance y in branch i connected to node p and m will add y to the p-th and m-th diagonal as well as −y to the p-th and m-th anti-diagonal. Sounds confusing? Let us go back to the circuit in figure 4 and its nodal equations (22). Sure enough, g2 between nodes 1 and 2 is contributing to the diagonal and anti-diagonal 1 and 2. The other conductances g3 and g4 are connected from node 2 to ground and therefore only add to the diagonal 2. The computer algorithm can be extended in a similar way to any other linear element, not just admittances. Anyway, the nodal equation set (22) is much easier to solve than the entire circuit tableau (13), but the trick is to have a simple impedance coefficient matrix Z which is easily inverted. With resistive devices in branches we always have the choice of expressing the relation in admittance or impedance form. By choosing the admittance formulation, we automatically get −1 on the respective diagonal of Z. Independent current sources will give us a −1 diagonal as well. Independent voltage sources, on the other hand, will necessarily cause a zero column and a zero row, thus rendering a singular Z without an inverse. In order to solve this kind of problem, we need to modify the nodal equations. 2.3 Modified Nodal Equations As we have seen in the previous section, any nodal approach is greatly simplified if we can provide a unity impedance coefficient matrix Z = −I. So let us separate the voltage sources that cannot be expressed using Z = −I right from the beginning by splitting the incidence matrix into two parts A = [A1A2], where A1 represents all branches containing independent voltage sources and A2 all other branches. Consequently, relational equations (5) are also split two ways, Y 0 v Z 0 i e 1 b1 1 b1 1 + − = 0. (24) 0 Y2 vb2 0 Z2 ib2 e2 Consider all independent voltage sources indexed 1 and everything else in-dexed 2. Branch equations for independent voltage sources are trivial vb1 = e1 rendering Z1 = 0 and Y1 = I. With these changes the complete circuit tableau from (14) becomes  T  n v   A −I 0 0 0 0 1 v     2 0 − I 0 0 0 AT  b1    b2 v  1 2    0 0 0 A A 0 = 0. (25)    b1  i  1   0 I 0 0 0 e  b2  i 0 0 Y2 0 Z2 e2 −1 19 Similar to the manipulations in the previous section, we permute column 1 between columns 5 and 6 and row 3 to the bottom. The equation set is then multiplied by a nonsingular matrix in order to eliminate branch voltages vb1 and vb2,     T b v   I 1 0 0 0 0 − I 0 0 0 A 0 T  b2 1 v       2 0 I 0 0 0 0 − I 0 0 A 0      b1  i    1   I 0 I 0 0 I 0 0 0 0 e = 0. (26)      b2  i  2   2 2 2   0 Y 0 I 0 0 Y 0 Z 0 e  n  v 0 0 0 0 I 0 0 A1 A2 0 0 −1 So far the result is not much different in comparison to the elimination step (16) in the previous section,   v  T  b1 − I 0 0 0 A 0 1 v  2    − I 0 0 A 0 0 T  b2  T   b1  i  1 1   0 0 0 0 A e = 0. (27)    T b2  i  2 2 2 2   0 0 0 Z Y A e 0 0 A1 A2 0 0 1  n  v − As Z1 = 0 we now have a missing pivot in row 3. To avoid singularity we exchange columns 3 and 4 and move row number 3 to the bottom to obtain   v  T  b1 − I 0 0 0 A 0 1 v  2    − I 0 0 A 0 0 T  b2  T   b2  i  2 2 2 2   0 0 Z 0 Y A e = 0. (28)    b1  i  2 1    0 0 A A 0 0 0 T  vn  0 0 0 A e 11 −1 We cannot eliminate the entire branch current vector anymore; hence, we settle for ib2 only,   v    T  b1 I 0 0 0 0 − I 0 0 0 A 0 1 v       2 I 0 0 0 0 − I 0 0 A 0 0 T  b2      b2  T i    2 2    2 2 0 0 I 0 0 0 0 Z 0 Y A e = 0,  2   2 2 1    v 0  −1     b1  i 0 −A Z I 0 0 0 A A 0 0 0 T  n  0 0 0 I 0 0 0 0 A e 1 1 −1 (29) to obtain   v  T  b1 − I 0 0 0 A 0 1 v     2 − I 0 0 A 0 0 T  b2  T   b2  i  2 2 2 2    0 0 Z 0 Y A e = 0. (30)  −1 T −1   i b1   1 2  2 2 2 2 2 2   0 0 0 A − A Z Y A − Z A e 0 T  n v  0 0 0 A e 11 −1 The two bottom lines actually represent an independent equation set, which is often referred to as the first modification of nodal equations. Besides the nodal 20 voltages vn , it also involves branch currents of independent voltage sources ib1. Assuming all devices, apart from independent voltage sources, may have a clean admittance form, we can set Z2 = −I, yielding T A 1 2 2 2 A Y A i A e b1 2 2 − = 0. (31) 0 T A v 1 n e 1 One might be tempted to interpret the zero below pivot A1 as an elimination of i b1, but note that A1 is not a square matrix. All unknowns are therefore independent and must be solved for simultaneously. We have arrived at an equation set of the order n + b 1, where b1 is the number of independent voltage sources in the circuit. The basic nodal equations are easily recognized on the top right-hand side of the matrix in equation (31). They are now framed by b 1 additional unknowns and supplemented with b1 equations. Normally a circuit will only have a few voltage sources, so the order of the set of equations is only insignificantly larger than the basic nodal equations (19). Looking at any default output of SPICE OPUS, you will notice a listing of all nodal voltages plus currents through independent voltage sources. So far we have covered both kinds of independent sources as well as arbitrary resistive branches. Next we need to consider controlled sources as well. Voltage-controlled current sources will have −1 on the diagonal of Z and an extra-diagonal nonzero coefficient in Y, so they would be included in Y2 in the modification of (31). Voltage-controlled voltage sources will cause an empty row in Z and an additional nonzero coefficient beside the diagonal 1 in the respective row of Y. Similar to independent voltage sources, they belong in Y1. Current-controlled current sources will have −1 on the diagonal of Z in addition to a nonzero element in the respective row. This does not cause singularity, but the inverse of Z is not that trivial anymore. Basically, they can be included in Y2. But the problem of current-controlled voltage sources still exists. They have some nonzero coefficient in the respective row of Z, but not on the diagonal. They may or may not cause singularities, depending on the circuit topology. To solve this problem, we would need to introduce yet another modification of the basic tableau. The procedure is not difficult, but it is rather complicated because the relational equations now need to be split four ways. In order to include all types of controlled sources, SPICE OPUS is actually based on a more complex modification than (31). However, the order of the set of equations and all its properties are heavily dominated by the basic nodal equations, so we will focus on these equations in the following section. Remember that all the procedures described next will equally apply to any modification. ˜˜˜ Before continuing, it is time for a practical example to illustrate all the theory so far. Consider the simple BJT amplifier in figure 5. It has a 12V voltage supply, a 1µA current input with a respective impedance, a voltage divider to set the transistor bias, a 2KΩ pullup resistor and a 10KΩ load resistor. There is no feedback loop and the bipolar junction transistor is operating in the classic common emitter mode. The circuit really doesn’t need more explanations. 21 2K 10K 2K 100E 12V 1µA 80E 10K Figure 5: A simple transistor amplifier. We want to see how the example will fit into (31). This chapter is about linear resistive networks, so we obviously need to replace the nonlinear transistor with a respective linear model prior to populating the equation set (31) with specific numbers. Suppose our BJT has a current gain of 100, an input resistance of 20KΩ and an output impedance of 50KΩ. A corresponding linear model depicted in figure 6. C B C i b B 20K 100ib 50K E E Figure 6: Replacing the transistor with a linear model. There is a current-controlled current source in the model going against our assumption of Z2 being a unity. Luckily, we can transform it to a voltage-controlled current source, which will fit perfectly in our modified nodal equa- tions. Since vbe i b · 20KΩ = v be , we can use instead of 100ib to control the 200 current source. Next, we replace all resistances with conductances, add labels to the circuit nodes and arrive at the network in figure 7. vn3 g1 g3 g8 0.5mS 0.1mS 0.5mS g2 e1 10mS v n v 1n v 2n4 12V i1 g4 g v 5n g6 g 27 1µA 12.5mS 50µS 200 20µS 0.1mS Figure 7: The nodal-equation-friendly network. 22 At this point we could simply set up A, Y, e and do the matrix multi- plications required by (31). That would be the correct formal way. But we would rather demonstrate the building of nodal equations directly from branch elements is discussed in (23). Armed with this knowledge we will illustrate the algorithm for composing modified nodal equations directly from the circuit network. In our case in figure 7, SPICE OPUS first parses the input netlist and stores all linear elements in an internal data structure. So it knows there are 4 nodes and 11 branches in the circuit. One branch holds a voltage source, therefore the final equation set will include 4 nodal voltages and one branch current as in       0 0 0 0 0 i b1 0    n    1 0 0 0 0 0 v 0    n    2 0       0 0 0 0 v = 0 . (32)    n3 v       0 0 0 0 0 0    0 0 0 0 0 v n 0 4 The lines in the equation set indicate the dominant basic nodal equations. As expected from (31), the modification only adds one equation and one variable. Initially all values are zero, but this is the computer data space where equa- tion (31) will gradually emerge. Next, the algorithm loops through all elements in the circuit and populates the matrix. Assuming the first element to go is g1. According to (23), it will contribute symmetrically to four positions       0 0 . 5mS 0 − 0 . 5mS 0 i b1 0    n    1 0 0 0 0 0 v 0    n    2 0       − 0 . 5mS 0 0 . 5mS 0 v = 0 . (33)    n    3 0       0 0 0 0 v 0 0 0 0 0 0 v n4 0 Next comes g2, adding 10mS to the diagonal 1 and 2 as well as subtracting 10mS from the anti-diagonal 1 and 2       0 10 . 5mS − 10mS − 0 . 5mS 0 i b1 0    n1    0 − 10mS 10mS 0 0 v 0       0 − 0 . 5mS 0 0 . 5mS 0 v = 0 . (34)    n2          0 0 0 0 0 v 0    n3    0 0 0 0 0 v n 0 4 After all eight conductances in the circuit, we have an almost complete basic nodal equation set       0 10 . 5mS − 10mS − 0 . 5mS 0 i 0 b 1    n    1 0 − 10mS 22 . 65mS − 0 . 1mS 0 v 0       0 − 0 . 5mS − 0 . 1mS 1 . 1mS − 0 . 5mS v = 0 . (35)    n    2       0 0 0 − 0 . 5mS 0 . 62mS v 0    n 3   0 0 0 0 0 vn 0 4 Remains to add the voltage-controlled current source. It draws current from node 4 to ground, controlled by vn . The controlling coefficient 0.005 therefore 2 23 pops up in row 4 of column 2 of the nodal equation set       0 10 . 5mS − 10mS − 0 . 5mS 0 i 1 µ A b1    n    1 0 − 10mS 22 . 65mS − 0 . 1mS 0 v 0    n    2 1       − 0 . 5mS − 0 . 1mS 1 . 1mS − 0 . 5mS v = 0 . (36)    n3 0       0 0 . 005 − 0 . 5mS 0 . 62mS v 0    0 0 0 1 0 vn 12V 4 As expected, the 1µA current source ends up on the respective right hand side, which completes the basic nodal equations. Next, the modification part is added, that is everything with index 1 in (31). In our case this only includes the independent 12V voltage supply, which is represented with a single 1 in A1, adding the additional variable ib to the 1 equations set. One more variable means we need one more equation. You will find it right at the bottom of (36), namely vn = 12V. 3 So far, the entire process of setting up modified nodal equations has been illustrated in the way a computer algorithm would do it. But, maybe a formal definition is even more explanatory       0 g + g − g − g 0 i i 1 2 2 1 b 1 1  2 2 3 4 5 3   n 1   0 − g g + g + g + g − g 0 v 0       1 − g − g g + g + g − g v = 0 .  1 3 1 3 8 8   n    2       0 0 0 . 005 − g g + g + g v 0  8 6 7 8  n 3   0 0 0 1 0 vn e 41 24 3 Solving Linear Equations Sets In this section we will look at ways to solve sets of linear equations. In general, a linear equation set may be formulated as n linear expressions with n unknowns. Assuming a square coefficient matrix G, a constant vector c, and a vector of unknowns x, a linear equation set is formulated as Gx = c. The explicit notation of individual components thus is g1,1 x1 + g1,2 x2 + ... + g1,n xn = c1 g2,1 x1 + g2,2 x2 + ... + g2,n xn = c2 .. . gn,1 x1 + gn,2 x2 + ... + gn,n xn = cn. (37) As we have established in the previous section, SPICE OPUS is actually based on a high order modification of the basic nodal equation set, but all equation set properties are heavily dominated by the basic nodal equations, so we will assume our general equation set (37) to be formulated according to (19). Assuming T Z = − I , the coefficient matrix becomes G = AYA, the constant vector c = Ae, and the unknowns x = vn. 3.1 Sparse Matrices A nonsingular circuit must necessarily have at least one branch more than there are nodes b > n; moreover, real-life circuits will have something like two branches for each node b ≈ 2n. According to (2) each row of incidence matrix Af will thus have an average of about four nonzero elements. Especially with large circuits, the reduced incidence matrix A will yield an insignificantly small lower row element average. The majority of elements in A will be zero. Matrices with mostly zero elements are referred to as sparse matrices. Just looking at the definition (5) of the admittance matrix, we observe that Y T is sparse as well. So both components of the coefficient matrix G = AYA are sparse. This should make G sparse too. Consider the arbitrary admittance branch y k in figure 3. It causes only two nonzero elements ap,k = +1 and am,k = −1 in the k-th column of A. It also places a single nonzero coefficient y k on the diagonal of Y. As we know from (23), the product AY will yield only two nonzero elements in the k-th column, namely +y k in row p and −yk in row m. Just as the multiplication by A has propagated the admittance yk column-wise, the subsequent multiplication by its transposition ( T AY ) A will additionally propagate y row-wise. Therefore, k an arbitrary admittance y k between nodes p and m only contributes to four coefficients in G, namely to the diagonal gp,p = gm,m = +yk , and to the anti-diagonal gp,m = gm,p = −yk . With approximately four branch connections to each node, we have the sum of all four admittances on the diagonal in addition to four nondiagonal nonzero elements in each row and column of G. So we have typically something like five nonzero elements per row and column in G regardless of the circuit size. A circuit with 100 nodes will give us a 100 by 100 coefficient matrix G, but of the 10,000 coefficients only about 500 will be nonzero. G definitely qualifies as a sparse matrix. 25 Consequently, a majority of coefficients in equation set (37) are equal to zero, and we can take advantage of this fact when attempting to solve (37). 3.2 LU Decomposition One way of solving linear equation sets Gx = c is by decomposing the square coefficient matrix G into the product of a lower triangular matrix L and an upper triangular matrix U, G = LU. (38) By definition, all elements above the diagonal of a lower triangular matrix are equal to zero, li,j = 0 ∀i < j. Similarly, all elements below the diagonal of an upper triangular matrix are zero, u i,j = 0 ∀i > j. Consequently, the components of the product in (38) are       g g . . . g l 0 . . . 0 u u . . . u 1,1 1,2 1,n 1,1 1,1 1,2 1,n  2,1 2,2 2,n  2,1 2,2   2,2 2,n  g g . . . g l l . . . 0 0 u . . . u       = . .. .. . . .. .. .. . . .. .. .. . . ..       . . . . . . . . . . . .       gn,1 gn,2 . . . gn,n ln,1 ln,2 . . . ln,n 0 0 . . . un,n (39) After the multiplications we get 2 2 n individual equations with n + n un-knowns, n gi,j = li,kuk,j ∀ Xi, j = 1 . . . n. (40) k=1 Obviously, the decomposition is not unique, because we have more unknowns than equations. We can freely choose the values for n variables. Usually the diagonal of L is set to 1, li,i = 1 ∀i = 1 . . . n. (41) However, something is not right! We are trying to solve a system of n linear equations ( 2 37 ) by solving a system of n linear equations (39). Did we make a bad deal? Not at all. The trick is in the triangularity of L and U. Because of all the zero elements, the sum in (40) doesn’t really have to go all the way up to n. If you study (39) closely, you’ll see that the sum only needs to go to min(i, j), because as soon as k is larger than i or j, either li,k or uk,j is zero. Therefore (40) becomes min(i,j) gi,j = li,kuk,j ∀ Xi, j = 1 . . . n. (42) k=1 Let us see where this is taking us. By resolving (42) with (41) in mind, we get the following 2 n equations. g 11 = u11 g12 = u12 g13 = u13 . . . g 21 = l21u11 g22 = l21u12 + u22 g23 = l21u13 + u23 . . . g 31 = l31u11 g32 = l31u12 + l32u22 g33 = l31u13 + l32u23 + u33 . . . .. .. .. . . . . . . 26 Remember, all elements of G are given, while all components of L and U are unknowns. The first row of equations is easy. We know all u1,i since they are simply equal to the g1,i. From the first equation of the second row we can calculate l 21, since we already know u11. As luck would have it, the second equation of the second row will give us u 22, the third will yield u23 and so on. Each equation seem to hold only one unknown. But that has got nothing to do with luck. The secret lies in the right order of calculation. By proceeding from left to right and form top to bottom we can extract exactly one l or one u in each step u11 = g11 u12 = g12 u13 = g13 . . . l 1 21 21 22 22 21 12 23 23 21 13 u = g u = g − l u u = g − l u . . . 11 l 1 1 31 31 32 32 31 12 33 33 31 13 32 23 u = g l = (g − l u ) u = g − 11 22 u (l u + l u ) . . . .. .. .. . . . . . . But that is not all. If you look closely at the above equations, you will notice, that we can just as well proceed from top to bottom and then from left to right. The formal notation for resolving (42) for the individual components of L and U is j−1 ! 1 X li,j = gi,j − li,kuk,j ∀i = 2 . . . n, j = 1 . . . i − 1 (43) uj,j k=1 and i−1 u X = g − l u ∀i = 1 . . . n, j = i . . . n. (44) i,j i,j i,k k,j k=1 In fact, we could define the same thing with an elegant series of matrix operators as discussed in section 1.4, but let us not be too enthusiastic. Anyway, the expressions look complex, but if we compute them in the right order, then each equation holds only one unknown. All li,j and ui,j are computed as a difference between the corresponding element gi,j and a sum of products of previously computed li,k and uk,j, that is, elements with lower indices k < i, j. This is very convenient, because we can store the components of L and U in the original matrix G as we proceed. Note that the unit diagonal of L does not need storing. The decomposition can proceed row-wise or column-wise, but we prefer an alternating pattern between rows and columns is in figure 8. 27 u1,1 u1,1 u k,i uk,j uk,i uk,j l i,k gi,i gi,j li,k ui,i ui,j l j,k gj,i lj,k lj,i Figure 8: The LU decomposition algorithm, proceeding alternating row- and column-wise. In each step we first finish the row, mapping gi,i . . . gi,n to ui,i . . . ui,n and then the respective column g i+1,i . . . gn,i to li+1,i . . . ln,i. This pattern is very important because G is a sparse matrix and needs special treatment, as we shall see in the next section. The decomposition of G in itself does not solve our equation set directly. Two more steps are needed: a forward and a backward substitution. Due to the rule of associativity, our equation set (38) can be computed either as (LU)x = c or L(Ux) = c. By introducing a new vector of unknowns y = Ux, we first solve Ly = c for y       1 0 . . . 0 y c 1 1  2,1   2   2 l 1 . . . 0 y c       .. .. . . .. .. ..       . . . . . . = . (45)       ln,1 ln,2 . . . 1 yn cn As L is lower triangular, the top row only holds one unknown, namely y1. With this in mind, we can extract y 2 from row two and so on. This process is called forward substitution i−1 y i = ci − X l y ∀i = 1 . . . n. (46) i,j j j=1 Knowing y, we now solve Ux = y for x. This time the coefficient matrix is upper triangular, so we employ a so called back substitution   1 n x X = y − u x ∀i = n . . . 1, (47) i  i i,j j  u i,i j=i+1 giving us the final solution x. Let us summarize. We can solve a linear equation set Gx = c, by decom- posing the coefficient matrix G = LU, according to (43) and (44) using the algorithm in figure 8. So, the equation set becomes LUx = c. Next, we intro- duce a temporary new variable y = Ux and by substitution (46) solve Ly = c for y. Finally, y is used in the back substitution (47) to solve Ux = y for x. 28 Thus, we have lined up all the math. But at what computational cost? With a little arithmetic we can establish that the LU decomposition in expressions ( n3 2 n n 43 ) and ( 44 ) demands + − multiplications and just as many additions. 3 2 3 The forward ( n 2 n 46 ) and the backward substitution ( 47 ) each need another + 2 2 operations. So we have a total of n3 2 3 n 2n + + multiplications and additions to 3 2 3 solve the equation set 3 Gx = c . With large circuits (e.g., n > 100) the n term is predominant, so we can state that the computational effort is approximately n3 . 3 So far we have discussed the process of LU decomposition from a general point of view. However from the previous section we know that our equation set will only yield approximately five nonzero elements in each row in the coefficient matrix G. This is potentially dangerous because we have divisions by ui,i in the decomposition (43) as well as the back substitution (47). Another consideration is the total number of arithmetic operations n 3 needed for the solution. This 3 number should be much lower if we employ an efficient pivoting technique as explained in the next section. 3.3 Successful Pivoting Techniques For several reasons, the key elements in the LU decomposition are the pivots, that is, the diagonal elements u i,i and li,i,. The latter are not really a problem because we have wisely chosen unit pivots in (41). However, the pivots in U are used for divisions in (43) and (47). Obviously, all pivots need to be nonzero; moreover, the pivot values must not be too small compared to all other coefficients since this can cause fatal numerical errors, as will be seen in the following section. The way to influence the pivots is by matrix permutation. Consider the component notation of Gx = c in (37). We have n individual equations with n unknowns. The order in which we write the equations surely cannot have any effect on the results and neither can the order of the unknowns. We are allowed to change the order of rows and columns in G as long as we reflect the changes in x and c. Specifically, the order of elements in x must follow the order of columns in G. Similarly, the order of c must accompany the order of rows in G. Changing the order of rows and columns in a matrix or vector is termed permutation. So let us consider the birth of pivots (44) in figure 8 from the permutation point of view. Each pivot is computed from the respective coefficient gi,i and the sum of products of all ls and gs above and to the left of it. If we ensured large, nonzero diagonals gi,i in advance, we should get nonzero pivots ui,i which are large enough not to cause fatal numerical errors. In some unlucky situations, however, the sum of products may be equal or very close to gi,i, in which case we get pivots which are either zero or too small. These rare occasions cannot be foreseen in advance. Nevertheless, the first step is a permutation of G to make sure the first pivot u11 = g11 is not zero. This is not difficult as most diagonal elements are nonzero and relatively large already. Consider the fact explained in section 3.1 that all resistive branches contribute to the diagonal. Thus, only a few permutations will be required to prepare the diagonal in G for solid pivots during the LU decomposition. 29 Let us illustrate this procedure with the example circuit from figure 5. After having set up the respective modified nodal equation set, we have arrived at the coefficient matrix G in (36). At first, we will focus on the pattern of nonzero elements only. In figure 9, all nonzero elements are symbolically represented by dark gray squares and denoted by g. 0 g g g 0 g g g 0 0 0 g g g 0 g g g 0 0 g g g g g g g g g g 0 0 g g g g 0 g 0 g 0 0 0 g 0 g 0 0 0 0 Figure 9: Fixing the first pivot in the coefficient matrix (36) by exchanging columns 1 and 4. Unfortunately, the first pivot is zero, so we really need to make some permu-tations before starting the decomposition of G. By exchanging columns 1 and 4, this problem is solved. As you can see, this leaves us with the last two pivots in G being zero, which can lead to zero pivots in U. Regardless, let us start the decomposition by alternatively applying (44) row-wise and (43) column-wise as in figure 10. g g g 0 0 u u u 0 0 u u u 0 0 g g g 0 0 l g g 0 0 l u u 0 0 g g g g g l g g g g l l g g g g 0 g 0 g l 0 g 0 g l l g 0 g g 0 0 0 0 l 0 0 0 0 l l 0 0 0 Figure 10: Mapping row 1 according to (44), column 1 according to (43) and marking three future fill-ins at g42, g52, g53. Then mapping row 2 and column 2. As expected, the fist row of U and the first column of L appear, replacing their original elements from G according to the algorithm in figure 8. Notice that the pattern of nonzero elements has not changed, however three zero elements in G are marked by dark gray squares, indicating future fill-ins. A fill-in means that an initial zero element changes to nonzero, thus reducing the sparsity of the matrix. After the second mapping in figure 10, g 42 and g52 promptly become nonzero. How come, we knew this in advance? According to (43), we know that l42 = g42 − l41u12. Since the product l41u12 is nonzero, the zero element g42 is mapped to a nonzero l42. The same goes for g52. The fate of future fill-ins is already sealed when rows and columns of G are mapped to respective rows of U and columns of L. The products li,kuk,j in (44) and (43) are nonzero only if l i,k 6= 0 and uk,j 6= 0. So, with each new line of U and column of L we just check all remaining zero elements in G whether they just got a new nonzero l to the left as well as a nonzero u above. Wherever this is the case, we have a future fill-in. 30 With this in mind we can complete the decomposition in three more steps as in figure 11. u u u 0 0 u u u 0 0 u u u 0 0 l u u 0 0 l u u 0 0 l u u 0 0 l l u u u l l u u u l l u u u l l l 0 g l l l u u l l l u u l l l 0 0 l l l l 0 l l l l u Figure 11: Three more mappings with respective fill-in predictions in dark gray squares. Luckily, the two fill-ins at g44 and g55 have provided us with nonzero pivots u44 and u55. However, as engineers we can’t rely on luck. We need to find a way to control our fill-in! But what options do we still have? In figure 9, we have initially done a column permutation in order to secure a nonzero first pivot. But who is to say that we can’t permute rows and columns during the entire process of decomposition? After each step, we can pick any nonzero g as the next pivot. For instance, if the current pivot is gk,k, but we would rather have gi,j, we just exchange row k with row i and column k with column j. It goes without saying that by permuting rows in the remaining part of G must be accompanied by the same permutations in the emerging L as well as in vector c. Likewise, the columns of U and the variable vector x must match the movements of the columns in G. This should give us the means to predict and control our fill-in. So it’s back to square one. Again, we start with the modified nodal equation set (36), only this time around we think harder before selecting the first pivot. The algorithm is called prospecting for pivots. It’s a relatively simple brute force algorithm. We know how to predict the fill-in of any given pivot, so we just check all pivot candidates. Initially, we have 15 nonzero elements in the coefficient matrix (36). In principle, we have to permute each of them to the pivot position and check for future fill-in just like we did in figure 10. However, the permutation is not even necessary. We can just skip it and analyze the pattern of nonzero elements directly in G as demonstrated in figure 12. 0 l g g 0 0 g g l 0 0 g g g 0 0 l g g 0 0 g g l 0 0 g g g 0 u u u u u g g g l g u u u u u 0 0 g g g 0 0 u u u 0 0 g g g 0 0 0 g 0 0 0 0 l 0 0 0 0 g 0 Figure 12: Checking the future fill-in for pivot candidates g32, g44 and g31. In the example we are checking 3 of the 15 nonzero elements, namely g32, g44 and g31. In the left most case we are considering g32 for the pivot position. This would map row 3 to U and column 2 to L. Fill-ins would potentially occur 31 at any position in G with a nonzero u in its column, as well as a nonzero l in its row. These position are marked by dark gray squares. This means, that g32 would add to 8 positions, from which 4 already are nonzero. Therefore g32 as pivot would cause 4 fill-ins at g11, g15, g21 and g25. Checking g44 in the same manner results in 4 fill-ins, while g31 is yielding no fill-ins at all. By looping through all 15 nonzero elements in G you can discover several candidates with no future fill-ins. For some reason, we prefer pivots with large values. So, we go for g31 = 1.00000 because it has the largest absolute value of all the 15 candidates. Figures 13, 14 and 15 visualize all the following steps of the pivoting algorithm. 0 g g g 0 g g g g g u u u u u 0 g g g 0 0 g g g 0 0 g g g 0 g g g g g 0 g g g 0 0 g g g 0 0 0 g g g 0 0 g g g 0 0 g g g 0 0 0 g 0 0 0 0 g 0 0 0 0 g 0 Figure 13: Permuting g31 to pivot position. Mapping row 1 and column 1 and prospecting for the next pivot. The largest element without future fill-in is g54 = 1.00000. u u u u u u u u u u u u u u u 0 g 0 0 0 0 u 0 0 0 0 u 0 0 0 0 g g g 0 0 l g g 0 0 l g g 0 0 g g 0 g 0 l g 0 g 0 l 0 g g 0 g g g 0 0 l g g 0 0 l g g 0 Figure 14: Permuting g54 to pivot position. Mapping row 2 and column 2. Finding and permuting the next pivot without fill-in, which is g34 = 0.01050. u u u u u u u u u u u u u u u 0 u 0 0 0 0 u 0 0 0 0 u 0 0 0 0 l u u 0 0 l u u 0 0 l u u 0 0 l 0 g g 0 l l g 0 0 l l u 0 0 l l g 0 0 l 0 g g 0 l 0 l u Figure 15: Mapping row 3 and column 3. Finding and setting the next pivot, which turns out to be g54 = 0.02265 and completing the decomposition. Our pivoting technique has paid off well. Not only have we ensured nonzero pivots ui,i, we have also greatly reduced the overall matrix fill-in. Consider the outcome of the decomposition with pivoting in figure 15 in comparison to the earlier outcome without pivoting in figure 11. Without pivoting we had 6 fill-ins 32 and no control over the pivot values. With pivoting we see no fill-in at all, plus we are getting large pivots. Admittedly, the pivoting algorithm has been demonstrated on the extremely small circuit from figure 5, which is by no means representative. In fact the circuit is so small that the sparsity of its coefficient matrix is barely recognizable. Even so, the example is illustrative. It turns out that the presented pivoting algorithm is even more efficient when applied to realistic analog circuits with up to 50 nodes. Notice, that so far we have not performed any actual calculations. All we have done in figures 13, 14 and 15 is simulating the nonzero element patterns during the decomposition. You could call it a decomposition dry-run. Its out-come is actually an optimized permutation of the original equation set Gx = c. The exact pivoting algorithm as it is implemented in SPICE OPUS is rather complex, so only the basic flowchart is presented in figure 16. Scan all nonzero coefficients g and tag the start G candidates that are large enough to become pivots. i = 1 Loop through all candidates g and U count the number i = i + 1 of fill-ins. Select the one with the least fill-in. L G Permute the no selected g to U done yes pivot position i = n ? and add all future fill-ins to G. L G Figure 16: The pivoting algorithm simulating the nonzero pattern of the de-composition. 33 Keep in mind that the pivoting algorithm is run before the actual decom-position. It simulates the decomposition fill-in only. There is no information regarding the actual coefficient values, except for the initial G values. Con-sequently, the fill-ins are not pivot candidates. Nevertheless, the algorithm performs sufficiently in the majority of real cases. The weak spot however might pop up during the subsequent decomposition if some pivot candidate is reduced too much or some fill-in becomes too large compared with its pivot. There are several advanced algorithms solving these kind of problems, but they would exceed the scope of this document. 3.4 Numerical Error Control Finally everything is set to actually solve the modified equation of our example circuit in figure 5. So, we take the equation set (36) and permute it using the quick fix for the first pivot as in figure 9, or the fill-in optimized permutation from figure 15. Either way, we just alternatively run (44) and (43) to obtain L and U. Then, we run a forward substitution (46) to get y followed by a back substitution (47) calculating our final result x. At this point, we are a bit worried about numerical errors during the entire process of LU decomposition. However, with such a small and simple circuit we surely get perfect results any way around. So we will not do the calculations just yet. In the previous section we kept stressing the fact that pivots should not be too small compared to all other coefficients. Now we will explain why a small pivot may cause fatal numerical errors and discuss how small is too small. Numerical error analysis is a very difficult subject in mathematics, so we will simplify things as much as possible. SPICE OPUS represents real numbers in the 64 bit double precision IEEE floating point format. This involves a 52 bit significand with an 11 bit expo-nent. Consequently, real numbers have a 10−16 relative round-off error. Our coefficients in G are based on model parameters with relative errors anywhere from 0 −16 . 5 to, say, 0 . 001. Thus, the 10 round-off error is negligible. We further know that each arithmetic operation will produce a round-off error in the 10−16 order of magnitude. So even if we had the worst case of error accumulation it would take something like 1014 operations to endanger the numerical quality of the final result. With n3 operations per LU decomposition 3 this would take a circuit of over 60,000 nodes with a nonsparse coefficient matrix. So the round-off errors remain at a safe distance at the far end of the significand. However, there is a situation where the round-off errors can completely flood the entire significand. The root of this evil is the subtraction of two floating point numbers which are very close in value to each other. Suppose that we only had a 5 digit significand. Then the simple subtraction 5 3 3 . 2356? · 10 − 5 . 2353? · 10 would yield something like 3 −1 . ????? · 10. A general rule says that if we see the exponent of a sum drop several magnitudes below the largest exponent of any summand, then the round-off error advances for just as many magnitudes in the significand. To illustrate this, let us run the decomposition on (36) with just the basic pivot fix from figure 9. In order to study the round-off error propagation we need to scale the floating point format down to our circuit. Instead of calculating with a 16 digit significand we go with only 3. Our initial equation set Gx = c 34 thus becomes       − 0 . 500 10 . 5 − 10 . 0 0 0 v n3 0 . 001    n    1 0 − . 100 − 10 . 0 22 . 7 0 0 v 0    n    2 1       . 10 − 0 . 500 − 0 . 100 1000 − 0 . 500 v = 0 . (48)       − 0 . 500 0 5 . 00 0 0 . 620 i 0    b1    1000 0 0 0 0 vn 12000 4 To make the equation as illustrative as possible, we have multiplied the equation set by the factor of 1000, which doesn’t change anything in terms of numerical error. Now, we run the decomposition rounding the significand to 3 digits after each addition and multiplication. The result Ly = c is ready for the forward substitution       1000 0 0 0 0 y 0 . 001 1    2   200 1000 0 0 0 y 0       2200 − 1980 1000 0 0 y = 0 . (49)    3         1000 920 − 300 1000 0 y 0    4   −2000000 −1830000 954000 −3180000 1000 y 5 12000 So far, everything is looking just fine. Notice that most elements have increased during the decomposition by several magnitudes. This is the direct consequence of our relatively small pivots in (48). During the decomposition all elements of L have been obtained by division with pivots, hence the large values. The backward substitution Ux = y is also facing large elements in the lower part of U       − 0 . 500 10 . 5 − 10 . 0 0 0 v n3 y 1    n1  2 0 − 12 . 0 25 . 0 0 0 v y       0 0 27 . 0 1000 − 0 . 500 v = y . (50)    n2  3       0 0 0 300 0 . 470 i y    b1   4 0 0 0 0 1980 v n y 45 This comes as no surprise, since large values naturally propagate as G is grad-ually mapping to L and U. After performing the forward and back substitution, we finally arrive at the solution with a bit of a surprise    −6        y 1 n3 1 . 00 · 10 v 11 . 9V 12 . 000V  2    n      1 y −7 2 . 00 · 10 v 0 . 961V 1 . 0733V  3    n      2 y    −        6 = 1 . 82 · 10 , v = 0 . 463V 6 = 0 . 52683V . (51)  4    b1 y    −        7 − 2 . 81 · 10 i − 9 . 55mA − 9 . 8963mA      y 1 5 n4 1 . 20 · 10 v 6.09V 5.4288V As we can see, x turns out to be far off the correct solution on the right-hand side. Observing the computation of y 5 according to (46), we can see that the entire sum, after all round offs, equals zero. In other words y5 is completely independent of all y1 . . . y4. This can’t be good, y5 is probably completely wrong. Consequently, the subsequent back substitution is doomed, but not only because it depends on a wrongly calculated y5. It includes more exponent reductions by addition. The fate of the substitutions has been decided right at the beginning of the decomposition. The first pivot g11 = −0.500 is actually one of the smallest of 35 all nonzero elements in G. Many elements of L are divided by g11 in (43), which makes them large. These large values then propagate to U as well (44). This is why most elements have increased by several orders of magnitude during the decomposition. In itself this is not a problem. However, what goes up must come down, as the final results are independent of the pivot selections. And there are only two ways for coefficients to come down, either by multiplication (division) or by addition (subtraction). Multiplication would be numerically safe. Unfortunately, this is not the case for all coefficients. ˜˜˜ What can we do to avoid all this? The numbers are defined by the circuit at hand. We don’t want to change the circuit, so we can’t change the numbers. But we can change the order of things! Let us bake a loaf of bread. The recipe for the dough calls for 100g of starter, 250g of water, 450g of flour and 5g of salt. Since we want to keep dirty dishes to a minimum, we put a single bowl on a scale, reset the scale reading and start adding the ingredients. We pour sourdough stater into the bowl until the reading is 100g. Next we add water until the scale shows 350g and flour until we get 800g. Finally we pour salt into the bowl until the reading is 805g. Any engineers will wince at this! Why? The amount of salt we’ve actually just added is determined by subtracting the last two readings, namely 805g−800g=5g. The no go, a subtraction of two floating point numbers which are very close to each other in value! The standard digital kitchen scale is accurate down to 0.2%. So a reading of 805g could be anything between 803g and 807g, which is no problem so far. However, after the subtraction we end up with anything between 4g and 6g of salt in our bread, which is only 20% accurate. A remedy is simple. By choosing a smart weighing order of ingredients, we can easily avoid the fatal subtraction. We start with the lightest ingredient (the salt) and work our way to the heavy ones. The target readings would be 5g, 105g, 355g and 805g. This time around, the accuracy of all ingredients is better then 0.5%. So, from baking bread we have learned that order matters greatly. Back to our circuits, where things are not that obvious. The idea is to avoid large coefficients during the decomposition by keeping the pivots as large as possible. Let us see what happens if we use the fill-in optimized permutation from figure 15. We get a nice string of large pivots to start with       1000 1 . 10 − 0 . 500 − 0 . 100 − 0 . 500 i b1 0    n    3 0 1000 0 0 0 v 12000    n    1 0       − 0 . 500 10 . 5 − 10 . 0 0 v = 0 . 001 . (52)    n2 0       − 0 . 100 − 10 . 0 22 . 7 0 v 0    0 −0.500 0 5.00 0.620 vn 0 4 Moreover, we know there won’t be any fill-ins during the decomposition. Sure enough, the numerical values of Ly = c remain in the same order of magnitude       1000 0 0 0 0 y 0 1    2   0 1000 0 0 0 y 12000       0 − 0 . 500 1000 0 0 y = 0 . 001 . (53)    3         0 − 0 . 100 − 95 . 2 1000 0 y 0    4   0 −0.500 0 38.2 1000 y5 0 36 The same goes for the elements of Ux = y       1000 1 . 10 − 0 . 500 − 0 . 100 − 0 . 500 i y b1 1    n   2 3 0 1000 0 0 0 v y    n   3 1 0       0 10 . 5 10 . 0 0 v = y . (54)    n   2 0       0 0 13 . 1 0 v y 4 0 0 0 0 0.620 vn y 45 Next, both substitutions are run to obtain the final results           y 1 b1 0 i − 9 . 90mA − 9 . 8963mA  2    n      3 y 1 1 . 20 · 10 v 12 . 0V 12 . 000V  3    n      1 y    −        3 = 6 . 00 · 10 , v = 1 . 07V ≈ 1 . 0733V (55)  4    n2 y    −3       6 . 91 · 10 v 0 . 527V 0 . 52683V      y −3 5 n4 3 . 36 · 10 v 5.42V 5.4288V This time the results are as accurate as they can possibly be considering that we have used a 3 digit significand only. For simplicity, the involved example has been extremely downsized, which makes it unrealistic. However, the principle of round-off error propagation is realistic. The strategy of keeping pivots as large as possible by permutation actually works well in the majority of cases. However, it is not a magic wand–in some special cases large pivots can do more harm than good. There are two simulator parameters in SPICE OPUS to control the pivot selection criteria: an absolute tolerance pivtol and a relative one pivrel with respective default values 10−12 −3 and 10. The pivot candidates in figure 8 must meet both conditions |g i,i| > pivtol |g i,i| > pivrel · |gj,i| ∀j = i + 1, i + 2, . . . , n. (56) Note that the relative tolerance is checked only for elements below the pivot, because only L components are divided by the pivot (43). Although the default pivot simulator parameters can be changed, even advanced users are strongly discouraged to alter these settings, as unpredictable numeric behavior may oc-cur. 37 4 Introducing Nonlinear Devices So far we have seen how to set up a linear equation set from a circuit netlist and how to efficiently solve the equations. We have allowed only linear devices, which is a tough assumption in circuit design. Now we are ready to take the next step by introducing nonlinear devices. In general, we cannot express branch currents and branch voltages explicitly anymore. We now have to use the general expression R(v b, ib) = 0 (57) instead of (5). R(vb, ib) is a column vector of functions of branch volt-ages (vb) and branch currents (ib) describing the nonlinearity contained in each branch, R 1(vb , v , i . . . i 1 b . . . v 2 b , i b b 1 bb ) = 0 2 b R 2(vb , v . . . i ) = 1 b . . . v 2 b , i b b , i 0 1 b 2 b b .. . R b(vb , v , i 1 b . . . v 2 b , i b b . . . i 1 b ) 2 b = 0. b Of course, real circuits will not include devices with functional dependences of all circuit variables, so the above equations are the most general case. Also, under normal circumstances not all branches include nonlinear devices; hence, many of the above functions will actually be simple linear expressions as in (5). Nevertheless, our complete general nonlinear equation set reads  T  A v − v n b  b  Ai = 0 (58) R(vb, ib) instead of the more explicit form in (8). Although we still have a linear part, which describes the circuit topology, we cannot attempt to solve this equation set analytically. In special cases one can find direct solutions, but in general we have to resort to iterative techniques. 4.1 Fixed Point Iteration One of the most fundamental principles for solving nonlinear equations with computers is fixed point iteration. The algorithm is suitable for solving equa-tions of the form x = f (x), (59) not unlike our problem in (58). Equation (59) is solved by computing the sequence x(k+1) (k) = f ( x), where k = 0, 1, 2, . . . , m. (60) Obviously, the iteration series needs a starting point (0) x and some conver-gence criterion to determine the necessary number of iterations m. We assume that a value for (m) (m −1) m exists such that | x − x| < , for an arbitrary . The 38 last iteration (m) ( ∗) x is then accepted as the approximation of the solution x to (59). If we are to use fixed point iteration to solve our nonlinear equation set (58), which has the form g(x) = 0 instead of (59), we need to transform it slightly. The idea is to add x to both sides of the equation to obtain x = x + g(x). (61) Let us look at an example of fixed point interation. Assume the simple function 5 x ( ∗) 5 g ( x ) = − e . In this case we know the solution of ( 61 ) as x = ln. 3 3 Further assume that the starting point (0) x = 0.1 and the convergence criterion is −4 = 10. 0.7 (0) (1) f ( x ) x f (2) (3) ( x ) x 0.6 0.5 0.4 (1) f ( x) x(2) 0.3 0.2 0.1 x (0) 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Figure 17: Fixed point iteration convergence. In figure 17 you can see a straight line representing the left-hand side of (61) and the curve depicting the nonlinear right-hand side. The iteration sequence starting at 0.1 needs 21 steps for convergence, x (0) = 0.1000000 x (1) = 0.6614957 x (2) = 0.3904740 x (3) = 0.5794596 x (4) = 0.4610527 .. . x (20) = 0.5107529 x (21) = 0.5108741. The accepted approximation (21) x = 0.5108741 is not far from the solution x( ∗) 5 = ln = 0.5108256. Although this looks promising, there are some 3 problems we should consider. 39 First, there is the possibility that (61) does not have a solution. This means that there is no intersection between the two curves in figure 17. In this case any iteration is irrelevant. Second, there could be more than one solution, that is, more than one intersection of the curve and the straight line. However, knowing that our equation set is describing a correct electrical circuit with a stable solution, we can usually assume exactly one solution to the equation set. An exception would be a circuit with multiple operating points, like a Schmitt trigger or a flip-flop, where several stable and unstable solutions exist. In these cases fixed point iteration would find only one of the stable solutions, depending on the starting point (0) x. So it is the responsibility of the engineer to ensure a correct circuit and manually supply a starting point if one is needed. A more serious problem is the question of convergence itself. It is well known that fixed point iteration works only if the absolute derivative of the curve at the intersection is less than unity, 0 (∗) 0 ∂f (x) f (x ) < 1 where f (x) = . (62) ∂x Moreover, the speed of convergence depends on the absolute derivative at the intersection. The closer it gets to zero, the faster the convergence. In our example we have the derivative 0 0 x f ( x ) = 1 + g ( x ) = 1 − e, which at the intersection is 0 (∗) 2 f ( x ) = − . This ensures convergence, as seen in figure 17, 3 even if the iteration sequence is a little slow. Let us now consider the function x g ( x ) = 5 − e. The solution of (61) now is (∗) x = ln(5). We still have the same expression for the derivative, but the gradient at the solution now is 0 (∗) f ( x) = −4, which is well outside the conver-gence interval [ (0) − 1 . . . 1]. Even if we start very close to the solution x = 1.55, our fixed point iteration now fails to find the solution, as seen from figure 18. 3.5 3.0 2.5 2.0 (0) (1) f ( x ) x 1.5 (0) x 1.0 f (1) ( x) 0.5 0 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 Figure 18: Fixed point iteration divergence. We have a clear divergence as a consequence of the steep intersection angle. 40 In electronic circuits we expect a multitude of highly nonlinear devices, so we clearly cannot hope for the basic fixed point algorithm to converge. We need an improved algorithm. 4.2 Newton–Raphson Algorithm Let us return to transformation (61) for a moment. We had to add x to both sides of the equation in order to get the right equation form for fixed point iteration. Keep in mind that the original equation set (58) reads g(x) = 0. So before adding x we can multiply both sides by any arbitrary function k(x) without altering the solution. The only condition is that k(x) be nonzero at the solution (∗) x. In general, our equation now is x = x + k(x)g(x). (63) So we have some freedom of choice regarding k(x). We can ensure conver- gence (62) and ideally even speed up the iteration sequence. The best thing would be to achieve a zero gradient at the point of intersection. Take the right-hand side of (63), f (x) = x + k(x)g(x), and formulate the first derivative 0 0 0 0 f ( x ) = 1 + k ( x ) g ( x ) + k ( x ) g ( x ). We want f(x) to be zero at x ( ∗) (∗) 0 (∗) (∗) 0 (∗) = x . Knowing that g ( x ) = 0, we get f ( x ) = 1 + k ( x ) g ( x) = 0, from which it follows that 0 −1 k ( x ) = − g ( x ). Equation (63) now becomes x = x − . (64) 0 g(x) g (x) With a properly defined circuit there is no danger that 0 g (x) could be zero at (∗) x = x, as this would mean multiple adjacent solutions. This modified equation ensures that fixed point iterations always converge; moreover, the convergence should be very fast. Applying fixed point iteration to this transformation is known as the Newton–Raphson algorithm. Before continuing, let us see how Newton–Raphson iterations will solve our problem from the previous section, x g ( x ) = 5 − e. After the transformation we have to solve −x (0) x = x − 1 + 5 e . Assuming the starting point x = 1.1 and the usual convergence criterion −4 = 10, we only need a sequence of four steps, x (0) = 1.100000 x (1) = 1.764355 x (2) = 1.620841 x (3) = 1.609503 x (4) = 1.609438. As expected, we can see the zero gradient intersection in figure 19 guaran-teeing convergence and speeding up things nicely. But there is always a price to pay. Instead of solving the simple equation g(x) = 0 we need to compute the complex expression (64) in each iteration. The expression involves the inverse of the first derivative of g(x). So we are looking at a considerable additional computational effort, but we gain shorter iteration loops compensating the extra effort. Most importantly, we now have a guarantee for convergence. Or do we? We have ensured a zero gradient, but this holds only in the close neighborhood of 41 1.9 1.8 (0) (1) f ( x ) x 1.7 1.6 (1) f ( x) 1.5 1.4 1.3 x(0) 1.2 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 Figure 19: Newton–Raphson iteration convergence. the solution. In other words, as long as the starting point (0) x is close enough to the solution (∗) x, the convergence really is guaranteed. However, the right-hand side of (64) might not be well behaved outside the immediate neighborhood of the solution. In these cases Newton–Raphson iterations still can experience convergence problems. In real circuit simulations this is not a very likely scenario but is by no means unheard of. Circuits with strong feedback loops, for instance, tend to diverge, so it is very important to have methods which bring the iterations close enough to the region where the Newton–Raphson algorithm is convergent. Finally, let us discuss a special situation where the multiplier k(x) introduces additional solutions to 0 −1 k ( x ) g ( x ) = 0. We have selected k ( x ) = − g ( x ). So we are concerned by the fact that at some x the derivative becomes infinite. This would introduce a new and unrealistic solution. However, an infinite gradient could only be caused by a nonlinearity in (58) having zero conductance at some voltage. This is something that we can control on the circuit level, so we do not really have to worry about it here. Now we are finally ready to apply Newton–Raphson iteration (64) to our equation set (58). So far we have discussed iterative algorithms for one-dimensional problems. We need to extend them to spaces of arbitrary dimensionality. In-stead of a single real number, we now consider x an n-dimensional column vector of variables x 1, x2 . . . xn and g(x) a respective column vector of n nonlinear func-tions g1(x1, x2 . . . xn), g2(x1, x2 . . . xn) . . . gn(x1, x2 . . . xn). Our first derivative of 0 g(x) becomes a square matrix of all partial derivatives, also known as the Jacobi matrix J(x), 42  ∂g1(x) ∂g1(x) ∂g1(x)  . . . ∂x1 ∂x2 ∂xn g 0  ∂g2(x) ∂g2(x) ∂g2(x)  . . .  ∂x 1 ∂x 2 ∂x n  ( x ) = J ( x ) =   . (65) ..  .. ..  . . .   ∂g n(x) ∂gn(x) ∂g (x) n . . . ∂x1 ∂x2 ∂xn Let us substitute the general expressions for the Jacobi matrix with our circuit tableau notation, where    T  v A v − v n n b x = vb  and g(x) =  Aib  . (66) ib R(vb, ib) In general, the Jacobi matrix appears in the following form:  T T T ∂ ( A v n − v b ) ∂ ( A v n − v b ) ∂ ( Avn−v  b ) g0 ∂(Ai  ∂v n ∂vb ∂ib (x) = b) ∂(Aib) ∂(Aib)  . (67) ∂  v n ∂vb ∂ib  ∂R(vb,ib) ∂R(vb,ib) ∂R(vb,ib) ∂v n ∂vb ∂ib We can partially simplify this expression, because the topological part of the circuit tableau is linear. We get  T  A − I 0 g0(x) = 0 0 A . (68)   0 ∂R(v ,i ) ,i ) b b ∂ R ( v b b ∂vb ∂ib This expression is very similar to a linear equation set as in (8). Actu-ally, we are looking at the linear equivalent of our nonlinear circuit, where all nonlinearities are replaced by gradients at a specific operating point vb, ib. Unfortunately, we will have to compute the inverse of the Jacobi matrix if we want to use the Newton–Raphson iteration form (64). The only alternative is to solve a linear equation set instead. Let us first apply the iterative algorithm to (64), x(k+1) (k) 0 (k) −1 (k) = x − [ g ( x )] g ( x). (69) Let us multiply both sides of this equation by 0 g(x). We have to be careful because 0 0 g ( x ) is a square matrix, so we must premultiply by g(x), g0 (k) (k+1) 0 (k) (k) (k) ( x ) x = g ( x ) x − g ( x). (70) The result looks a bit confusing, because the left-hand side now includes the variable vector from the current iteration (k) (k+1) x as well as the new one x. Keep in mind that the latter is to be computed from (70). We are actually looking at a linear equation set of the type Gx = c. The entire right-hand side is known from iteration k and so is the Jacobi matrix on the left-hand side. From (70), (66), and (68) we have  (k) T  A −  (k+1) (k)  T I 0 v n A −   (k)  T (k) I 0 v n A v n − v b    b     b   b  0 0 A v = 0 0 A v − Ai . 0 ∂R ∂R ∂R ∂R i 0 i R ∂ b b v b ∂ i b ∂ v b ∂ i b (71) 43 For simplicity, we have dropped the explicit functional dependence of R on vb, ib. For the same reason, the iteration superscripts are symbolically placed outside the matrices rather than at each variable instance. We can further resolve part of the right-hand side, so that equation (71) simplifies to  (k) T   A − (k+1) (k)   I 0 v 0 n    b    0 0 A v = 0 . (72) 0 ∂R ∂R ∂R ∂R i v + i − R ∂ b b b v b ∂ i b ∂ v b ∂ i b The algorithm of (72) resembles our well-known linear equation set in (8). The topological part is identical, but the branch equations are obviously a lin-earization at the point of the previous iteration. This can be illustrated sepa-rately as ∂ v + i = v + b b b ib − R. (73) v ∂R (k+1) ∂R (k+1) ∂R ∂R b b b b ∂ i ∂ v ∂ i All expressions above originate from the k-th iteration except the variable vectors vb and ib marked with (k + 1). The expression clearly describes a 2b dimensional plane touching the branch curves at the point where (k+1) (k) v = v b b and (k+1) (k) i = i. The plane is tilted in the same direction as the curve at the b b point of inflection. In other words, we have a tangential plane at (k) (k) R ( v , i). b b ib (mA) 30 20 R (k) (k) ( v , i) b b 10 R (k+1) (k+1) ( v , i) b b 0 -10 600 610 620 630 640 650 660 670 680 vb (mV) Figure 20: Newton–Raphson equivalent device. To illustrate this point, consider a simple diode with the branch relation R(v b, ib) = Is(e vt −14 vb − 1) − ib = 0 where I s = 10 A and vt = 26mV. In figure 20 this nonlinear relation is depicted by a curve. Substituting R in equation ( I vb (k+1) (k+1) s 73 ) with the diode expression, we get a straight line e vt · v − i = v t b b I vb ( vk+1) b vt s e ( − 1) + I s touching R ( v b , i b ) = 0 at v = vb. v t b 44 Suppose the branch voltage in iteration k is vb = 0.67V. We get a line representing the diode as a linear element for the duration of iteration k + 1. After solving the linear equation set (72), we get a solution somewhere on that line. Suppose we have obtained v b = 0.63V. The next iteration is going to be based on a linearization at that point. The iterative process will continue in this manner until at some point the iteration falls very close to the previous one for all nonlinear branches. In this case the approximation of the nonlinearity by a straight line is justified, and the result of the linear equation set is considered to be sufficiently close to the solution of the underlying nonlinear equation set. In this section we have seen that the Newton–Raphson algorithm very effi-ciently solves nonlinear circuit equations. Moreover, the algoritm can be applied at the device level, as illustrated in figure 21. Newton–Raphson algorithm nonlinear initial resistive variable network vector linear resistive network set up modified nodal equation set LU decomposition no final yes conver- variable variable gence? vector vector Figure 21: Newton–Raphson algorithm. Before any circuit equations are formulated, all branches containing nonlin- ear devices are transformed according to (73); that is, the nonlinear relations are replaced with respective tangents. In terms of devices, the tangent is realized by the parallel combination of an independent current source and a conductance. After replacing all nonlinearities, we obtain a linear resistive network, for which an equation set is formulated, as described in Section 2, and solved exactly, as explained in Section 3. Although we have not yet discussed dynamic devices, we can now consider the operating point and the DC sweep analysis in SPICE OPUS. In both cases we are interested in steady state circuit solutions. By definition, after all tran-sients die out, the voltages across inductors and the currents through capacitors 45 become equal to zero. Consequently, branches containing capacitances are re-placed with independent zero current sources while inductances are replaced with independent zero voltage sources. In this way, we get a nonlinear resistive network which can be solved by the Newton–Raphson algorithm, as depicted in figure 22. OP analysis nonlinear network dynamic ib = 0 v b = 0 nonlinear resistive network variable Newton–Raphson vector init=0 Figure 22: Operating point analysis. This is the program flow on SPICE OPUS when running the operating point analysis. The DC sweep is also very similar. It consists of a series of operating point calculations while one or two sources are varied. The individual steps in the sweep are theoretically independent operating point anlyses, but SPICE OPUS takes advantage of the fact that adjacent steps in the sweep should pro-duce adjacent operating point solutions. So the solution of each sweep step is wisely used as the initial variable vector in figure 22 for the next Newton– Raphson iteration. In other words, the initial iteration k = 0 in (73) is the result of the previous run. In this way the number of Newton–Raphson iterations is greatly reduced. Formally, the program flow of a DC sweep analysis is shown in figure 23. 46 OP sweep nonlinear network dynamic ib = 0 v b = 0 step source resistive nonlinear network Newton–Raphson init=previous no variable yes sweep variable vectors done? vector Figure 23: DC sweep analysis. So far we have succeeded in solving nonlinear circuits by iteratively solving a linearized transformation until the variable vectors of two consecutive iterations differ by less than a predefined value. A number of open questions still remain, like how to determine the initial Newton–Raphson iteration, how to detect convergence or the lack thereof, and what to do in cases of divergence. 4.3 Convergence Detection We have already mentioned that the Newton–Raphson algorithm is stopped if two consecutive iterations in (72) are close enough. In fact, SPICE OPUS employs a more advanced convergence detection mechanism. There are three simulator parameters involved in the convergence detection cirteria with the default values −12 −3 abstol = 10 , reltol = 10, and vntol = 10−6. SPICE OPUS actually checks not only the difference between iteration k and k + 1 but also the one before, k − 1. The latest iteration pair must satisfy | (k+1) (k) (k+1) (k) v − v | ≤ reltol · max( | v | , | v|) + vntol ni ni ni ni | (k+1) (k) (k+1) (k) i − i | ≤ reltol · max( | i | , | i|) + abstol. (74) b1i b1i b1i b1i All nodal voltage differences must be small enough in a relative as well as an absolute sense. Moreover, all current defined branches (31) must fulfill the same criteria. The same applies for the previous iteration pair, | (k) (k−1) (k) (k−1) v ni n reltol − v · max( | v | ≤ | , | v|) + vntol i i i n n | (k) (k−1) (k) (k−1) i − i | ≤ reltol · max( | i | , | i|) + abstol . (75) b1i b1i b1i b1i 47 After all the above criteria are met, there still is one more check left. The three points form a triangle in a multidimensional space. We require the angle at the second point to be less or equal to 90◦. This assures that the iteration series is in a sort of rotation around the solution, as illustrated in figure 24. x2 x(k−1) x(k) x(k+1) 0 0 x 1 Figure 24: Newton–Raphson convergence detection. Formally, Pythagoras theorem can be used to ensure an acute angle at (k) x q | (k+1) (k −1) (k) (k−1) (k+1) (k) 2 v − v | ≤ | v | | v −|2 n n n − v n + n v n (k+1) (k − q 1) (k) (k −1) (k+1) (k) 2 | 2 i − i | ≤ | i − i | + | i − i |. (76) b1 b1 b1 b1 b1 b1 This complex convergence detection scheme is necessary, because sometimes two adjacent iteration points can be very close to each other only to drift apart again. By defining the simulator parameter noconviter one can switch off the double convergence check, but without a very special reason the user should not change any of the simulator parameters that control convergence. Basically, there are two things that can go wrong in connection with conver-gence. Although we have chosen the very stable and robust Newton–Raphson iteration algorithm in Section 4.2, there still is no guarantee of convergence. Large numbers of transistors with their exponential characteristics, intercon-nected in a certain way, can cause havoc in any iteration algorithm. The itera-tions either oscillate instead of zooming in on the solution or they even diverge. In both cases the maximum iteration limit is exceeded itl1 = 100 with still no convergence detected. The other problem arises when at some iteration in Figure 21 the LU de-composition fails. This can happen for a number of reasons, e.g., diverging iterations can reach absurd numerical values. Whatever the reason, the alarm always goes off in the pivoting algorithm when at some step it cannot find a single coefficient that would satisfy (56). 48 But this is not the end of the Newton–Raphson iteration algorithm in SPICE OPUS; it is rather the beginning of a whole series of tricks. The convergence helpers described in the following section are among the most important quality marks of any circuit simulator. 4.4 Automatic Convergence Helpers SPICE OPUS does not give up easily. There are six tricks built into SPICE OPUS, and all of them are activated automatically. Normally the user does not have to intervene, and the average user does not even have to know about them. All six automatic convergence helpers rely on two assumptions. First, at least one solution of the circuit must exist. Obviously, no convergence helper can find a nonexisting solution. A circuit with no solution is not unheard of. Very often an erroneous netlist causes singularities; for instance, two parallel voltage sources, or an ideal transformer generating a floating subcircuit. In these cases all convergence helpers will fail. The second assumption is that the proximity of the solution is convergent for the Newton–Raphson algorithm. In other words, if we can find an initial vector k = 0 for (72) which is close enough to the final solution, the iteration series will converge. So, it’s really all about finding that crucial proximity. We know that (64) is guaranteeing a zero gradient 0 (∗) f ( x) = 0 at the solution, but that does not mean that f (x) can’t be very steep further away from the solution as illustrated in figure 25. y f (0) (1) ( x ) x x (2) (1) f ( x) x (0) 0 0 x Figure 25: Newton–Raphson convergence. The perpendicular gray line marks the Fixed Point Iteration convergence limit as defined by (62). Any iterations venturing outside the intersection of f (0) ( x ) and the gray line risk spiraling into divergence. The inital x is just 49 barely squeaking by. However, just a little more to the left and the goose is cooked as in figure 26. y (0) x(1) f (x ) x (2) (1) f ( x) x(0) 0 0 x Figure 26: Newton–Raphson divergence. Normally, Newton–Raphson iterations start from a zero vector. Whenever divergent iterations are detected, convergence helper algorithms are used to find a suitable starting point. Junction Voltage Limitation This is a very crude measure that prevents Newton–Raphson iterations from running amok. Especially in the beginning of an iteration series, the exponential nature of p-n junctions can cause extreme numerical values going right to the limit of floating point numbers. Sometimes the limit is even violated, causing exceptions in the mathematical package. In order to prevent these rare occasions, SPICE OPUS sets an upper limit on voltages across a p-n junction regardless of their polarity. By changing the simulator parameter 30 voltagelimit , the default limit of 10V can even be overridden. So after each new iteration in ( (k+1) 72 ), the branch voltages v containing b p-n junctions are checked. Any value above or below voltagelimit is reset to the respective limit,  (k+1) (k+1) v ; ∀ | v| ≤ (  b i bi k voltagelimit v +1) (k+1) = (77) bi bi voltagelimit − ; ∀ v < −voltagelimit  (k+1)  voltagelimit ; ∀ v n > voltagelimit. i Junction voltage limitation not only prevents numerical exceptions—in most cases it also speeds up convergence by limiting the iteration space. Normally the default value of voltagelimit does not need to be changed. 50 Damping Newton–Raphson Steps As we have mentioned, one of the things that can prevent convergence is an os- cillating iteration sequence as illustrated in figure 26. Instead of quickly zooming in on the solution, the iterations start oscillating around it. A major problem is the detection of oscillations as they are not necessarily periodic. Also note that we are looking for oscillating iterations in (72), where each vector has 2b + n components. In the worst scenario, we need to detect a stochastic limit cycle in a 2b + n dimensional space. This is not a trivial problem, and we do not want to spend more computational effort on the detection of oscillations than on solving the problem itself. Let us put aside the question of detection and look at the solution first. An oscillation can also be seen as an infinite series of iterates in the proximity of the solution. Intuitively, one would limit the iteration steps. This approach is termed the damped Newton–Raphson algorithm, whatever limiting strategy is employed. In SPICE OPUS the simulator parameter sollim (default value is 10) reg-ulates the damping of iteration steps. The damping algorithm is based on the same voltage and current tolerance thresholds as the convergence detection. We first define the thresholds for iteration k + 1 (note that i is the index of a component in the solution vector vn, ib1), v (k+1) (k+1) (k) = si reltol · max( | v | , | v|) + ni vntol ni i(k+1) (k+1) (k) s i reltol = · max( | i | , | i|) + abstol. (78) b1i b1i With damping turned on, the Newton–Raphson algorithm still runs exactly as defined in (72). However, before feeding the variable vector back to the netlist, all variables which have stepped over their respective threshold (78) are limited. The Newton–Raphson flow now has one more transformation in the loop, as can be seen in figure 27. Actually, the steps are limited to a fraction determined by sollim of the respective threshold. The damping of voltages is expressed in the following equation:  n i ni ni si  v  (k+1) (k+1) (k) (k+1) ; ∀ |v − v | ≤ v v(  (k+1) v k +1) ( k ) (k+1) (k) (k+1) si = n v < i n i − ; ∀ v n i − v n i − v s (79) i sollim  (k+1)  ( k ) v (k+1) (k) (k+1)  si v n i + ; ∀ v n i − v n i > v s i. sollim There are three possibilities. If the variable step was already within its convergence tolerance, nothing happens. If the new nodal voltage (k+1) v is n i lower than its predecessor (k) v ni by more than the threshold, then this step is reduced to a fraction of the threshold. An excessively long positive voltage step is limited accordingly. Similarly, all branch currents that occur in the set of modified nodal equa-tions are damped,  b s 1 b 1 b 1  i i ii (  (k+1) (k+1) (k) (k+1) i ; ∀ | i − i | ≤ i i (  k+1) i k +1) ( k ) si (k+1) (k) (k+1) = i − ; ∀ i − i < − i b 1 s (80) i b 1 i sollim b 1 i b 1 i i  (k+1)  ( k ) i (k+1) (k) (k+1)  si i + ; ∀ i − i > i s. b 1 i sollim b 1 i b 1 i i 51 Newton–Raphson algorithm nonlinear ... damping steps initial resistive variable network vector damp linear variable resistive vector network set up modified nodal equation set LU decomposition no variable final yes conver- variable gence? vector vector Figure 27: Newton–Raphson iterations with damped steps. With a default damping fraction of 10, it is clear that the damped steps will be extremely tiny. This ensures robust convergence at the expense of many more iterations. The normal iteration limit itl1 with its default value of 100 would thus be exceeded immediately, so SPICE OPUS has a multiplier simulator parameter sollimiter with the default value 10. Therefore, whenever damping is turned on, the iteration limit actually becomes itl1 · sollimiter. The damping factor sollim, which is actually defining a fraction of the threshold, would normally be greater than one. However, SPICE OPUS also accepts values smaller than one, in which case the steps are limited to more than the tolerance threshold. In this case the damped step is additionally limited by the original step. The idea is to reduce the original Newton–Raphson steps rather than enlarging them. As we shall see in the following sections, the damped Newton–Raphson al-gorithm is used when the GMIN stepping and source value stepping algorithms fail to produce a solution. This removes the need for a sophisticated oscillation detection algorithm and works well in practice. GMIN Stepping The idea behind this convergence helper is strikingly simple and efficient. SPICE OPUS adds a conductance between each node and ground. In this way conver-gence is greatly improved, at the expense of not solving the original circuit, but rather a transformed version. Intuitively, it is clear that any circuit which causes problems for the Newton–Raphson algorithm can be tamed by adding 52 large conductances between the nodes. Formally this can be verified. As discussed in Section 3.1, an arbitrary admittance yk between nodes p and m contributes to four coefficients in the sparse equation set (37), namely to the diagonal gp,p = +yk, gm,m = +yk and the anti-diagonal g p,m = −yk , gm,p = −yk. In our case one terminal is always the ground node, m = 0. Because the reference node is reduced from the incidence matrix, only one contribution is left on the diagonal, g p,p = +yk. By adding a huge conductance yk to each node, we get a diagonal with large terms in the equation set. However, a large and linear diagonal guarantees a solution. We have sure convergence because the linearity outweighs all nonlinear problems and the strong diagonal ensures perfect pivots for (56). The situation is illustrated in figure 28. y g gmin g > g min f (x) g = gmin g = 0 0 0 x Figure 28: Newton–Raphson GMIN stepping. With large enough conductances we always get a solution. However, this is the solution of the transformed circuit, whereas we want the solution of the orig-inal circuit. The trick is to gradually fade out the additional conductances (and that is where the name GMIN stepping comes from). In each step the conduc-tances are reduced and the Newton–Raphson algorithm is run from the previous solution. Eventually, the conductances are stepped down to zero, rendering the original circuit. This procedure, called GMIN stepping, is very effective al-though also very computationally intensive. Note that each step involves one complete Newton–Raphson run. So far we have conveyed just the general idea. Now let us be more specific. There are three simulator parameters with respective default values controlling this process in SPICE OPUS: the minimal conductance for the AC and TRAN analysis, −12 gmin = 10, the minimal conductance for DC analysis, gmindc = 10−12, and the maximal number of GMIN steps before giving up, gminsteps = 10. The simplest way would be to start with an extremely large conductance g 53 and then step it down gradually. However, it is uneconomical to start down-stepping from a needlessly high value. Therefore, SPICE OPUS starts with the default minimal conductance g min and steps it up in huge three decade increments until convergence is reached. In the second part the conductance is gradually stepped down, always starting the Newton–Raphson iteration from the previous solution. If everything works out, the conductance g eventually falls below gmin and is finally completely removed. The three steps are illustrated in figure 29. Gmin stepping nonlinear add g = gmin from network resistive each node to ground n = 1, s = 2 damp = 0 Gmin up-stepping variable vector Gmin down-stepping variable vector variable vector Gmin final step fail! Figure 29: GMIN stepping algorithm. The up-stepping part is simple. SPICE OPUS starts with g = gmin and multiples it by 1000 for each unsuccessful Newton–Raphson run, as seen in figure 30. The overall step counter n is initialized at the beginning and is limiting the total number of Newton–Raphson runs to the predefined value nmax = gminsteps. Whenever this number is reached, the entire GMIN stepping pro-cedure is considered to have failed. The up-stepping is fast and should take only a few runs. However, singular circuits are bound to fail already during the up-stepping part. Given a successful up-stepping, we have our first solution. From this point on we need to step-down the conductance. This process is fragile because the individual steps always need to be close enough not to break convergence. The algorithm is depicted in figure 31. Each successful step increases the stepping fraction s, speeding up the down-stepping. However, a too large step is likely to break the convergence chain, so the algorithm has to revert to the previous conductance g = gp and accordingly reduce s. This adaptive step sizing is computationally very efficient, but can lead to extremely small step sizes, in which case Newton–Raphson damping is 54 Gmin up stepping nonlinear resistive Newton–Raphson network init=0 g = g ∗ 1000 vector variable yes success? g = g p n = n + 1 no fail! n < n no yes max Figure 30: Up-stepping part of GMIN stepping. Gmin down stepping variable g = gp/s vector damp = 1 n = n + 1 no increase s s too fail! n < nmax g small? p = g damp = 0 yes yes reduce s vector variable no no g p > g min success? yes Newton–Raphson init=previous Figure 31: Down-stepping part of GMIN stepping. turned on automatically. The fact that at some point a very small step in g is causing a loss of convergence does not necessarily mean that the Newton–Raphson algorithm is experiencing oscillating iterations! Putting the Newton–Raphson algorithm in damping mode is based on pure heuristic reasoning. It just turns out that this measure very often works. In the case of a successful down-stepping without exceeding the maximal number of Newton–Raphson iterations, we have convergence for some g < gmin. With the default value of −12 g = 10 this is almost our original circuit. Only min a very small resistive network is still superimposed on each node. So the final part of the algorithm is to completely remove g from the network and do one 55 final Newton–Raphson series starting from the last known solution, as shown in figure 32. Gmin final step variable vector g = 0 Newton–Raphson init=previous vector variable yes success? no fail! Figure 32: Final part of GMIN stepping. One could argue that this part is not necessary as the difference between having −12 g = 10 and g = 0 is irrelevant. With normal circuits this is the case, but some singularities in the circuits can prevent convergence in this very last part. For instance, suppose that there is no DC path from some node to ground. By removing the smallest of g, we are facing a dangling node with a theoretically undeterminable voltage. Looking back at the entire GMIN stepping procedure in figure 29, it is ob-vious that the default maximal Newton–Raphson count of gminsteps = 10 is rather low. In fact, the default is only meant to do a quick trial of GMIN step-ping, producing successful results only for relatively simple circuits. In difficult cases GMIN stepping will run out of steps and will report this to the user. In this situation one should first review the circuit for systematic singularities be-fore increasing gminsteps because the resulting GMIN stepping runs may take a long time to complete. 56 5 Frequency-Domain Analysis There is one more step we need to make towards simulating real circuits. We have to find a way to include dynamic devices. We will not worry about non-linear energy storing devices as they can be represented by a combination of a nonlinear controlled source and either a linear capacitor or inductor. So we are really talking about introducing simple linear capacitance and inductances into branch relations (5). There are two ways to do this, either in the frequency or the time domain, each having its respective advantages and disadvantages. 5.1 Small Signal (AC) Analysis Small signal response analysis (also referred to as frequency-domain analysis, AC analysis, or AC sweep), is based on the Fourier transformation of the circuit’s equations. Unfortunately, we have to assume either linear circuits or small signals. As we are primarily interested in nonlinear circuits, we will first discuss the notion of small signals. Observe any nonlinear analog circuit in the steady state. From previous sections we know how to obtain an operating point solution. Now imagine that some independent source between two nodes starts oscillating with an arbitrary frequency but with an extremely small magnitude. Unless the circuit has some discontinuity in the immediate proximity of the operating point, this tiny os-cillation will propagate linearly through the entire circuit. In other words, the circuit can be considered linear in the neighborhood of the operating point, provided that the neighborhood is small enough. With this assumption small signal response analysis needs to calculate only the ratios of magnitudes and the phase delays to completely describe any sinusoidal signal. This is done with complex arithmetic, where all resistive devices are rep-resented by admittances equal to the gradients in the operating point. These gradients are already computed as part of each Newton–Raphson iteration, as shown in figure 20. Capacitances and inductances become imaginary functions of the frequency 1 y = jωC and y = , respectively. Consequently, indepen- c l jωL dent voltage sources become short circuits, and independent current sources are replaced by open circuits. With these replacements we are facing a simple linear resistive network, but not like that described by (37). The only differences are the parametric frequency dependence and the complex arithmetic. The latter is no problem; the procedure is exactly equivalent to that described in Section 3. The frequency dependence is usually solved by sweeping ω over a relevant interval. The program flow of the small signal analysis in SPICE OPUS is summarized in figure 33. New users often have misunderstandings regarding the AC analysis. We clarify some of them here. The linearized resistive network in figure 33 needs at least one AC excita- tion. So unless we have defined at least one AC source, the result will be zero, because the right-hand side of (37) is all zero. The AC excitation magnitude and phase basically are arbitrary, because we have a linear system where the superposition theorem applies. There- 57 nonlinear bias AC analysis dynamic OP analysis vector network g linear dynamic network ac jωC turn on 1 jωL linear resistive network step set up modified frequency nodal equation set LU decomposition no vectors variable yes sweep variable done? vector Figure 33: Small signal response analysis. fore, it makes sense to chose unit magnitude and zero phase. In this way the results can be directly interpreted as ratios. Do not be disturbed by SPICE OPUS reporting 1MV at the output of an operational amplifier. Remember to interpret this as the amplification gain of a unit excitation. You can specify more than just one AC excitation source, in which case all excitation sources are swept with the same frequency. Thus, any output will be the product of the combined excitation. This questions the purpose of the entire analysis, but in some special cases one might make use of this possibility. Unknown x’s in (37) should be considered in polar form where the ab- solute values signify the magnitude and the angle represents the relative phase shift. This puts some natural limits on the phase response cal-culation because the complex number space only allows angles up to 2π 58 radians. On the other hand, especially with modern high gain operational amplifiers, we experience phase shifts well outside the ◦ ± 180 range. The small signal analysis will still deliver correct results; however, the phase will be wrapped. For instance, ◦ ϕ = 14 should actually be interpreted as ϕ ◦ ◦ = 14 + k · 360, where k can be any integer including zero. As circuit designers we can guess the value of k for each specific case. Neverthe-less, there is the special NUTMEG command unwrap() which attempts to make an intelligent guess of k. Small signal analysis results depend heavily on the operating point of the circuit. Always double-check the resulting operating point before focusing on the AC analysis data. One could say that small signal response analysis reveals the differential properties of a circuit in the neighborhood of a specific operating point. With a different bias (resulting in a different operating point), the gradients of the characteristics of nonlinear resistive devices change, and the AC analysis might yield completely different results. Despite the fact that the small signal analysis results in much information on the differential behavior of the circuit, it does not provide us with a picture of the circuit’s nonlinear behavior. Nevertheless, the analysis is fast and almost never results in convergence problems. Conver-gence problems are usually the result of the initial operating point analysis that precedes AC analysis. 5.2 Small Signal (NOISE) Analysis Small signal noise analysis is another small signal analysis derivative (also in the frequency domain), which is very important and is being heavily used in modern chip design. The goal is to find the total noise spectrum produced by a network of individual electronic devices. The noise level in electronic circuits puts a major constraint on the dynamic range, so small signal noise analysis is crucial to modern chip design. It is based on the knowledge of each individual noise source. In theory there are three noise sources: thermal noise from every resistance as well as shot noise and flicker noise from every semiconductor device. Noise sources can depend on the frequency (e.g., flicker noise) and on the operating point (shot noise, flicker noise). A noise signal + v ( t ) can be represented by its power spectrum density S (f ). vv If the signal is a node voltage, the corresponding unit for the signal is V and the unit for the power spectrum density is V2/Hz. If the signal is filtered by an ideal bandpass filter with frequency range 0 f ≤ f ≤ f , a new noise signal v(t) 1 2 is obtained. By integrating the power spectrum density + S (f ) from f f vv 1 to , 2 the mean square of 0 v(t) is obtained, 1 Z τ Z f2 lim 0 2 + | v ( t ) | dt = S (f )df. (81) τ →∞ vv 2 τ −τ f1 Every resistor and every semiconductor device is a source of noise. Because the noise is a small signal, it is assumed that the circuit is linear within the noise magnitude around the circuit’s operating point. Therefore, the circuit can be linearized and an ordinary small signal response analysis applied to obtain the noise characteristics of the linearized circuit as in figure 34. 59 nonlinear bias Noise analysis dynamic OP analysis vector network g linear dynamic network ac jωC turn turn on off all noise sources 1 jωL linear resistive network step set up modified frequency nodal equation set LU decomposition no vectors variable yes sweep variable done? vector Figure 34: The block diagram of the noise analysis. The noise introduced by a device is represented by one or more noise current sources in the AC small signal model of the device. The phase of such noise current sources is 0, whereas their magnitude generally depends on the frequency and the circuit’s operating point. S + (f ) = 4kT g g nn Figure 35: Resistor noise model. Figure 35 depicts the noise model of a resistor, where k is the Boltzmann con- 60 stant, T is the absolute temperature, g = 1/R is the conductance of the resistor, and + 2 S ( f ) is the current noise power spectrum density (A/Hz) representing nn the thermal noise. Generally, there are + n noise nn noise current sources in the circuit. Let S (f ) ,i denote the power spectrum density of i-th noise source. Because individual noise sources are uncorrelated, the power spectrum density of the output noise S + (f ) is the sum of power spectrum density contributions from individual out,out noise sources, n noise S X 2 + out,out i nn ( f ) = | A ( f ) | S (f ), (82) ,i i=1 where Ai(f ) is the transfer function from the i-th noise source to the output at frequency f . The equivalent input noise is defined via a noiseless circuit (one in which the circuit elements generate no noise). It is the noise signal that must enter a noiseless circuit at its input in order to produce the same output noise power spectrum density as one would get with a noisy circuit and the input signal equal to zero. The equivalent input noise power spectrum density is obtained by dividing the ouput noise power spectrum density by the squared magnitude of the cir-cuit’s transfer function from the input to the output. This transfer function can be obtained with an ordinary small signal response analysis. 61 6 Time-Domain (TRAN) Analysis Time-domain analysis is the most computationally intensive approach, but also the most realistic in the sense that real circuits actually perform in the time domain. Besides the dynamic excitation from independent sources it is the reactances that influence the transient behavior of any circuit. In other words, we need to find a way to supplement branch equations (5) with equations for capacitors and inductors. 6.1 Backward Euler Integration The time-domain model of a capacitor (inductor) is given by the following equa-tion. dv c(t) dil(t) ic(t) = C , vl(t) = L . (83) dt dt A direct inclusion would turn coefficient matrices Y and Z into operators (e.g., terms of the form d would appear as matrix elements) and the linear dt equation set (8) into a set of ordinary differential equations. In order to avoid this, we use the trick of partitioning time into small steps. The basic idea can be considered a type of finite element approach to time-domain analysis. The simplest way is to assume constant currents through capacitors and constant voltages across inductors. First, let us examine the consequences on capacitors in figure 36. i i(t) i(tk+1) ic (t) i(t k) h = tk+1 − tk 0 0 t k tk+1 t Figure 36: Backward Euler integration. We are observing the time interval [t k, tk+1]. Inside this time interval we assume a constant capacitor current, i(t) = i(tk+1). (84) 62 By substituting this expression for the current in the integral form of (83) and solving the simple integral, we get 1 Z Z 1 i(t ) v k+1 ( t ) = i ( t )d t = i ( t )d t = t. (85) C k+1 C C As expected, the constant current through the capacitance causes a linear voltage ramp across it. However, we are not as interested in the transient inside the time slice as in the variable increments of the time step, that is the difference between variable values at times tk and tk+1, i(tk+1) v(t k+1) − v(tk ) = (tk+1 − tk). (86) C The time step is usually referred to as h = tk+1 − tk. With this substitution we get the following admittance form: C C i(t k+1) = v(tk+1) − v(tk ). (87) h h This actually represents an admittance C g c h = in parallel with a constant current source C i = − v(t ). So by knowing the circuit variables at t , we can c h k k compute the variables in the next time step t k+1 by solving a resistive network where each capacitor has been replaced by a respective g c and ic (figure 37). C C C i = v ( t ) g = c k c h h Figure 37: The model of a capacitor in time-domain analysis. Naturally, the procedure for impedances is exactly dual. We assume a con-stand voltage across inductances, v(t) = v(t k+1). (88) Next we solve the basic integral 1 Z Z 1 v(t ) i k+1 ( t ) = v ( t )d t = v ( t )d t = t (89) L k+1 L L and formulate the current step v(tk+1) i(tk+1) − i(tk ) = (tk+1 − tk). (90) L At this point the duality ends, because we need an admittance expression in both cases, so the final time-step equation for the inductors is h i(t k+1) = v(tk+1) + i(tk ). (91) L In each time step, inductors are thus replaced by the impedance h g l L = in parallel with the independent current source il = i(tk). 63 L h i = i ( t ) g = l k l L Figure 38: The model of an inductor in time-domain analysis. In summary, we are able to compute a transient response of a dynamic nonlinear circuit by successively solving resistive nonlinear networks for each individual time step. Actually, we have just reinvented the well-known backward Euler integration algorithm. The program flow is illustrated in figure 39. TRAN analysis nonlinear initial dynamic OP analysis condition network vector nonlinear resistive step time network Newton–Raphson algorithm no vectors variable yes sweep variable done? vector Figure 39: Time-domain transient analysis. Each step k + 1 is computed based on the previous one, k, under the as-sumption of constant currents through capacitances and constant voltages across inductances. But what about the first step? We need some initialization for the integration algorithm. SPICE OPUS always preforms an operating point anal-ysis prior to any integration steps. The results of the operating point analysis are then used as the initial time step k = 0, as depicted in figure 39. ˜˜˜ Before we go into any more details, let us review the basic backward Euler integration on a very simple example. Consider the circuit in figure 40 with a 1K resistor, a 10µF capacitor, and a 5mA step excitation at t = 0. 64 v c(t) ic(t) 0, t < 0 i(t) = R = 1KΩ C = 10µF 5mA, t ≥ 0 Figure 40: Example circuit demonstrating integration algorithms. The case is linear and simple enough so that we can guess the transient. We expect a 5mA surge in the capacitor current i c(t) dying exponentially while the voltage across the capacitor v c(t) will rise exponentially toward 5V. The initial operating point is trivial as the excitation at t = 0 is zero. The selected time step is intentionally large, h = 10ms, which is actually too large for the circuit time constant. In this way we exaggerate the inexact behavior of the integration algorithm. The observed time interval is 50ms, so we have five integrations steps besides the initial operating point. The resulting current through the capacitor and the calculated voltage across it are displayed in figures 41 and 42, respectively. The integration steps are clearly marked. Each figure also shows the exact exponential solution for com-parison. In both figures we have explicitly represented the backward Euler ap-proximations between the data points, that is, the rectangular current transient and its trapezoidal voltage integration. i c(t) (mA) 5 4 3 (t 1, i1) 2 (t 2, i2) 1 (t 3, i3) (t 4, i4) 0 (t 0, i0) 0 5 10 15 20 25 30 35 40 t(ms) Figure 41: Backward Euler integration current through capacitor. Let us observe the current step function in figure 41. Because of the huge time step, it is way off the correct solution. The absolutely worst part is the beginning, because the assumption that nothing much is happening inside the time slice is most seriously violated at the beginning. This was to be expected as we have a discontinuous excitation at t = 0. Unfortunately, this is a typical scenario in real circuits. So ideally, instead of a constant integration time step h we would rather have an intelligently adapting time step. 65 There is another interesting observation to be made in figure 41. By con-sidering the fact that each integration step is numerically based on the results of the previous calculations (87) and (91), one would think that the calculation errors will have a cumulative nature. In other words, calculations based on erroneous data cannot be expected to produce results with improved accuracy. However, by looking at the current steps we observe the surprising fact that toward the end of the transient the integration algorithm comes much closer to the correct solution than it started out. This is by no means a rule, but neither is it very exceptional. v c(t) (V) 5 (t 4, v4) 4 (t 3, v3) (t 2, v2) 3 (t 1, v1) 2 1 0 (t 0, v0) 0 5 10 15 20 25 30 35 40 t(ms) Figure 42: Backward Euler integration voltage across capacitor. Turning our attention to the voltage transient in figure 42, we find one more typical effect. A rough glance at the characteristics reveals that the voltage transient is more accurate than its current counterpart. If we were observing an inductor instead of the capacitor, we would find the reverse situation. The rea-son for this is that SPICE controls the numerical error by keeping an eye on the circuit variables that are subject to numerical integration (voltages for capaci-tors and currents for inductors). Other variables may exhibit larger numerical errors. With a smaller time step h, the presented backward Euler integration could produce useful results with real circuits. In SPICE OPUS backward Euler integration is selected by setting the method simulator parameter to trap and the maxord simulator parameter to 1. Besides backward Euler integration, SPICE OPUS also provides more advanced algo-rithms. 6.2 Trapezoidal Integration For trapezoidal integration, instead of assuming constant currents through ca-pacitors and constant voltages across inductors with a time slice, we assume respective linear functions. Instead of having to integrate rectangles, we must now integrate a piecewise linear function with each segment having a trapezoidal form as seen in figure 43. 66 i i(tk+1) i(t) ic (t) i(t k) h = tk+1 − tk 0 0 t k tk+1 t Figure 43: Trapezoidal integration. We will examine the situation on capacitors first. The expression will be a bit more complicated this time, so we will simplify the notation. We drop the explicit functional dependence on time and use time-step indices directly on current and voltage symbols. So instead of i(t), i(tk ), v(tk) we will use i, ik, vk , respectively. Consequently, a constant current (84) is now replaced by the linear ramp (ik+1 − ik) i = (t − tk) − ik. (92) (tk+1 − tk) We obtain the corresponding voltage transient by integration, resulting in a parabolic characteristic, v 1 Z 2 1 ( i k +1 − i k ) t = i d t = ( − t k t ) + i k t . (93) C C ( t k +1 − t k ) 2 The expression is somewhat complicated, but as we are only interested in the voltage increment, we can calculate the difference vk+1 − vk between time steps tk+1 and tk using (93), 1 vk+1 − vk = (tk+1 − tk)(ik+1 + ik). (94) 2C We now resolve the expression to admittance form and substitute h for the time step 2C 2C ik+1 = vk+1 − ( vk + ik). (95) h h Again we arrive at an admittance 2C g = in parallel with an independent c h current source 2C i c k k h = − v(t ) − i(t ). Comparing (95) to (87), we can see that this time the current source depends not only on the voltage from the previous 67 step but also involves the current. This is very important. As the trapezoidal integration algorithm proceeds through the time steps, it takes more information from the signal history than backward Euler integration. The same can be observed with inductors. First we assume a linear voltage ramp across the inductor, (vk+1 − vk ) v = (t − tk) − vk. (96) (tk+1 − tk ) After integration we get 1 Z 2 1 ( v k +1 − v k ) t i = vdt = ( − tkt) + vkt . (97) L L (t k+1 − tk) 2 The resulting current step now is 1 ik+1 − ik = (tk+1 − tk)(vk+1 + vk), (98) 2L and the admittance form with the h substitution finally yields h h i k+1 = vk+1 + ( vk + ik ). (99) 2L 2L Inductors are thus replaced by the impedance h g l 2 = L in parallel with the independent current source h i l k k 2 = L v(t ) + i(t ). In comparison with (91) we again notice the additional dependence on v(tk ). In conclusion, we expect the assumption of piecewise linear derivatives of reactances to give us a better approximation of real transients than the simplistic step model in backward Euler integration. i c(t) (mA) 5 4 (t 1, i1) 3 2 (t 2, i2) 1 (t 3, i3) 0 ( (t 4, i4) t 0 , i 0 ) 0 5 10 15 20 25 30 35 40 t(ms) Figure 44: Trapezoidal integration current through capacitor. Let us repeat the five integration steps for the example in figure 40. For comparison, we have plotted the trapeziodal current steps against the correct current transient as well as the results from backward Euler integration in figure 44. 68 Interestingly, the first step is just as badly off the correct current surge as in backward Euler integration. Actually it is even worse, because the average current in the first time slice is even less than that in the backward Euler step. The remaining steps, on the other hand, catch up with the correct signal nicely. This becomes even more evident by comparing the voltage responses in figure 45. Not only is the trapezoidal algorithm with its parabolic segments smoother than the backward Euler integration curve, it also approximates the correct voltage trace much better. v c(t) (V) 5 ( (t t4, v 3 , v 3 )4) 4 (t 2, v2) 3 2 (t1, v1) 1 0 (t0, v0) 0 5 10 15 20 25 30 35 40 t(ms) Figure 45: Trapezoidal integration voltage across capacitor. The discontinuous nature of the circuit excitation does not match well with trapezoidal integration. The more an integration algorithm is drawing informa-tion from previous time slices, the worse it reacts to discontinuous phenomena. Unfortunately, in electrical engineering discontinuous signals are quite frequent. SPICE OPUS uses trapezoidal integration if the method simulator parameter is set to trap and the maxord simulator parameter is set to 2. By default, SPICE OPUS utilizes trapezoidal integration. In everyday use this integration method has proved itself as the most efficient. Backward Eu-ler integration is used only when the simulator encounters a discontinuity or when the transient analysis is started. However, for special cases the user can manually switch to backward Euler integration. 69