Feb 27, 2011

PLL - Just another control system

A  PLL is just another control system which works in the phase domain, ie it trys to make the phase of the output approach that of the input. phase locked loop. Viewing thus you can answer many fundamental questions on the usage and properties of a PLL. For instance, what happens to step response? Why or why not is a PLL stable? Why do you need a zero in a gardner type PLL etc etc

So how does one go about designing a PLL - whether in IC form or discrete.
a) Get your system level specifications. What is your reference frequency? What about your Output frequency. Application - RF, serdes etc.
b) From there you can find your system parameters - Bandwidth, constraints (power, area etc)
c) This gives you an idea what architecture you need for the sub-blocks. For example, VCO - ring oscillator, LC and what sub-types (quadrature, varactor based etc)
d) Follow-up with designing these components, do block level simulations and finally system level simulations for specifications such as Power, Jitter. And lots of iterations.
Simple, right :-)

Some good books for the beginner to advanced
1) Phase locked Loops - design, simulation and applications
2) Phase Lock Techniques, 3rd ed

There is a host of literature on this topic and my list is by no means comprehensive.

I have been building PLLs since 1997 in various technologies and applications.

I remember one of the first PLLs that I helped out on was for a 622Mb/s Sonet transceiver. This was a conventional Ring oscillator based PLL. The loop filter was fully differential and it fed a v2i (voltage to current converter) that fed a current starved Ring oscillator. The PFD was your typical 2 Flop one with a delayed reset.
One of the most fundamental changes in PLLs has been the growth of LCPLLs. I remember that when inductors were first used for PLL VCOs in commercial products, they were viewed as  risky. Nowdays you see them even in microprocessors on digital substrates with massive amount of switching noise.  By my estimate this took ~7years.

I have had the fortune to design inductor based PLLs in multiple generations of VDSM (very deep sub-micron) CMOS. My latest project was for a PCIe  LCPLL. The usual challenges of low jitter, power, area etc. were compounded by PCIe requirement of tight control on loop Bandwidth. Now, the trouble arises because of a few reasons
a) Varactor modeling and variation lead to large variation in Kvco
b) Charge pump are not really current source but charge sources. Due to finite rise, fall times etc. the "effective" current from a charge pump is not the same as current source/sink
c) Variation in loop filter Resistance

So some innovative solutions need to be used to meet the specifications. Some  ideas include
- methods to measure PLL Bandwidth and peaking on-die and auto-calibrate out errors
- Linearizing Kvco through ac-coupling caps
- Increasing charge pump switching speeds by moving away from cascode switching
- Using schemes which don't have a loop resistance. Examples: Self-biased or Sample-reset filters
Also do remember to use the correct Charge pump (HiZ) and VCO (LC VCO) to get your jitter low enough to meet your jitter requirements.

Drop me a note in case you want more ideas on the effectiveness of certain techniques to meet your architectural specifications.

There are quite a few articles out there on PLLs. However, finding an article on debuggability/ practical issues with PLLs is not readily available. So I am listing a few
a) Make Feedback divider faster than VCO. At startup your VCO might have a railed control voltage and may run much faster than you simulated for
b) Shield your VCO input control voltage. This is your most sensitive node. If you want to bring it for observability put a low leakage buffer on this pin
c) Avoid chicken-and-egg problems. When integrating the PLL with your ASIC your ASIC may depend upon a clean PLL clock to propogate certain states to the PLL boundary whereas the PLL may need certain states from the ASIC to lock. A way out is to use a local non-PLL clock (eg ring oscillator) just for propogating the states till the PLLs lock
d) Modeling PLL in RTL should take into account any states which cause PLL to shift frequencies or loose lock. It is not feasible to model PLL locking in a simulation involving a full-SoC. However, a basic model should involve all signals which can cause PLL to drift.
e) Model PLL in matlab etc. to ensure your PLL is always going to be stable and ideally have a damping factor > .8 in the worst case. Remember to take aging into account when doing this.

I have just scratched the surface for this very important mixed-signal component. There are host of other nuanes - for example, testing PLLs with 300fs jitter (really!), modeling/comprehending various forms of jitter etc. etc. My next topic will focus on the future of PLLs and give a flavour of how PLL techniques can be used to analyze other components (such as clock data recovery)

Remember debugging a PLL is very difficult since its a feedback loop. Be thorough and diligent in all your circuit and system level checks. Good luck!

Productization - at last some discussion.

One of the requirements for a successful serdes project is the ability to sell it in high volume. Unfortunately many designers (especially IP designers) may not be aware of the methodology required for productizing a serdes. I am speaking from experience since I learnt this the hard way.

Once a design comes back from fabrication, multiple steps happen (some in parallel, others slightly staggered)
a) Wafer sort
b) System validation
c) Bench debug
d) Part characterization
e) High volume testing
f) Misc (burn-in, JTAG, margining)

Wafer sort is the process of testing at the wafer level and marking the bad dies so that they are not assembled (and packaging costs can be saved). Quite a few folks  believe this is not a valuable exercise though my experience has been otherwise. BIST (eg local loopback from the Tx to the Rx) is something testable at this point which we used. Some dc structural tests can also be tested during wafer sort. This is also the point of pre-trimming your critical analog blocks. I had designed a flow for trimming the bandgaps and voltage regulators to occur at wafer sort. These are quite complicated flows due to the chicken-and-egg problems (eg. PLLs may need bandgaps to lock but bandgap may need PLL clocks for propogating control signals to it)

Bench debug is the process whereby the part is tested in the lab with close interaction with the designer. This typical proceeds in parallel with high volume testing. Bugs from both of these flows are disposed of by the design engineering teams. At is at this point that all the choices you made defining DFx (design for debug, design for manufacturing etc) become apparent in terms of their value. This is one of the prime reasons to having digital control loops for your analog data path. For instance, you want to see control loop settling with a step change on your input. You want to view the control register bits on Logic analyzer. These are some questions you need to think about and talk with your validation team. Remember to ensure these post-silicon validation folks really stress the part to wring out all the bugs from the part. 

Characterization is high volume electrical validation of the part. It consists of 2 sub-steps.
a) Run the part on tester with a more elaborate test program. This helps to setup your guardbands and is also the fastest way to get high volume data. An example of design related feedback would be to ensure that your signal of interest is visible on the pins that interface with the tester and have no dependency upon the functioning of other parts of your SoC. As an example, ensure that PLL lock is visible on the test out pin of the part and doesn't depend upon the PLL actually locking! Sounds trivial but you'd be surprised as to the effort involved in this.
b) Run the part on the various platforms. This is needed for characterizing non-HVM tested parameters. If you're not clever this could turn out be a huge time sink and you may not get timely data for your next part revision. For instance, the rise and fall time of the part can be tested part-by-part. A better way is to put a RF switch to select 1 out of say 40 lanes with the output going to the measuring instrument. By cycling through the 40 lanes, a stastically significant sample size can be tested time effectively.
In addition, a requirement unique to serdes is the different platforms its required to work with. For instance it should be able to work with short channels, long channels, different crosstalk profile connectors etc. Remember just testing with long channels doesn't gaurantee good results with a short channel. Also interoperability with different parts quickly makes this is an intractable problem which requires thinking and innovative approaches.

High volume testing is the ultimate validation that will be run on every part before shipping to customers. Some serdes standards specify requirements which make HVM easier. Loopback is a prime example of this. However, one can go one step further. For instance, we could stress an eye using the Transmit loop filter and then loop back a stressed eye. These and other such ideas can be brought back during the design phase to make a more complete test strategy. 
Some questions to ask when you plan your HVM strategy
How much yield should you shoot for.
What is the DPM target for the product.
These factors flow into your design targets. eg offsets, monte-carlo requirements

As the reader can see, productization is not a simple task. It takes multiple product cycles before the design and validation teams become good in this. Such skills are rare and documentation is even rarer since its not discussed in conferences/journals. Happy debugging!

Feb 25, 2011

Equalizer - Not your average sports term

Equalizer - A goal scored in the final seconds of a pulsating soccer match putting two great teams on an even kneel
Equalizer - A critical circuit allowing a far away viewer to see/hear the match with adaquate fidelity in real time

My first brush with an equalizer was in my graduate years where I designed and fabricated my first ever IC. Carnegie Mellon had a disk drive center which was looking at techniques to increase the capacity of drives. As you start increasing the denisty of recording, the adjacent bits start to "nudge" each other. This term ISI (inter symbol interference) causes you to read a 0 as 1 or vice-versa. An equalizer is a block to remove the ISI and regain your bits.

I designed the backward equalizer part of the DFE. We wanted to build something that was at least 50MHz to go along with another PhD students' Front End equalizer. I used a simple current steering mechanism as the fastest method. By doing it in current domain we could a simple subtraction of the post-cursor bits. One of the key challenges was building test circuitry since we could only send in PRBS data from external instruments but we needed to create distorted currents on chip. In any case I was able to run through the whole IC fabrication method (Thanks Mosis) and actually got to check the performance in the lab. We published the results and could hit speeds approaching 66MHz. Seems so ancient by today's standards. In any case, Check it out in IEEE JSSC under my name.

Another opportunity I got was building a Receive Equalizer for 100baseTx transceiver. 100BaseT had just arrived and my employer had built a BiCMOS Transceiver which was wildely successful. Unfortunately CMOS versions were coming around and we needed to build a lower power BiCMOS one in a quad form-factor (put four of them together in one package). The way to lower the power was to decrease the Supply voltage. This is tricky because of  the .7V in Vbe. You just can't get around it. And you end up building all these level shifters, emitter followers, diff pairs for high speed circuits and end up needing larger Vss/Vee. A way around this to do use more of current folding schemes.

So the idea was to use folded cascode structures and lower the supply voltage from 5V to 3V. The equalizer consisted of gm-C filters and had a split high peaking and low peaking path. By relative weighting between these two paths, you could achieve an adaptive equalizer for any length of Cat5 cable between 0 to 120m.
One of the nasty bugs we had was that in some parts the output of the equalizer was always stuck at 1. Of course simulations didn't predict this. After countless sleepless days, it was discovered that offset of one of the stages was causing the equalizer output to saturate. Finding this was such a pain. Fib and then put in an external voltage source to try to tune out the offset. Using pico-probes and then making sure things did not move once you were able to hit those probe pads - ah the joys. In any case, a stepping fixed it. While an analog approach, this product worked very well and was again a commercial success. Even today in the age of 32nm and below CMOS, preceeding a Digital DSP equalizer/ADC combo with a small, simple analog equalizer can provide savings in terms of power and complexity. Thats my claim and I am sticking with it!

I also designed a 10G transmit equalizers. The cool thing about Tx equalizers is that they are much easier to design and understand. Just put a 1UI delay and a scaled replica and you are done (for 1 tap filter). However, as is usual, there is more than meets the eye. One of the trends is to use a parallel approach with a Look-up table instead of a serial multi-tap approach. Another desicion point is the usage of current mode vs voltage mode drivers. So be aware of all these choices before finalizing the architecture.

One of the key trends is to use more mixed-signal control loops in equalizers. As a common theme meshing with my other posts, you will see more and more mixed-signal content in future communication SoCs. For instance instead of using an analog feedback for offset compensation, one can use a finite state machine working on digital samples of the output. Such loops are much more powerful (in terms of controllability, observability and predictability). In addition, they save power - a key requirement nowdays. Of course, the two main downsides of dither and simulation methodology have to be accounted for. Much more can be written on this subject but you'll have to wait for future posts where I'll talk about merits of digital control loops and their design methodology.

Going forward, some other things I forsee
a) More adaptive Transmit equalizers rather than just programmable. We are starting to see it but it is not clear (at least to me) what is the optimum split between receiver and transmitter. ISSCC 2011 had interesting papers on this.
b) Of course, higher speeds and lossier channels with some sort of optimal power management.
c) Better modeling of Tx analog imperfections in link level modeling and its impact on BER. More understanding of non-linear effects and inclusion in link modeling.
d) On the circuit side meeting the required accuracy and swing levels will continue to challenge and require close cordination between spec writing and system simulations.

 Comments are welcome especially to see the future trends.

Feb 20, 2011

The Sound of Music

 One of the "funnest" projects I did was to build a Direct digital synthesizer (DDS). This part takes in a digital control word and provides a sine wave output. The frequency of the sine wave is proportional to the digital control word. Apparently one of the customers was a  toy manufacturer manufacturing musical toys spewing out classic songs. It is interesting but us engineers tend to skimp over applications of our ICs - you'd be surprised how many different ideas people can come up with (especially if your product is marked for the consumer space)

This part had digital logic, memory (ROM to store the sine codes) as well as DACs, buffers and amplifiers. In essence it was an SOC even before that term had been coined :-) At that time there was no AMS flow. In fact the company did not even have a verilog license (given the analog roots).  Thankfully we did have waveform viewers on screen and not reams of paper to look at (that was my undergrad school!).

Most of the time was spent on the current steering DAC and opamps (used in the low pass filter and the drivers). The customer wanted at least 50db of output fidelity so that translated back into quantization noise and DAC resolution/accuracy. The rest of the time was spent making sure that spice converged :-) No, it wasn't that bad but we did have clunky tools - making sure we actually knew how spice worked. I could design and simulate the individual blocks well but spice used to choke when doing a full chip simulation (remember everything including the memory block was in schematic form). Then you needed multiple cycles to actually do an FFT and make sure your fidelity is not broken. Ugh!

It took me about 5 months of design and layout supervision to get it to fab. I was proud of being the sole designer on it. In hindsight I can now appreciate how much EDA tools have helped increase productivity and made possible much more complicated ICs. AMS, verilogA, database sharing - some of my favourites. More of these in a late blog.

All the functionality and specification was met. However, one of the clock outputs (used only for debug/test) was showing a voltage swing between 5 and .5V (instead of 5 and 0). After beating down on the board designer and finding no leakage paths, I went back to my schematics and layout. To make a long story short, turns out one of the nmos transistors was not connected to ground but to the substrate which ultimately got connected to ground (but via a high Impedence path).

I miss the simplicity of being sole designer on an IC but I don't miss the EDA tools of that time.

The Power of Simplicity

One of the first projects I did straight out of school did not make sense to me.

I and another RCG (recent college graduate) were told to build a 6MHz video filter in .6u BiCMOS process. Our manager came up with a simple scheme of sallen key biaquads and used a simple bipolar source follower as the opamp. (well there was some local feedback but lets not get there) We protested - we wanted to build a fancy schmancy folded fully-differential cascode opamp. Our manager stood firm. The product was supposed to cost only 60c and replace the present solution of using discrete Ls and Cs by our customers. Apparently these customers all used custom solutions with discrete bulky components and we were going to "IC" it.

So we built this simple video filter family (there were multiple bandwidths for the different standards - NTSC and also different gain values etc) which consisted of 6th order filter made up of biquads. It had 2 trim steps for Bandgap reference and temco and some cool equalizer with a nice BiCMOS video driver. We did such a great job that the part worked first time met all specs and sampled to customers ahead of time. Turns out that simple IC sold more than 15mil units in the first year of its introduction.

Interestingly, another engineer was given the task of building a filter family with tighter specs - ladder filters, fully differential configuration and 5 trim steps. The firm thought that they good enter the professional video market as well. Turns out neither the value proposition nor the costing worked out so this complicated product never made it big.

Two lessons
a) The power of integrated circuit comes from ability to replace discrete bulky components with a clear value proposition.
b) KISS. Doesn't have to be cutting edge but has to have a clear value proposition for the customer. Lets see how a DSP person responds to this comment

Footnote: The product won best product award at EEDN in 1996.