Clock distribution issues

Hi,
We are developing a distributed clock system with a custom PCB. Our system has an architecture as described in Decawave reference document APS007 (fig.1). The clock is generated by a clock generator (CDCM6208V1RGZ) and wired distributed in differential mode to all the anchors.
On the anchors the clock is converted from differential to single-ended and connected to XTAL1 AC coupled with 2200 pF capacitor (XTAL2 is floating) as described in par 3.5.1 of DW1000 datasheet.
Our scenario is this:

  • An external TAG (with clock on board) sends 1000 packets.
  • All anchors are placed at 50 cm from TAG in LOS.

In this condition, all the anchors receive 30% of good CRC and 70 % of sync loss events (reed Solomon frame sync loss). We have checked all power supply and verified that they meet the requirements reported in section 7.2 of DW1000 datasheet (we use a DC-DC converter). In addition we used a frequency counter to measure the clock frequency on XTAL1 and it is exactly 38.4 MHz.
Why do we have this kind of behaviours? What we can check?

Thanks,

Gianni

1 Like

Hi,
You are sending data, do you see the errors when you do not send any data or just a few bytes?
Have you tried TWR between some boards (no clock sync needed) to see if the boards are working fine? You could use our RX/TX examples code to do so.

Also have a look at our IC user manual section 4.1.6 RX Message timestamp and its note:
Note: Due to an issue in the re-initialisation of the receiver, it is necessary to apply a receiver reset after certain receiver error or timeout events (i.e. RXPHE (PHY Header Error), RXRFSL (Reed Solomon error), RXRFTO (Frame wait timeout), etc.). This ensures that the next good frame will have correctly calculated timestamp. It is not necessary to do this in the cases of RXPTO (Preamble detection Timeout) and RXSFDTO (SFD timeout).

Regards
Leo

Hi,

thanks for your reply. No, I do not receive any bytes when tag is not transmitting data.
I tried a TWR mode, but a lot of packet are lost and measure is not finalised with double side TWR.

I confirm that my software works as indicated in your IC user manual section 4.1.6 RX Message timestamp and its note.

Hi
I take it you mean “I do not receive any bytes when tag is transmitting data” .

Either way, then it coould well be a hardware issue if TWR does not work.
You also have the nodes rather close to each other (0.5m) and so the power seen at the receiver is more than about -85 dBm the receiver will potentially be saturated, see Figure 22 in the user manual.
Could you either move the nodes further apart or reduce the TX power ?
Regards
leo

Gianni,

There are many reasons you could experience these problems. Distributing a clean clock signal is hard, and the DW1000 performance relies on a very stable clock. Give it a clock that varies or has noise, and it really hurts the performance.

There’s no way I can definitely determine your problem, but I can suggest ideas you can try and ways to help isolate the problem.

My lead guess is that you need to look at the TCXO guidance in the datasheet, specifically the part about needing an LDO for VBAT shown in figure 37. The reason for this is that the XTAL input circuits are powered from VBAT and that the transmit pulses on VDDPA1 and VDDPA2 can bounce VBAT if tied together and cause issues on the XTAL input. Thus, you LDO VBAT to 3.0 volts, slightly under VDDPA1/2 at 3.3 volts, and that isolates noise. An external crystal doesn’t have this problem because it isn’t ground referenced, but an external clock is ground referenced and that is why the LDO is needed for external signals and not for the crystal.

Make sure you have a good clean signal. You need an amplitude over 800 mV (about 1.5 volts seems ideal), but not too strong, exceeding about 2.5 volts starts to cause issues. You want a stable clock frequency with minimal phase noise. A possible way to check this is an oscilloscope with delayed trigger. Trigger on mid point of rising or falling edge, and then view the signal 20 ms later, say. If your scope does infinite persistence, you can see jitter in the outcome.

Here’s an example of a reasonably good clock (signal generator, 500 ms after trigger):

image
And here is an example of a bad clock under the same conditions, first near the trigger point:

image
And then after a delayed trigger to see how stable it was:

image
It is only after a long delay do you see the second clock is wavering in period. That was enough to make the system not work.

The important point is that a clock that wavers in period, even slightly, does real damage to the DW1000 ability to work.

In addition to period jitter, amplitude noise contributes to bad clocking as well. This is voltage noise that causes the input buffer on XTAL1 to vary when it decides if the input is a 0 or a 1. So you want a clean signal there, and usually a little RC filter at the XTAL1 input can do wonders to clean that up.

Beware of ANY PLL in the clock chain before the DW1000. A frequency synthesizer is one example. Your clock distribution chip is another. All of those introduce new jitter. I’m not a big believer in the “jitter clean” features, either, as they may cause issues under certain circumstances.

When distributing clocks to multiple DW1000s, we have found that a good TCXO driving a PL133 fan out buffer works pretty well. If you need more than 3 outputs, put more PL133 chips down and tie their inputs together. We generally power it from 1.8 volts to avoid too much amplitude (which hurts EMC compliance as well).

Another tactic is to put the DW1000 in CW transmit mode and look at the carrier on a spectrum analyzer. It should be a really solid narrow central peak. Any humps or shoulders mean something is wrong with clock input or PLL loop filters.

To be sure you have a clocking problem, I’d take a known good DW1000 board (MDEK, EVK, etc) and try to use it to both transmit to your boards and to receive from your boards. This will tell you if the problem is clocking or something else. It can also tell you if the problem is only on transmit or receive. That’s a big clue for if it is only the node that transmits which is the problem, then the VBAT LDO issue is indicated due to VDDPA1/2 noise afflicting VBAT.

You can also hack in a crystal on your board and see if the DW1000 can now receive or transmit more reliably. This doesn’t solve your problem since you need sync’ed clocks, but it at least makes sure your board doesn’t have other issues masking themselves as clocking problems. For example, code issues could easily cause packet loss you are attributing to the clock issue.

Distributing a 38.4 MHz clock via cables has always been a tough challenge not only technically, but also if you have to pass EMC, you have this one pure spike at 38.4 MHz and that’s hard to tamp down. I would not want to have to pass FCC/CE tests with such a design. If you can at all design the system to not require a cable distributed clock, that would be the preferred solution.

Mike Ciholas, President, Ciholas, Inc
3700 Bell Road, Newburgh, IN 47630 USA
[email protected]
+1 812 962 9408

1 Like

Hi Mike,

Thanks for your answer. You say:

Blockquote
My lead guess is that you need to look at the TCXO guidance in the datasheet, specifically the part about needing an LDO for VBAT shown in figure 37. The reason for this is that the XTAL input circuits are powered from VBAT and that the transmit pulses on VDDPA1 and VDDPA2 can bounce VBAT if tied together and cause issues on the XTAL input. Thus, you LDO VBAT to 3.0 volts, slightly under VDDPA1/2 at 3.3 volts, and that isolates noise.

Now power configuration for each anchor in our distribution system is this:

External 12V -> Step down -> 4V -> LDO -> 3V3 -> LDO -> 1V8

LDO_3V3 supply:

  • All transceiver LVDS to SINGLE_END
  • Microcontroller
  • VDDDIG, VBAT, VDDIO_A, VDD_AON, VDDPA1, VDDPA2, VDDLNA,

LDO_1V8 supply:

  • VDDLDOA, VDD_LDOD

Do You suggest instead to do this?

External 12V -> Step down -> 4V -> LDO_1 -> 3V3 -> LDO -> 1V8
External 12V -> Step down -> 4V -> LDO_2 -> 3V3

Thanks,

Gianni

The problem here is that VBAT and VDDPA1/2 are on the same voltage rail, namely 3V3. The VDDPA1/2 pins create current spikes when transmitting, which cause voltage dips on the 3V3 rail, which VBAT feeds into the XTAL1 input buffer, which causes excessive jitter on the clock input signal and ruins the transmit timing. So your transmit packets are being damaged by clock jitter from noise. It does not matter how 3V3 was generated, the coupling of VBAT and VDDPA1/2 is the issue.

The solution is an LDO for VBAT so it is isolated from the VDDPA1/2 created noise. This is shown in the application diagram in the datasheet:

Note U3, the 3V0 LDO in the diagram. Missing from the diagram are the appropriate input and output filter caps on the LDO (most LDOs required some minimum amount of output capacitance for stability).

Here at Ciholas, we have tried putting substantial amounts of capacitance and filtering on VBAT instead of an LDO and that did not work completely, though it got better. In our experience, the LDO is required for any external clock signal. It isn’t required for a crystal because the crystal isn’t ground referenced and “floats” with the noise.

If your board is routed as I expect, with a solid 3V3 net connecting pins, you will find it hard to retrofit an LDO to isolate VBAT. Thus it may take a board revision to fully fix this. You would want to know that a board rev will fix it, so here are some tests to help you figure this out:

Replace the TXCO with a crystal. If the boards now transmit and receive fine, then the LDO fix is likely to solve the problem.

Heap massive amounts of capacitance on VDDPA1/2 pins. If packet reliability goes up with increasing capacitance, then the LDO fix will likely work.

Reduce TX power. If packet reliability, at short range, goes up with reducing transmit power settings, then the LDO fix likely works.

Transmit to the boards from a known good transmitter (MDEK, EVK, etc). If the boards receive fine, then the LDO fix is likely to work. Then transmit from your boards to a known good receiver. If that has packet errors, this confirms it is a transmit problem and the LDO fix is likely to work.

Mike Ciholas, President, Ciholas, Inc
3700 Bell Road, Newburgh, IN 47630 USA
[email protected]
+1 812 962 9408

Hi Mike,

Thanks again for your answer.

Blockquote
Replace the TXCO with a crystal. If the boards now transmit and receive fine, then the LDO fix is likely to solve the problem.

We have tried to do this and every thing seems to work fine.

Blockquote
The problem here is that VBAT and VDDPA1/2 are on the same voltage rail, namely 3V3. The VDDPA1/2 pins create current spikes when transmitting, which cause voltage dips on the 3V3 rail, which VBAT feeds into the XTAL1 input buffer, which causes excessive jitter on the clock input signal and ruins the transmit timing. So your transmit packets are being damaged by clock jitter from noise. It does not matter how 3V3 was generated, the coupling of VBAT and VDDPA1/2 is the issue.

What you say, can also generate issues in rx mode? In fact we have tried to send 1000 blink messages from MDEK to our anchors with external clock and we lost a lot of packets.

Gianni

This strongly suggests an issue with how the external clock is being delivered to the XTAL1 pin.

Usually, the absence of the LDO doesn’t cause receive problems, only problems in transmit. The fact you are also experiencing problems in receive suggests you may have problems beyond the LDO being absent.

Can you send scope shots of the XTAL1 input signal?

Can you post the schematic of the XTAL1 drive circuit?

Mike Ciholas, President, Ciholas, Inc
3700 Bell Road, Newburgh, IN 47630 USA
[email protected]
+1 812 962 9408

Hi Mike,

Below you can find what you requested:

ANCHOR

Power:

image

Clock distribution:

image

RF:

image

CLOCK GENERATOR

CLK_F generated by CDCM6208V1RGZ
image

Wave form acquired with Teledyne Lecroy Wavesurfer 3010z, probe Teledyne PP019:

In addition, what we see is the dw1000 lose the PLL lock when is in RX mode. Our idea is that happen most frequently when anchors are in NLOS. What do you think about that?

Thanks again,

Gianni

You images are just readable enough for me to get an idea.

The ADN4662 has an output voltage swing of 3.3 volts (supply voltage). You are over driving the XTAL1 input with too much amplitude, and your scope shot confirms this (even going a bit above the rails due to trace inductance). Though not documented, over driving the XTAL1 input causes problems in the clock system (guess how we know this…). We’ve found about 1.8 volt swing is a reasonable operating point which is well above the minimum required (0.8 volts) and below the rails (3.3 volts).

Hack the board so there’s a divider on the output of U7 (say 1K/1.2K to get about 1.8 volts), then feed through a DC block cap (2.2 nF is fine) to the XTAL1 pin. See if that helps. The 1K/1.2K divider also serves as a bit of a low pass filter due to input capacitance.

A better final result is to find a 1.8 volt receiver chip which will produce a suitable output swing with no further circuitry other than a DC block.

Also, you still have the clock power tied to rails which will experience VDDPA1/2 problems on transmit. Ideally, the LVDS receiver is powered from a separate power supply from whatever powers VDDPA1/2, and that powers VBAT. For TCXOs, this is a 3.0 volt LDO from 3.3 volt rail. You want the clock signal to NOT bounce on transmit pulses as that slightly moves clock edge locations. When dealing with picosecond timing, that’s important.

PLL lock has nothing to do with UWB radio traffic, NLOS or not, it is just how the clock circuits are running. If PLL lock is being lost, it is either a bad clock input (likely) or bad PLL filter passives or layout (possible but usually not the case). The fact PLL lock is being lost is a strong indication there are clock problems.

Finally, when using an external clock, generally you set XTALT to zero, minimum capacitance. This reduces power usage. In this case, with a 1K/1.2K divider, the XTALT control can serve as an adjustable low pass filter element, so you can try adjusting it to see if that helps. Higher values of XTALT will lower the roll off frequency and lead to possibly more stable results, at the expense of power. Your design does not appear to be power sensitive, however.

Good luck and let us know how it turns out.

Mike Ciholas, President, Ciholas, Inc
3700 Bell Road, Newburgh, IN 47630 USA
[email protected]
+1 812 962 9408

Hi Mike,

Thanks for your suggestions. I will let you know as soon as possible! Thanks!

Gianni

Hi Mike,

We did a little step forward. Now DW1000 don’t lose the PLL but our clock is not yet as you suggest above, we are working on it. We are following your indications step by step, and we performed the trigger delayed test. The result is this:

image

What do you think about it? Our measure show about 300 ps of jitter. We are working to reduce it and to reduce the signal amplitude @ 1.8 V.
With the actual configuration we are trying to do some more measures for validating accuracy. Our setup is very simply: a tag is placed in front of an anchor at fixed distance and orientation, both placed at heigth of 30 cm from ground. Varying the distance (from 1 m to 2.5 m) and/or orientation we noticed that the anchor moves from LOS to NLOS and vice versa in a not clear manner (the NLOS threshold is the same as described in section 4.7 of the user manual). Obviouslly no obstacle are placed between anchor and tag. Moreover, at fixed distance and variable orientation we noticed that the time of flight changes with a variation up to 50 decawave ticks. All measures are very repeateble: placing the tag in the same orientation we get the same results as before (std dev about 8 decawave tick). For us, it’s too much to be explained with the antenna delay changing with orientation considering that we use an omnidirectional antenna.

Thanks,

Gianni

What did you change?

That’s probably acceptable and may be the scope and not your signal.

The 30 cm height over the ground will subject you to destructive interference from ground bounce even at the short distances you are operating over. This can be so severe as to completely lose the first path and result in an NLOS output and bad data.

Loss of first path can also happen if your cross polarize the antennas (vertical on one, horizontal on the other, say), which explains the orientation results you are getting.

Mike Ciholas, President, Ciholas, Inc
3700 Bell Road, Newburgh, IN 47630 USA
[email protected]
+1 812 962 9408

Hi Mike,

What did you change?

We have just reduced the power supply on LVDS from 3V3 to 3V. A little patch waiting PCB reworks.

The 30 cm height over the ground will subject you to destructive interference from ground bounce even at the short distances you are operating over. This can be so severe as to completely lose the first path and result in an NLOS output and bad data.
Loss of first path can also happen if your cross polarize the antennas (vertical on one, horizontal on the other, say), which explains the orientation results you are getting.

We did other tests moving the anchor to height more of 2 m: the issue related to NLOS seems disappear but we keep noting issues related to inconsistent time of flight (variation up to 50 ticks decawave with fixed distance and variable orientation). Our idea is that phase noise on reference clock signal could cause such behaviours. Now we are working to reduce phase noise and signal amplitude as you suggested. Any further suggestions is appreciated.

Gianni

Glad that helped.

Just because you aren’t losing PLL lock doesn’t mean the signal is good now, it only means it is better enough for the PLL lock to be maintained.

That pretty much confirms it was destructive interference with ground bounce.

That suggests you are finding an antenna null on the first path and you are reading a multipath distance instead.

All antennas have nulls, you simply can’t work around that. When the null is aligned with the first path, the other node will hear a multipath signal and timestamp that, leading to an increase in measured distance or time.

Not usually if this symptom is antenna orientation dependent.

Mike Ciholas, President, Ciholas, Inc
3700 Bell Road, Newburgh, IN 47630 USA
[email protected]
+1 812 962 9408

Hi Mike,

Thank you so much for your support.

we are quite sure about it, but what is the best test for you to affirm that the signal is good and the DW1000 is working fine? Or, what are the signs that show you that DW1000 is not working correctly?

That suggests you are finding an antenna null on the first path and you are reading a multipath distance instead.
All antennas have nulls, you simply can’t work around that. When the null is aligned with teh first path, the other node will hear a multipath signal and timestamp that, leading to an increase in measured distance or time.

Now we are using a standard Taiyo Yuden antenna AH 086M555003 as suggested by Decawave. What we see is variation up to 50 ticks for little variation of orientation, no more of 10 degrees. We think (and hope) that is too much to be related to antenna null.

Thanks,

Gianni

A good basic test is to put the DW1000 into CW (continuous wave) mode where it puts out a pure single frequency. Capture that on an RF spectrum analyzer with fine resolution bandwidth (say 1 Hz if your analyzer is capable of it) and examine the purity of the frequency. If the signal is a sharp peak, then you have good reference clock and PLL filters. If you see “shoulders” or a series of “fences” around the peak, then that’s an issue.

You can also sniff around the board to check out the 38.4 MHz, 124.8 MHz, 499.2 MHz frequencies as well. If 499.2 MHz is sharp and clean, but the carrier isn’t, that indicates PLL filter problems on VCOTUNE.

The CW test fails to account for the effect of transmitter pulsing during normal operations which can upset an external clock input in particular. The only true test of that is to see if you get reliable packet transmission to another node. Ultimately, that is always the final test, how well do you transmit packets to another node. Any clock jitter or instability affects packet reliability quite severely.

It all depends on where those 10 degrees of orientation change are. If that puts you into a null, and the receiving node measures a multipath signal instead of the true first path, then 10 degrees of orientation change can easily cause a change of 50 ticks.

Our earliest design, more than 5 years ago, used the Taiyo Yuden antenna. The performance was disappointing so we have spent time and effort designing better antennas since. We’ve also tried all the ceramic ones (Taiyo Yuden, Partron, Johanson, etc) and none of them work well in our opinion. There is a lot more to good UWB antenna design that radiation efficiency, dBi, VSWR, etc. You have to look at time domain factors as well and I don’t think the ceramics do that very well.

Designing a really good UWB antenna is very difficult.

Mike Ciholas, President, Ciholas, Inc
3700 Bell Road, Newburgh, IN 47630 USA
[email protected]
+1 812 962 9408

Hi Mike,

Thanks for your suggestions. I will let you know as soon as we made all the fixes that you suggest.

Gianni

Hi Mike,

As you suggested, we have followed your guide lines regard our clock distribution system. We have found a lot of critical issues in our system that we have tried to fix. Actually, we reached the following result about clock generation and continuous wave analysis:

image

The violet color shows the clock generated by CDCM (main source), and the yellow the clock in input to DW1000. We know exactly the origin of the spourious 2,25 MHz due to step down that we will change on the new HW version.

image

Here, the violet is the signal generated from DW1000 in CWM by applying the CDCM clock signal directlly to DW1000. And the yellow is the signal generated distributing the clock by our clock distribution system (ADN 4661/62), in both cases the swing is about 3v3. We did a lot of test but we didn’t find any evidence and improvment due to swing reductions from 3v3 to 1v8 as you suggested. What do you expect to see reducing the clock swing amplitude?

The following picture showes a zoom of previous picture around the carrier:

tempsnip

We notice this difference in the two signal, shoud be a problem for you?

Thank a lot,

Gianni