Tuesday, September 13, 2011

The great XBee 57.6kbps mystery finally solved.

Ever since I started using the STM32F103 microcontroller, I've been hindered by the inability to wirelessly program the way I could with my old MSP430F2274 setup. Which sucks because that's the entire point of the wootstick (Wireless bOOTloading). The wootstick is the MCU board I've used on all my motor controllers:


The latest one, v2.0, has the STM32F103, an FTDI USB-to-UART converter, and an XBee Series 1 digital radio built on to a 3"x1" board with 2mm breakout headers. Ideally, it should:
  • Send and receive data over USB or over the XBee as a virtual COM port to a PC. Switching from USB to XBee wireless is seemless - no software modifications required on either side.
  • Be bootloader programmable through either the USB port or (with some configuration) the XBee radio. The latter option is the most useful to me, since it means the controller can be embedded in an enclosed system and still be programmed.
The USB interface works fine, but up to now, I've been able to get only half of the wireless functionality working. I can either wirelessly bootload or send and receive data over XBee, but not both. Changing from one state to the other requires reconfiguring the XBee, which is impossible if the controller is embedded in, idk, say, a scooter.

I decided I would solve this problem before moving forward on sensorless field-oriented control, or rather, that I would use it as an excuse for not making any progress for so long... I figured it would only take a day of messing around to figure out why exactly this was happening, I just hadn't actually sat down and done it up to now.

Then, as I was spending the last two weeks playing with the PCB quadrotor, I ran into almost exactly the same problem. I tried to set up the Arduino Mini on the quadrotor to be wirelessly programmable, so that I would not have to tether it to my computer every time I want to change the controller. But I found that if I set the radio to the baud rate required for the bootloader (57.6kbps, same as the STM32F103), I would get intermittent communication under radio control. Furthermore, the problem went away under USB control with the same baud rate, which suggested a problem specific to the XBee radios. And I've seen other online sources that suggest XBee trouble at 57.6kbps. So I decided to dig a little deeper.

Step one was to round up all my XBees, minus the two high-power transceivers that I have yet to get back from certain unscrupulous borrowers.
Step two was to find a real scope.
As much as I love the Tek 2445 analog scopes in MITERS and the Edgerton Center, I absolutely needed a digital scope to see what was going on at the bit level of the PC-to-XBee-to-microcontroller serial communication. So I went down to the mechatronics lab, which I now have access to since I am apparently the TA for a new course. (Have you heard?!) And there are 12 brand-new four-channel Agilent digital storage oscilloscopes with all kinds of fancy features. They can even save waveforms as .png images to a flash drive. (Something I did not find out in time for this post, as you'll soon see.)

Anyway, based on internet rumors, I suspected a problem with baud rate timing mismatch between the computer, XBee, and microcontroller. All are nominally set to 57.6kbps, but due to the limited clock frequency dividers on the XBee and the microcontroller, there is some error. First, the PC:


I zoomed way in on the first (start) bit of the transmission, something only possible with a nice set of working trigger and holdoff settings on the new scopes. The bit width is 17.34μs/b, the inverse of which is 57.67kbps. Within the error tolerance of the measurement, that's exactly correct. Next, the XBee:


A single bit on the XBee Tx pin was 17.00μs wide. This is 58.824kbps, which is what the internet suggests a XBee set to 57.6kbps actually is. The reason for this is because the XBee has a 16MHz crystal and an integer divider the produces the buad rate. If the integer divider is of the form 16MHz/(16*N), and N=17, then the exact baud rate is 58.824kbps. The next integer up, N=18, would yield 55.556kbps, which has a higher error than N=17. So the XBee uses N=17 assumes the PC can deal with the 2.1% error in baud rate, which it can.

Now here is where it gets interesting. The Arduino, when configured with Serial.begin(57600), showed up with a bit rate of 17.50μs. (Sorry, forget to take a picture of the scope...) This is 57.140kbps. It's consistent with an integer divider of the form 16MHz/(8*N) with N=35, and digging into the Arduino serial library source code suggests that this is in fact what happens. With an error of less than 1%, it's preferred over N=34, which would make it exactly equal to the XBee rate. 

But, the error on the XBee is in the opposite direction as the error on the Arduino. So the total error of an XBee talking to an Arduino at 57.600kbps nominal is close to 3%. This is flirting with the maximum error tolerance of the USART, according to the ATmega328 reference manual, page 193. (Seriously, RTFM, you will not find this information on www.arduino.cc.) When you factor in the +/-1% error tolerance of the ceramic resonator used to generate the Arduino's 16MHz clock, the chance for framing errors increases even more.

Why exactly it works in the bootloader but not during normal data transmission, I don't know. Maybe it has trouble picking up bytes that are adjacent to each other, or maybe the bootloader just has better software error checking to verify the program data is correct. In any case, I took the safe route and forced the Arduino baud rate to match the XBee baud rate exactly:

  Serial.begin(57600);
  UCSR0A |= (1 << U2X0);
  UBRR0L = 33;

Which, now that I look at it, should be the same as:

  Serial.begin(57600);
  UCSR0A &= ~(1 << U2X0);
  UBRR0L = 16;

It's 16MHz / [16 * (16 + 1)] instead of 16MHz / [8 * (33 + 1)]. (RTFM on page 179.) The latter is preferable because slow mode has a wider error band. Either way, it forces the Arduino to have a bit rate of 58.824kHz, matching the XBee. This made the RC control much happier, and the bootloader still works. (It should be independent of the user code.)

But that was all for the Arduino. To test the clock speed deviation theory on the STM32F103, I took out DirectDrive and brushed the dust off of it:

Wasn't I going to do something with this?
I checked the USART documentation for the STM32F103 against my code only to find that the USART clock divider has significantly more resolution. Instead of an integer clock speed divider of the form 16MHz/(16*N) it is essentially 16MHz/N, and I have N=278. (For some reason, it's not quite that simple, RTFM page 768.) But the result is that it should be (and is) running at 57.55kbps. The error between this and the XBee baud rate should be within tolerance of the USART.

Additionally, sending and receiving data works just fine on the STM32F103 at 57.6kbps with 8-N-1 configuration. Only when set to 8-E-1 (even parity bit, required by the bootloader) would the wireless data transmission stop working. But the bootloader worked fine on 8-E-1, with the XBees properly configured for even parity. So now I had at least narrowed it down to a problem involving the parity bit configuration.

Interestingly, I had unknowingly managed to avoid the problem over USB, set at 57.6kbps 8-N-1, because the FTDI chip would automatically reconfigure to 8-E-1 when using the bootloader, then switch back to 8-N-1 when running user code. Only because the XBee radios are have a fixed parity setting did the problem seem to be a wireless-only issue.

I think the theme of this post is Read The Fucking Manual because after only about 10 minutes of doing just that, I found the answer:

It's right there on page 791.
Essentially, by setting the PCE bit, I thought I was appending a single, even parity bit to the end of the data frame, making it 8-E-1. Instead, it was replacing the last data bit with the parity bit, making it 7-E-1, and screwing up the entire data protocol. Only by changing the word length to 9-bit by setting the M bit could I get 8-E-1. I should point out that no other microcontroller that I've used does it this way. Setting the parity enable bit appends a parity bit to the data frame on the ATmega328 and the MSP430F2274.

Well, with that sorted out, everything works properly now. The flash bootloader works over XBee and I can send and receive data at 57.6kbps, 8-E-1. Clearly the next thing to do is go back to working on sensorless control make MIDI scooter play Railgun.

6 comments:

  1. Hi! Found your page looking for STM32 CAN baud. Second the motion - RTFM over and over, because there are more options and remaps and gotchas than you can shake a stick at. Makes the part amazingly useful, but complex. One small quibble re: your conclusion: Parity has *always* been defined as being within the data field, on all the comm systems I've used since I started in 1972. If you're expecting an extra bit after an 8-bit byte, then you'd better specify a 9-bit byte. The implicit data rate degradation of parity is why it became worthwhile to invent checksums and CRCs over the whole transmission, trading CPU time for message bits.

    ReplyDelete
  2. Good to know. I only have a limited data set of modern microcontroller UARTs and every one I've seen up until now has had a parity enable that appends a bit to the existing data frame length. This is also true of the serial port object in all the programming languages I've used for PCs. But I can see the reason for defining it the other way, to preserve data rate.

    ReplyDelete
  3. HI, I have a problem with baud rates >9600.
    I have a setup with 2 arduinos(spark fun pro micro) and 2 xbee series 2.
    There is no communication with >9600 baud rates.
    Spark fun pro micro uses Altmega32U4 so i can't use the upper registers.
    How can I solve this issue?
    Any help appreciated.
    Thanks in advance

    ReplyDelete
    Replies
    1. I haven't used the ATmega32U4 but I think it should have similar registers. If it's using Arduino syntax you could also try manipulating the argument of Serial.begin() up or down a bit to force it to pick a different register value. For example, try Serial.begin(58825) to match the XBee.

      Make sure the XBee's are not defaulted back to 9600bps as well. I've accidentally done that many times!

      Delete
  4. I have an XBee AT Coordinator plugged into an Adafruit Xbee Adapter board. When my program reads data coming from a arduino, there is no issue. no hicups. but when my java program collects data from the Xbee Adapter board, for some reason, the buffer of the xbee tends to want to freeze. I never have this problem with the arduino. I am wondering? Could what you've post be the issue?

    ReplyDelete
    Replies
    1. Could be, but I don't think I would rule out other problems either. You can test it pretty easily by setting all baud rates to 58.8kbps instead of 57.6kbps. The error is sometimes okay and sometimes borderline, just depends on the exact configuration. But at least if you set everything to 58.8kbps and still have problems you can rule that out and move on to other troubleshooting steps.

      Delete