CAN J1939 device stops responding after communication timeout

1.3k Views Asked by At

I'm a higher layer guy, I don't and don't want to know much about , or even particular ECUs. I just don't like the software solution, so I'd like to ask, if customer's requirements are legitimate.

  1. If particular ECU doesn't receive CAN frame within 300 ms timeout after powerup, it stops responding to any further frames and must be power cycled. This is a information from customer's technicians, I have to just believe it.
  2. It is possible to powerup ECU after CAN driver thread is ready, but it would require some extra wiring by end customers.
  3. Software solutions are all bad or worse, like running FreeRTOS before important checks, put CAN driver code to code common with other products, or start CAN periphery in the bootloader and left running without software control until driver starts.
  4. The sensitive part is, that we have no explicit demand to start CAN driver within such a short time in specification. Customer says, that it's part of J1939 specification.

Can someone confirm or disprove, that J1939 allows devices to unrecoverably stop receiving after 300 ms of silence or requires devices to start transmitting within 300 ms after powerup? Or at least guide me to parts of J1939 standard, which could possibly regard this?

Thank you

2

There are 2 best solutions below

0
Kenny_pce On BEST ANSWER

My colleague answered, that there's no such demand, only vague 300 ms timeout.

6
Lundin On

If particular ECU doesn't receive CAN frame within 300 ms timeout after powerup, it stops responding to any further frames and must be power cycled. This is a information from customer's technicians, I have to just believe it.

This does of course entirely depend on what task it is performing.

Generally, an ECU, as in an automotive computer in a car/truck etc is never allowed to hang up/latch up. The normal course of action would be for the ECU to either reboot/reset itself or revert to a fail-safe mode.

But in case of tractors and heavy machinery the normal safe mode is "stop everything".

It is possible to powerup ECU after CAN driver thread is ready, but it would require some extra wiring by end customers.

I don't know what this is supposed to mean. What is "extra wiring"? Something to keep other nodes in common mode while one is rebooting? Terminating resistors? Some dirty power-up delay circuit?

Software solutions are all bad or worse, like running FreeRTOS before important checks, put CAN driver code to code common with other products, or start CAN periphery in the bootloader and left running without software control until driver starts.

Generally speaking, it's custom to initialize critical hardware like clocks, watchdogs, prescalers, pull resistors etc very early on. Initializing hardware peripherals may or may not be critical. It's custom to do this after the CRT has been executed, at the beginning of main() and the order of initialization usually matters a lot.

If you have a delay longer than 300ms from power-on reset to the start of main(), something is terribly wrong with the program.

The sensitive part is, that we have no explicit demand to start CAN driver within such a short time in specification. Customer says, that it's part of J1939 specification.

I haven't worked much with J1939 and I don't remember what it says specifically, but 300ms is an eternity in a real-time system! It's not a "short time".

In general, correctly designed mission-/safety-critical CAN control systems in automotive/industrial settings work like this:

  • All data is sent repeatedly in fixed intervals, regardless of if it has changed or not. Commonly once per 10ms or once per 100ms.
  • A node which has not received new data will use the previously received data for now.
  • There is a timeout from the point of when last valid data was received, when the receiving node must stop using old data and revert to a fail-safe mode. This time is often relative to how fast the controlled object can move. It's common to have timeouts after some multiple of 100ms.

I would say that your customer's requirements are very reasonable, it's nothing out of the ordinary.