Jump to content
yyrkoon

SocketCAN help needed

Recommended Posts

I was wondering if anyone has experience with socketcan under Linux, and might be able to share some insight on the matter.

 

I have searched the web, and have indeed found a lot of information. The protocol we're dealing with is a proprietary undocumented protocol, based on J1939, and NEMA 2000 fast packets( mixed proto is seems ). We're fairly certain this is correct as after a few weeks we've identified most / all PGN's, and can reassemble the data fields by hand.

 

Where my problem lies is in how to implement code to piece this all together. I've read most of the linux/documentation/can.txt file. I have also found a few simple code examples, and also have looked through candump.c from the can-utils git repo( which is very terse reading ). Nothing so far seems to demonstrate, or document how to implement filters, so we can read frames for specific PGN's only. Perhaps I missed or skipped over something ?

 

Anyway, I was wondering if anyone else may have already gone through this learning experience and be able to share some insight on the matter. Books would be good, and links too of course. But I tend to learn best from well structured ( and commented ) code. Any information would of course be welcome.

 

Thanks in advance.

 

[EDIT]

 

I should probably also add that I am fairly newb to Linux programming, and had zero CAN experience prior to this endeavor. So, I do not mind learning, but am starting to feel "stuck"

Share this post


Link to post
Share on other sites

OK, so I've made *some* progress. But have run into what for now seems to be a minor snag. So far, I've managed to setup a virtualbox virtual machine with Debian wheezy 7.0 i386. In order to create an isolated CANBus dev machine. After installing all the prereq packages for development, I then pulled in the can-utils, built, and installed. Creating a logfile from the actual beaglebone was easy enough, as was setting up a vcan interface, and using canplayer to "play back" the logfile. candump then confirms that it works. So . . .

 

candump:

  vcan0  19F00302   [8]  C2 00 B8 0B 00 00 00 00
  vcan0  19F00302   [8]  C3 00 00 00 00 02 01 00
  vcan0  19F00302   [8]  C4 00 00 00 00 00 00 00
  vcan0  19F00302   [8]  C5 00 00 B8 0B 00 00 00
  vcan0  19F00302   [8]  C6 00 00 00 00 00 03 01
  vcan0  19F00302   [8]  C7 00 00 00 00 00 00 00
  vcan0  19F00302   [8]  C8 00 00 00 B8 0B 00 00
  vcan0  19F00302   [8]  C9 00 00 00 00 00 00 FF
  vcan0  19F0C503   [8]  E0 15 03 15 20 AD 00 00
  vcan0  19F0C503   [8]  E1 7C 01 00 00 10 00 00
  vcan0  19F0C503   [8]  E2 00 FF FF FF FF FC 00
  vcan0  19F0C503   [8]  E3 00 FF FF FF FF FF FF
  vcan0  19F0C503   [8]  00 15 03 03 B2 61 00 00
  vcan0  19F0C503   [8]  01 10 FF FF FF 06 00 00
  vcan0  19F0C503   [8]  02 00 FF FF FF FF FC 00
  vcan0  19F0C503   [8]  03 00 FF FF FF FF FF FF
  vcan0  19F0C502   [8]  C0 15 03 03 12 61 00 00

 My cantest app:

  Recv :  99F00302       C2 00 B8 0B 00 00 00 00
  Recv :  99F00302       C3 00 00 00 00 02 01 00
  Recv :  99F00302       C4 00 00 00 00 00 00 00
  Recv :  99F00302       C5 00 00 B8 0B 00 00 00
  Recv :  99F00302       C6 00 00 00 00 00 03 01
  Recv :  99F00302       C7 00 00 00 00 00 00 00
  Recv :  99F00302       C8 00 00 00 B8 0B 00 00
  Recv :  99F00302       C9 00 00 00 00 00 00 FF
  Recv :  99F0C503       E0 15 03 15 20 AD 00 00
  Recv :  99F0C503       E1 7C 01 00 00 10 00 00
  Recv :  99F0C503       E2 00 FF FF FF FF FC 00
  Recv :  99F0C503       E3 00 FF FF FF FF FF FF
  Recv :  99F0C503       00 15 03 03 B2 61 00 00
  Recv :  99F0C503       01 10 FF FF FF 06 00 00
  Recv :  99F0C503       02 00 FF FF FF FF FC 00
  Recv :  99F0C503       03 00 FF FF FF FF FF FF
  Recv :  99F0C502       C0 15 03 03 12 61 00 00

In most ways( that are important ) the output is identical. However the frame.can_id field is different. Just slightly, and it only seems to be different in the first byte. This frame.can_id field in our case is actually a NEMA 2000 / NEMA 2000 fast packet PGN, and also when read by candump is always 0 or 1. Going from memory, the first byte, is actually a bitfield, and can only be 0, or 1. I'll have to go back an revisited the documentation I have. But I am not sure how my program differs from candump in this context. Perhaps it's a default socketcan filter ? People say that candump is a good example on howto implement  socketCAN RX in C. But I beg to differ. The code is absolutely horrendous when it comes to readability.

 

[EDIT]

 

Ah, my buddy tells me the PGN is a 29 bit field, and that the last 3 bits need to be masked off. Still I've yet to figure out how candump is doing this . . .

Share this post


Link to post
Share on other sites

The code for those who may be interested in a very simple socketCAN RX example:

#include <stdio.h>
#include <stdint.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <linux/can.h>
#include <linux/can/raw.h>
#include <net/if.h>
#include <sys/ioctl.h>

static int sock;

static int can_init(char *ifname){

    struct sockaddr_can addr;
    struct ifreq ifr;

    sock = socket(PF_CAN, SOCK_RAW, CAN_RAW);
    if (sock < 0){
        perror("Unable to create socket.");
        exit(1);
    }

    memset(&ifr.ifr_name, 0, sizeof(ifr.ifr_name));
    strncpy(ifr.ifr_name, ifname, IFNAMSIZ);

    if (ioctl(sock, SIOCGIFINDEX, &ifr) < 0) {
        perror("SIOCGIFINDEX");
        exit(1);
    }

    memset(&addr, 0, sizeof(addr));
    addr.can_family = AF_CAN;
    addr.can_ifindex = ifr.ifr_ifindex;
    if(bind(sock, (struct sockaddr *)&addr, sizeof(addr)) < 0){
        perror("Unable to bind to socket");
        exit(1);
    }

    return 0;
}

static void read_can_frame(void){

    struct can_frame frame;
    int nbytes;
    int i;

    nbytes = read(sock, &frame, sizeof(struct can_frame));

    if (nbytes < 0) {
            perror("can raw socket read");
            exit(1);
    }
    printf("  Recv :  %03X      ", frame.can_id);

    for (i = 0; i < frame.can_dlc; i++){

        printf(" %02X", frame.data[i]);
    }

    printf("\n");
}

int main(int argc, char *argv[]) {

    can_init("vcan0");

    while(1){
        read_can_frame();
    }

    return 0;
}

Share this post


Link to post
Share on other sites

Bit 31 of can_frame.can_id indicates if an 11 or 29 bit address is used. It is probably being masked by the candump program because it is not actually part of the address.

 

It would be nice if the candump program used 3 or 8 hex digits to indicate the address size in use.

 

Using a stream (socket) for a block based protocol seems like a bad idea. Much that can go wrong.

Share this post


Link to post
Share on other sites

Looks like this is the code that does the printing in your candump example...

 

void sprint_long_canframe(char *buf , struct can_frame *cf, int view) {
	/* documentation see lib.h */

	int i, j, dlen, offset;
	int dlc = (cf->can_dlc > 8)? 8 : cf->can_dlc;

	if (cf->can_id & CAN_ERR_FLAG) {
		sprintf(buf, "%8X  ", cf->can_id & (CAN_ERR_MASK|CAN_ERR_FLAG));
		offset = 10;
	} else if (cf->can_id & CAN_EFF_FLAG) {
		sprintf(buf, "%8X  ", cf->can_id & CAN_EFF_MASK);
		offset = 10;
	} else {
		sprintf(buf, "%3X  ", cf->can_id & CAN_SFF_MASK);
		offset = 5;
	}

	sprintf(buf+offset, "[%d]", dlc);
	offset += 3;

.....
So it does mask off the address bits and also prints 3 or 8 hex digits as needed.

Share this post


Link to post
Share on other sites
Using a stream (socket) for a block based protocol seems like a bad idea. Much that can go wrong.

 

Assuming I understand where you're coming from on this comment . . .

 

You can thank the folks at Volkswagon for that.  Initially, socketCAN was their "baby", but back then it was called "LLC" or some such acronym. All according to what I've read over the past month or so anyway.

 

What else I've read has made complete sense as to why it was initially implemented as a network model driver versus the old way of using a character driver model. Three of the key points that stuck in my mind was performance, ease of creating a driver for new hardware, and on the user space end. The ability to have multiple read / write access to the same device at one time.

 

Anyway, I'm not really arguing against your point. Which is certainly valid. What I can tell you however is that so far what I've seen seems to work really well. No out of order frames / data, and the like that I've noticed yet. Supposedly the driver is supposed to deal with all that, but how exactly . . . I'm not sure that I care *YET*.

 

External to the beaglebone + logic supply serial / CAN cape we're connected to an AC inverter ( Schneider / Xantrex ). The idea here is to create a "black box" to read data off the inverter, MPPT charge controller, and comm box all connected to the same CANBus. Thus being able to graph, plot, or other wise display amp hours / watt hours that come in, and goes out of the system. Schneider sells such a device, and we do own one . . . But at ~$400-$500 a pop . . . it is a bit rediculous.

 

Anyway, their protocol( Xanbus ) from what I've read is based off of ModBus which I understand to be an opensource inititive. This does not seem correct as NEMA 2000 is something completely different as I understand it( Still learning . . .). But in the case that their hardware *is* using a protocol that is based off of opensourced software . . . I think it is only fitting that *someone* reverse "their" technology, and give it back to the open source community. Which at this point my buddy has all but completely reversed the high level protocol. At least enough to recognize 90% of the PGN's, and by extension giving us the ability to decode the data. Still plenty left  though for me to wrap my brain around, but I love learning.

 

Speaking of learning. Thanks much for pointing me to that code @@oPossum. I was actually reading that bit last night, but was feeling tired, and went to bed. Actually, I was glancing through the whole source file, but that bit stuck out in my head as relevant.

Share this post


Link to post
Share on other sites

By the way both:

 

(CAN_ERR_MASK|CAN_ERR_FLAG)

 

and

 

CAN_EFF_MASK

 

Work in my case using the %X8 format. Both chop off the leading zero ( if one exists ) for the PGN, but I do not think that matters for our usage. I'll have to read through can.h though to make sure I completely understand what Each does. Obviously the first is for error msg's frames and the second is for EFF msg's frames. More reading to do . . .

 

Thanks again oPossum.

 

Share this post


Link to post
Share on other sites

More progress . . . a few value I'm not able to recognize / decode. Mostly the 06xxxx PGN's. My guess is that perhaps they're error frames, but I have nothing to "prove" that yet. Also it is possible my bit-manipulation is faulty . . .

 

code fragment:

typedef union{
    struct {
        unsigned int byte1 : 8;
        unsigned int byte2 : 8;
        unsigned int byte3 : 8;
        unsigned int byte4 : 8;
    }bytes;
    unsigned int field;
}pgn_field_t;

static unsigned int create_pgn(unsigned int value){
    pgn_field_t pgn_field;

    pgn_field.field = value;

    unsigned int pgn = ((pgn_field.bytes.byte4 >> 4) << 16) +
                        (pgn_field.bytes.byte3 << 8) +
                        (pgn_field.bytes.byte2);

    return pgn;
}

static unsigned int create_srcid(unsigned int value){
    pgn_field_t pgn_field;

    pgn_field.field = value;

   return pgn_field.bytes.byte1;
}

output:

 

can_id - dec_pgn - hex_pgn - src_id - Description

 19F0C402 127172 1F0C4 02
 19F00F03 126991 1F00F 03 Unit Status
 19F0C402 127172 1F0C4 02
 19F0C402 127172 1F0C4 02
 19F0C402 127172 1F0C4 02
 19F0C402 127172 1F0C4 02
 19F0C402 127172 1F0C4 02
 19F0C903 127177 1F0C9 03 MPPT Status
 19F00301 126979 1F003 01 AC Input Status
 19F00301 126979 1F003 01 AC Input Status
 19F00301 126979 1F003 01 AC Input Status
 19F00301 126979 1F003 01 AC Input Status
 19F00301 126979 1F003 01 AC Input Status
 19F00301 126979 1F003 01 AC Input Status
 19F00301 126979 1F003 01 AC Input Status
 19F00301 126979 1F003 01 AC Input Status
 19F00301 126979 1F003 01 AC Input Status
 19F00301 126979 1F003 01 AC Input Status
 19F00F02 126991 1F00F 02 Unit Status
 09F01D01 061469 0F01D 01
 19F0C603 127174 1F0C6 03 2nd PWR SUP Status
 19F0C603 127174 1F0C6 03 2nd PWR SUP Status
 19F0C603 127174 1F0C6 03 2nd PWR SUP Status
 19F0C501 127173 1F0C5 01 DC Source Status
 19F0C501 127173 1F0C5 01 DC Source Status
 19F0C501 127173 1F0C5 01 DC Source Status
 19F0C501 127173 1F0C5 01 DC Source Status
 19F00E02 126990 1F00E 02 Battery Status
 19F00E02 126990 1F00E 02 Battery Status
 19F00E02 126990 1F00E 02 Battery Status
 19F0BE03 127166 1F0BE 03 Charger Stats-Batt
 19F0BE03 127166 1F0BE 03 Charger Stats-Batt
 19F0BE03 127166 1F0BE 03 Charger Stats-Batt
 19F0BE03 127166 1F0BE 03 Charger Stats-Batt
 19F0BE03 127166 1F0BE 03 Charger Stats-Batt
 19F0BE03 127166 1F0BE 03 Charger Stats-Batt
 09F01D01 061469 0F01D 01
 18EA0100 125441 1EA01 00
 19F00F01 126991 1F00F 01 Unit Status
 19220001 074240 12200 01
 19F0BD02 127165 1F0BD 02 Inverter Status
 19F0C403 127172 1F0C4 03

 Decimal PGN 127172 we have decoded / have the ability to decode. Just have not got around to it yet.

Share this post


Link to post
Share on other sites

    unsigned int pgn = ((pgn_field.bytes.byte4 >> 4) << 16) +
                        (pgn_field.bytes.byte3 << 8) +
                        (pgn_field.bytes.byte2);

should probably be

 

    unsigned int pgn = ((pgn_field.bytes.byte4 & 0x1F) << 16) |
                        (pgn_field.bytes.byte3 << 8) |
                        (pgn_field.bytes.byte2);

Share this post


Link to post
Share on other sites

Thanks oPossum, however I do not think that will work. What I'm trying to do with the left most byte is shift out the right most nibble. As I believe it is not used to construct the PGN part of the field. But the left most nibble *is* needed.

 

I'd post a link to a jpg I have of our J1939 "decoder ring", but the forums seem broken in this regard. maybe a direct link will work ?

 

https://plus.google.com/photos/106867156582775247949/albums/6154008874967495009?authkey=CPWWjpHeh9y8iAE

 

[EDIT]

 

@@oPossum

 

What I was thinking however is that perhaps for these and a few other frames they might require a different method to brake down the PGN, assuming these are even "real" PGN frames . . .

 

There are a few frames I do not understand yet. Ones that start with 18xxxx, 09xxxx, and a couple others I can not think of offhand.

 

One thing I am not doing though is filtering out SFF frames, as we do not seem to be getting any 3 byte frame fields - ever. I doubt this has anything to do with it, but maybe I'm wrong.

Share this post


Link to post
Share on other sites

These are two such frames . . .

 18EEFF01 126719 1EEFF 01
 09F01D01 061469 0F01D 01

And here is the PGN enum I have built so far with like 4 similar dec PGN values . . .  Just as anothe though. These could be higher priority frames, and hence are somehow displayed differently. Perhaps my bit manipulation is "good enough" for lower priority frames, but not good enough( maybe dropping a needed bit? ) for the higher priority frames? IDK . . .

enum pgn_ids {
    DATE_AND_TIME_STATUS            = 129033,
    ADDRESS_CLAIMED                 = 61183,
    ISO_ADDRESS_CLAIM                = 60928,
    DC_SOURCE_CONFIG_OVER_VOLT        = 66048,
    DC_SOURCE_CONFIG_UNDER_VOLT        = 65792,
    SOFTWARE_VERSION_STATUS            = 129038,
    CHARGER_INTERNAL_TEMP            = 127010,
    AGS_STATUS                        = 126993,
    INVERTER_INTERNAL_TEMP            = 127007,
    MPPT_STATUS                        = 127177,
    SECONDARY_POWER_SUPPLY_STATUS     = 127174,
    UNIT_STATUS                     = 126991,
    AC_TRANSFER_SWITCH_STATUS        = 127167,
    BATTERY_STATUS                    = 126990,
    CHARGER_STATUS                    = 126990,
    CHARGER_STATISTIC_BATTERY        = 127166,
    STRING_CONFIG                    = 122112,
    USER_INTERFACE_STATUS            = 127004,
    INVERTER_STATUS                  = 127165,
    AC_OUTPUT_STATUS                 = 126982,
    AC_INPUT_STATUS                  = 126979,
    DC_SOURCE_STATUS                 = 127173
};

Share this post


Link to post
Share on other sites

Perhaps my bit manipulation is "good enough" for lower priority frames, but not good enough( maybe dropping a needed bit? ) for the higher priority frames? IDK . . .

 

 

 

No this can not be right. At least for "normal" frames. The PGN values are exactly as what my buddy got decoded by hand. The frames are just somehow different. Both of those listed above I also believe has short data fields. 4 bytes each / 32bit, and it did occur to me that one is an EFF frame . . .

Share this post


Link to post
Share on other sites

I could use some advice on how to deal with with the data coming from multiple frames for some PGNs.

 

First, obviously there are different frame types. Some frames only have a 4, or 6  bytes per payload. Other frames, which are NEMA 2000 fast packet can have up to around 1440 bytes of data over multiple frames. In our case these NEMA 2000 fast packet fields do not seem to be larger than 88 total bytes, or 11 frames. These look like this when dumped by candump:

  can1  19F00601   [8]  40 49 03 FC 33 01 01 00
  can1  19F00601   [8]  41 00 00 00 00 00 00 00
  can1  19F00601   [8]  42 00 00 00 FF FF 00 00
  can1  19F00601   [8]  43 00 00 00 00 00 00 00
  can1  19F00601   [8]  44 02 00 00 00 00 00 00
  can1  19F00601   [8]  45 00 00 00 00 00 FF FF
  can1  19F00601   [8]  46 00 00 00 00 00 00 00
  can1  19F00601   [8]  47 00 00 03 36 9E 03 00
  can1  19F00601   [8]  48 3E 08 00 00 0C 6F 17
  can1  19F00601   [8]  49 FF FF F4 01 00 00 B2
  can1  19F00601   [8]  4A 01 00 00 0A FF FF FF

In this case, this is an AC output complete frame "set". The first byte of each frame, is a sequential number we've been referring to as "line numbers", and is not needed for the data field. The second byte of the first frame is the byte length of the frame set. Then the rest is all frame set data pertaining to the given PGN type, in this case "AC output". 0x00 most likely is features not currently active, and 0xFF are values that are either reserved, or not used.

 

So, what I'm having a little trouble trying to figure out is how I should piece this information together. The first byte of each frame is easy to deal with. I figured I would most probably want to store this value *somewhere* to make sure I do not get out of order frames mixed together.

 

Secondly, I figure the rest of each frame could be pieced together fairly easy. However what I am having a hard time imagining is what data type to use, and where to put it. Also to decrease redundancy I feel that the data type should be resizable. e.g. malloc / realloc etc. No matter how I look at this problem though, the checks needed for each case seems to be overly complex.

 

So what do you all think ? Fixed size data type for each kind of data payload, and does it really matter if I'm using up stack space for the data, or not? My program is pretty small with at most a call stack of 4 functions deep. My instincts tell me to use dynamic memory allocation, but at the same time it seems that putting data on the stack would be simpler . . . at least a little less to worry about.

Share this post


Link to post
Share on other sites

Not sure what your question is. Can you decode/parse in real time as in look for the start of frame. The next byte will tell you the number of bytes you need to pull in. Once you have all the bytes, parse the frame for data you need. You may have to have two buffers - one for incoming data and one for parsing data. Incoming data buffer feeds into the parsing buffer.

Share this post


Link to post
Share on other sites

Maybe I should just write a couple of examples to illustrate what I'm asking better. I have a couple ideas, one of which would be less complex, and perhaps more readable, but would require a lot of if/else blocks. The other way would probably be better, and would be done using two union / struct byte fields. Or possibly now that I have been thinking about it a while. one array( heap or stack ), and one union / struct field for structuring the data.

 

One thing that is bothering me however is how do I ensure that I am putting the data into a field correctly ? I have . .

 

a) the frame.can_dlc that tells me how many bytes for the frame payload

B) line numbers in the first byte of every fast packet data payload.

c) the ability to know if the data is coming in is multi frames or not. Based on the PGN.

 

The other aspect that is troubling me is wondering how do I make this "generic" to keep duplicate code to a minimum.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×