Tracking Protocol Specification

cincraft scenario sdk lite 1 0 1 this page describes the sdk lite tracking protocol, it is also available as a downloadable file here description the idea behind the sdk lite is to provide our customers with a much simpler way to interface with cincraft scenario as a data source this is achieved by means of a consistent udp connection with a single mtu for each frame containing all of the essential information usually needed for integration with 3rd party software the server type can be set to either udp broadcast or udp unicast the server type, port and ip address can all be modified in cincraft scenario the total size of each message/packet will always be smaller than the typical 1500 bytes in udp communications and its format will conform to the specification described by the version of the sdk lite protocol further extensions are planned for the sdk lite, including a serial implementation packet format after a 2 characters control flag (0x0a, 0x0d), the next 8 bytes will always be fixed whatever the version of the sdk lite protocol they define the version of the protocol that is streamed and the length of the dynamic section in bytes \# size type (x86 64) data 0 2 bytes char \[2] control flag \[0x0a, 0x0d] the next section is static and has a fixed size of 8 bytes 1 1 byte uint8 t checksum (two’s complement) 2 3 bytes uint8 t \[3] sdk lite version \[m, m, r] 3 4 bytes uint32 t length in bytes of the dynamic section dynamic section starts here 4 4 bytes uint32 t packet number 5 4 bytes uint32 t timecode 6 12 bytes float \[3] translation \[x, y, z] 7 12 bytes float \[3] rotation \[ox, oy, oz] 8 12 bytes float \[3] normalized focus, iris and zoom 9 12 bytes float \[3] mapped focus, iris and zoom 10 4 bytes float nodal shift in meters 11 64 bytes float \[16] lens parameters the packets must be checked against their checksum at arrival this is achieved by computing the sum of all bytes as int8 t starting with the checksum and ending with the last byte of the dynamic section which length is given by #3 a valid packet would sum to zero total size of a packet is 134 bytes for a 1080p60 video signal, the required bandwidth is 60x134 = 8040 bytes/sec so under 8kb/sec making sense of the sdk lite’s information packet number a counter of sent packets since the sdk lite began streaming timecode the sdi source’s timecode translation the translation vector\[3] in meters rotation the rotation uses a compact vector\[3] rodrigues representation \ the angle of rotation is given by the module of the input vector \ the axis of rotation is given by the normalized input vector opencv provides a convenient method to convert to and from a 3x3 rotation matrix void rodrigues(inputarray src, outputarray dst, outputarray jacobian=noarray()) normalized fiz focus, iris and zoom encoder values normalized between 0 0 and 1 0 mapped fiz world scale values for focus (in meters), iris (in t stops) and zoom (in mm) nodal shift distance from the camera sensor’s center to the nodal point in mm lens parameters this is a vector\[16] with 3 distinct sections the first four values are affine parameters normalized h focal, normalized v focal, normalized h optical center and normalized v optical center the two that follow are radial distortion k1, k2 the two that follow are radial undistortion inv k1, inv k2 c++ pseudo code for packet deserialization low level read helpers uint8 t readuint8(byte data, uint32 t& pos, int maxlen = 1) { 	if (maxlen > 1 && static cast\<int>(pos) > maxlen) 	{ 	 return 0; 	} 	uint8 t ret = static cast\<uint8 t>(data\[pos]); 	pos += 1; 	return ret; } uint32 t readuint32(byte data, uint32 t& pos, int maxlen = 1) { if (maxlen > 1 && static cast\<int>(pos + 3) > maxlen) 	{ 	 return 0; 	} 	uint32 t ret = static cast\<uint32 t>(data\[pos]); 	pos += 4; 	return ret; } float readfloat32(byte data, uint32 t& pos, int maxlen = 1) { if (maxlen > 1 && static cast\<int>(pos + 3) > maxlen) 	{ return 0; 	} 	float ret; 	uint8 t bytes\[] = { data\[pos], data\[pos+1], data\[pos+2], data\[pos+3] }; 	memcpy(\&ret, \&bytes, sizeof(ret)); 	pos += 4; 	return ret; } conversions and coordinate utilities // in degrees float normalizedfocallengthtofov(float focallengthnorm) { return (180 0 / m pi) 2 0 atan2(1 0, 2 focallengthnorm); } // zeiss sdk streams camerafromworld eigen vector3f getposition(float tx, float ty, float tz, float rotx, float roty, float rotz) { eigen vector3f rodriguesvec(rotx, roty, rotz); // read directly from sdk lite eigen angleaxisf angleaxis(rodriguesvec norm(), rodriguesvec normalized()); eigen matrix3f rotationmatrix = angleaxis torotationmatrix(); auto inverserotation = rotationmatrix transpose(); eigen vector3f translationvec(tx, ty, tz); eigen vector3f position = inverserotation translationvec; return position; } void rodriguestoeuler(float rotx, float roty, float rotz, float& panout, float& tiltout, float& rollout) { eigen vector3f rodriguesvec(rotx, roty, rotz); // read directly from sdk lite eigen angleaxisf angleaxis(rodriguesvec norm(), rodriguesvec normalized()); eigen matrix3f rotationmatrix = angleaxis torotationmatrix(); auto rotationmatrixinv = rotationmatrix transpose(); // choose the angle order as needed for the target system eigen matrix3f zeisscs; zeisscs setzero(); zeisscs(0, 0) = 1 0; zeisscs(1, 1) = 1 0; zeisscs(2, 2) = 1 0; const auto lrotation = zeisscs rotationmatrixinv; eigen vector3f eulerangles = lrotation eulerangles(1, 0, 2); panout = eulerangles\[0] 1; tiltout = eulerangles\[1]; rollout = eulerangles\[2]; } packet extraction example void extracttrackdata(char tsname, char rxbuffer) { // content of rxbuffer is expected to be checked before entering // control flags must be valid // checksum must also be valid byte data = (byte )rxbuffer; // ========= // header pos += 2; // control flag // ============== // static section pos += 1; // checksum uint8 t version major = readuint8(data, pos); uint8 t version minor = readuint8(data, pos); uint8 t version revision = readuint8(data, pos); uint32 t maxlen = readuint32(data, pos) + pos; // maximum length of packet // ============== // dynamic section float packet number = static cast\<float>(readuint32(data, pos, maxlen)); float timecode = static cast\<float>(readuint32(data, pos, maxlen)); float tx = readfloat32(data, pos, maxlen); float ty = readfloat32(data, pos, maxlen); float tz = readfloat32(data, pos, maxlen); float rotx = readfloat32(data, pos, maxlen); float roty = readfloat32(data, pos, maxlen); float rotz = readfloat32(data, pos, maxlen); // position of pinhole / projection center / non parallax point in meters from world origin // zeiss uses a right handed y up coordinate system auto position = getposition(tx, ty, tz, rotx, roty, rotz); // rotation in radians float pan, tilt, roll; rodriguestoeuler(rotx, roty, rotz, pan, tilt, roll); // rotation in degrees pan = 180 f pan / static cast\<float>(m pi); tilt = 180 f tilt / static cast\<float>(m pi); roll = 180 f roll / static cast\<float>(m pi); // normalized fiz values in \[0, 1] float focusnorm = readfloat32(data, pos, maxlen); float irisnorm = readfloat32(data, pos, maxlen); float zoomnorm = readfloat32(data, pos, maxlen); // mapped fiz values float focusmapped = readfloat32(data, pos, maxlen); // meters float irismapped = readfloat32(data, pos, maxlen); // t stops float zoommapped = readfloat32(data, pos, maxlen); // mm // distance of the sensor from pinhole / projection center / non parallax point float nodalshift = readfloat32(data, pos, maxlen); const float hfov = normalizedfocallengthtofov(readfloat32(data, pos, maxlen)); const float vfov = normalizedfocallengthtofov(readfloat32(data, pos, maxlen)); // zeiss values are normalized relative to image width and height // (0 0, 0 0) is top left and (1 0, 1 0) is bottom right const float hcenter = readfloat32(data, pos, maxlen); const float vcenter = readfloat32(data, pos, maxlen); // distortion values from zeiss / cinelens are normalized along the half diagonal of the sensor // if values are normalized along the full diagonal, scale factors must be adjusted accordingly float k1 = readfloat32(data, pos, maxlen) / 4 0f; float k2 = readfloat32(data, pos, maxlen) / 16 0f; // inverse parameters float k1inv = readfloat32(data, pos, maxlen) / 4 0f; float k2inv = readfloat32(data, pos, maxlen) / 16 0f; } sample for one frame of data const uint8 t buf\[] = { 0x0a, 0x0d, 0xb2, 0x01, 0x00, 0x01, 0x7c, 0x00, 0x00, 0x00, 0xec, 0xa4, 0x00, 0x00, 0x05, 0x47, 0x26, 0x08, 0x1d, 0x55, 0x14, 0x3f, 0x0b, 0x6d, 0x2f, 0x40, 0xd5, 0x23, 0x32, 0x3f, 0x10, 0x8b, 0xa4, 0x3f, 0xe7, 0x31, 0x6c, 0x3c, 0x8c, 0x2f, 0x93, 0xbd, 0x76, 0x90, 0x7e, 0x3f, 0x00, 0x00, 0x00, 0x00, 0x34, 0x28, 0x7f, 0x3f, 0x7b, 0x66, 0x1d, 0x41, 0x33, 0x33, 0xf3, 0x3f, 0x32, 0x34, 0x6e, 0x42, 0x93, 0xa6, 0x72, 0x3d, 0xf0, 0xac, 0xc5, 0x40, 0x35, 0xd9, 0x2f, 0x41, 0x6b, 0x66, 0x01, 0x3f, 0x1d, 0xe8, 0x05, 0x3f, 0x97, 0xf0, 0xd3, 0x3c, 0xf9, 0x8c, 0xcd, 0x38, 0x8b, 0x53, 0xd3, 0xbc, 0xd7, 0xc5, 0xdc, 0x3a, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, };