The coordinates systems used by librealsense (which is where the call to rs_extrinsics comes from) are described here: https://github.com/IntelRealSense/librealsense/blob/master/doc/projection.md#extrinsic-camera-parameters
The following is an extract from that document:
Each stream of images provided by librealsense is associated with a separate 2D coordinate space, specified in pixels, with the coordinate [0,0] referring to the center of the top left pixel in the image, and [w-1,h-1] referring to the center of the bottom right pixel in an image containing exactly w columns and h rows. That is, from the perspective of the camera, the x-axis points to the right and the y-axis points down. Coordinates within this space are referred to as "pixel coordinates", and are used to index into images to find the content of particular pixels.
Each stream of images provided by librealsense is also associated with a separate 3D coordinate space, specified in meters, with the coordinate [0,0,0] referring to the center of the physical imager. Within this space, the positive x-axis points to the right, the positive y-axis points down, and the positive z-axis points forward. Coordinates within this space are referred to as "points", and are used to describe locations within 3D space that might be visible within a particular image.
Extrinsic Camera Parameters
The 3D coordinate systems of each stream may in general be distinct. For instance, it is common for depth to be generated from one or more infrared imagers, while the color stream is provided by a separate color imager. The relationship between the separate 3D coordinate systems of separate streams is described by their extrinsic parameters, contained in the rs_extrinsics struct. The basic set of assumptions is described below:
1. Imagers may be in separate locations, but are rigidly mounted on the same physical device
• The translation field contains the 3D translation between the imager's physical positions, specified in meters
2. Imagers may be oriented differently, but are rigidly mounted on the same physical device
• The rotation field contains a 3x3 orthonormal rotation matrix between the imager's physical orientations
3. All 3D coordinate systems are specified in meters
• There is no need for any sort of scaling in the transformation between two coordinate systems
4. All coordinate systems are right handed and have an orthogonal basis
• There is no need for any sort of mirroring/skewing in the transformation between two coordinate systems..."
Hopefully this document will help clear your doubts.
Hi Pedro, thank you for replying. Unfortunately, the answer is still unclear to me. Based on the link you shared, it says
translationfield contains the 3D translation between the imager's physical positions, specified in meters
However, I don't know how you defined `imager's physical positions`. It doesn't say either `Pixel coordinates` nor `Point coordinates`.
Based on what I got from the output of `rs_extrinsics`, the coordinates should be the positive x-axis points to the left, the positive y-axis points up, and the positive z-axis points forward, but please confirm or argue this.
According to https://github.com/IntelRealSense/librealsense/blob/master/doc/projection.md#extrinsic-camera-parameters, these are the ways pixel coordinates and point coordinates work:
[0, 0] => Top left pixel of the image.
[w-1, h-1] => Bottom right pixel of the image (in an image containing exactly w columns and h rows).
[0, 0, 0] => Center of the physical imager.
The positive x-axis points to the right, the positive y-axis points down, and the positive z-axis points forward.
Let me know if that helps, I'll be more than glad to answer any question you might have.
I hope you could answer my question directly. I meant that the document said `imager's physical positions`, but it's unclear whether it's based on which coordinates.
Also, if we assume it's based on the point coordinates, the output from `rs_extrinsics` doesn't make sense because the Depth/IR sensor is located on +25mm right from the Color sensor, though the output shows "-0.02549983" which means 25mm left. That's why I'm thinking the coordinate system used for rs_extrinsics is not `Point coordinates` (and of course not `Pixel coordinates`).
Thanks a lot for sharing this information with us. Please let us analyze it to see if we can determine what might be happening. If we are able to find anything useful, we'll make sure to share it with you in this thread.
We have an update for your case.
The reason the value is -.025 is because the reference point is the depth sensor (also known as imager) which is to the right of the RGB sensor. The RGB sensor is to the left, negative x direction, of the depth sensor. The coordinate system is the point system based on meters.
I hope this helps.
You could try but the output is from color to depth, not from depth to color. So, I believe the reference point in this case should be the color.