Covid-19 Update: We've taken precautionary measures to enable all staff to work away from the office. These changes have already rolled out with no interruptions, and will allow us to continue offering the same great service at your busiest time in the year.

3D Object Pose Tracking for Robotics Grasping

2258 words (9 pages) Essay in Information Technology

08/02/20 Information Technology Reference this

Disclaimer: This work has been submitted by a student. This is not an example of the work produced by our Essay Writing Service. You can view samples of our professional work here.

Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of UK Essays.

I. INTRODUCTION My role of this project is to build up the graphic scene and deal with adjusting lighting changes of the object. It is a repeatable process since the machine learning algorithm will store precise data and drop lower reliable cases. Then, it will circle the processes to improve the accuracy of stored image data.

II. TARGET Unlike humans, robots do not have a sense of touch, and rely on cameras for vision. As such, computer vision is an important part of designing robots to interact with the world on their own. Our task is part of an ongoing project attempting to improve a computers understanding of a three dimensional environment through machine learning and computer vision; developing computer algorithms to help robotic arms determine the appropriate path to grasp an object. At the moment, the computer is using two stationary cameras placed at dierent angles facing the arm and the object, and analyzing their feedback to determine the objects position and type. The current setup has a few problems. To begin with, lighting changes can make it dicult to identify an object. When the hand is covering the object, the shadow changes the light captured by cameras. Also, if the environments light intensity changes, it will result in a color change to captured streams. The computer is expecting pixels representing the object to be in a certain range in the RGB spectrum, and such changes will potentially move the apparent colors out of the expected range. As a result, the computer fails to recognize the presence of the object its looking for. We could expand the expected color range, but that could cause the computer to recognize parts of the image that are not the object, such as the background, as part of the object. Our goal is for the computer to recognize the object in dierent levels of light, without that issue. Another problem is when the robotic hand gets in the way of the cameras. Since the cameras are at xed positions, the robotic arm will block parts of the object when it moves in to pick it up. As a result, the apparent shape of the object on the screen will change, causing the computer to fail at identifying it and cannot accurately determine its position and shape. The purpose of this software will be to improve a computers ability to identify a known object in an image in different lighting conditions. The end product will be two helper functions for the computer vision project. The rst will take an image of a scene and the initial position of the object being tracked as input and determine the diffuse and specular that describe the lighting of the scene. These will be used to determine the color range that the computer should be looking for. The second function will then use the images, diffuse, specular, and the approximate mask in order to determine a more accurate nal mask. The primary goal is to get these functions to properly track a robotic arm, and as a stretch goal is to attempt to do the same with a set of objects that the arm can manipulate.

III. DETAILED PIECES This section will demonstrate three key pieces of the project and brainstorm other optional technologies that can solve the same problem.

A. Piece 1: Capture of the Data In order to track the robotics grasping pose, it requires that the computer is supposed to compare the monitored streams from cameras with stored data in the memory. Thus, previous task is to capture objects data by taking images from various viewpoints to build three dimension scene. 1) Image Sets: This technology is the method to capture objects data in this project. We intend to set up a box that keeps environmental light out. Then, taking pictures of the object under a certain light intensity in the box. This aims to keep recorded RGB values of each object the constant. Further, changing the viewpoint of the camera and repeating image capturing. Finally, all objects 3D images will be stored in the computer. By this solution, we probably need to create over 150 sets of images of every single object to support sufficient sources for the computer to form detailed 3D scene of the object. Also, we need to be very careful to deal with lighting. Since this step is to create objects data on the computer, there is no doubt that data is not allowed to contain wrong information. Notice that the surface color is one of the information that the computer is going to use to distinguish the type of objects. Moreover, colors are represented by RGB values. Each of three primary color has a value interval from 0 to 255, if the computer treats it as an integer. Even though red 250 and red 200 both present red color to the objects surface, the computer will not put them into a single category. Therefore, it is necessary to create images with precise and constant lighting, which lead to a constant surface color.

3

2) 3D Printing: Another approach that comes out from my mind is 3D printing. The previous point is to capture objects data. Nevertheless, this method is to construct 3D models with computers. In this case, the objects data is in the memory at the beginning. There is no need to worry about transferring physical objects specifics into data. A 3D printer can carve objects with precise information given by the computer [5]. Then, using created objects to test algorithm of grasping pose tracking. There might be a concern that how accurate a 3D printed object will be. Compared to original object that is built in computer, a physical object may have flaws. Therefore, this option has additional requirement, which is to measure the specifics of each object. For instance, it is obligatory to measure the length of each edge of the polygon, referring to a cube. Developers might define required accuracy to be 0.01 centimeter and abandon objects not within the range. 3) Panorama: Back to capture images, a better camera or more powerful device can help us to obtain physical objects data. Panoramic camera shows a possibility to create 3D scene of objects through 360 degree panorama [4]. This solution cost much more than the first option since a powerful panoramic camera is expensive. However, it does save lots of time to photograph hundreds of object images. Similarly, it needs to interact with lighting set. As mentioned previously, we need to keep objects surface colors to be constant variables.

B. Piece 2: Color Changing To begin with, lighting changes can make it difficult to identify an object. When the hand is covering the object, the shadow changes the light captured by cameras. Also, if the environments light intensity changes, it will result in a color change to captured streams. The computer is expecting pixels representing the object to be in a certain range in the RGB spectrum, and such changes will potentially move the apparent colors out of the expected range. 1) Machine Learning: For the lighting and shading issue, we will be attempting a machine learning approach as our primary option. We will take key frames from these recordings and, using Photoshop or a similar software, label the pixels as either part of the object, part of the arm, or part of the background. We will feed this data into a learning algorithm, then test to see how well the algorithm learned to work around the shading issues. The machine learning algorithm will take an image of a scene and the initial position of the object being tracked as input and determine the diffuse and specular that describe the lighting of the scene. These will be used to determine the color range that the computer should be looking for. The diffuse accounts for the angle between the incoming light and the surface normal. The specular accounts for the angle between the perfect reflector and the eye position. In this situation, the two cameras that we are using to record streams serve the role of the eyes. 2) Color Sensor: This option is using the color sensor to detect the object surface color. The color sensor is a device that can compare objects colors with previously referenced colors to improve color detection [1]. Once two types of colors are within a certain acceptable range of error, the sensor will output the results. With various referenced labels, even though the background has a subtle difference in color, the sensor could detect it in a fast speed. There are other advantages, including automatically adapting to wavelengths, detection of tiny difference in gray value and independence of the color of the label and the background. This option could replace the function that is dealing with colors. 3) Gray Scale: Since working on color is a tough task, there is a method to only compare two images gray levels [2]. This option allows us to avoid complex computation of colors, including discrete algorithm. The gray scale is a single value that is represented on a single channel. To demonstrate, the image will be totally black while its gray level is 0 and white for maximum gray value. The first option utilizes diffuse and specular to calculate the objects surface colors in order to restore the colors variables in computer for comparing with referenced data. In this option, the brightness of objects will be used to compare because more bright the object is, a higher gray value it will show.

C. Piece 3: Pose Tracking The purpose of the project, 3D object pose tracking for Robotics Grasping, is to successfully track the hand and object position. It is improved by dealing with lighting changes and shades. 1) Tracking System: The object tracking system currently used by the Universitys robotics team is to use cameras placed aside the object to produce video streams to a computer. By analyzing frames of certain images, the computer will label each pixel of each image with terms of background, object and robotic hand, which is the way for computer to distinguish the object and then track the pose of robotics grasping. Currently, it accounts for this by identifying a large range of colors as potentially being part of the object. This, of course, results in a lot of false positives (pixels being labeled as part of the object when they actually arent). An advanced approach will allow the computer to adjust the expected average for the specific lighting with a smaller acceptable range. 2) Motion Capture: This option, indeed, will cost a lot more than our current tracking system. However, this technology afford more precise position tracking of various objects, even human action. Some companies, including Sega Interactive, have launched a variety of commercial motion capture devices. A tracker is set at a key part of the moving object, and the position of the tracker is captured by the motion capture system. Then, a computer processes obtained data to generate the three-dimensional space. This technology is widely used in movie and game field to capture human actions [3]. One

4

sensor is called inertial navigation sensor, which measures the characteristics of the athlete’s motion acceleration, azimuth, and tilt angle. This approach is not effected by environmental disturbances and blocks. The capture accuracy is extremely high and the sampling speed could reach 1000 times per second or higher. 3) OpenCV: It is an open source computer vision library. OpenCV is written by C++ and its main interface is also C++, but still retains a large number of C language interfaces. The library also has a number of interfaces to Python, Java, MATLAB/OCTAVE (version 2.5), C#, Ruby and GO. OpenCV program is fast, stable and strongly compatible. It is a choice to get rid of some special solutions that rely on hardware, such as video surveillance, manufacturing control systems, medical equipment. OpenCV focuses on real-world and real-time applications, and its execution speed is greatly improved by the optimization of C language. Fields, like human computer interaction, action recognition and robotics, are gaining benefits from this technology. We can implement our OpenCV program to track object position.

IV. CONCLUSION According to analysis of technologies that could be used for robotics grasping pose tracking, the machine learning algorithm is the most economic method to improve the tracking accuracy. No needs of sensors and any other hardware equipment, the program will determine the object grabbed by robotic hand. Though the algorithm could not be precise enough at the beginning, the accuracy is increasing since the algorithm stored positive data in the memory. With a long time iteration, more reliable data will be stored and the computer has more sampling to learn. Notice that this technology is automatic and the computer will feed itself with current database.

REFERENCES

[1] Colour sensors system description. https://www.sensopart.com/jdownloads/Systembeschreibungen/Colour-sensors contrast-sensors luminescence-sensors system description.pdf. (Accessed on 11/03/2018). [2] Gray-level transformation. https://spie.org/samples/PM197.pdf. (Accessed on 11/07/2018). [3] R. Fischer. Motion capture process and systems. https://pdfs.semanticscholar.org/e399/84b1e08f5a98e03e83f2e4d6bac3e997e0d8.pdf. (Accessed on 11/07/2018). [4] Walkabout Worlds. Create a basic 3d model of a room. https://www.youtube.com/watch?time continue=78&v=3IAK93U2QUI, March 2017. (Accessed on 11/02/2018). [5] Bob Yirka. A 3-d printer that can print data sets as physical objects. https://phys.org/news/2018-06-d-printer-physical.html, June 2018. (Accessed on 11/02/2018).

Get Help With Your Essay

If you need assistance with writing your essay, our professional essay writing service is here to help!

Find out more

Cite This Work

To export a reference to this article please select a referencing style below:

Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.

Related Services

View all

DMCA / Removal Request

If you are the original writer of this essay and no longer wish to have the essay published on the UK Essays website then please:

McAfee SECURE sites help keep you safe from identity theft, credit card fraud, spyware, spam, viruses and online scams Prices from
£124

Undergraduate 2:2 • 1000 words • 7 day delivery

Order now

Delivered on-time or your money back

Rated 4.6 out of 5 by
Reviews.co.uk Logo (198 Reviews)