PAGE 1
OUTDOOR LOCALIZATION TECHNIQUE USING LANDMARKS TO DETERMINE POSITION AND ORIENTATION By SAI KRISHNA YELURI A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLOR IDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE UNIVERSITY OF FLORIDA 2003
PAGE 2
Copyright 2003 by Sai Krishna Yeluri
PAGE 3
I dedicate this thesis to my lord, SaiBaba, for being the light at th e end of the tunnel; and especially for helping to dig me out during cave-ins.
PAGE 4
ACKNOWLEDGMENTS I express my sincere appreciation to my advisor, Dr. Carl Crane, for giving me the opportunity to work on this project and for providing continuous guidance. I also thank Dr. Michael Nechyba, the committee member who guided me through the concepts of image processing and its applications. I also acknowledge the efforts of my other committee member Dr. John Schueller whose time and expertise were greatly appreciated. I thank the Air Force Research Lab at Tyndall for their support in the ongoing research at University of Florida. I also thank my fellow students in the Center for Intelligent Machines and Robotics for sharing their knowledge and helping me when needed. Finally I thank my parents, grandparents and especially God for their blessings. iv
PAGE 5
TABLE OF CONTENTS Page ACKNOWLEDGMENTS.................................................................................................iv LIST OF TABLES............................................................................................................vii LIST OF FIGURES...........................................................................................................ix ABSTRACT.......................................................................................................................xi CHAPTER 1 INTRODUCTION........................................................................................................1 2 BACKGROUND REVIEW..........................................................................................4 Natural Landmarks.......................................................................................................4 Artificial Landmarks.....................................................................................................5 Komatsu’s, Z-Shaped Landmark...........................................................................6 Optical Reflectors..................................................................................................7 Acoustic Artificial Landmarks..............................................................................8 Line Navigation Using Artificial Landmarks........................................................9 Thermal trail...................................................................................................9 Odor trail......................................................................................................10 Beacon Navigation System (TRC)......................................................................11 Fixed Beacon System (CONAC).........................................................................12 Imperial College Beacon Navigation System......................................................12 Spatial Positioning System (Odyssey).................................................................13 Laser Net Beacon Tracking System (NAMCO)..................................................13 Denning Branch International Robotics Laser Navigation Position Sensor.......14 Learned Landmarks....................................................................................................15 Landmark Database.............................................................................................15 Online Localization.............................................................................................16 3 TRILATERATION.....................................................................................................20 4 CAMERA CALIBRATION.......................................................................................30 Definition of Camera Calibration...............................................................................31 Intrinsic Parameters (Interior Projective Parameters).........................................31 Extrinsic Parameters (Exterior Projective Parameters).......................................33 v
PAGE 6
Camera Calibration Techniques.................................................................................34 Photogrammetric Calibration..............................................................................35 Linear calibration.........................................................................................35 Nonlinear calibration....................................................................................37 Two step calibration technique....................................................................37 Three Step Calibration Technique..............................................................................38 Camera Model.....................................................................................................38 Calibration Procedure..........................................................................................42 Linear parameter estimation.........................................................................43 Iterative search.............................................................................................47 Software Description...........................................................................................54 5 FEATURE EXTRACTION........................................................................................59 Feature Detection........................................................................................................60 Input Description:................................................................................................61 Smoothen Input Image........................................................................................62 Thresholding an Image........................................................................................63 Blob Detection Algorithm...................................................................................69 Feature Localization...................................................................................................71 6 EXPERIMENTAL METHOD....................................................................................79 7 RESULTS AND CONCLUSIONS............................................................................83 Results.........................................................................................................................83 Conclusions...............................................................................................................115 8 FUTURE WORK......................................................................................................120 APPENDIX PSEUDO CODE..............................................................................................................123 Calibration Algorithm Functions..............................................................................123 Feature Extraction Functions....................................................................................126 Main function for calibration and feature extraction................................................131 LIST OF REFERENCES.................................................................................................132 BIOGRAPHICAL SKETCH...........................................................................................134 vi
PAGE 7
LIST OF TABLES Table page 6-1 Camera parameters......................................................................................................80 7-1 Sony CCD camera positioned at 16 feet......................................................................85 7-2 Sony CCD camera positioned at 18 feet......................................................................86 7-3 Sony CCD camera positioned at 20 feet......................................................................87 7-4 Sony CCD camera positioned at 22 feet......................................................................88 7-5 Sony CCD camera positioned at 24 feet......................................................................89 7-6 Sony CCD camera positioned at 26 feet......................................................................90 7-7 Sony CCD camera positioned at 28 feet......................................................................91 7-8 Sony CCD camera positioned at 30 feet......................................................................92 7-9 Sony CCD camera positioned at 32 feet......................................................................93 7-10 Sony CCD camera positioned at 34 feet....................................................................94 7-11 Sony CCD camera positioned at 36 feet....................................................................95 7-12 Sony CCD camera positioned at 38 feet....................................................................96 7-13 Sony CCD camera positioned at 40 feet....................................................................97 7-14 Sony CCD camera positioned at 45 feet....................................................................98 7-15 Sony CCD camera positioned at 50 feet....................................................................99 7-16 Canon power shot camera positioned at 16 feet......................................................100 7-17 Canon power shot camera positioned at 18 feet......................................................101 7-18 Canon power shot camera positioned at 20 feet......................................................102 7-19 Canon power shot camera positioned at 22 feet......................................................103 vii
PAGE 8
7-20 Canon power shot camera positioned at 24 feet......................................................104 7-21 Canon power shot camera positioned at 26 feet......................................................105 7-22 Canon power shot camera positioned at 28 feet......................................................106 7-23 Canon power shot camera positioned at 30 feet......................................................107 7-24 Canon power shot camera positioned at 32 feet......................................................108 7-25 Canon power shot camera positioned at 34 feet......................................................109 7-26 Canon power shot camera positioned at 36 feet......................................................110 7-27 Canon power shot camera positioned at 38 feet......................................................111 7-28 Canon power shot camera positioned at 40 feet......................................................112 7-29 Canon power shot camera positioned at 45 feet......................................................113 7-30 Canon power shot camera positioned at 50 feet......................................................114 viii
PAGE 9
LIST OF FIGURES Figure page 1-1. Navigation test vehicle................................................................................................3 2-1. Autonomous robot for a known environment............................................................17 2-2. Z shaped landmarks located at 50 m intervals...........................................................17 2-3. The Z shaped landmark.............................................................................................18 2-4. Miniature robot equipped to follow chemical trails on the ground...........................18 2-5. Beacon tracking system (LASERNET).....................................................................19 2-6. LaserNav laser-based beacon sensor.........................................................................19 3-1. Landmark-based positioning system problem statement...........................................28 3-2. View of the Trilateration Technique.........................................................................29 4-1. Example of distortion................................................................................................55 4-2. Pinhole camera model...............................................................................................55 4-3. Steps involved in transformation from 3D to 2D image coordinates........................56 4-4. Calibration object......................................................................................................56 4-5. View of 3D and 2D data points.................................................................................57 4-6. Flow chart of calibration procedure...........................................................................57 4-7. Functions hierarchy details........................................................................................58 5-1. Example of PPM format............................................................................................73 5-2. Smoothening Example 1............................................................................................74 5-3. Smoothening Example 2............................................................................................74 5-4. Thresholding example on orange dots.......................................................................74 ix
PAGE 10
5-5. Thresholding example on pink dots...........................................................................64 5-6. Segmentation images.................................................................................................75 5-7. Samples drawn from a two dimensional Gaussian model ........................................76 5-9. Gaussian based threshold example............................................................................77 5-10. Gaussian based thresholding example on two colors..............................................77 5-11. Blob detection Example 1........................................................................................77 5-12. Blob detection Example.2........................................................................................78 5-13. Blob detection Example.3........................................................................................78 5-14. Flow chart of the feature detection algorithm.........................................................78 6-1. Calibration object......................................................................................................82 7-1. Blue control point’s centers marked........................................................................116 7-2. Bad detection of blobs beyond 50 feet....................................................................117 7-3. Sony CCD camera (distance) vs (% relative error in total distance).......................117 7-4. Sony CCD camera (distance) vs (% relative error in orientation angle).................118 7-5. Canon power shot (distance) vs (% relative error in total distance)........................118 7-6. Canon power shot (distance) vs (% relative error in orientation angle)..................119 x
PAGE 11
Abstract of Thesis Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Master of Science OUTDOOR LOCALIZATION TECHNIQUE USING LANDMARKS TO DETERMINE POSITION AND ORIENTATION By Sai Krishna Yeluri May 2003 Chair: Carl D. Crane III Department: Mechanical and Aerospace Engineering This study deals with developing techniques to determine position for local navigation of a vehicle with respect to artificial landmarks. We present two methods of determining the position and orientation of a remote vehicle system. Trilateration uses a vision system that is mounted on the vehicle. The position and orientation of the vehicle are determined based on the direction to three fixed target points. Camera calibration also uses a vision system as a sensor. However it determines the position and orientation of the vehicle by using the concept of camera calibration and by using just one target point. We describe the camera calibration procedure and the feature extraction procedure required by the camera calibration. For the experiments we built an artificial landmark. We used it to determine the position and orientation of the camera using the camera calibration technique with respect to the artificial landmark. Basic vision techniques were used to determine the position and orientation information. We compared results obtained using the second system to xi
PAGE 12
the manually obtained values. The algorithm developed was efficient, autonomous, versatile, and accurate to a certain range. This technique could work with any off-the-shelf camera. Recommendations are given for future work to improve the system’s functionality. xii
PAGE 13
CHAPTER 1 INTRODUCTION Navigation of mobile robots is a broad topic, covering a large spectrum of different technologies and applications. It draws on some very ancient techniques, as well as some of the most advanced space science and engineering. The degree of autonomy available now after a machine has been programmed is approaching that once considered purely science fiction. Depending on the scale of the problem, navigation can be categorized into two techniques: Global navigation determines one’s position in absolute or map-referenced terms. Local navigation determines one’s position relative to objects that can be either stationary or moving in the environment. The technology being used in the global mobile robot navigation is rapidly developing. One modern system is the Global Positioning System (GPS). Since its release by the US Department of Defense, it superseded several other systems being used in the military and by civilians across the world. When differential techniques (DGPS) were added to the GPS, it became a successful method for global reference navigation. But, the use of DGPS or GPS is hindered when there is a periodic signal blockage due to obstructions. If there is a multipath interference from large reflective surfaces in the vicinity, other positioning techniques are required. One local navigation technique being developed uses landmark-based navigation. Research in the area of autonomous vehicle navigation is being conducted at the Center for Intelligent Machines and Robotics (CIMAR) at the University of Florida under 1
PAGE 14
2 the sponsorship of Air Force Research Laboratory at Tyndall Air Force Base, Florida. Our aim was to develop technologies that provide the Air Force with autonomous navigation capability. CIMAR developed its Navigation Test Vehicle (NTV) based on a Kawasaki Mule ATV (Figure 1-1). The NTV uses a Novatel Beeline GPS system that offers real-time kinematic position and orientation data. Currently the NTV is not useful for local navigation when GPS signals are obstructed. Our aim was to develop a technique that could be used for the local navigation of the test vehicle, using nothing but landmark-based navigation. Chapter 2 gives a detailed description of Landmark Based Navigation. It also gives a brief explanation of currently available Landmark Based Positioning Systems in the market; and the reasons why these systems were not used for local navigation in our test vehicle. Chapter 3 describes the initial technique (trilateration) developed for local navigation. Trilateration is a vision-based navigation technique (a type of landmark navigation that uses only optical sensors; and not dead-reckoning, laser sensors, or ultrasonic sensors). Chapter 3 describes the entire algorithm behind this technique to determine the position and orientation information. Although results seemed promising this technique was not chosen because of cost considerations. Chapter 4 describes the actual technique chosen for the landmark navigation. It is a vision-based navigation technique, called Camera Calibration used to determine the position and orientation of the camera. Chapter 4 introduces the basic concepts of the technique, and then explains the algorithm used to develop the technique. The camera’s position and orientation are calculated relative to a target object if this camera were
PAGE 15
3 placed on the vehicle; then the vehicle’s position and orientation can also be determined locally since the transformation between the camera and the vehicle is assumed to be known. Chapter 5 explains the Feature Extraction process. Feature extraction is part of the camera calibration technique. Chapter 6 explains the experimental procedure. Chapter 7 gives the results of the camera calibration technique, including the technique’s accuracy. Chapter 8 suggests future work involved for the technique. Figure 1-1. Navigation test vehicle
PAGE 16
CHAPTER 2 BACKGROUND REVIEW This chapter begins by describing landmarks and their classification. It also gives the reader an idea of the position systems available in the market today; and the reasons why these positioning systems are not being used for the test vehicle. Landmarks are distinct features that a robot can recognize from its sensory input. Landmarks can be geometric shapes (e.g., rectangles, lines, circles). Landmarks may include additional information (e.g., in the form of barcodes). Landmarks generally have a fixed and known position. Landmarks are classified into two types: natural and artificial. Natural landmarks are objects or features that are already in the environment and that have a function other than robot navigation. Artificial landmarks are specially designed objects or markers placed in the environment with the sole purpose of enabling robot navigation. Natural Landmarks The main problem in landmark navigation is detecting and matching the characteristic feature from sensory inputs. Most computer vision-based natural landmarks are long vertical edges (such as doors, wall junctions, and ceiling lights). A natural landmark positioning system generally has the following basic components: A sensor (computer vision) for detecting landmarks and contrasting them against the background. A method for matching observed features with a map of known landmarks. 4
PAGE 17
5 One system using natural landmarks was developed in Canada, as part of a project developing a sophisticated robot system called the “Autonomous Robot for a Known Environment” (ARK) (Jenkin and Milios 1993). A Cyber motion K2A+ platform serves as the carrier for a number of sensor subsystems. The ARK navigation module consists of a custom-made pan and tilt table, a CCD camera, and an eye-safe IR spot laser rangefinder. The navigation module is used to periodically correct the robot’s accumulating odometer errors. The system uses natural landmarks (such as alphanumeric signs, semi-permanent structures, or doorways). The only criterion used is that the landmark is distinguishable from the background scene by color or contrast. Figure 2-1 shows the ARK. The ARK navigation module uses an interesting hybrid approach: the system stores landmarks by generating a three-dimensional gray level surface from a single training image obtained from the CCD camera. Once a suitable landmark is found, the projected appearance of the landmark is computed. The expected appearance is then used to find the normalized correlation based matching algorithm that yields the robots relative distance and bearing with regard to that landmark. With this procedure the ARK can identify different natural landmarks and measure its position relative to the landmarks. Positioning accuracy depends on the geometry of the robot. As one can see from the above discussion the ARK navigation module can only be used in an indoor environment. Artificial Landmarks Detection is much easier with artificial landmarks (Abdel and Karara 1971), which are designed for optimal contrast. In addition, the exact size and shape of artificial landmarks are known in advance. Researchers have used different kinds of patterns or marks as artificial landmarks. The geometry of the method and the associated techniques
PAGE 18
6 for position estimation vary accordingly. Many artificial landmark-positioning systems are based on computer vision. The various types of artificial landmarks that have already been developed are described below. Komatsu’s, Z-Shaped Landmark Komatsu Ltd. (Tokyo, Japan) is a manufacturer of construction machines. One of Komatsu’s research projects aims at developing an unmanned dump truck (Matsuda and Yoshikawa 1989). As early as 1984, researchers at Komatsu Ltd. developed an unmanned electric car that could follow a previously taught path around the company’s premises. The vehicle had two onboard computers, a directional gyrocompass, two incremental encoders on the wheels, and a metal sensor, which detected special landmarks along the planned path. The accuracy of the vehicle’s dead reckoning system was approximately two percent on the paved road and during straight-line motion only. The mechanical gyrocompass was originally designed for deep-sea fishing boats and its static direction accuracy was 1 degree. On rough terrain the vehicle’s dead reckoning error deteriorated notably. However with the Z shaped landmarks used in this project for periodic recalibration the positioning could be recalibrated to an accuracy of 10 cm (4 in). The 3 meter wide landmark was made of 50 mm wide aluminum strips sandwiched between two rubber sheets. In order to distinguish between “legitimate” metal markings of the landmark and between arbitrary metal objects, additional parallel line segments were used. The metal markers used as landmarks in this experiment are resilient to contamination even in harsh environments. Water, dust and lighting condition do not affect the readability of the metal sensor. Figure 2-3 shows how Z landmarks are set up.
PAGE 19
7 Each Z shaped landmark comprises three line segments. The first and third line segments are parallel, and the second one is located diagonally between the parallel lines. During the operation a metal sensor located underneath the autonomous vehicle detects the three crossing points P1, P2, and P3.The distances, L1 and L2, are measured by the incremental encoders using odometry. The lateral position error can be corrected after passing through the third crossing point P3. Note that for this correction method the exact location of the landmark along the line of travel does not have to be known. However, if the location of the landmark is known, then the vehicle’s actual position at P2 can be calculated easily. The size of the Z-shaped landmark can be varied, according to the expected lateral error of the vehicle. Larger landmarks can be buried under the surface of paved roads for unmanned cars. Smaller landmarks can be installed under factory floor coating or under office carpet. Komatsu has developed such smaller Z-shaped landmarks for indoor robots and AGVs. Optical Reflectors A new mobile robot guidance system using optical reflectors was developed by Yuji Mesaki and Isao Masuda (Mesaki and Masuda 1992).The advantages of using the reflectors are 1) they are easy to identify, 2) inexpensive 3) easy to set up and 4) inconspicuous to humans. A CCD camera is used for finding landmarks and determining the position of a robot. An ordinary design of the reflector is a thin sticker which is stuck on the ceiling at certain places to help find the absolute position and orientation of the mobile robot. This new guidance system uses these inconspicuous and inexpensive landmarks to achieve a smooth trajectory and a non-jerky motion. The images of a ceiling landmark acquired by CCD camera are obtained as bright regions. These regions are labeled. The labeled landmark coordinates are calculated using parameters like height of
PAGE 20
8 ceiling h, tilt angle alpha, etc. In the next stage a combination of landmarks is recognized and the turning direction at a point just under the landmarks is calculated. The indexed landmarks are used in motion commands. Trajectory is developed which provides continuous curvature, which dominates a jerky motion. The control is done with a period of 1/30 seconds to fit the hardware configuration, it takes about 1 second to extract and recognize the landmark information from an acquired image. To increase the robot speed, a more complex real time control scheme, using many landmarks and maps, must be developed. These kinds of landmarks can be used only indoors. Acoustic Artificial Landmarks In this application a mobile robot is equipped with sonar and is capable of locating acoustic reflectors. The acoustic landmarks are chosen from the acoustic reflectors whose positions are independent of the mobile robot position. By matching the two dimensional patterns of the acoustic landmarks and acoustic reflectors that are newly recognized the system can navigate without error accumulation. This was done by Joong Hyup Ko, Seung Do Kim, and Myung Jin Chung, Department of Electrical Engineering, Institute of Science and Technology Korea (Borenstein et al. 1996). The acoustic landmarks that were used are corner type and edge type reflectors. Acoustic reflectors are located from the measurements of the range and the direction to it. The localization in this method amounts to establishing the relationship between a mobile robot and the acoustic landmarks whose positions were determined with respect to a world coordinate frame. A mobile robot can localize itself in the vicinity of a reference frame, where the acoustic landmarks were acquired before hand, as long as more than one landmark can be recognized. This technique could be used only indoors.
PAGE 21
9 Line Navigation Using Artificial Landmarks Another type of landmark navigation that has been widely used in industry is line navigation. Line navigation can be thought of as a continuous landmark, although in most cases the sensor used in this system needs to be very close to the line, so the range of the vehicle is limited to the immediate vicinity of the line (Tsumura 1986). There are different implementations for the line navigation such as Electromagnetic Guidance, Reflecting Tape Guidance and Ferrite Painted Guidance, which uses ferrite magnet powder painted on the floor. These techniques have been in use for many years in industrial automation tasks. Vehicles using these techniques are generally called Automatic Guided Vehicles (AGVs). However, two recently introduced variations of the line navigation approach are of interest for mobile robots. Both the techniques are based on the use of Short Lived Navigational Markers (SLNM). Monash University has investigated two forms of SLNM making use of a heated trail and an odor trial. Thermal trail The major contribution of Lindsay Kleeman and R. Andrew Russell (Kleeman and Russell 1993) is that they used a thermal sensor for short tracks, as opposed to the use of computer vision, dead reckoning, or other sensors. Due to the speed and characteristics of the thermal sensor, modeling of robot dynamics, slippage, and friction is not required in this problem. To detect infrared energy radiated by the heat trail a pyroelectric sensor based on lithium tantalite was employed. Within the sensor the lithium tantalite is formed into a thin plate capacitor and has a temperature dependent spontaneous polarization. A single pyroelectric sensor is mounted at the end of a servo-controlled arm, and this system is contained inside an aluminum enclosure. Because only one sensor is used for
PAGE 22
10 both measurements there are no problems of matching the characteristics of two individual devices. A Motorola CFHC11 micro controller coordinates the function of the sensor by controlling the position of the servo. A thermal path was laid using a 70 w quantity halogen projector lamp attached to a tractor robot and located 30 mm, above a vinyl-tiled floor. The thermal sensor was placed on the front of the robot vehicle. SLNM markers decay after use and so one needs to remove them. The varying intensity of SLNM’s makes it difficult to follow the path accurately. Thermal paths can be tracked up to ten minutes after being laid on a tiled floor, thus allowing applications of SLNM to be of direct assistance in a number of navigation tasks. As one can see that these types of positioning systems can be used in the indoor environment most effectively. Odor trail This interesting technique is based on laying down an odor trail and using an olfactory sensor to allow a mobile robot to follow the trail at a later time. The technique was described by Deveza (Deveza et al. 1994), and the experimental system was further enhanced as described by Russell (Russell, 1995) at Monash University in Australia. Russell’s improved system comprises a custom built robot equipped with an odor-sensing system. The sensor system uses controlled flows of air to draw odor air over a sensor crystal. The crystal is used as a sensitive balance to weigh odor molecules. The quartz crystal has a coating with a specific affinity for the target odorant; molecules of that odorant attach easily to the coating and thereby increase the total mass of the crystal. While the change of mass is extremely small, it suffices to change the resonant frequency of the crystal. A 68Hc11 microprocessor is used to count the crystal’s frequency, which is in the kHz region. A change of frequency is indicative of odor concentration. In Russell’s
PAGE 23
11 system two such sensors are mounted at a distance of 30 mm (1-3/16 in) from each other, to provide a differential signal that can then be used for path tracking. A miniature robot built by Russell shown in Figure 2-4 is equipped to follow chemical trails on the ground. For laying the odor trial, Russell used a modified felt tip pen. The odor-laden agent is camphor, dissolved in alcohol. When applied to the floor, the alcohol evaporates quickly and leaves a 10 mm wide camphor trial. Russell measured the response time of the olfactory sensor by letting the robot cross an odor trail at angles of 90 and 20 degrees. Currently, the foremost limitation of Russell’s volatile chemical navigational marker is the robots slow speed of 6 mm/s (1/4 in/s). As one can see the short lived landmarks could be used only in indoors. Beacon Navigation System (TRC) Transitions Research Corporation [TRC], Danbury , CT, has incorporated their LED based Light Ranger, into a compact, low cost navigational referencing system for open area autonomous platform control. The TRC Beacon Navigation System calculates vehicle position and heading at ranges unto 24 m (80ft) within a quadrilateral area defined by four passive retro reflective beacons (Borenstein et al. 1996) A static 15-second unobstructed view of all four beacons is required for initial acquisition and setup, after which only two beacons must remain in view as the robot moves about. At this time there is no provision to periodically acquire new beacons along a continuous route, so operation is currently constrained to a single zone roughly the size of a small building (i.e. 24 x 24 m or 80 x 80 ft). Power requirements are 0.5 A at 12 VDC and 0.1 A at 5 VDC. The system costs $11,000.
PAGE 24
12 Fixed Beacon System (CONAC) A stationary active beacon system that tracks an omni directional sensor mounted on the robot is currently being sold to allow for tracking multiple units. The basic system consists of two synchronized stationary beacons that provide bearings to the mobile sensor to establish its x-y location. A hybrid version of this approach employs two lasers in which one of the beacons with lower laser plane is tilted from the vertical to provide coverage along the Zaxis for three-dimensional indoor systems (Borenstein et al. 1996) Long-range exterior position accuracy is specified as 1.3 mm and the heading accuracy is 0.05 degrees. The nominal maximum line of sight distance is 250 m system. This system represents the current state of the art in terms of active beacon positioning. A basic system with one strobe and three nods costs approximately $4,000. Imperial College Beacon Navigation System Premi and Besant (Prem and Besant, 1987) of the Imperial College of Science and Technology, London, England, describe an AGV guidance system that incorporates a vehicle mounted with a laser beam rotating in a horizontal plane that intersects three fixed location reference sensors. The photoelectric sensors are arranged in collinear fashion with equal separation and are individually wired to a common FM transmitter via appropriate electronics so that the time of arrival of laser energy is relayed to a companion receiver on board the vehicle. A digitally coded identifier in the data stream identifies the activated sensor that triggered the transmission, thus allowing the onboard computer to measure the separation angles. Tradeoffs include the increased cost associated with installation of poser and communications lines and the need for significantly more expensive beacons. This can be a serious drawback in very large area
PAGE 25
13 installations, or scenarios where multiple beacons must be incorporated to overcome line of sight limitations. Spatial Positioning System (Odyssey) Spatial Positioning Systems (SPS) of Reston, Virginia, develops and markets a high accuracy 3-D positioning system called Odyssey. The Odyssey system was originally developed for the accurate surveying of construction sites (and for retroactive three-dimensional modeling of buildings, etc). However, it appears that the system can be used for mobile robot operations quite easily (Borenstein et al. 1996) The Odyssey system is comprised of two or more stationary laser transmitters and a mobile optical receiver mounted on top of the red-white receiving pole in the center. The receiver is connected to a portable data-logging device with real time data output via RS232 serial interface. In its originally intended hand held mode of operation, the surveyor holds the tip of the receiver wand at a point of interest. The system records instantly the three-dimensional coordinates of that point. To set up the Odyssey system two or more transmitters must be placed at precisely known locations in the environment. Alternatively the accurate transmitter position can be computed in a reverse calibration procedure in which the receiver is placed at 4 known positions. Once the transmitters are located at known positions, one or more receivers can produce data points simultaneously, while being applied in the same environment. For mobile robot applications the Odyssey system is expensive at roughly $90,000, depending on system configuration. Laser Net Beacon Tracking System (NAMCO) The NAMCO Lasernet beacon tracking system uses retro reflective targets distributed throughout the operation area of an automated guided vehicle (AGV) in order
PAGE 26
14 to measure range and angular position. A servo controlled rotation mirror pans a near infrared laser beam through a horizontal arc of 90 degrees at a 20 Hz update rate. When the beam sweeps across a target of known dimensions, the detector senses a return signal of finite duration. Since the targets are all the same size, the signal generated by a close target will be of longer duration than that from a distant one. Figure 2-5 shows the LaserNet Beacon Navigation System. The angular accuracy 0.1 percent and the angular resolution is 0.1 degrees for the analog output; accuracy is within 0.05 percent with a resolution of 0.006 degrees when the RS-232 serial port is used. The analog output is a voltage ranging from 0 to 10v over the range of to +45 degrees, whereas the RS-232 serial port reports a proportional “count value” from 0 to 15360 over this same range. The system costs $3400 in its basic configuration, but it has only a limited range of 15 m. Denning Branch International Robotics Laser Navigation Position Sensor Denning Branch International Robotics [DBIR], Pittsburgh, Pa, offers a laser based scanning beacon system that computes vehicle position and heading out to 183 m using cooperative electronic transponders, called active targets. A range of 30.5 m is achieved with simple reflectors. An absolute bearing accuracy of 0.03 degrees is obtained. The fan shaped beam is spread 4 degrees vertically to ensure target detection at long range while traversing irregular floor surfaces, with horizontal divergence limited to 0.017 degrees. Each target can be uniquely coded so that the laser can distinguish up to 32 separate active or passive targets during a single scan. The eye safe near infrared laser generates a 1mw output at a wavelength of 810 nanometers. Figure 2-6 shows the LaserNav Laser based sensor.
PAGE 27
15 One of the potential sources of problems with this device is the relatively small vertical divergence of the beam: 2 degrees. This undesirable phenomenon is likely due to reflections off shiny surfaces other than the passive reflectors. This problem affects probably all light based beacon navigation systems to some degree. Another source of erroneous beacon readings is bright sunlight entering the workspace through wall openings. So this system could be used only in the indoors. Learned Landmarks In an attempt to capitalize on the benefits of both the image and landmark based methods, Robert Sim, and Gregory Dudek, Center for Intelligent Robots at McGill University, described a method that combines their strengths. Images are encoded as a set of visual features called landmarks. This approach uses image landmarks to perform position estimation but learns these landmarks from a preliminary traversal of the environment. They selected extreme densities of the edge distribution in each image as landmark candidates. Then they produce low dimensional descriptions of the appearance of each of the observed landmarks. Only those candidates which are maximal over their neighborhood and which are in the user defined threshold density are taken. There are two subsections that explain the details of each stage (Betke and Gurvits 1994) Landmark Database Viewpoints are selected such that the camera is facing in a consistent orientation; once these images have been obtained they are used to automatically compute a suitable set of landmarks for subsequent positions. The goal of this method is to grow tracked landmarks over space so that a candidate landmark can be matched to the correct target over a large portion of space.
PAGE 28
16 Online Localization Matching candidate landmarks and exploiting transformation of each landmark into a subspace defined by its corresponding tracked set was done in online localization. Given a set of position estimates from the set of observed landmarks in an image a final position estimate is obtained by first detecting and removing outliers using a median filter and then finding the mean of the remaining estimates. The approach here is used on domain specific landmarks using a subspace projection method based on experimentally validating attention, which is based on maximal of the edge density distribution. The use of discrete landmarks generated by an encoding of the landmark sub-images makes the method potentially robust against isolated changes in the environment. It allows for post processing of selected landmarks to apply additional criteria. Finally the fact that the landmark representation is learned suggests that the technique cannot be applied to a wide range of different environments. This Chapter has presented several landmark based navigation systems that are available in the market today. It also explains the reasons why none of these systems were used on the test vehicle. Since the need is for a position system that can be used outdoors, most of the systems discussed above either seem to be used in indoor applications or are too expensive. The following chapters will discuss the techniques that have been developed for the outdoor application using artificial landmarks.
PAGE 29
17 Figure 2-1. Autonomous robot for a known environment (Courtesy of Atomic Energy of Canada Ltd.) Figure 2-2. Z shaped landmarks located at 50 m intervals (“Where am I ? Sensors and Methods for Moblie Robot Positioning,” Borenstein, J., Everett, H.R., and Feng, L., Fig 7.7, pg 179, 1996)
PAGE 30
18 Figure 2-3. The Z shaped landmark (“Where am I ? Sensors and Methods for Moblie Robot Positioning,” Borenstein, J., Everett, H.R., and Feng, L., Fig 7.9, pg 180, 1996) Figure 2-4 Miniature robot equipped to follow chemical trails on the ground
PAGE 31
19 Figure 2-5. Beacon tracking system (LASERNET) (Courtesy of Namco Controls Corp.) Figure 2-6. LaserNav laser-based beacon sensor (Courtesy of Denning Branch International Robotics)
PAGE 32
CHAPTER 3 TRILATERATION This chapter explains the algorithm behind the initial technique called Trilateration developed to determine the position and orientation of the vehicle by measuring the direction from the vehicle to three pre positioned landmarks and it also explains the reasons why this technique was not used. This approach is similar to other previous beacon based navigation systems where optical reflectors, short lived navigational markers such as odor trails or heat trails and acoustic landmarks were being used but these could be used only in indoors. The three landmarks which are distinguishable from the environment are detected by using cameras placed on the vehicle. For this analysis, the world coordinate system will be referred to as coordinate system 1 while the vehicle the coordinate system is referred to as coordinate system 2. The problem statement of the algorithm can be defined as follows: Given 1 P 1 , 1 P 2 , 1 P 3 , the coordinates of three target points measured in the 1st coordinate system. 1 , 1 , 2 , 2 , 3 , 3 , the orientation angles towards target points. Find , the 4 x 4 transformation matrix that defines the position and orientation of the 2nd coordinate system with respect to the 1st. 12 Thus the position and orientation of the vehicle coordinate system can be determined by measuring the direction to three fixed target points. Figure 3-1 depicts the definition of the angles i and i. The direction to a target point is defined by first 20
PAGE 33
21 rotating a line parallel to the x axis of the vehicle coordinate system by an angle I about the axis. This line is then rotated an angle i about the modified y axis of the vehicle coordinate system. The coordinates of the target points 1 through 3 can be written in terms of the 2nd coordinate system as 11111112sinsincoscoscosdP , (3-1) 22222222sinsincoscoscosdP 33333332sinsincoscoscosdP where the distances d 1 , d 2 and d 3 are unknown. Let L 12 , L 23 , and L 31 represent the distance between targets 1 and 2, 2 and 3, and 3 and 1 respectively. These distances can be determined from the target point coordinates that are given in terms of the 1st coordinate system. Next, the following terms are defined to represent the components of the direction cosines of the vectors defined in Eq. 3-1. S 1x = cos 1 cos 1 , S 1y = cos 1 sin 1 , S 1z = sin 1 , S 2x = cos 2 cos 2 , S 2y = cos 2 sin 2 , S 2z = sin 2 , S 3x = cos 3 cos 3 , S 3y = cos 3 sin 3 , S 3z = sin 3 (3-2) And thus Eq. 3-1 can be written as Eq. 3.3 by substituting the values of Eq. 3-2. zyxSSSd111112P , (3-3) z2y2x2222SSSdP z3y3x3332SSSdP The distance between the target points can be determined in terms of the coordinates of the target points measured in the 2nd coordinate system as L 12 2 = (d 2 S 2x – d 1 S 1x ) 2 + (d 2 S 2y – d 1 S 1y ) 2 + (d 2 S 2z – d 1 S 1z ) 2 L 23 2 = (d 3 S 3x – d 2 S 2x ) 2 + (d 3 S 3y – d 2 S 2y ) 2 + (d 3 S 3z – d 2 S 2z ) 2 L 31 2 = (d 1 S 1x – d3S 3x ) 2 + (d 1 S 1y – d 3 S 3y ) 2 + (d 1 S 1z – d 3 S 3z ) 2 (3-4) These equations can be expanded and arranged into the form
PAGE 34
22 d 2 2 + d 2 (-2k 1 d 1 ) + (d 1 2 – L 12 2 ) = 0 (3-5) d 3 2 + d 2 2 + d 2 d 3 (-2k 2 ) – L 23 2 = 0 (3-6) d 3 2 + d 3 (-2k 3 d 1 ) + (d 1 2 – L 31 2 ) = 0 (3-7) where k 1 = (S 1x S 2x + S 1y S 2y + S 1z S 2z ), k 2 = (S 2x S 3x + S 2y S 3y + S 2z S 3z ), k 3 = (S 1x S 3x + S 1y S 3y + S 1z S 3z ) (3-8) Eq. 3-5 and Eq. 3-7 are now rewritten as d 2 2 + 2B 1 d 2 + C 1 = 0 (3-9) d 3 2 + 2B 2 d 3 + C 2 = 0 (3-10) where B 1 = -k 1 d 1 , C 1 = d 1 2 – L 12 2 , B 2 = -k 3 d 1 , C 2 = d 1 2 – L 31 2 . (3-11) Solving Eq. 3-9 for d2 and Eq 3-10 for d3 gives d 2 = -B 1 1 (3-12) d 3 = -B 2 2 . (3-13) where 1 = B 1 2 – C 1 , 2 = B 2 2 – C 2 . (3-14) Subtracting the sum of Eq 3-5 and Eq 3-7 from Eq 3-6 gives d 2 d 3 (-2k 2 ) + d 2 (2k 1 d 1 ) + d 3 (2k 3 d 1 ) + (L 12 2 + L 31 2 – L 23 2 – 2d 1 2 ) = 0. (3-15) This equation can be written as A 3 d 2 d 3 + B 3 d 2 + C 3 d 3 + D 3 = 0 (3-16) where A 3 = -2 k 2 , B 3 = 2 k 1 d 1 , C 3 = 2 k 3 d 1 ,
PAGE 35
23 D 3 = L 12 2 + L 31 2 – L 23 2 – 2d 1 2 (3-17) Substituting Eq. 3-12 and Eq. 3-13 into Eq. 3-16 and rearranging gives A 3 1 2 + (A 3 B 1 B 2 – B 1 B 3 – B 2 C 3 + D 3 ) = (A 3 B 2 – B 3 ) 1 + (A 3 B 1 – C 3 ) 2 (3-18) Squaring both sides of Eq 3-18, using Eq. 3-14 to substitute for 1and 2, and rearranging gives 2 (A 3 D 3 – C 3 B 3 ) 1 2 = 2 B 1 (B 3 D 3 + A 3 C 3 C 2 ) + 2 B 2 (A 3 B 3 C 1 + C 3 D 3 ) -2 B 1 B 2 (A 3 D 3 + B 3 C 3 ) – (D 3 2 + C 3 2 C 2 + B 3 2 C 1 + A 3 2 C 1 C 2 ) (3-19) Squaring both sides of Eq. 3-19 using Eq. 3-14 to substitute for 1and 2, and rearranging gives A 3 4 C 1 2 C 2 2 + 4A 3 3 [B 2 B 3 C 2 C 1 2 + B 1 C 1 C 2 (C 2 C 3 – B 2 D 3 ] + 2A 3 2 [D 3 2 (-2C 1 B 2 2 – 2B 1 2 C 2 + C 1 C 2 ) + D 3 (8B 1 2 B 2 C 3 C 2 +8B 2 2 B 3 C 1 B 1 + 4C 2 C 1 B 2 C 3 + 4C 2 C 1 B 1 B 3 ) + C 3 2 C 2 2 (-4B 1 2 – 2C 1 ) + C 1 2 B 3 2 (-4B 2 2 – 2C 2 ) 12 C 2 C 1 B 1 B 3 B 2 C 3 = 0 (3-20) Substituting Eq. 3-11 and Eq. 3-17 into Eq. 3-20 results in the following fourth degree polynomial in the variable d12 a 4 (d 12 ) 4 + a 3 (d 1 2 ) 3 + a 2 (d 1 2 ) 2 + a 1 (d 1 2 ) + a 0 = 0 (3-21) where a 4 = -16 k 3 4 + 64 k 1 k 2 k 3 3 + (-2 k 1 2 k 2 2 – k 1 2 – k 2 2 + 1) 32 k 3 2 + ( k 1 2 + k 2 2 1) 64 k 1 k 2 k 3 + 16(-k 1 4 + 2 k 1 2 – k 2 4 + 2 k 2 2 – 2 k 1 2 k 2 2 – 1) (3-22) a 3 = (L 12 2 + 2 k 12 L 23 2 – L 23 2 ) 32 k 3 4 + (L 23 2 – 4 k 1 2 L 23 2 – L 31 2 – 3 L 12 2 ) 32 k 1 k 2 k 3 3 + [(4 k 1 2 k 2 2 – 4 k 1 2 + 2 k 1 4 – k 2 2 + 2) 32 L 23 2 + (2 k 12 k 2 2 + k 1 2 + k 2 2 – 1) 32 L 31 2 + (2 k 1 2 k 2 2 + 2 k 2 2 + k 1 2 2) 32 L 12 2 ] k 3 2 + [(k 1 k 2 – k 1 k 2 3 + k 1 3 k 2 ) 32 L 23 2 + (k 1 k 2 – k 1 k 2 3 – k 1 3 k 2 ) 96 L 31 2 + (3 k 1 k 2 – 3 k 1 k 2 3 – k 1 3 k 2 ) 32 L 12 2 ] k 3 + (-k 1 4 – k 1 2 k 2 2 + 2 k 1 2 + k 2 2 ) 32 L 23 2 + (k 2 4 – k 1 2 – 2 k 2 2 + k 1 2 k 2 2 + 1 ) 32 L 12 2 + (k 1 4 + k 2 4 + 2 k 1 2 k 2 2 k 1 2 – 2 k 2 2 + 1) 32 L 31 2 (3-23) a 2 = (-L 12 4 – L 23 4 + 2 L 12 2 L 23 2 ) 16 k 3 4 + [(L 12 2 + L 23 2 ) 32 L 31 2 + (L 12 4 + L 234 – 2 L 12 2 L 23 2 ) 32 ] k 1 k 2 k 3 3 + [(-2 k 1 2 – 2 k 2 2 +1) 8 L 31 4 + (-4 k 1 2 k 2 2 k 2 2 + 3) 16 L 12 2 L 31 2 + (-4 k 1 2 k 2 2 + 4 k 1 2 – 3) 16 L 23 2 L 31 2 + (-2 k 1 2 – 6 k 2 2 + 5) 8 L 12 4 + (4 k 12 k 2 2 + 4 k 12 + 4 k 2 2 – 5) 16 L 12 2 L 23 2 + (-6 k 1 2 – 2 k 2 2 + 5) 8 L 23 4 ] k 3 2 + [(2 k 1 2 + 2 k 2 2 -1) 16 k 1 k 2 L 31 4 + (k 1 2 + 4 k 2 2 – 5) 32 k 1 k 2 L 12 2 L 31 2 + (-2 k 1 2 + k 2 2 + 1) 32 k 1 k 2 L 23 2 L 31 2 + (2 k 1 2 1) 16 k 1 k 2 L 23 4 + (k 1 2
PAGE 36
24 + k 2 2 + 1) 32 k 1 k 2 L 12 2 L 23 2 + (2 k 2 2 – 1) 16 k 1 k 2 L 12 4 ] k 3 + [(-2 k 1 4 – 2 k 2 4 + 5 k 1 2 + 5 k 2 2 – 6 k 1 2 k 2 2 – 3) 8 L 31 4 + (-4 k 2 4 – 2 k 1 2 k 2 2 + 3 k 1 2 + 7 k 2 2 – 3) 16 L 12 2 L 31 2 + (2 k 1 4 + 4 k 1 2 k 2 2 – 5 k 1 2 – 3 k 2 2 + 3) 16 L 23 2 L 31 2 + (-2 k 2 4 + 5 k 2 2 + k 1 2 – 2 k 1 2 k 2 2 – 3) 8 L 12 4 + (k 1 2 – k 2 2 + 1) 48 L 12 2 L 23 2 + (2 k 1 4 + 5 k 1 2 + k 2 2 k 1 2 k 2 2 – 3) 8 L 23 4 ] (3-24) a 1 = [2 k 1 2 k 2 2 – k 1 k 2 k 3 – k 1 2 – k 2 2 + 1] 8 L 31 6 + [(-4 k 1 k 2 3 k 3 + 2 k 2 2 k 3 2 + 4 k 2 4 + 5 k 1 k 2 k 3 – k 3 2 – 7 k 2 2 – 2 k 1 2 + 3) L 12 2 + (3 k 1 2 + 2 k 2 2 + k 1 k 2 k 3 – 4 k 1 2 k 2 2 + k 3 2 – 3) L 23 2 ] 8 L 31 4 + [(4 k 2 4 + 2 k 1 2 k 2 2 – k 1 2 – 2 k 3 2 + 5 k 1 k 2 k 3 – 4 k 1 k 2 3 k 3 – 7 k 2 2 + 3) 8 L 12 4 + (2 k 1 k 2 3 k 3 + 2 k 3 2 + 4 k 2 2 – 3 k 1 k 2 k 3 +2 k 1 2 – 3) 16 L 12 2 L 23 2 + (2 k 1 2 k 2 2 + k 1 k 2 k 3 – 3 k 1 2 – 2 k 3 2 – k 2 2 + 3) 8 L 23 4 ] L 31 2 + (2 k 2 2 k 3 2 – k 1 k 2 k 3 – k 2 2 – k 3 2 + 1) 8 L 12 6 + (3 k 3 2 + k 1 2 – 4 k 2 2 k 3 2 + k 1 k 2 k 3 + 2 k 2 2 – 3) 8 L 12 4 L 23 2 + (-2 k 1 2 + k 1 k 2 k 3 – k 2 2 + 2 k 2 2 k 3 2 – 3 k 3 2 + 3) 8 L 12 2 L 23 4 + (k 1 2 + k 3 2 – k 1 k 2 k 3 – 1)8 L 23 6 (3-25) a 0 = L 31 8 + (-L 12 2 + L 23 2 +2 k 2 2 L 12 2 ) 4 L 31 6 + (16 k 2 2 L 12 4 – 6 L 23 4 – 16 k 2 2 L 12 2 L 23 2 – 6 L 12 4 – 16 k 2 4 L 12 4 + 12 L 12 2 L 23 2 ) L 31 4 + (2 k 2 2 L 12 2 L 23 4 + 3 L 12 4 L 23 2 + L 23 6 – L 12 6 + 2 k 22 L 12 6 – 3 L 12 2 L 23 4 – 4 k 2 2 L 12 4 L 23 2 ) 4 L 31 2 + (4 L 12 2 L 23 6 – L 12 8 + 4 L 12 6 L 23 2 – L 23 8 – 6 L 12 4 L 23 4 (3-26) Corresponding values for d2 and d3 will now be obtained. Eq. 3-16 may be written as (A 3 d 2 + C 3 ) d 3 + (B 3 d 2 + D 3 ) = 0 ` (3-27) Multiplying this equation by d 3 gives (A 3 d 2 + C 3 ) d 3 2 + (B 3 d 2 + D 3 ) d 3 = 0 (3-28) Eq 3-10, Eq. 3-27 and Eq 3-28 may be written in matrix format as 0001dd0DdBCdADdBCdA0CB2132332332332332322 (3-29) The condition that must exist in order that a common value of d3 will simultaneously satisfy Eq. 3-10 and Eq. 3-27 is that Eq. 3-10, and Eq. 3-28 be linearly dependent. Thus it is necessary that 00DdBCdADdBCdA0CB2132332332332322 (3-30) Expanding Eq. 3-30 gives
PAGE 37
25 E 1 d 2 2 + E 2 d 2 + E 3 = 0 (3-31) where E 1 = -A 3 2 C 2 – B 3 2 + 2 A 3 B 2 B 3 (3-32) E 2 = 2 C 3 B 2 B 3 + 2 A 3 B 2 D 3 – 2 A 3 C 2 C 3 – 2 B 3 D 3 (3.33) E 3 = 2 B 2 C 3 D 3 – C 2 C 3 2 – D 3 2 (3-34) Multiplying Eq. 3-9 by –E 1 and adding to Eq. 3-31 gives (E 2 – 2 B 1 E 1 ) d 2 + (E 3 – E 1 C 1 ) = 0. (3-35) Solving for d 2 gives 1123112EB2EECEd . (3-36) Eq. 3-16 can now be used to determine the corresponding value for d 3 . The next task is to determine1, the 44 transformation matrix that defines the position and orientation of the second coordinate system measured with respect to the first, for each of the solution sets {d T2 1 , d 2 , d 3 }.From Eq. 3-3, the coordinates of the three target points are known in the second coordinate system. Unit vectors that point from the first to the second target point and from the first to the third target point can be determined in the first and second coordinate systems as ||1i2i1i2i12iPPPPS , i = 1, 2 (3-37) ||1i3i1i3i13iPPPPS , i = 1, 2 (3-38) The unit vector that is perpendicular to the plane formed by the three target points can be determined in terms of the first and second coordinate systems as ||13i12i13i12iniSSSSS , i = 1, 2 (3-39)
PAGE 38
26 The transformation matrix 1 can be written as T2 1000orig211212PRT (3-40) where is a 33 rotation matrix and R12 1 P 2 orig represents the coordinates of the origin point of the second coordinate system as measured with respect to the first coordinate system. The rotation matrix will transform the direction vectors S 12 , S 13 , and S n from the second to the first coordinate system according to the equations 1 S 12 = R12 2 S 12 (3-41) 1 S 13 = R12 2 S 13 (3-42) 1 S n = R12 2 S n (3-43) These three equations can be written in matrix format as [ 1 S 12 , 1 S 13 , 1 S n ] = [ R12 2 S 12 , 2 S 13 , 1 S n ] (3-44) The matrix can thus be determined as R12 R12 = [ 1 S 12 , 1 S 13 , 1 S n ] [ 2 S 12 , 2 S 13 , 2 S n ] (3-45) The last component of the transformation matrix to be determined, i.e. T12 1 P 2orig , can be obtained as 1 P 2orig = 1 P 1 – R12 2 P 1 . (3-46) So by substituting the values obtained from Eq. 3-46 and Eq. 3-45 into the Eq. 3-40 we can obtain the 4 x 4 transformation matrix, .It has been shown that the position and orientation of a mobile vehicle can be determined by measuring the direction from the vehicle to three pre-positioned landmarks. The approach has been programmed in the C computer language and the results verified using a CAD drawing package. T12
PAGE 39
27 This technique is useful in finding the position of the vehicle with respect to the beacons without the use of range sensors. However, there are drawbacks to this approach. It is necessary that the angular orientation data of the three targets with respect to the vehicle be obtained simultaneously. This will require either three cameras that each tracks the direction to a target point or the vehicle will have to stop to obtain the data needed to calculate its position and orientation. This may or may not be practical for a particular scenario. As mentioned previously the results of this technique have been verified by using CAD software. The target points and two coordinate systems were drawn as shown in the CAD drawings Figure 3-1. One of the coordinate systems is for the world coordinate system (1st coordinate system) and the other for the vehicle coordinate system (2nd coordinate system). The three target points were measured with respect to both these coordinate systems in the CAD drawing. So we have the values of 1 P 1 , 1 P 2 , 1 P 3 and 2 P 1 , 2 P 2 ,, 2 P 3 . From this drawing we also calculated the distance between the targets, i.e., d 12 , d 23 and d 31 as well as the orientation angles towards the target points, i.e., 1 , 1 , 2 , 2 , 3 and 3 . The algorithm needs the target points in world coordinate system, orientation angles and the distances between the target points as input. The output of the algorithm was the transformation matrix. T12 The algorithm was verified by calculating the values of the targets points in the vehicle coordinate system, i.e., 2 P 1 , 2 P 2 and 2 P 3 by using the equation below and comparing the obtained values with previously obtained values from the CAD drawings. 2 P 1 = 2 T 1 x 1 P 1 2 P 2 = 2 T 1 x 1 P 2
PAGE 40
28 2 P 3 = 2 T 1 x 1 P 3 where 2 T 1 be obtained by calculating the inverse of 1 T 2 . The algorithm is also verified in another way. The distances to the target points from the vehicle coordinate system, i.e., d1, d2 and d3 are calculated from the CAD drawing and these values can be compared with the values obtained from Equations. 3-21, 3-36 and 3-16. The results have been obtained for different scenarios by changing the positions of the target points in the CAD drawings and calculating the transformation matrix 1 for these different scenarios. Figure 3-2 shows a simulation of how the scenario would look when the vehicle approaches the three landmarks. 2 Figure 3-1 Landmark-based positioning system problem statement
PAGE 41
29 Figure 3-2. View of the Trilateration Technique
PAGE 42
CHAPTER 4 CAMERA CALIBRATION This chapter deals with a technique to determine a vehicle’s position and orientation just by using a single landmark and a camera. The basic idea of this technique is to initially determine the position of the camera on the vehicle with respect to the landmark and since its position is already known with respect to the vehicle it then becomes a simple mathematical problem to obtain the position of the vehicle with respect to landmark. The limitations of the previously developed technique could be overcome by using this new approach. The approach is called as the camera calibration technique. Initially camera calibration techniques were developed in the field of photogrammetry for aerial imaging and surveying. First, photographic cameras were used, but recently video cameras have replaced them almost completely. Camera calibration plays an important role in application areas like robot vision and industrial metrology. Three dimensional vision applications, such as robot vision, require modeling of the relationship between the two-dimensional images and the three-dimensional world. Camera calibration is the process which accurately models this relationship. It is a necessary step in 3D computer vision in order to extract metric information from 2D images. So Camera Calibration is an essential part of metric image processing. Camera calibration provides a way of determining the position of an object point which can be used in applications such as a mechanical part’s dimensional measurement, automatic assembly of mechanical or electronics components, robot calibration, and trajectory analysis. 30
PAGE 43
31 The requirements for camera calibration depend on the application. In applications such as robot guidance, the calibration procedure should be fast and automatic, but in metrology applications, precision is a more important factor. Camera calibration involves relating the optical features of a lens to the sensing device. During the last decade several techniques have been developed to be accurate and fast and to offer efficient calibration. In order to understand these techniques the definition of camera calibration will first be presented followed by a description of the technique being used here. Definition of Camera Calibration The major issue of camera calibration in the context of three dimensional machine vision is to determine the internal camera geometric (intrinsic parameters) and the 3D position and orientation of the camera frame relative to a certain world coordinate system (extrinsic parameters). A brief description of these parameters is given below. Intrinsic Parameters (Interior Projective Parameters) These parameters give the relationship between camera-centric coordinates and image coordinates. Effective focal length (f): the distance from the real model plane of a lens to the focus, when the lens is focused at the infinity position. It is measured in millimeters. Scale factor (S x ): an uncertainty parameter which is introduced due to a variety of factors, such as slight hardware timing mismatch between image acquisition hardware and camera scanning hardware. Even a one percent difference can cause three to five pixels of error for a full resolution frame. Principal point (u 0 , v 0 ). The computer image coordinates for the origin in the image plane. It is the foot of the perpendicular from the center of projection to the image plane.
PAGE 44
32 Distortion coefficients: Radial distortion coefficient (k 1 , k 2 ) Radial distortion is a product of cheap lenses where the magnification is different at the edges than at the center of the lens. Because of lens shapes, radial distortion is also a function of the lens field of view. Radial distortion is usually not an issue for telephoto lenses but is quite prevalent in wide angle lenses. Although a function of the radius, the effects of the distortion are noticeable through most of the image. Discarding the affected areas would therefore leave little of the image left for processing. Due to this distortion, the world points are therefore not projected on a plane, but rather in a surface which can be considered to be spherical. This has the effect that straight lines are mapped to parabolas in the image. For example, the grid in Figure 4-1 shows how an undistorted image (A) is converted to a distorted image (B) by radial distortion. To compensate for this distortion, coefficients are used which are called radial distortion coefficients. If (u, v) are the ideal image coordinates of a point and (u’, v’) are the distorted image coordinates, then the displacement (d u , d v ) due to radial distortion can be modeled using the equations: d u = u’ (k 1 r 2 + k 2 r 4 + k 3 r 6 ) d v = v’ (k 1 r 2 + k 2 r 4 + k 3 r 6 +) (4-1) where r 2 =(x 2 + y 2 ) and k 1 , k 2 , k 3 ..... are the radial distortion coefficients. The displacement increases with distance from the center of the image. Since the higher order terms of do not contribute to the radial lens distortion significantly, only the first two terms of radial distortion coefficient i.e. (k 1 ,k 2 ) are considered. Tangential distortion coefficient (p 1 , p 2 ) Centers of curvature of lens surfaces are not always strictly collinear. This introduces another common distortion type called as
PAGE 45
33 tangential distortion. To compensate for this distortion tangential distortion coefficients are used. If (u, v) are the ideal image coordinates of a point, (u’, v’) are the distorted image coordinates, then the displacement (d u , d v ) due to tangential distortion which is at right angle to the vector from the center of the image can be modeled using equations: d u = v 0 (p 1 r 2 + p 2 r 4 + p 3 r 6 + ) d v = u 0 (p 1 r 2 + p 2 r 4 + p 3 r 6 + ) (4-2) where r 2 =(x 2 + y 2 ) and p 1 , p 2 , p 3 are the tangential distortion coefficients. Like radial distortion, the displacement in tangential distortion also increases with distance from the center of the image and also the higher order terms of the tangential distortion are insignificant and hence only the first two terms of the tangential distortion (p 1 , p 2 ) are considered. In both, the distortions occur only at the even powers of the distance r from the principal point and typically only the first or perhaps the first and the second term in the power series are retained. In calibration one attempts to recover the coefficients of these power series. Other distortion coefficients like the linear distortion coefficient can be useful if the image axes are not orthogonal, but in most cases the CCD arrays are almost perfect rectangles, and hence the linear distortion component is insignificant. Extrinsic Parameters (Exterior Projective Parameters) These are the parameters which are of more importance to this application. These parameters give the relationship between an object centered coordinate system and a camera centered coordinate system. The transformation from object to camera consists of a rotation and a translation. This transformation has six degrees of freedom, three for rotation and three for translation. The rotation consists of three Euler angles, yaw q, pitch f, and revolve y, i.e., the rotation about the x, y and z axis. The three rotation angles form
PAGE 46
34 a 3 x 3 rotation matrix R. A translation vector T which is a 3 x 1 matrix consists of the x, y, and z distances of the object coordinate system origin to the camera coordinate system origin. The orientation and position relationship between the two coordinate systems can be written as a 4 x 4 matrix as (4-3) 1000R where R can be represented as and T as 312111RRR 322212RRR 332313RRR TzTyTx Camera Calibration Techniques The camera calibration techniques should meet the criterion described below (Roger Tsai 1987). Accurate. The techniques developed should have the potential of meeting the accuracy requirements so that they can be used in applications such as inspection, assembly of mechanical parts. This in turn means that factors such as lens distortion and perspective projection rather than parallel projection should be introduced in the theoretical modeling of the imaging process. Versatile. The technique should operate uniformly and autonomously for a wide range of optical setups and accuracy requirements. Autonomous. The calibration procedure should not require manual interference such as initial guesses for some parameters. Efficient. The calibration technique should allow enough potential for high speed implementation Usage with off shelf cameras. The technique should be compatible for the off shelf cameras because they are easily available We can classify the calibration techniques roughly into two categories. One is self calibration and the other is photogrammetric calibration. The self calibration category of calibration does not require any calibration object. Just by moving a camera in a static
PAGE 47
35 scene, the rigidity of the scene provides two constraints on the cameras internal parameters. Therefore, if images are taken by the same camera with fixed internal parameters, parameters allow us to reconstruct 3-D structure up to a similarity. While this approach is very flexible, it is not yet mature (Bougnoux 1998). Because there are many parameters to estimate we cannot always obtain reliable results. Since the results are not accurate in self calibration emphasis will be focussed on the Photogrammetric Calibration technique. Photogrammetric Calibration Calibration is performed by observing a calibration object whose geometry in 3-D space is known with very good precision. This calibration can be done very efficiently (Roger Tsai 1987). The calibration object usually consists of two or three planes orthogonal to each other. These approaches require an elaborate setup. Many experimental set-ups have been proposed: a single view of a 3D object or multiple views of a planar calibration tile. These objects are marked with grids, squares, discs, circular dots of different colors. The achieved accuracy depends upon: The accuracy of the 3D calibration object or the accuracy of the xyz translation stage for moving a planar calibration tile. The accuracy is provided on the attributes of the 2D features extracted from the image. The numerical method used for estimating the camera parameters. The various techniques developed in the Photogrammetric calibration, will be discussed followed by advantages and disadvantages of these techniques. Linear calibration Linear camera calibration techniques involve the determination of a perspective transformation matrix by solving linear equations. The algorithms are fast and there is no
PAGE 48
36 need for nonlinear optimization. Given the 3D world coordinates of a number of points and the corresponding 2D image coordinates, the coefficients in the perspective transformation matrix can be solved by a least square solution of an over-determined systems of linear equations. Given the perspective transformation matrix, the camera model parameters can then be computed if needed. Even though the transformation equations from 3D world coordinates to 2D image coordinates are nonlinear functions, they are linear if lens distortion is ignored. Ignoring lens distortion is unacceptable when doing 3D measurement and hence the accuracy of the final solution is very noise sensitive. In this technique the number of unknowns is usually larger than the actual degrees of freedom, so the actual constraints in the transformation parameters are not satisfied. Since no iterations are required, direct methods are computationally fast. The actual constraints are left out to reach the objective to construct a non iterative algorithm. Consequently in the presence of noise, the intermediate solution does not satisfy the constraints and the accuracy of the final solution is relatively poor. Due to these difficulties, the results obtained are not accurate enough. The various techniques which come into the category of linear calibration are as follows Direct Linear Transformation (DLT) developed by Abdel Aziz and Karara (Abdel and Karara 1971) was the widely used one because of its simplicity and quikness. This technique involved solving only linear equations. The simplifications made reduced the precision of the parameter estimates and as a consequence, they are not suitable for applications like metrology. Yakimsovsky and Cunningham’s stereo program developed for JPL Robotics Research Vehicle (Yakimovsky and Cunningham 1978). They used a highly linear lens and ignored distortion. The field of view in there system was narrow and if the object distance was large, ignoring distortion should cause more error.
PAGE 49
37 Nonlinear calibration Because of the increasing processing power of standard workstations, the nonlinear nature of the estimation problem is not as restricting as it was years ago. Nonlinear calibration techniques use a large number of unknowns and a large-scale nonlinear optimization. Even though this technique compensates for lens distortion and allows easy adaptation of any complex model for imaging, the iterative nature of the algorithms requires a good initial guess. This violates our criterion of being autonomous. This technique is computationally more expensive and the optimization may be unstable if the iterative procedure is not appropriately designed, especially in highly distorted environments. Even though these techniques compensate for lens distortion, since the algorithm is iterative, the procedure may end up with a bad solution unless a good initial guess is available. The various methods developed which come under this category are as follows Faig’s Technique (Faig 1975) uses 17 unknowns for each image, and is very computer intensive. The accuracy is excellent even though a large number of unknowns are involved in the technique. As described above, the technique requires a good initial guess. Sobel, Lowe, Gennery (Sobel 1974) described a non linear calibrating system where 18 parameters must be optimized. The accuracy results were not mentioned This technique was built in similar terms as the Faig’s technique. Two step calibration technique This technique involves a direct solution for most of the calibration parameters and an iterative solution for the other parameters. Roger Y.Tsai (Roger Tsai 1987) was the founder of this technique. Most of the calibration techniques developed after this were based on this calibration technique. He used a radial alignment constraint which is used to derive a linear solution for the external parameters and focal length of the camera. An
PAGE 50
38 iterative scheme is used to estimate three parameters which handle the radial distortion coefficient, effective focal length and the depth component in the translation vector. Quaternion algebra is used to extract the optimum transformation matrix. This technique takes into consideration lens distortion. So in these two step methods, the initial parameter values are computed linearly and the final values are obtained with nonlinear minimization. Three Step Calibration Technique The calibration technique that has been employed in this study is totally based on the ThreeStep calibration technique developed by Janne Heikkila (Heikkila 2000). This technique is an extension of Roger Tsai’s Two-Step calibration technique described earlier. The camera model for projection being used in this technique and the calibration procedure used in this technique will be discussed. Camera Model The camera model is based on the pinhole model with lens distortion. The pinhole camera model is based on the principle of co-linearity, where a straight line through the projection center into the image plane projects each point in the object space. Depending on the accuracy requirements, the model is typically based on either orthographic or perspective projection. Perspective projection gives an idealized mathematical framework, which is actually quite accurate for high quality camera systems. For the off-the-shelf systems, the perspective projection model is often augmented with a lens distortion model. Figure 4-2 shows a pure perspective projection (pinhole) camera model with lens distortion. Let P be an arbitrary 3D object point and [X, Y, Z] T be its coordinates, measured in millimeters, with respect to a fixed world coordinate system W. The center
PAGE 51
39 of projection is at the origin O of the camera frame C. The origin of the camera coordinate system is in the projection center at the location (X 0 , Y 0 , Z 0 ) with respect to the object coordinate system, and the z-axis of the camera frame is perpendicular to the image plane. The rotation is represented using Euler angles , and that define a sequence of three elementary rotations around x, y, z axis respectively. The rotations are performed clockwise, first around the x-axis, then the y-axis that is already once rotated, and finally around the z-axis that is twice rotated during the previous stages. The image plane is parallel to the XY plane of the camera frame and it is displaced with a distance f (focal length) from O along the z axis. The z axis is called the principal axis and the intersection of the image plane with principal axis is called the principal point o, the coordinates of which are [u 0 , v 0 ] T . So the u and v axes of the 2D image coordinate frame are parallel to the x and y axes, respectively. The center of the image plane is at the origin and the center of the lens measured in the camera frame is at coordinates (0, 0, f) where f is the focal length of the lens. The coordinates of P in the camera frame C are [x, y, z] T and the undistorted coordinates of point P in image plane is [u v] T and the distorted coordinates of the point P in the image plane is [u d , v d ] T. Let [u p v p ] T denote the 2D coordinates (in pixels) with respect to the computer image coordinate system. Figure 4-3 shows the steps involved in the transformation of the 3D world coordinates to computer image coordinates. A detailed description of each step involved in the overall transformation from (X, Y and Z) to (u p , v p ) will now be presented. The rigid body transformation from the object world coordinate system (X, Y and Z) to the camera 3D coordinate system (x, y and z) is given by
PAGE 52
40 TZYXRzyx (4-4) where R is the 3 x 3 rotation matrix and T is a translation vector 312111RRR 322212RRR 332313RRR TzTyTx The rotation is represented using Euler angles , and that define a sequence of three elementary rotations around x, y, z-axis respectively. The rotations are performed clockwise, first around the x-axis, then the y-axis that is already once rotated, and finally around the z –axis that is twice rotated during the previous stages. The values in the rotation matrix are as follows R 12 = sinsincos– cossin R 11 = coscos R 22 = sinsinsin + cos cos R 21 = -cossin R 13 = cos sin cos + sin sin R 31 = -sin R 23 =cos sin sin – sin cos R 32 = -sin cos R 33 = cos cos Once the values in the rotation matrix are known, the Euler angles can be calculated as = sin -1 R 31 , = atan2 (R 32 /cos, R 33 /cos) and = atan2 (-R 21 /cos , R 11 /cos) (4-5) where atan2 (y, x) is the two-argument inverse tangent function giving the angle in the range [-, ]. As usual in computer vision literature, the origin of the image coordinate system is in the upper left corner of the image array. So the transformation from 3D camera coordinate system coordinates (x, y, z) to ideal (undistorted) image coordinate (u, v) is achieved by using perspective projection with pinhole camera geometry and this is given in Eq. 4-6.
PAGE 53
41 u = f (x/z) and v = f (y/z) (4-6) where f is the effective focal length which is one of the parameters to be calibrated. Several methods for correcting the lens distortion have been developed. The most commonly used approach is to decompose the distortion into radial and tangential. So the third step which involves converting the undistorted image coordinates into distorted image coordinates is achieved by including radial and tangential distortion coefficients which were discussed previously. The transformation can be shown mathematically as u d = u – D x and v d = v – D y (4-7) where D x = u (k 1 r 2 + k 2 r 4 + k 3 r 6 + ..) + 2 p 1 u v + p 2 (r 2 + 2 u 2 ), D y = v (k 1 r 2 + k 2 r 4 + k 3 r 6 + ..) + p 1 (r 2 + 2 v 2 ) + 2 p 2 u v where r 2 = (u 2 + v 2 ) So in the third step the parameters to be calibrated are the distortion coefficients .The transformation from (u d , v d ) in millimeters to (u p ,v p ) in pixels involves scaling from millimeters to pixels u p = S x u d / d x ’ + u 0 and v p = v d / d y + v 0 (4-8) where (u 0 , v 0 ) is the computer image coordinate origin in image plane (principal point), d x ’ = d x N cx / N fx , where d x is the distance between adjacent sensor elements in the x direction, d y is the distance between adjacent sensor elements in the y direction, N cx is the number of sensor elements in the x direction, N fx is the number of pixels in a line sampled by the computer and S x is the uncertainty scale factor. So in the final step of the entire transformation the only parameter needed to be calibrated is the scale factor. So the objective of the explicit camera calibration procedure is to determine optimal values for these parameters based on image observations of a known 3D target. In case of self calibration the points are also included in the set of unknown parameters. However the
PAGE 54
42 calibration procedure used here is performed with a known target. Once all these parameters are known even the reverse analysis, i.e., calculating the computer image coordinates of a point (u p , v p ) by knowing its 3D world coordinates can easily be done. Calibration Procedure The calibration procedure described in this section has been designed for circular landmarks and small points whose radius in this case is set to zero. The camera model that is being considered for this calibration procedure must include the eight intrinsic parameters and 6 extrinsic parameters described above. If in the calibration procedure there are multiple frames, then for each frame there might be common intrinsic parameters but the extrinsic parameters vary. The factors which are known in advance are the number of pixels in the horizontal direction, number of pixels in the vertical direction, the effective CCD chip size in horizontal direction (mm), the effective CCD chip size in vertical direction (mm) and the radius of the control points (mm). Along with these parameters the 3D coordinates as well as the observed 2D image coordinates of the observed points are known in advance. There would not be an optimal solution for the camera parameters because there are no direct estimators of all the camera parameters due to the nonlinear mapping of the camera model. Because of this, an iterative searching technique is being used to estimate the parameter vector which has all the unknown parameters. The initial values of the camera parameters are to be known in advance in order for one to obtain global optimum minimum solution by using linear parameter estimation. These initial values are obtained by ignoring the lens distortion parameters.
PAGE 55
43 Linear parameter estimation The linear parameter estimation is achieved by using the Direct Linear Transformation technique (Abdel and Karara 1971), which ignores the nonlinear radial and tangential distortion components. In this initialization step, the aim is to find the first, not the final, estimates of the camera parameters so it is more reliable to use the nominal focal length, the aspect ratio and the image center as the intrinsic parameters. The first step in the linear transformation is the transformation from the object coordinates to the image coordinates. This transformation can be represented by the Eq. 4-9. 1iiiiiiiiZYXAwwvwu (4-9) where A= is called the DLT matrix 312111aaa 322212aaa 332313aaa 342414aaa [X i , Y i , Z i ] are the 3D world coordinates and [u i , v i , w i ] are the image coordinates. ‘i’stands for the number of control points. The parameters of A can be solved by eliminating w i . Expanding the matrix product of Eq. 4-9 yields Eq. 4-10. u i w i = a 11 X i + a 12 Yi + a 13 Z i + a 14 v i w i = a 21 X i + a 22 Yi + a 23 Z i + a 24 w i = a 31 X i + a 32 Y i + a 33 Z i + a 34 (4-10) Substituting of w i in the first two lines of Eq. 4-10 yields two equations with 12 unknowns Eq. 4-11. a 11 X i + a 12 Y i + a 13 Z i + a 14 a 31 u i X i a 32 u i Y i a 33 u i Z i a 34 u i = 0 a 21 X i + a 22 Y i + a 23 Z i + a 24 a 31 v i X i a 32 v i Y i a 33 v i Z i a 34 v i = 0 (4-11)
PAGE 56
44 The 12 unknowns can be solved for using 6 or more known points in terms of world coordinates (X i , Y i , Z i ) and corresponding image coordinates (u i , v i ) . The use of i control points gives rise to the following matrix formulation Ba =C as in Eq. 4-13 Let ‘a’be a vector expressed as follows a = [a 11 , a 12 , a 13 , a 14 , a 21 , a 22 , a 23 , a 24 , a 31 , a 32 , a 33 , a 34 ] T Let L be a matrix represented as Eq. 4-12 The matrix Eq. 4-12 for i control points is obtained (Melen, 1994) La = 0 (4-12) 0.0.01NiXXX 0.0.01NiYYY 0.0.01NiZYZ 01.01.01 NiXXX0.0.01 NiYYY0.0.01 NiZZZ0.0.01 10.10.10 NNNNiiiivXuXvXuXvXuX ..1111 NNNNiiiivYuYvYuYvYuY ..1111 NNNNiiiivZuZvZuZvZuZ ..1111 NNiivuvuvu..11 343332312423222114131211aaaaaaaaaaaa = (4-13) 000...000 0.0.01NiXXX 0.0.01NiYYY 0.0.01NiZYZ 01.01.01 NiXXX0.0.01 NiYYY0.0.01 NiZZZ0.0.01 10.10.10 NNNNiiiivXuXvXuXvXuX ..1111 NNNNiiiivYuYvYuYvYuY ..1111 NNNNiiiivZuZvZuZvZuZ ..1111 NNiivuvuvu..11 (4-14)
PAGE 57
45 To avoid a trivial solution for the parameters, Abdel-Aziz and Karara used the constraint a 34 =1. The parameters a 11 .a 33 can be estimated in a least squares technique and thereby the transformation matrix can be determined. The Eq. 4-12 also could be solved using the Pseudo Inverse technique. The parameters a 11 a 34 do not have any physical meaning but are composed of intrinsic and extrinsic modeling parameters. This 3 x 4 matrix ‘A’ is nothing but the estimate of the perspective transformation matrix F represented by F’. Let R’ and T’ be the estimates of rotation matrix R and the translation matrix T respectively. The perspective transformation matrix F is represented as follows F = PM, where P = and M = (4-15) 00sf 00f 100vu 000 0R 1 The values in the matrix P are assumed to be already known (the published camera parameters are used) but the values in M are the parameters to be estimated. For the initial estimates we can write F’ = PM’ = [P 13 x R’ P 13 x T’] (4-16) where P 13 is a matrix which contains the first three columns of P. So one can obtain the values of R’ and T’ easily as follows R’ = P 13 -1 F 13 ’ and T’ = P 13 -1 F 4 (4-17) where F 13 is a matrix containing the first three columns and F 4 the last column of F’. The matrix R’ is normalized and orthogonalized by using the singular value decomposition because the columns of R’ represent the coordinates of the axis vectors of the camera coordinate system measured with respect to the object coordinate system and as such must be orthogonal. A brief description of the SVD is given below.
PAGE 58
46 Singular Value Decomposition (SVD) is the method of choice for solving most linear least squares problems (William el at. 2002). SVD methods are based on the following theorem of linear algebra, any M x N matrix ‘A’ whose number of rows M is greater than or equal to its number of columns N, can be written as a product of an M x N column-orthogonal matrix U, an N x N diagonal matrix W with positive or zero elements (the singular values), and the transpose of an N x N orthogonal matrix V. A = U W V T (4-18) The matrices U and V are each orthogonal in the sense that their columns are orthogonal, U ik U in = kn 1 k N and 1 n N V jk V jn = kn 1 k N and 1 n N (4-19) The SVD decomposition can also be carried out when M < N. In this case the singular values of Wj for j = (M + 1) N are all zero, and the corresponding columns of U are also zero. Eq. 4.17 then holds only for k, n M. The decomposition can always be done, no matter how singular the matrix is, and it is “almost” unique. So in our case the matrix R’ is given by R’ = U W V T . The orthogonal version of R’ which is R ortho ’ given by Eq. 4-20 R ortho = U W’V T (4-20) where W’ is a diagonal matrix with diagonal elements as 1 and |UV T | in descending order. The W’ matrix will look like W’ = | 00000.1 0000.10 |00TUV
PAGE 59
47 Since we now know the values of R orhto ’ and T’, the initial estimates of the extrinsic parameters can be calculated. Once these estimates are known, the final estimates of the parameters can be found out by using the iterative search algorithm which is described below. Iterative search In this step the best estimate for the camera parameters can be obtained by minimizing the residual, i.e., the weighted sum of squared differences between the model (u i , v i ) and the observations (U i , V i ), where i = 1,, N. The residuals occur because there are noise components incorporated in the measurement process. The noise component is modeled as Gaussian noise. The objective function in the case of Gaussian noise is represented as F = (U Ni1 i – u i ) 2 + (V Ni1 i – v i ) 2 (4-21) The parameters of the forward camera model are obtained by minimizing the above function. There are several numerical techniques that can be used to minimize the above equation but the Levenberg-Marquardt method has is used here as the solution for the optimization of this problem. A brief explanation of the method is given below. The Levenberg-Marquardt method (William el at. 2002) is a single-shot method which attempts to find the local fit-statistic minimum nearest to the starting point. Its principal advantage is that it uses information about the first and second derivatives of the fit-statistic as a function of the parameter values to guess the location of the fit-statistic minimum. This method works well if the statistic surface is well behaved. This method provides the fastest convergence. However without proper initial parameter values the optimization may stick in a local minimum and thereby cause the calibration to fail. This
PAGE 60
48 method takes only a few iterations to obtain a global minimum for the Eq. 4-18. This method is a maximum neighborhood method which was developed to perform an optimum interpolation between the Taylor series method and the gradient method because both the methods frequently run aground. In the Taylor series there is divergence of the successive iterates and in the gradient methods there is slow convergence after the first few iterations. So the Levenberg-Marquardt method has the ability to converge from an initial guess which may be outside the region of convergence of other methods and it also has the ability to close in on the converged values rapidly after the vicinity of the converged values has been reached. It is also a popular alternative to the Gauss-Newton method of finding the minimum of a function F(x) that is the sum of squares of nonlinear functions, F(x) = [fi(x)] 2 (4-22) Ni1 Let the Jacobian of fi(x) be denoted as Ji(x), then the Levenberg-Marquardt method searches in the direction given by the solution P to the equations (J k T J + k I) P k = -J k T F k , (4-23) where k are nonnegative scalars and I is the identity matrix. So by using the Levenberg-Marquardt method an optimized solution for the extrinsic parameters as well as the intrinsic parameters is found out. In order to have a more clear understanding of how the calibration procedure works a numerical example is given. Let’s consider a particular instance where the 3D world coordinates as well as its 2D image coordinates of 12 control points are known. These points are represented in a
PAGE 61
49 matrix form as [X 3D coor , Y 3D coor , Z 3Dcoor , U 2Dx component , V 2D y component ]. The units of the 3D world coordinates are in mm and the units of the 2D image coordinates are pixels. The numerical values of the 12 points in matrix form are stored in the matrix called Data as shown below. The graphical representation of these points is shown in Figure 4.5 also. These data points are from the calibration object, whose image as shown in Figure 4-4 was taken by the Sony CCD camera Other input information that is needed for calibration: Number of pixels in horizontal direction in the image = 640 Number of pixels in vertical direction in the image = 480. Effective CCD chip size in horizontal direction = 8.8 mm Effective CCD chip size in vertical direction = 6.6 mm Nominal focal length of the camera = 25 mm Radius of the circular control points = 101.6 inches. Data = 990 (4-24) 0.00.00.00.00.00.06.9900.6354.2796.9900.6354.279 6.0.6356.9900.6356.9900.6356.9906.9906.9900.6350.6350.635 6.9906.9900.6350.6354.2794.2790.00.00.00.00.00.0 3.1486.1362.2149.2036.2858.2768.5327.4679.3985.5239.4576.389 3.3789.2821.3891.2890.3985.2945.3456.3655.3897.2490.2668.283
PAGE 62
50 The initial values of the camera extrinsic parameters are obtained from the Direct Linear Transformation Technique which is described above. The L matrix is obtained from the Eq. 4-15. The Eq. 4-14 is solved by pseudo inverse technique to obtain the matrix ‘a’ 0000.10001.00000.00001.05258.10001.00041.00005.03284.40026.00004.00032.0 0001.0005.00032.0 0000.00041.00004.0 0001.00001.00026.0 (4-25) 0000.15538.13284.4 The matrix A is nothing but the 3 x 4 form of the matrix ‘a’. The transformation matrix F’ is the A matrix and it is given above. The P matrix obtained from Eq. 4.16 and its inverse (i.e., P 13 ) is P = P 0025 0250 13.34.4 000 13 -1 = (4-26) 00.000.004.0 00.004.000.0 00.1132.0176.0 Initial estimates, i.e., R’ and T’ obtained from Eq. 4-18 is given below R’ = 0 T’ = (4-27) 6791.02527.000067.1 0746.4410.11434.0 7303.00724.00358.1 8.87903.62020.25 The above rotation matrix is normalized and orthogonalized by SVD. By singular value decomposition the values of U, W, V matrices are shown below
PAGE 63
51 U = W = 1120.08822.04574.0 00011.4604.08877.0 9937.00990.00526.0 004746.1 04467.10 9925.000 V = (4-28) 4201.09009.01095.0 6131.0.3707.06976.0 6691.02259.07080.0 |UV T | = 1.0000 So W’ = 0 1 (4-29) 00.000.000.1 00.00.100.0 00.00.000.0 The orthogonal version of the matrix R’, which is R ortho ’, given by Eq. 4-21 is calculated to be R ortho’ = (4-30) 6905.01545.07066.0 1240.09877.00949.0 7126.00221.07012.0 From the above translation and the rotation matrix the initial extrinsic parameters are calculated to be Initial position = [-25.2, -620.3, 8790.1, 9.9, -43.7, -12.3] The first three values in this vector are the x, y and z distance in mm obtained from the translation matrix. The next three values are the orientation angles in degrees along the x, y and z axes which are obtained from Eq.4-5 This initial position when combined with the initial intrinsic parameters gives us the entire initial parameter matrix as given below. The first four values in the matrix above are the intrinsic parameters and the remaining are the extrinsic parameters which are obtained from initial position. In the intrinsic parameters the first value is the scale
PAGE 64
52 factor, the second value is the focal length (mm), the third and the fourth are the principal image coordinates (pixels). The matrix calculated above is used to obtain the final estimate of the parameters by using the iterative approach by Levenberg Marquardt method as discussed above. Initial parameter = (4-31) 3.127.439.98.879030.62020.250.2400.3200.250.1 The number of iterations for this method has been set to 100. Since the values of the parameter are known the pixel image coordinates are calculated by using Eq. 4-6 through 4-9. The pixel coordinates obtained are shown below. The Jacobian matrix is calculated for the image coordinates obtained and the error matrix, obtained from the difference the above matrix and the original image coordinates which had been given as input, is given below. The residue is calculated as follows Residue = Error’ x Error = 7.3984 x 10 -4 The goal is to reduce this residue. The value is reduced by updating the value of the initial parameter vector with the value D p obtained from the equation below (Heikkila, 2000) Dp = (J T x Error) / (J T x J).
PAGE 65
53 The new position is obtained by adding this value to the old position and it is shown in Eq. 4-33. Image coordinates = Error = (4-32) 35.14872.13662.21372.20486.28472.27643.53264.46752.39993.52366.45784.389 02.47810.37827.28386.38826.28945.39882.29405.34607.38866.24893.26516.284 47.5322.5882.3222.3611.999.1032.6722.4793.2568.6348.4303.23 The first four values in the matrices above are the intrinsic parameters and the remaining are the extrinsic parameters which are obtained from initial position. In the intrinsic parameters the first value is the scale factor, the second value is the focal length (mm), the third and the fourth are the principal image coordinates (pixels). The process Parameter estimate = Final parameter estimate = (4-33) 7.156.442.159.71823.4614.123.2661.3220.320.1 01.03.453.129.87353.3034.1231.2050.3551.289.0 of iterating to find the final parameter is continued until the residual value is below the range specified. If the iteration does not converge to the residual range specified and it exceeds the number of iterations then the calibration is said to have failed. But In this
PAGE 66
54 particular example the algorithm converged in 38 iterations and the final parameters are given above in the final parameter estimate. The final parameter estimate shows the final parameters of the camera. The first 4 rows of the matrix contain the intrinsic parameters. The next six rows of the matrix contain the extrinsic parameters, the first three being the distance and the last three being the orientation angles. The original distance between the camera and the calibration object is 28 feet and the distance obtained from the algorithm is 28.7419 feet. The original total orientation angle measured was 45 degrees and the algorithm calculated value is 45.3 degrees. The original focal length of the lens was 25 mm; the algorithm obtained value is 28.12 mm. The entire camera calibration algorithm is shown in figure 4-6 in the form of a flow chart. The calibration algorithm requires the 3D world coordinates as well as the 2D image coordinates of the control points as mentioned earlier. The 3D world coordinates are measured manually but in order to obtain the 2D image coordinates the feature extraction technique has been adapted and it has been explained in detail in the Chapter 5. Software Description The entire camera calibration algorithm developed by (Heikkila 2000) was first written in MATLAB. The code was then converted to C so that this algorithm could be used by other image processing algorithms. The hierarchy of the functions used in the algorithm is shown in figure 4-5 in the form of a chart so that one can understand as to how the code written for the algorithm is structured. A brief description of what each function does and its pseudo code is given in the Appendix A.
PAGE 67
55 A B Figure 4-1. Example of distortion. A) Undistorted grid. B) Distorted grid Figure 4-2 Pinhole camera model
PAGE 68
56 Figure 4-3 Steps involved in transformation from 3D to 2D image coordinates Figure 4-4. Calibration object
PAGE 69
57 Figure 4-5. View of 3D and 2D data points. Figure 4-6. Flow chart of calibration procedure
PAGE 70
58 Figure 4-7. Functions hierarchy details
PAGE 71
CHAPTER 5 FEATURE EXTRACTION This chapter deals with the explanation of the details involved in the feature extraction and also the approach taken in order to achieve feature extraction. The basic need for feature extraction is to obtain some of the input parameters for the camera calibration algorithm. The input parameters for the camera calibration include the 3D world coordinates of the control points as well as the 2D image coordinates in pixels of the corresponding control points. So the feature extraction comes in handy to obtain the 2D image coordinates in pixels of the control points. Since the 3D coordinates of the control points do not change, they are measured once manually. But in the case of the 2D image coordinates they change with the position of the camera. Image features usually correspond to interesting elements of a scene. They usually embody information about the geometric shapes of objects present in a scene. They stand out from other parts of an object because of geometric properties or intensity values. They can usually be extracted more reliably. The image features are of two types: Global image property (e.g. – average gray value, image variance, area in pixels) Region of the image with a particular measurable property (e.g. line, a circle, a set of pixels with similar texture). So an image feature is a local, meaningful and a detectable region of the image. Efficient techniques have been developed to extract image features for several tasks such as object detection, pose estimation, tracking, and classification. The image features in our case are the centroids of the circular dots on the two faces of the cube. These circular 59
PAGE 72
60 dots on the cube are of different colors on each face of the cube, so the image features also include detecting the colors of the dots. The traditional goal of the feature extractor is to characterize an object to be recognized by measurements whose values are very similar for objects in the same category, and very different for objects in different categories. This leads to the idea of seeking distinguishing features that are invariant to irrelevant transformation of the input. So in the case of the circular dots, they are made of colors which are distinguishable from the rest of the surroundings. The feature extraction is concerned with the detection and localization of particular image patterns and does not usually require world knowledge. Therefore feature extraction can be viewed as image processing rather than high-level vision. First we shall discuss feature detection. Feature Detection Feature detection is a part of the feature extraction process. For computer vision, feature detection usually represents the first step of a computer vision system. The detected features may be used to concisely represent the geometric shape of an object or more likely they will be used for other high level computer vision tasks like pose estimation, camera calibration, motion estimation, 3D reconstruction, and object recognition. Detection means to search through the image to locate the desired image features and specify some essential properties of the located features. For example, a line segment may be located by its centroid. Its properties may include its length and orientation. The calibration object being used consists of two planes of a cube which have 9 circular control points on each plane. So the image should have clear view of the calibration object. The basic function of feature detection is to detect the circular dots in
PAGE 73
61 the image. After detecting the circular dots calculating their centers would be the next goal of the detection process. The center, i.e., the control point is nothing but the x and y distance of the center from the top left hand corner of the image. There are 9 control points on each plane of the cube so the output of the feature detection will be an array of size 18 x 2 which contains the centers of the 18 circular dots on the two planes of the cube measured in the world coordinate system. We shall now discuss the basic steps involved in this feature detection technique. Input Description The input to the feature detection algorithm is a PPM image of the calibration object. A PPM (portable pixmap file) is the lowest denominator color image file format. A PPM file consists of two parts, a header and the image data. The header consists of a least three parts normally delaminated by carriage returns and/or linefeeds The first “line” is a magic PPM identifier; it can be “P3” or “P6”. The next line consists of the width and height of the image as ASCII numbers. The last part of the header gives the maximum value of the color components for the pixels, this allows the format to describe more than single byte (0) color values. An example to illustrate on the format of a PPM file looks like is shown in Figure 5-1. A PPM file defines an image as a character array say P[X_DIM][Y_DIM][3], where X_DIM and Y_DIM are the x and y dimensions of the image. Each pixel in an image is in the RGB order, i.e., Red, Green, Blue order and it is 3 bytes long. The first objective now is to read the PPM file and store the RGB values of each pixel of the image in a three dimensional array so that it can be used for further processing. One can access a pixel in the (x, y) location of an image as shown below P [x] [y] [0] --> red byte (char)
PAGE 74
62 P [x] [y] [1] --> green byte (char) P [x] [y] [2] --> blue byte (char) The pseudo code for the function which has been written in order to read the PPM image file is given in the Appendix. The function has been named as “Read_ppm _image”. Smoothen Input Image The next objective in the detection process is to smooth the image after it has been read. The main objective of smoothing an image is to reduce the noise in the image and to prepare the image for further processing such as segmentation. Image smoothening is the set of local pre-processing methods whose predominant use is the suppression of noise which uses redundancy in the image data. The methods are based on the general idea that the average is computed only from those points in the neighborhood, which have similar properties to the processed point. Image smoothing blends adjacent colors in low resolution images for smoother color transitions. For each pixel in an image we examine a window area that is centered at the pixel. Pixels within the window are then divided into three groups based on their intensities relative to the center pixel. One group has higher intensities, one group has lower and the third group has intensities close to that of the center pixel. A new intensity value for the center pixel is then determined by averaging the pixels in one or more groups based on the population of the three groups. Therefore, the number of pixels used in averaging and the configuration of these pixels may vary from point to point. This high adaptability enables the approach to preserve the sharpness of edges, corners, lines and other image features in the smoothing process. A smoothing filter is initially defined as a low pass filter, the size of which can be changed. The size of the filter refers to the region of the region in the image on which the
PAGE 75
63 smoothing is done. After the filter has been defined the entire image is traversed and for each pixel its new pixel value, i.e., the average value is assigned. The input to the filtering function is an unfiltered image and the output is a filtered image, which is ready for further processing. To give one a visual feeling of how a smoothing filter works, Figure 5-2 shows an image of a cube with black dots on each face. After applying the smoothing filter on this image, the smoothed image looks like Figure 5-2 (B). Another example showing visually the output of the smoothing algorithm is shown in Figure 5-3. The pseudo code for the smoothing algorithm is given in the Appendix. The pseudo code explains the basic data structures involved and the working of the algorithm. The function is named as “Filter”. The filtered image can now be used for further image processing. It is on this filtered image that the threshold algorithm is applied which converts this filtered image to a binary image containing the information that is necessary for calculating the center of the control points. Thresholding an Image The next step in the detection process is the thresholding of an image. The main objective of thresholding an image is to convert an image into a binary image, i.e., image containing only black and white pixels so that this image could be easily used for further processing. The basic idea of thresholding is also to reject all the pixels that are not within a threshold range. The way an image is converted to a binary image is that if the pixel value is within the range specified then that particular pixel is converted to a white pixel and if that pixel is not within the range then it is converted to a black pixel. So after thresholding an image a binary image is obtained which can be used by other image processing techniques.
PAGE 76
64 The efficiency of this thresholding technique totally depends on the value(s) of the threshold to be specified. Initially the threshold value was chosen to be the average pixel values of red, green and blue of each pixel. So the thresholding was done in such a way that if the pixel value did not match the threshold value it was rejected, i.e., it would be made black and the remaining pixels were made white. Even though this technique looks simple, it did not specify a range of values to choose from. So this method did not seem to be an efficient technique. Two examples are given to show the deficiency of this technique in Figures 5-4 and 5-5. The basic idea was to convert the orange dots and the pink dots in their respective original images into white pixels and make the remaining of the image black in color so that the image is left with only white dots which could be used for further processing. But since only a particular threshold value is specified in this technique, The image has not been correctly classified into dots and this is seen in figures 5-4 (B) and 5-5 (B).So in order to provide the thresholding technique with a range of values to choose from, the concept of Gaussian modeling was introduced in the thresholding technique. This technique requires training data on which a Gaussian model can be built. The training data in our case is the Red, Green and Blue pixel values of the circular dots on the two faces of the cube. This training data has been obtained from the Segmentation algorithm written by Dr.Micheal C. Nechyba, an Assistant Professor in the Electrical and Computer Engineering Department, University of Florida. The segmentation algorithm allows the user to load in an image and segment itself into interesting regions. The program first splits an image into a user specified number of uniformly distributed regions. Once this process has converged, the user can select the
PAGE 77
65 specific regions from the segmented image and dump them into an output file. Each line in the output file is of the form R G B x y, i.e., the color and location of the pixels in the regions user has specified. An example is shown below to illustrate as to how the segmentation algorithm works. The input to the algorithm is shown in Figure 5-6 [A], which contains two faces of the cube with different colors of dots on them. The goal here is to achieve the R, G, B pixel values of the dots which are nothing but the training data required in order to build a Gaussian model for thresholding. After the segmentation algorithm has been run the final segmentation of the image shown in Figure 5-6 [A] will be Figure 5-6 [B]. The user can now specify a region in these segmentations for which he needs to find the R G B pixel values. In this case the regions containing the dots were studied. This is shown in Figure 5-6 [C]. After the selection of the regions the output of the algorithm gives a text file which contains training data. The Gaussian Modeling is based on the assumption that the training data is Gaussian or Normal in distribution. A continuous univariate normal or Gaussian density is completely specified by two parameters: its mean and variance 2 . The probability density function for univariate density is given by 2)/)((2/1exp)2/1()(xxp (5-1]) The general multivariate normal density in d dimensions is written as )(1)((2/1exp)2/12/)2/(1()(xtxdxp (5-2)
PAGE 78
66 where x is a d-component vector, is the d-component mean vector, is the d-by-d covariance matrix, and | | and -1 are its determinant and inverse respectively.(x-) stands for the transpose of (x-) The covariance of two features measures their tendency to vary together, i.e., to co-vary. The variance is the average of the squared deviation of a feature from its mean; the covariance is the average of the products of the deviations of feature values from their means. The covariance matrix x is symmetric and positive semi definite. The diagonal elements in the covariance matrix are the variances of the respective x i (i.e. 2 ) represented as ii and the off diagonal elements ij are the covariances of x i and x j . If x i and x j are statistically independent, then ij = 0. If all the off-diagonal elements are zero, p(x) reduces to the product of the univariate normal densities for the components of x. So the multivariate normal density is completely specified by the elements of the mean vector and the independent elements of the covariance matrix . The samples drawn from a normal population tend to fall in a single cloud or cluster as shown in Figure 5-7. The center of this cluster is determined by the mean vector and the shape of this cluster is determined by the covariance matrix. From the equation of multivariate density one can observe that the loci of points of constant density are hyperellipsoids for which the quadratic from (x-) -1 (x-) is constant. The principal axes of these hyperellipsoids are given by the eigenvectors of and the lengths of these axes are determined by there eigen values. The value r 2 = (x-) -1 (x-) (5-3)
PAGE 79
67 is called as the Mahalanobis Distance from x to . So the contours of constant density are hyperellipsoids of constant Mahalanobis distance (Richard et al. 2000) In our case the density function is multivariate, so the two parameters which are of concern are the mean and the covariance of the feature vectors. The first step in Gaussian Modeling is to calculate the mean value of all red pixels in the training data obtained. Similarly the mean values of the green and blue pixels in the training data are calculated. This is shown mathematically as R = 1/N * Ri , G = 1/N * Gi , B = 1/N * .Bi (5-4) where N is the total number of pixels in the training data, R i, G i and B i are the Red, Green and Blue pixel values and R, G, B are the mean values of these red, green and blue pixels. Here stands for the summation of the pixel values. So the mean vector for the R G B pixels is given as = BGR (5-5) Then the covariance matrix of these R, G and B pixel values is calculated. The covariance matrix is given by BBBGBRGBGGGRRBRGRR (5-6) where RR = 1/N * (X i – R ) 2 GG = 1/N * (X i – G ) 2 RG = 1/N * (X i – R ) (X i – G ) GR = 1/N * (X i – G ) (X i – R )
PAGE 80
68 RB = 1/N * (X i – R ) (X i – B ) GB = 1/N * (X i – G ) (X i – B ) BB = 1/N * (X i – B ) 2 BR = 1/N * (X i – B ) (X i – R ) BG = 1/N * (X i – B ) (X i – G ) So after the mean and covariance matrix have been obtained the Mahabolinis distance is calculated by using Eq. 5-3. The thresholding is based on this Mahabolinis distance. Instead of having the average pixel values of red, green and blue of each pixel as the threshold, the Mahabolinis distance is used as the threshold. So for each pixel the Mahabolinis distance is calculated and if it lies within the value of the threshold specified then that particular pixel is converted to white otherwise that pixel is made black, that is to say that pixels which are not in the threshold region are rejected. The result of this gives a binary image which can be used for further image processing. The same example shown in Figure 5-4 [A] was used to apply the new thresholding algorithm to that image, the results of which are shown in Figure 5-7 [B]. As one can see in the binary image the dots have been separated out of the rest of the image unlike Figure 5-4 [B], where the dots have not been separated. After running the algorithm on Figure 5-8 [A], the image shown in Figure 5-8 [B] was obtained. These images show that there is a drastic improvement by the introduction of the Gaussian Modeling in the thresholding technique. The training data for the color (orange, magenta) was obtained from various images with varying lighting conditions to enhance the efficiency of the thresholding technique. Since the cube has dots with multiple colors, in order to threshold these multiple colors a Gaussian Model has to be built for every color, i.e., the means and covariance for every color has to be calculated. In order to illustrate this aspect a cube with two different
PAGE 81
69 colors of dots is shown in Figure 5-8 (A). The results as shown in Figure 5-9 (B) were obtained after thresholding. The pseudo code for this approach is given in the Appendix. Blob Detection Algorithm This is the last and the most important phase of the feature detection process. After the image has been thresholded to our needs, i.e., made into a binary image, the goal of the blob detection algorithm is to detect the white regions in this binary image and calculate the size and the center of that particular region (blob). This center is nothing but the center of the control points in image coordinates which is one of the inputs to the camera calibration algorithm to determine the position and orientation of the camera. So the more accurately the center is calculated the better the results of the calibration A blob is a region which is surrounded by pixels of the same values. Here blob is a region which contains white pixels (Red, Green and Blue value is 255, 255 and 255). A structure has been defined for the blob and it contains the following parameters Label: to assign a value to each pixel Ellipse: blob position statistics (covariance, mean , angle , major and minor axes) RGB: values of Red, Green and Blue of each pixel. XY: the values of the position of each pixel with respect to the image (x, y) N: size of the blob, i.e., no. of pixels in the blob Before the actual algorithm for blob detection is run, memory is allocated to the structure defined above for the blobs detected in the image. The algorithm for the detection of the blobs is a recursive process. The way this process works can be explained in a simple manner as follows. The algorithm checks to see whether a pixel in the binary image is a white pixel or not. If it is a white pixel then the properties, i.e., the Red, Green and Blue values of that particular pixel, as well as the
PAGE 82
70 x, y position of that pixel, are noted. When a white pixel is detected it is labeled so that the algorithm when performing the recursive calls does not take that pixel into consideration again. Once this white pixel is detected its neighboring pixels are also checked to see if they are white pixels. If they are white then their properties are also noted down and these pixels are added to that particular blob. The newly detected pixels are also labeled. This process continues until the surrounding pixels of the pixel in that blob are completely labeled or if they have some pixels which are labeled and the remaining which are black in color. In a similar way, during this recursive process the remaining blobs in the image are also detected. So for a blob which is detected by the algorithm the properties of all the pixels in that blob are known and the mean, covariance from the known values of the x, y position of the pixels in the image can be calculated. The mean is nothing but the center of the blob obtained. The pseudo code for the function which helps in collecting data and detection blobs is given in the Appendix. The function is named as “Detect_blob”. The size of the blob is nothing but the total number of white pixels in the blob. The size of the blob can be set to a fixed value. So when the number of white pixels in the blob exceed a particular value then that blob is rejected, i.e., it is not stored in the memory even though the pixels in this blobs are labeled. The number of blobs to be detected can also be set to a particular value. When a blob is detected it’s placed in a stack which helps in solving the memory allocation problems. The blob thus obtained can be considered as an ellipse whose major, minor axis and angle can be found out from the properties of the blob itself. Since the properties of the blob are known the original PPM image can be reverted back from the binary image. In order to visualize the detection of
PAGE 83
71 the blob on the original PPM image, the boundary of the detected blob was marked with a different color. The center of each blob is also marked for each blob. As mentioned earlier, this center is nothing but the mean values of the entire x and y positions of the pixels in the ellipse. The boundary of the blob is detected from the properties of the ellipse. So initially each point is detected on the boundary and then its color is changed to a different one. The pseudo code for the function which does this is given is Appendix The function is named as “Draw_Ellipse”. A few examples are shown in Figures 5-11, 5-12, 5-13 as to how the image looks after the blob has been detected. These are the same images on which the thresholding was done previously in this chapter. The boundaries of the blobs and there centers have been made green and yellow so that they can be easily visualized. The entire blob detection algorithm is depicted Figure 5-14 in the form of a flow chart. By looking at the flow chart one can easily understand how the feature detection technique works. The next section explains the remaining part of the feature extraction process. Feature Localization The next phase of the feature extraction is the feature localization technique. Once the features (centers of the control points) have been found out from the detection algorithm, they should be assigned to there respective 3D world coordinates. This is exactly what feature localization does, i.e., it assigns the features on the projected image to the features on the calibration object. The reason why feature localization is needed is that if the 2D image coordinates are not assigned to exact 3D coordinates then there would be no use of detecting the features in the image by using the feature detection. So
PAGE 84
72 it can be framed that the feature localization is as important as the feature detection in a feature extraction process. The centers of the blobs (x, y position) obtained from the feature detection are random in nature. The algorithm detects the feature of whichever blob it encounters. Since it gives the features in a haphazard manner, it would be difficult to assign a particular 2D feature to its original 3D world coordinates. So different colors have been used for the control points on each row on the two faces of the cube. One other fact which is known is that the x value (x position of the control point from the top left hand corner of the image) always increases from left of the image to the right of the image irrespective of the y value (y position of the control point from the top left hand corner of the image). So once the features have been obtained from the feature detection, the centers of the blobs are first sorted by color. Now we have an array having centers of only one blob color. This array is now sorted in the increasing order by the x values of the centers. At the end a sorted list of the centers of the control points in image coordinates is obtained which can be easily used for assigning to their respective world coordinates. The sorting of the x values in increasing order has been done by using the insertion sort algorithm. It is one of the simplest methods to sort an array. It sorts by repeatedly taking the next item and inserting it into the final data structure in its proper order with respect to items already inserted. In this case the input to the insertion sort is small and this sort works fine for small input. The pseudo code for the insertion sort is given in the Appendix. The function for the insertion sort has been named as “Insertion_sort”.
PAGE 85
73 Once the centers of the control points have been obtained and assigned to their respective object coordinates, these parameters can now be used as input to the camera calibration algorithm. The two algorithms have been combined in such a way that given an image, the blob detection algorithm finds the 2D features of the control points and from this information the camera calibration algorithm calculates the position and orientation of the camera from which the image was taken. The combined algorithm has been tested for various positions of the camera in different lighting conditions by taking images of the calibration object which contain the control points and the accuracy and efficiency of the algorithm have been determined and given in Chapter 6. Figure 5-1 Example of PPM format
PAGE 86
74 A B Figure 5-2. Smoothening example 1. A) Original image. B) Smoothed image. A B Figure 5-3. Smoothening example 2. A) Original image. B) Smoothed image A B Figure 5-4. Thresholding examples. A) Original image. B) Badly thresholded image
PAGE 87
75 A B Figure 5-5.Thresholding example. A) Original image. B) Badly thresholded image A B C Figure 5-6.Segmentation images. A) Original image. B) Segmented image. C) Regions selected for obtaining training data
PAGE 88
76 Figure 5-7. Samples drawn from a two-dimensional Gaussian model lie in cloud centered mean (“Pattern Classification,” Richard O. Duda, Peter E. Hart and David G. Stork, Fig 2.9, pg 35, 2001). A B Figure 5-8. Gaussian based threshold example. A) Original image. B) Thresholded image
PAGE 89
77 A B Figure 5-9. Gaussian based threshold example. A) Original image. B) Thresholded image A B Figure 5-10. Gaussian based thresholding example on two colors. A) Original image. B) Thresholded image A B Figure 5-11. Blob detection example 1 A) Original image. B) Blob detected image
PAGE 90
78 A B Figure 5-12. Blob detection example. 2 A) Original image. B) Blob detected image A B Figure 5-13. Blob detection example. 3 A) Original image. B) Blob detected image Figure 5-14. Flow chart of the feature detection algorithm
PAGE 91
CHAPTER 6 EXPERIMENTAL METHOD This chapter describes the details of the experiments that were conducted in order to achieve the goal of determining the position and orientation of the camera relative to the target. This chapter also contains the results obtained from the experimental setup to include the accuracy of the technique that has been developed. The camera calibration technique described in Chapter 4 was tested by two experiments. First, the parameter estimates of the forward camera model were analyzed using real images and compared with the parameter estimates measured manually. Second, the image coordinates of the control points were calculated by using the known parameters of the camera as well as its 3D world coordinates and compared to the original image coordinates for precision. The tests were performed with real images. These images were captured using two off-the-shelf cameras. The first camera is a CCD video camera module (Sony model XC-711), the features of which are described in the Table 6-1. The second camera is a digital camera (Canon Power Shot S100), the properties of which are also described in Table 6-1.The images obtained by the CCD camera were digitized by using a Intel Corporation frame grabber (SAA7116) and the size of the images were 640 by 480 pixels. The images obtained by Canon Power Shot S100 camera were in bitmap format. These images were converted to PPM format by using XV software in Linux and the size of these images was also 640 by 480 pixels. 79
PAGE 92
80 Table 6-1 Camera parameters Specifications Sony XC_711 CCD Canon Powershot S100 Pick up elements 768(H)x493(V) 480(H) x 290(V) Sensing Area 8.8mm x 6.6mm (1/(2.7)) inch camera tube White Balance Manual Auto / Manual Nominal Focal Length 25 mm 35 mm The images contain a calibration object which has two perpendicular planes, each with 9 circular control points. The two perpendicular planes can be considered as the two faces of a cube. The size of each plane was 4 by 4 feet. Three colors, i.e., blue, pink and orange were used for the circular control points. These control points were spaced 14 inches from each other. The size of each circular control point was 8 inches in diameter. The entire arrangement of this calibration object is shown in Figure 6-1. The Figure 6-1 shows that the top row of the calibration object contains blue circular control points, the middle contains pink control points and the bottom contains orange control points. The reason for arranging these control points in this manner is for the feature extraction process and its significance was described in chapter 5. The coordinate system of the calibration object has its origin at the top of the calibration object; its y axis pointing downwards, x axis and z axes are represented as shown in Figure 6-1. Actually only 6 circular control points i.e. the pink and the orange points on each plane is sufficient for the camera calibration algorithm to work. The blue circular points are used if the center of the orange or pink circular points is difficult to
PAGE 93
81 obtain due to the varying lighting conditions. The program has been written such that it takes in the best available 6 circular control points on each plane for calibration. The blue circular control points are used for the second experiment where the center of these points is calculated in image coordinates by knowing their position in world coordinate system and by knowing the position and orientation of the camera. For the first experiment the camera was placed on a tripod at a fixed position and the calibration object was moved accordingly. For each position of the object, its image was taken by the camera and its x, y and z distances from the camera were measured manually. Also the camera’s orientation angles along the x, y and z were measured manually for each position of the calibration object. The distance between the camera and the object was varied from 16 to 40 feet by every two and by 5 feet after there until 50 feet. After 50 feet the calibration object was kept at random positions until 100 feet. The angle of the camera was varied so that the image had a clear view of both the planes. The position and orientation of the camera was manually calculated for each position of the object. Then by utilizing the algorithm on images obtained from each of these positions of the calibration object, the position and orientation of the camera could be obtained. The results from the algorithm are then compared with the manually calculated values. In the second experiment the same images that were obtained from the first experiment were used but here a reverse analysis of our algorithm was performed, i.e., by knowing the position and orientation of the camera for each position of the object from the first experiment as well as the 3D world coordinates of the center of the blue control points the image coordinates of the center of the blue circular control points could be calculated. The image coordinates thus obtained were compared with the original image
PAGE 94
82 coordinates of the blue circular control points and the error was noted. These two experiments were conducted for various positions of the calibration object by using both the cameras and the results are given below. The results of the experiments are provided in chapter 7. Figure 6-1 Calibration object
PAGE 95
CHAPTER 7 RESULTS AND CONCLUSIONS This chapter contains the results obtained from the experiments conducted on the camera calibration technique developed. After the results, the conclusions are given in this chapter. Results The results are given in the form of a table which contains the position (x, y and z distance) of the camera from the object and orientation (x, y and z angle) of the camera from the object calculated manually and from the algorithm. The table also contains the image with the center of the blue control points marked, its value in pixels obtained from the reverse analysis and its original value. Tables 7-1 to 7-30 each contain 4 columns and 18 rows. The 1st row of the table contains the image on which the reverse analysis as well as the forward analysis was performed. Rows 2 to 4 in the table represent the X distance, Y distance and Z distance in inches which are the distances between the x, y, z axes of the camera coordinate system and x, y, z axes of the object coordinate system. The 5th row contains the total distance between the calibration object and the camera which is obtained by finding the square root of the sum of the squares of the X, Y and Z distances between the calibration object and the camera. Rows 6 to 8 represent , and , angles in degrees which are the rotation angles of the x, y, z axes of the camera coordinate system with the x, y, z axes of the calibration object coordinate system. The 9th row contains the total orientation angle 83
PAGE 96
84 obtained from the rotation matrix of the X, Y and Z orientation angles. Rows 11 to 16 represent the center of the blue control points as shown in the Figure 7-1. The 2nd column contains the manually measured values of the position and orientation of the camera as well as the image coordinates of the blue control points. The manual measurements were made with some inaccuracy. The inaccuracy was assumed not to exceed 1/8 inch in distance and 1 degree in orientation angle. The third column in the table contains the values of the parameters by running the algorithm on the image taken from the camera. The fourth column contains the relative error between the manual and algorithm results.
PAGE 97
85 Table 7-1 Sony CCD camera positioned at 16 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 1.1250 1.1239 0.0982 Y distance (inches) 1.6250 1.6198 0.3212 Z distance (inches) 191.3750 191.5649 0.0994 Total distance (inches) 191.3852 191.5750 0.0992 Q –X rotation (degree) 5.0000 5.0285 0.5715 f – Y rotation (degree) -45.0000 -45.0933 0.2314 YZ rotation (degree) 5.0000 5.0657 1.3142 Total orientation angle (degrees) -45.7412 -45.8451 0.2271 Reverse analysis First control point (x, y) 66,125 68, 126 3.0312, 0.8123 Second control point (x, y) 157, 128 154, 127 1.9152, 0.7823 Third control point (x, y) 252, 126 256, 129 1.5933, 2.3824 Fourth control point (x, y) 396, 111 391, 106 1.2646, 4.5175 Fifth control point (x, y) 469, 87 466, 83 0.6458, 4.6122 Sixth control point (x, y) 535, 70 536, 69 0.1951, 1.4325 Average % relative distance error 191.3852 191.5750 0.0991 Average % relative angle error -45.7412 -45.8451 0.2274
PAGE 98
86 Table 7-2 Sony CCD camera positioned at 18 feet from calibration object Forward analysis Results obtained manually Relative % error 1.1250 1.1989 0.0657 1.6250 1.7194 0.0581 216.2500 215.9413 0.0014 Algorithm results X distance (inches) Y distance (inches) Z distance (inches) Total distance (inches) 216.2590 215.9515 0.1428 Q –X rotation (degree) 5.0000 5.3691 0.0738 f – Y rotation (degree) -45.0000 -45.1365 0.3029 YZ rotation (degree) 5.0000 5.5412 0.1082 Total orientation angle (degrees) -45.7412 -45.8863 0.3175 Reverse analysis First control point (x, y) 122, 184 119, 179 2.4512, 2.7264 Second control point (x, y) 191, 187 193, 181 1.0521, 3.2176 Third control point (x, y) 263, 189 261, 195 0.7614, 3.1842 Fourth control point (x, y) 379, 178 375, 176 1.0642, 1.1205 Fifth control point (x, y) 448, 163 446, 168 0.4517, 3.0734 Sixth control point (x, y) 510, 149 504, 154 1.1846, 3.3641 Average % relative distance error 216.2590 215.9515 0.1428 Average % relative angle error -45.7412 -45.8863 0.3175
PAGE 99
87 Table 7-3 Sony CCD camera positioned at 20 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 1.1250 1.1295 0.4124 Y distance (inches) 1.6250 1.5947 1.8721 Z distance (inches) 240.8751 240.2641 0.3245 Total distance (inches) 240.8832 240.1186 0.3174 Q –X rotation (degree) 5.0000 5.01347 0.2916 f – Y rotation (degree) -45.0000 -45.1534 0.3542 YZ rotation (degree) 5.0000 4.9574 0.8564 Total orientation angle (degrees) -45.7412 -45.9425 0.4119 Reverse analysis First control point (x, y) 122, 183 117, 179 4.1851, 2.1927 Second control point (x, y) 192, 186 189, 186 1.5621, 0.0021 Third control point (x, y) 262, 190 268, 187 2.2968, 1.5941 Fourth control point (x, y) 378, 176 372, 179 1.5943, 1.7131 Fifth control point (x, y) 448, 162 441, 156 1.5764. 3.7012 Sixth control point (x, y) 513, 150 517, 149 0.7862, 0.6743 Average % relative distance error 240.8832 239.1551 0.7174 Average % relative angle error -45.7412 -46.2589 1.1319
PAGE 100
88 Table 7-4 Sony CCD camera positioned at 22 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 3.6250 3.5456 2.1903 Y distance (inches) 5.2500 5.0047 4.6723 Z distance (inches) 264.3750 265.7451 0.4992 Total distance (inches) 264.4250 265.7738 0.5106 Q –X rotation (degree) 2.2500 2.2012 2.1746 f – Y rotation (degree) -50.5000 -49.7456 1.4927 YZ rotation (degree) 4.7500 4.7023 1.0029 Total orientation angle (degrees) -50.6615 -50.0887 1.1305 Reverse analysis First control point (x, y) 155, 197 136, 189 2.2314, 4.0674 Second control point (x, y) 219, 197 200, 186 8.6816, 5.5812 Third control point (x, y) 187, 195 181, 189 3.2142, 3.0841 Fourth control point (x, y) 391, 183 396, 187 1.2864, 2.1942 Fifth control point (x, y) 454, 173 457, 179 0.6674, 3.4716 Sixth control point (x, y) 512, 166 519, 160 1.3716, 3.6187 Average % relative distance error 264.4250 265.7738 0.5106 Average % relative angle error -50.6615 -50.0887 1.1305
PAGE 101
89 Table 7-5 Sony CCD camera positioned at 24 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 3.6250 3.6954 1.9412 Y distance (inches) 5.2500 5.1043 2.7863 Z distance (inches) 288.1254 288.1308 0.5415 Total distance (inches) 288.1960 289.9071 0.6213 Q –X rotation (degree) 2.2500 2.2820 1.4247 f – Y rotation (degree) -50.5000 -51.0126 2.0319 YZ rotation (degree) 4.7500 4.7736 0.5149 Total orientation angle (degrees) -50.6615 -51.1708 1.0054 Reverse analysis First control point (x, y) 174, 187 177, 183 2.1751, 2.2148 Second control point (x, y) 235, 186 243, 179 3.3691, 3.7614 Third control point (x, y) 296, 186 316, 192 6.8142, 3.2351 Fourth control point (x, y) 392, 177 361, 169 7.9105, 4.5213 Fifth control point (x, y) 452, 166 421, 159 6.8612, 4.2217 Sixth control point (x, y) 511, 160 532, 164 4.1611, 3.7541 Average % relative distance error 288.1960 289.9071 0.6213 Average % relative angle error -50.6615 -51.1708 1.0054
PAGE 102
90 Table 7-6 Sony CCD camera positioned at 26 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 3.6250 3.5009 3.4251 Y distance (inches) 5.2500 5.3302 1.5241 Z distance (inches) 312.6250 315.0391 0.7512 Total distance (inches) 312.6901 315.8555 1.0123 Q –X rotation (degree) 2.2500 2.2002 2.2104 f – Y rotation (degree) -50.5000 -51.8212 2.6162 YZ rotation (degree) 4.7500 4.6085 2.9816 Total orientation angle (degrees) -50.6615 -51.7293 2.1073 Reverse analysis First control point (x, y) 205, 171 209, 163 1.1952, 4.6812 Second control point (x, y) 267, 170 259, 178 3.0026, 4.7126 Third control point (x, y) 327, 169 321, 164 1.8364, 2.9641 Fourth control point (x, y) 415, 165 409, 169 1.4572, 2.3786 Fifth control point (x, y) 164, 167 167, 160 1.8342, 4.1932 Sixth control point (x, y) 510, 165 505, 163 1.8314, 4.1921 Average % relative distance error 312.6901 315.8555 1.0123 Average % relative angle error -50.6615 -51.7293 2.1073
PAGE 103
91 Table 7-7 Sony CCD camera positioned at 28 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 8.5000 8.1252 4.4368 Y distance (inches) 10.3750 10.6852 2.9941 Z distance (inches) 336.8750 346.5143 2.8614 Total distance (inches) 337.1419 344.9032 2.3021 Q –X rotation (degree) 4.2500 4.1986 1.2154 f – Y rotation (degree) -55.0000 -55.9956 1.8169 YZ rotation (degree) 2.7500 2.8321 2.9861 Total orientation angle (degrees) -55.1102 -56.0962 1.7891 Reverse analysis First control point (x, y) 210, 169 215, 166 2.5512, 1.6344 Second control point (x, y) 269, 172 283, 179 5.1141, 3.7943 Third control point (x, y) 329, 172 313, 165 4.7623, 3.9725 Fourth control point (x, y) 417, 164 408, 159 2.1346, 3.3547 Fifth control point (x, y) 464, 170 472, 166 1.8364, 2.0846 Sixth control point (x, y) 513, 168 503, 163 1.1984, 3.2461 Average % relative distance error 337.1419 344.9032 2.3021 Average % relative angle error 55.1102 56.0962 1.7891
PAGE 104
92 Table 7-8 Sony CCD camera positioned at 30 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 8.5000 8.7066 2.4368 Y distance (inches) 10.3750 9.7826 5.7121 Z distance (inches) 360.7532 372.6213 3.2974 Total distance (inches) 360.9992 372.8514 3.2831 Q –X rotation (degree) 4.2500 4.4748 5.2978 f – Y rotation (degree) -55.0000 -53.0090 3.6254 YZ rotation (degree) 2.7500 2.7887 1.4127 Total orientation angle (degrees) -55.1102 -53.1442 3.5669 Reverse analysis First control point (x, y) 219, 169 223, 163 1.7512, 3.6624 Second control point (x, y) 277, 167 297, 177 3.3148, 5.7123 Third control point (x, y) 332, 165 323, 173 2.7198, 4.9751 Fourth control point (x, y) 407, 158 394, 154 3.1324, 2.3318 Fifth control point (x, y) 450, 151 463, 146 2.8745, 3.6547 Sixth control point (x, y) 463, 146 444, 151 3.6542, 4.2463 Average % relative distance error 360.9992 372.8514 3.2831 Average % relative angle error -55.1102 -53.1442 3.5669
PAGE 105
93 Table 7-9 Sony CCD camera positioned at 32 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 8.5000 8.2339 3.1770 Y distance (inches) 10.3750 9.7759 5.7468 Z distance (inches) 384.3752 372.2671 3.2644 Total distance (inches) 384.6089 372.4865 3.1518 Q –X rotation (degree) 4.2500 4.4748 5.2914 f – Y rotation (degree) -55.0000 -57.3055 4.1348 YZ rotation (degree) 2.7500 2.8991 5.4263 Total orientation angle (degrees) 55.1102 57.4565 4.2574 Reverse analysis First control point (x, y) 254, 158 243, 150 3.9714, 5.0620 Second control point (x, y) 301, 160 296, 163 1.6600, 1.8857 Third control point (x, y) 350, 159 374, 152 6.0022, 4.4635 Fourth control point (x, y) 427, 153 407, 159 4.6427, 3.9264 Fifth control point (x, y) 470, 147 446, 152 5.0642, 3.4274 Sixth control point (x, y) 516, 139 495, 131 4.1036, 5.7555 Average % relative distance error 384.6089 372.4865 3.1518 Average % relative angle error 55.1102 57.4565 4.2574
PAGE 106
94 Table 7-10 Sony CCD camera positioned at 34 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 15.6250 16.4390 5.2104 Y distance (inches) 26.5000 27.0857 2.2978 Z distance (inches) 408.1250 426.9873 4.6217 Total distance (inches) 409.2828 427.7946 4.5230 Q –X rotation (degree) 4.2500 4.2765 4.1875 f – Y rotation (degree) -55.0000 -52.8200 5.3674 YZ rotation (degree) 2.7500 2.6009 5.9297 Total orientation angle (degrees) -55.1102 -52.9498 5.6179 Reverse analysis First control point (x, y) 266, 157 282, 149 5.9754. 5.1714 Second control point (x, y) 309, 157 323, 151 4.6235, 3.8174 Third control point (x, y) 354, 155 365, 147 3.0854, 5.1036 Fourth control point (x, y) 425, 147 405, 151 4.7563, 2.9684 Fifth control point (x, y) 470, 142 446, 136 5.3247, 4.4732 Sixth control point (x, y) 518, 138 543, 130 4.9654, 6.0504 Average % relative distance error 409.2828 427.7946 4.5230 Average % relative angle error -55.1102 -52.9498 5.6179
PAGE 107
95 Table 7-11 Sony CCD camera positioned at 36 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 15.6250 16.8031 7.5432 Y distance (inches) 26.5000 25.1379 6.3574 Z distance (inches) 432.6250 461.2726 6.9251 Total distance (inches) 433.7174 463.7657 6.9281 Q –X rotation (degree) 8.3750 8.9327 7.0164 f – Y rotation (degree) -60.2500 -56.3518 7.0684 YZ rotation (degree) 4.5000 4.7462 7.8921 Total orientation angle (degrees) -60.6002 -56.81112 7.0864 Reverse analysis First control point (x, y) 278, 149 254, 156 8.4175, 4.5429 Second control point (x, y) 319, 150 342, 161 7.2468, 7.5474 Third control point (x, y) 361, 148 383, 157 6.4237, 5.8714 Fourth control point (x, y) 430, 142 408, 154 5.1237, 8.1268 Fifth control point (x, y) 430, 137 404, 146 5.9567, 6.7541 Sixth control point (x, y) 510, 132 560, 141 8.5470, 7.0104 Average % relative distance error 433.7174 463.7657 6.9281 Average % relative angle error -60.6002 -56.81112 7.0864
PAGE 108
96 Table 7-12 Sony CCD camera positioned at 38 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 15.6250 16.9323 8.3670 Y distance (inches) 26.5000 24.2369 8.5469 Z distance (inches) 456.5000 493.2241 8.1457 Total distance (inches) 457.5354 494.1086 8.0863 Q –X rotation (degree) 8.3750 8.9955 7.4367 f – Y rotation (degree) -60.2500 -54.8275 9.0001 YZ rotation (degree) 4.5000 4.1346 8.1275 Total orientation angle (degrees) -60.6002 -55.3264 8.7025 Reverse analysis First control point (x, y) 265, 149 289, 136 9.2175, 8.7462 Second control point (x, y) 312, 146 335, 132 7.4257, 9.3504 Third control point (x, y) 356, 142 385, 152 8.2605, 7.4236 Fourth control point (x, y) 418, 135 379, 147 9.1100, 9.4020 Fifth control point (x, y) 455, 131 497, 139 9.4102, 6.2876 Sixth control point (x, y) 492, 124 530, 134 7.8563, 8.2374 Average % relative distance error 457.5354 494.1086 8.0863 Average % relative angle error -60.6002 -55.3264 8.7025
PAGE 109
97 Table 7-13 Sony CCD camera positioned at 40 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 1.6250 1.7988 10.6980 Y distance (inches) 6.5000 7.18556 10.5472 Z distance (inches) 480.3750 537.0521 11.7986 Total distance (inches) 480.4217 532.5052 10.8412 Q –X rotation (degree) 8.3750 7.4185 11.4202 f – Y rotation (degree) -60.2500 -67.8546 12.6217 YZ rotation (degree) 4.5000 5.0295 11.5684 Total orientation angle (degrees) -60.6002 -67.8626 11.9842 Reverse analysis First control point (x, y) 285, 142 247, 162 13.1110, 13.8954 Second control point (x, y) 322, 140 282, 161 12.4705, 14.7893 Third control point (x, y) 364, 136 406, 114 11.4736, 15.7401 Fourth control point (x, y) 423, 131 478, 118 13.2801, 10.0954 Fifth control point (x, y) 461, 126 525, 145 14.0089, 15.3674 Sixth control point (x, y) 498, 122 572, 145 14.8721, 14.2201 Average % relative distance error 480.4217 532.5052 10.8412 Average % relative angle error -60.6002 -67.8626 11.9842
PAGE 110
98 Table 7-14 Sony CCD camera positioned at 45 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 1.6250 1.9737 21.4697 Y distance (inches) 6.5000 4.8925 24.7314 Z distance (inches) 540.6250 646.1551 19.5244 Total distance (inches) 540.6665 646.1766 19.5148 Q –X rotation (degree) 8.3750 6.7745 19.1101 f – Y rotation (degree) -60.2500 -71.2998 18.3472 YZ rotation (degree) 4.5000 5.4090 20.0287 Total orientation angle (degrees) -60.6002 -72.0014 18.3674 Reverse analysis First control point (x, y) 309, 131 373, 103 20.7871, 21.1501 Second control point (x, y) 343, 130 410, 106 19.5614, 17.8265 Third control point (x, y) 378, 130 282, 155 25.3741, 19.3254 Fourth control point (x, y) 431, 120 517, 141 20.0010, 17.2945 Fifth control point (x, y) 464, 134 562, 159 21.1965, 19.2504 Sixth control point (x, y) 493, 129 391, 156 20.7642, 21.2674 Average % relative distance error 540.6665 646.1766 19.5148 Average % relative angle error -60.6002 -72.0014 18.3674
PAGE 111
99 Table 7-15 Sony CCD camera positioned at 50 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 1.6250 2.1145 30.1278 Y distance (inches) 6.5000 4.3829 32.5701 Z distance (inches) 600.6250 788.2446 31.2374 Total distance (inches) 600.6624 788.7596 31.4561 Q –X rotation (degree) 8.3750 10.8657 29.7463 f – Y rotation (degree) -60.2500 -76.2645 26.5874 YZ rotation (degree) 4.5000 2.8467 36.7410 Total orientation angle (degrees) -60.6002 -76.3874 26.0514 Reverse analysis First control point (x, y) 311, 127 408, 179 31.2701, 41.1287 Second control point (x, y) 345, 126 448, 86 29.8764, 31.4400 Third control point (x, y) 377, 123 511, 72 35.5541, 41.2365 Fourth control point (x, y) 425, 118 529, 82 24.5321, 30.1478 Fifth control point (x, y) 456, 116 617, 81 35.4174, 30.0217 Sixth control point (x, y) 485, 112 630, 147 29.8467, 31.0078 Average % relative distance error 600.6624 788.7596 31.4561 Average % relative angle error -60.6002 -76.3874 26.0514
PAGE 112
100 Table 7-16 Canon power shot camera positioned at 16 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 1.1250 1.1256 0.0498 Y distance (inches) 1.6250 1.7914 0.1024 Z distance (inches) 191.3750 191.1845 0.0995 Total distance (inches) 191.3852 191.1636 0.1012 Q –X rotation (degree) 5.0000 5.0101 0.2027 f – Y rotation (degree) -45.0000 -45.0606 0.1347 YZ rotation (degree) 5.0000 5.0138 0.2754 Total orientation angle (degrees) -45.7412 -45.8721 0.2862 Reverse analysis First control point (x, y) 65, 123 66, 123 1.1110, 0.47153 Second control point (x, y) 157, 127 157, 126 0.0945, 0.8412 Third control point (x, y) 250, 125 258, 129 1.1068, 1.4126 Fourth control point (x, y) 396, 107 394, 105 2.4701, 1.2367 Fifth control point (x, y) 469, 85 468, 80 0.1476, 1.3674 Sixth control point (x, y) 538, 67 536, 64 0.1576, 0.1236 Average % relative distance error 191.3852 191.1636 0.1012 Average % relative angle error -45.7412 -45.8721 0.2862
PAGE 113
101 Table 7-17 Canon power shot camera positioned at 18 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 1.1250 1.1289 0.3452 Y distance (inches) 1.6250 1.6299 0.3017 Z distance (inches) 216.2500 216.7266 0.2204 Total distance (inches) 216.2590 216.7216 0.2139 Q –X rotation (degree) 5.0000 5.01249 0.2498 f – Y rotation (degree) -45.0000 -45.1672 0.3715 YZ rotation (degree) 5.0000 5.0209 0.4173 Total orientation angle (degrees) -45.7412 -45.9159 0.3821 Reverse analysis First control point (x, y) 124, 178 119, 174 1.1871, 2.1964 Second control point (x, y) 195, 176 192, 176 1.5638, 0.0094 Third control point (x, y) 265, 174 274, 171 1.2974, 1.5384 Fourth control point (x, y) 381, 168 348, 171 2.4513, 1.7461 Fifth control point (x, y) 449, 163 442, 166 1.5764, 2.7000 Sixth control point (x, y) 514, 156 518, 159 0.7869, 0.6719 Average % relative distance error 216.2590 216.7216 0.2139 Average % relative angle error -45.7412 -45.9159 0.3821
PAGE 114
102 Table 7-18 Canon power shot camera positioned at 20 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 1.1250 1.1308 0.5217 Y distance (inches) 1.6250 1.6328 0.4792 Z distance (inches) 240.8751 242.0816 0.5009 Total distance (inches) 240.8832 242.1119 0.5101 Q –X rotation (degree) 5.0000 5.0187 0.3738 f – Y rotation (degree) -45.0000 -45.2308 0.5129 YZ rotation (degree) 5.0000 5.0204 0.4082 Total orientation angle (degrees) -45.7412 -45.9779 0.5175 Reverse analysis First control point (x, y) 155, 198 151, 193 2.4571, 2.7201 Second control point (x, y) 223, 199 225, 193 1.0597, 1.5741 Third control point (x, y) 289, 197 287, 203 0.7641, 2.1867 Fourth control point (x, y) 395, 185 391, 183 1.0687, 1.1124 Fifth control point (x, y) 454, 174 452, 178 0.4597, 2.0074 Sixth control point (x, y) 514, 169 508, 175 1.1879, 1.5482 Average % relative distance error 240.8832 242.1119 0.5101 Average % relative angle error -45.7412 -45.9779 0.5175
PAGE 115
103 Table 7-19 Canon power shot camera positioned at 22 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 3.6250 3.6520 0.7460 Y distance (inches) 5.2500 5.2816 0.6027 Z distance (inches) 264.3750 266.5730 0.8314 Total distance (inches) 264.4250 266.6118 0.8270 Q –X rotation (degree) 2.2500 2.2701 0.8954 f – Y rotation (degree) -50.5000 -51.0040 0.9981 YZ rotation (degree) 4.7500 4.7877 0.7932 Total orientation angle (degrees) -50.6615 -51.1679 0.9997 Reverse analysis First control point (x, y) 155, 197 151, 201 2.8560, 2.1134 Second control point (x, y) 223, 196 198, 200 1.9247, 1.2410 Third control point (x, y) 291, 195 297, 192 2.0671, 1.4271 Fourth control point (x, y) 395, 182 409, 179 2.6841, 1.9047 Fifth control point (x, y) 457, 175 161, 180 2.3712, 2.8541 Sixth control point (x, y) 516, 165 531, 168 1.8431, 2.0100 Average % relative distance error 264.4250 266.6118 0.8270 Average % relative angle error -50.6615 -51.1679 0.9997
PAGE 116
104 Table 7-20 Canon power shot camera positioned at 24 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 3.6250 3.5456 2.1903 Y distance (inches) 5.2500 5.0047 4.6721 Z distance (inches) 288.1254 290.7199 0.9005 Total distance (inches) 288.1960 290.8203 0.9106 Q –X rotation (degree) 2.2500 2.2012 2.1746 f – Y rotation (degree) -50.5000 -49.7456 1.4927 YZ rotation (degree) 4.7500 4.7023 1.0029 Total orientation angle (degrees) -50.6615 -50.0887 1.1305 Reverse analysis First control point (x, y) 178, 197 183, 194 2.5513, 1.6397 Second control point (x, y) 235, 196 237, 199 2.1140, 3.7942 Third control point (x, y) 294, 184 189, 177 1.7612, 1.9514 Fourth control point (x, y) 394, 179 386, 173 2.1387, 3.3547 Fifth control point (x, y) 453, 168 461, 165 1.8370, 2.0849 Sixth control point (x, y) 511, 165 501, 161 2.1917, 2.2468 Average % relative distance error 288.1960 290.8203 0.9106 Average % relative angle error -50.6615 -50.0887 1.1305
PAGE 117
105 Table 7-21 Canon power shot camera positioned at 26 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 3.6250 3.7233 2.7123 Y distance (inches) 5.2500 5.0993 2.8741 Z distance (inches) 312.6250 318.6274 1.9247 Total distance (inches) 312.6901 318.6900 1.9188 Q –X rotation (degree) 2.2500 2.3366 3.8541 f – Y rotation (degree) -50.5000 -51.7091 2.3945 YZ rotation (degree) 4.7500 4.6189 1.2469 Total orientation angle (degrees) -50.6615 -51.8783 2.4019 Reverse analysis First control point (x, y) 210, 168 204, 171 2.7854, 1.9523 Second control point (x, y) 259, 171 265, 174 2.2224, 1.7463 Third control point (x, y) 326, 168 333, 173 2.2374, 3.1274 Fourth control point (x, y) 414, 166 429, 170 3.8142, 2.6401 Fifth control point (x, y) 165, 158 169, 161 2.7301, 1.6574 Sixth control point (x, y) 517, 166 502, 170 2.9436, 2.1711 Average % relative distance error 312.6901 318.6900 1.9188 Average % relative angle error -50.6615 -51.8783 2.4019
PAGE 118
106 Table 7-22 Canon power shot camera positioned at 28 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 8.5000 8.7066 2.4368 Y distance (inches) 10.3750 9.7826 5.7121 Z distance (inches) 336.8750 346.2819 2.7924 Total distance (inches) 337.1419 346.6186 2.8109 Q –X rotation (degree) 4.2500 4.4748 5.2978 f – Y rotation (degree) -55.0000 -53.0090 3.6254 YZ rotation (degree) 2.7500 2.7887 1.4127 Total orientation angle (degrees) -55.1102 -53.1442 3.5669 Reverse analysis First control point (x, y) 215, 171 220, 176 2.1754, 2.1467 Second control point (x, y) 265, 169 269, 174 3.3647, 3.7641 Third control point (x, y) 338, 170 361, 175 4.1274, 3.2310 Fourth control point (x, y) 418, 165 385, 158 4.9120, 4.5210 Fifth control point (x, y) 468, 163 436, 156 3.8612, 4.2210 Sixth control point (x, y) 518, 167 537, 153 4.1602, 3.7512 Average % relative distance error 337.1419 346.6186 2.8109 Average % relative angle error -55.1102 -53.1442 3.5669
PAGE 119
107 Table 7-23 Canon power shot camera positioned at 30 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 8.5000 8.2339 3.1770 Y distance (inches) 10.3750 9.7759 5.7468 Z distance (inches) 360.7532 371.8081 3.0644 Total distance (inches) 360.9992 372.7836 3.2644 Q –X rotation (degree) 4.2500 4.1221 3.0142 f – Y rotation (degree) -55.0000 -57.1890 3.9842 YZ rotation (degree) 2.7500 2.8452 3.4617 Total orientation angle (degrees) -55.1102 -57.2891 3.9537 Reverse analysis First control point (x, y) 254, 158 246, 164 3.0214, 3.9145 Second control point (x, y) 303, 157 312, 151 3.1247, 3.8452 Third control point (x, y) 351, 160 361, 166 2.8864, 3.7741 Fourth control point (x, y) 426, 154 443, 156 3.9147, 4.2865 Fifth control point (x, y) 471, 145 478, 149 4.8136, 2.1275 Sixth control point (x, y) 514, 139 517, 145 3.9947, 4.2357 Average % relative distance error 360.9992 372.7836 3.2644 Average % relative angle error -55.1102 -57.2891 3.9537
PAGE 120
108 Table 7-24 Canon power shot camera positioned at 32 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 8.5000 8.7065 2.4379 Y distance (inches) 10.3750 9.7826 5.7124 Z distance (inches) 384.3752 398.9314 3.7869 Total distance (inches) 384.6089 399.8902 3.9732 Q –X rotation (degree) 4.2500 4.4748 5.2471 f – Y rotation (degree) -55.0000 -57.1417 3.8941 YZ rotation (degree) 2.7500 2.7888 1.4178 Total orientation angle (degrees) -55.1102 -57.1869 3.7831 Reverse analysis First control point (x, y) 267, 156 271, 150 1.7584, 3.6674 Second control point (x, y) 308, 156 331, 165 3.7310, 5.7164 Third control point (x, y) 352, 153 343, 157 2.7164, 3.9724 Fourth control point (x, y) 424, 146 411, 143 3.2340, 2.3333 Fifth control point (x, y) 471, 140 484, 136 2.8046, 3.1240 Sixth control point (x, y) 517, 134 504, 139 4.0070, 4.2156 Average % relative distance error 384.6089 399.8902 3.9732 Average % relative angle error -55.1102 -57.1869 3.7831
PAGE 121
109 Table 7-25 Canon power shot camera positioned at 34 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 15.6250 16.3671 4.7521 Y distance (inches) 26.5000 25.4559 3.9425 Z distance (inches) 408.1250 427.4052 4.7241 Total distance (inches) 409.2828 428.4753 4.6893 Q –X rotation (degree) 4.2500 4.0273 5.2417 f – Y rotation (degree) -55.0000 -57.8258 5.3179 YZ rotation (degree) 2.7500 2.8149 2.3671 Total orientation angle (degrees) -55.1102 -57.9202 5.0988 Reverse analysis First control point (x, y) 267, 155 278, 158 4.7812, 2.1971 Second control point (x, y) 309, 157 301, 166 2.5781, 5.9271 Third control point (x, y) 352, 154 372, 160 4.7648, 5.7236 Fourth control point (x, y) 424, 148 441, 154 5.7648, 4.1101 Fifth control point (x, y) 473, 140 445, 146 4.0174, 3.9874 Sixth control point (x, y) 518, 134 526, 129 5.8412, 4.3841 Average % relative distance error 409.2828 428.4753 4.6893 Average % relative angle error -55.1102 -57.9202 5.0988
PAGE 122
110 Table 7-26 Canon power shot camera positioned at 36 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 15.6250 16.7547 7.2347 Y distance (inches) 26.5000 24.6079 7.1486 Z distance (inches) 432.6250 462.2598 6.8574 Total distance (inches) 433.7174 463.2174 6.8026 Q –X rotation (degree) 8.3750 8.9327 7.0164 f – Y rotation (degree) -60.2500 -64.4242 6.9281 YZ rotation (degree) 4.5000 4.7462 7.8921 Total orientation angle (degrees) -60.6002 -64.7606 6.8654 Reverse analysis First control point (x, y) 278, 149 297, 136 7.0174, 8.5467 Second control point (x, y) 320, 149 340, 145 6.1274, 7.6354 Third control point (x, y) 361, 150 370, 164 9.1901, 7.2501 Fourth control point (x, y) 429, 142 460, 137 7.1247, 8.4671 Fifth control point (x, y) 470, 134 433, 144 7.8521, 7.3241 Sixth control point (x, y) 510, 126 521, 139 7.9864, 7.2580 Average % relative distance error 433.7174 463.2174 6.8026 Average % relative angle error -60.6002 -64.7606 6.8654
PAGE 123
111 Table 7-27 Canon power shot camera positioned at 38 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 15.6250 17.1673 9.8714 Y distance (inches) 26.5000 24.2369 8.5427 Z distance (inches) 456.5000 498.2241 9.1468 Total distance (inches) 457.5354 499.1086 9.0863 Q –X rotation (degree) 8.3750 8.9955 7.4128 f – Y rotation (degree) -60.2500 -54.8275 9.0094 YZ rotation (degree) 4.5000 4.1346 8.1274 Total orientation angle (degrees) -60.6002 -55.3264 8.7025 Reverse analysis First control point (x, y) 268, 148 289, 136 9.2715, 8.7465 Second control point (x, y) 313, 147 335, 132 7.4184, 9.2470 Third control point (x, y) 356, 144 385, 152 8.2678, 7.2140 Fourth control point (x, y) 419, 136 379, 147 9.1170, 9.1742 Fifth control point (x, y) 456, 129 497, 139 9.4178, 6.2874 Sixth control point (x, y) 491, 124 530, 134 7.8542, 8.1274 Average % relative distance error 457.5354 499.1086 9.0863 Average % relative angle error -60.6002 -55.3264 8.7025
PAGE 124
112 Table 7-28 Canon power shot camera positioned at 40 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 1.6250 1.8402 13.2461 Y distance (inches) 6.5000 5.6895 12.4731 Z distance (inches) 480.3750 547.0512 13.0801 Total distance (inches) 480.4217 547.0839 13.8757 Q –X rotation (degree) 8.3750 7.4185 11.4201 f – Y rotation (degree) -60.2500 -53.0525 12.3152 YZ rotation (degree) 4.5000 5.1295 13.9901 Total orientation angle (degrees) -60.6002 -53.4303 11.8315 Reverse analysis First control point (x, y) 285, 142 247, 162 13.1101, 13.8964 Second control point (x, y) 322, 140 282, 161 12.4702, 14.7802 Third control point (x, y) 364, 136 406, 114 11.4769, 15.7423 Fourth control point (x, y) 423, 131 478, 114 13.2176, 10.0029 Fifth control point (x, y) 461, 126 525, 145 14.0098, 15.3647 Sixth control point (x, y) 572, 173 14.8797, 14.2234 Average % relative distance error 480.4217 547.0839 13.8757 Average % relative angle error -60.6002 -53.4303 11.8315 498, 122
PAGE 125
113 Table 7-29 Canon power shot camera positioned at 45 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 1.6250 2.0027 23.2467 Y distance (inches) 6.5000 5.0395 22.4760 Z distance (inches) 540.6250 663.5998 22.7468 Total distance (inches) 540.6665 654.4217 21.0398 Q –X rotation (degree) 8.3750 7.4186 11.4287 f – Y rotation (degree) -60.2500 -46.8082 22.3101 YZ rotation (degree) 4.5000 5.5795 23.9947 Total orientation angle (degrees) -60.6002 -47.3124 21.9269 Reverse analysis First control point (x, y) 307, 133 236, 178 23.1174, 33.8901 Second control point (x, y) 342, 130 265, 149 22.4762, 24.8701 Third control point (x, y) 378, 130 497, 81 31.4763, 35.7432 Fourth control point (x, y) 378, 127 487, 111 23.4680, 26.4715 Fifth control point (x, y) 430, 124 575, 148 24.0001, 30.0145 Sixth control point (x, y) 464, 118 617, 143 28.6541, 24.8732 Average % relative distance error 540.6665 654.4217 21.0398 Average % relative angle error -60.6002 -47.3124 21.9269
PAGE 126
114 Table 7-30 Canon power shot camera positioned at 50 feet from calibration object Forward analysis Results obtained manually Algorithm results Relative % error X distance (inches) 1.6250 1.1086 31.7824 Y distance (inches) 6.5000 9.0682 39.5147 Z distance (inches) 600.6250 782.0347 30.1278 Total distance (inches) 600.6624 782.0881 30.2062 Q –X rotation (degree) 8.3750 5.2921 36.8101 f – Y rotation (degree) -60.2500 -82.2955 35.5874 YZ rotation (degree) 4.5000 3.3057 26.5420 Total orientation angle (degrees) -60.6002 -82.3396 35.8735 Reverse analysis First control point (x, y) 312, 127 412, 86 32.2101, 31.9504 Second control point (x, y) 343, 126 448, 163 30.7469, 29.4703 Third control point (x, y) 377, 123 507, 75 34.5274, 39.2507 Fourth control point (x, y) 426, 120 278, 157 34.5198, 31.1104 Fifth control point (x, y) 457, 117 553, 163 29.9457, 39.0247 Sixth control point (x, y) 486, 111 690, 154 41.8465, 39.0014 Average % relative distance error 600.6624 782.0881 30.2062 Average % relative angle error -60.6002 -82.3396 35.8735
PAGE 127
115 As shown in the results above, the algorithm gave good results up to 34 feet with a 1.5 – 4 % relative error. After 34 feet the algorithm could not provide results with such accuracy because the center of the control points obtained were not accurate for the camera calibration to provide good results. Once the distance between the camera and the object was increased beyond 60 feet, it failed to detect the circular control points on the two planes itself. By looking at the figure [3], one can come to the conclusion that the algorithm failed to detect the controls beyond the 50 feet range. Conclusions One must notice that the two planes built must be in clear view of the camera. As shown in the images in the tables, one can see that even in varying lighting conditions the algorithm was able to calculate the center of the control points. This is mainly because of the training data used for Gaussian Modeling used for thresholding the image. In order to give a graphical view of the results, error plots have been shown in Figures 7-3 to 7-6. Figures 7-3 and 7-5 shows the plot of the distance between the calibration object and the camera vs relative % error in the total distance between manual and the algorithm calculated values. Similarly Figures 7-4 and 7-6 shows the plot of the distance between the calibration object and the camera vs relative % error in the total orientation angle between the manual and the algorithm calculated values. The reason for showing the results by using two cameras is to prove that the algorithm can be used by images obtained from any off-the-shelf camera once its properties are known. As stated in the previous chapters and by looking at the results, the algorithm runs efficiently and autonomously, with out any user interference and it is versatile. The algorithm is accurate up to a range of 34 feet. As shown in the error plots above the algorithm gave a relative error above 5 % beyond 34 feet. The reason behind
PAGE 128
116 this relative error is because the feature extractor was not able to locate the centroids of the circular dots on the two faces of the cube accurately. Due to the error in locating the centers of the dots, the calibration algorithm produced results with an error. So when given a clear view of the 12 control points on each face of the cube, the calibration algorithm was able to calculate the position and orientation of the camera up to 34 feet with a relative error less than 5 %. This algorithm runs autonomously, i.e. with out any user interface. The algorithm calibrates the camera within 2 seconds, so the algorithm is pretty efficient too. In case if the algorithm is not able to find optimized parameter values then it will notify the user saying that the calibration failed. So the algorithm has met the calibration criteria which were mentioned in chapter 4. Just by using simple vision techniques and with out any costly equipment, the parameters of the camera were calibrated in an outdoor environment. Figure 7-1 Blue control point’s centers marked
PAGE 129
117 Figure 7-2. Bad detection of blobs beyond 50 feet Figure 7-3. Sony CCD camera (distance) vs (% relative error in total distance)
PAGE 130
118 Figure 7-4. Sony CCD camera (distance) vs (% relative error in orientation angle) Figure 7-5. Canon power shot (distance) vs (% relative error in total distance)
PAGE 131
119 Figure 7-6. Canon power shot (distance) vs (% relative error in orientation angle)
PAGE 132
120 CHAPTER 8 FUTURE WORK This research’s primary goal was to develop a technique by which a vehicle’s position and orientation could be determined locally. The technique developed helps one to find out the position and orientation of a camera with respect to a known artificial landmark in the environment, the results of which were shown in the previous chapter. The entire testing of the algorithm was done with the camera and the vehicle did not come into the picture. Determination of vehicles position and orientation can be accomplished in the following way. The camera is placed on the vehicle and the calibration algorithm is run which gives the camera’s position and orientation with respect to the landmark. Since the camera is manually placed on the vehicle, its position and orientation are known with respect to the vehicle. When both these quantities are combined the end result is the position and orientation of the vehicle with respect to the landmark. This can be shown mathematically as follows: If Camera T Landmark is the transformation matrix between the camera coordinate system and the landmark coordinate system which is known from the calibration technique and Vehicle T Camera is the transformation matrix between the vehicle coordinate system and the camera coordinate system which is also known since the camera is fixed on the vehicle, then Vehicle T Landmark, which is the transformation matrix between the vehicle coordinate system and the landmark coordinate system, can be determined.
PAGE 133
121 Vehicle T Landmark = (Vehicle T Camera ) x (Camera T Landmark ) [8.1] Thus, the position and orientation of the vehicle coordinate system can be found from the known transformation matrix Vehicle T Landmark. The first goal of the future work should be to determine the Vehicle T Landmark. The camera on which the testing of the algorithm was done did not have the facility of auto iris and auto zoom. So each time an image of the object had to be taken, the camera’s iris and zoom had to be adjusted manually in order to produce good images for calibration. The calibration technique works fine for these types of cameras since the properties of the camera do not vary with time. But eventually the goal is to use a camera with auto iris and auto zoom together with a zooming capacity. This facility would increase the range of the algorithm. In order for the calibration algorithm to accommodate these variations in zoom and iris it has to be modified. This is another goal which needs to be accomplished. Only two planes of the calibration object were built and testing was done on these two planes only. The other two planes need to be built so that when the vehicle is moving around the calibration object it has at least two planes of the object to view in order for the calibration algorithm to work and determine the vehicle’s position. The entire goal is to make this technique useful at the run time, i.e., for the vehicle to determine its position and orientation when it is in motion. So when the vehicle is in motion it should have a clear sight of at least two planes of the calibration object and the camera on the vehicle should be able to takes images of the calibration object and update the position and orientation information of the vehicle with respect to the object continuously.
PAGE 134
122 As discussed in Chapter 5, Gaussian Modeling was used to achieve the thresholding phase of the algorithm. But the drawback of this technique is that if you have multiple colors on the circular control points then a Gaussian Model had to be built for each and every color which can be a tedious process. So one of other goals in the future work is to employ the Mixture Modeling technique for the thresholding phase of the feature detection. Using Mixture Modeling there will not be a need to calculate the Gaussians for every color. The Mixture Modeling would in fact give the Gaussians of all the colors specified in the threshold range. Once these goals have been achieved, the algorithm might be efficient enough to work in outdoor environments.
PAGE 135
APPENDIX PSEUDO CODE This appendix contains the pseudo code for some of the important functions which have been used for calibrating of a camera. Calibration Algorithm Functions ExtPar: This function calculates the initial extrinsic parameters by direct decomposition of the perspective transformation matrix. These initial parameters will be later used for initialization in the non linear estimation. InputData: Matrix [N x 5] containing the 3D and 2D image coordinates of the control points. Row format [ Wx Wy Wz Ix Iy] Camera Model:Type of camera being used. OutputPos: Camera position and orientation Pseudo Code: ExtPar (Camera_model, data, pos) { Calculate the L matrix for use in the DLT method; Perspective Transformation Matrix = Pseudo Inverse (L); Translation vector = P13 -1F (Last Column); for P13 see equation Rotation vector = P13 -1 F ( First three columns); Singular Value Decomposition of R; Use the equations [] to get the parameters; } Cinit: This is function which forms a matrix containing the control point position and orientation which can be used for further processing by functions like Calibration. InputDataMatrix that contains the 3-D coordinates of the control points (in right handed frame), corresponding image observations (in image frame, origin in the upper left corner) and the normal vectors of the object surface. Row format is [Wx, Wy, Wz, Ix, Iy, Nx, Ny, Nz] Units: mm for control points, pixels for image points. Camera Model: -Type of camera used. OutputBs: Matrix containing the 3D configuration of the control points (position, orientation) 123
PAGE 136
124 Delta = 1.0000e-10 f = SysImageCoor (Camera_model, Bs, P, f); SysImageCoor: This function calculates the synthetic image coordinates from the 3D world coordinates of the control points and the camera parameters. InputCamera Model: – Type of camera used. Bs: 3-D configuration of the control points Pos: Camera position and orientation. Par: Camera intrinsic parameters. Output – ic: Synthetic Image coordinates Pseudo Code – SysImageCoor (Camera model, Bs, pos, par, ic) { Calculate the perspective transformation matrix F by using Eq [] The values of x and y are calculated from the Bs matrix and F matrix. r2 = x * x + y * y; Delta = 3 * k1* r2 + (5 * k2 * r2) * r2; (k1, k2, p1, p2 are distortion coeff); Xn = x * (1+delta) + 6 * p1 *x *y – p2*(r2 – 6*x*x); Yn = y * (1+delta) p1 * (r2 – 6 * y * y) + 6 * p2 *x *y; Q = 4 * k1 * r2 + 6 * k2 * r4 + 8 * p1 *y + 8 * p2 * x +1; xn=xn./Q; yn=yn./Q; ic(1) = NDX *S * xn / Sx + Cpx ; ic(2) = NDY * yn / Sy + Cpy; NDX, NDY are No. of pixels in horizontal direction, vertical direction. Sx, Sy are the effective CCD chip size in horizontal and vertical direction. Cpx and Cpy are principal points and S is the aspects ratio. } Jacob: This is a function which calculates the partial derivatives of the parameters and approximates them with respect to the estimated parameters. This function is used by the optimization routine. It uses the SysImageCoor to calculate the synthetic image coordinates. Input: Camera_Model:Type of camera being used. Bs: 3-D configuration of the control points P: Both intrinsic and extrinsic parameters Output: J:Jacobian matrix evaluated at P f:synthetic image coordinates Pseudo Code: Jacob (Camera Model, Bs, P, f, J) {
PAGE 137
125 for( i=1: Number of parameters) {Add the delta to the P vector i.e q = p + delta; g = SysImageCoor (Camera_model, Bs, q, f); J = (g – f) / delta ;} } LevMarNonLinOpt: This function is used for non linear optimization. It performs the Levenberg-Marquardt non linear optimization. It uses the previously defined functions above to obtain an optimum solution to the non linear problem InputCamera Model: Type of camera being used. Bs: 3-D configuration of the control points P0:Initial parameter vector OutputP: Final parameter vector Res: Residual (sum of squared errors) Iter: Number of iterations Er: error for each point J: Jacobian Matrix Pseudo Code:LevMarNonLinOpt (Camera Model, Bs, P0, iter, res, J, P) { Jacob (Camera Model, Bs, P, f, J); Error = y – f; where y is original pixel values of the control points. Res = Error’ * Error; Store this value in a Pres; Beta = J’*Error; Alpha = J’ * J; lambda=0.001; For i = 1: limit { alpha (i) = alpha(i) +lambda; dp = alpha \ beta; P = P + dp; Jacob (Camera Model, Bs, P, f, J); Error = y – f; Res = Error’ * Error; If (abs (res – pres) < 1.0000e-004 & i > 10); break; end; If res < pres {Lambda = 0.1 * lambda; Beta = J’*Error; Alpha = J’ * J; Pres=res; pp=p ;} else {lambda = 10 * lambda; alpha (i) = alpha (i) + lambda; res=pres; p=pp ;} } } Calibration: This is the main routine as it calls the function which calculates the initial parameters of the camera as well as the function which calculates the final parameters
PAGE 138
126 Filter is a data structure with a two dimensional array (m [][]) and the size of the filter, f. iteratively. It also calls the function which calculates the control point position and orientation. InputData: Matrix containing the 3D world coordinate, 2D image coordinates and normal vectors Points: Number of control points Camera Model:Type of camera being used. Outputpar: Intrinsic camera parameters Pos: Position and orientation of the camera Res: Sum of the squared errors Pseudo CodeFunction Calibration (data, points, par, pos, res ) { Bs = cinit (data, points); ExtPar (Camera_model,pos, data, points); LevMarNonLinOpt (Bs, points, initial_param, final_param, res); } Feature Extraction Functions Read_ppm_file :This function reads in the values of the Red, Green and Blue pixel values of each pixel in the ppm file and it stores it into a three dimensional array. Input: An image file in PPM format, its size i.e. x dimensions and y dimensions and the size of its header. Output: A three dimensional array which contains the Red, Green and Blue pixel values of each pixel. Pseudo Code: Read_ppm_file ( PPM file, 3D array, xdim, ydim, header size) { for( i =0 to header size) Read the header information for ( i=0 to y dim) {for ( j=0 to j dim ) {for (k=0 to 3) Read the pixel info into the three dimensional array;}} } } Define_smoothing_filter: It defines the type of filter needed to built in order for this to the used in the filter algorithm for smoothening of the images. Input: Filter
PAGE 139
127 Temp = 255; imout -> p[j][i][0] = v; Output: A filter which can be used for smoothening of the images , f. Pseudo Code: Define_smoothing_filter(f , size) { for( i=0 to size) for( j=0 to size) Set the value for the filter i.e set the value for the 2 dimension array; } Filter: This function uses the filter defined in the above function and smoothens the image according to the type of filter defined. Input: A PPM image imin, Filter f Output: A smoothened PPM image imout Pseudo Code: Filter (imin, imout, f) { Red =0; Green =0; Blue =0; White =0; fsize= f->fsize; for ( i= imin->ymin to imin -> ymax){ for ( i = imin->xmin to imin -> xmax){ for(k = -fsize to fsize){ for(l = -fsize to fsize){ Red + = (f->m[l+fsize][k+fize] * p[j+l]p[i+k][0]); Green + = (f->m[l+fsize][k+fize] * p[j+l]p[i+k][1]); Blue + = (f->m[l+fsize][k+fize] * p[j+l]p[i+k][2]); White + = f->(m[l + fsize][k+fsize]); } } } Imout -> p[j][i][0] = Red/ white + 0.5); Imout -> p[j][i][2] = Red/ white + 0.5); Imout -> p[j][i][2] = Red/ white + 0.5); } } Threshold_PPM_image: This function converts a PPM image in to a binary PPM image by thresholding the image depending on the threshold range specified. Input: A PPM image imin that is to be thresholded, a threshold value t, mean values of the training data ,mean, and the covariance ,cov, of the training data. Ouput: A PPM image , imout, which is thresholded to the range specified. Pseudo Code: Threshold_PPM_image( imin, imout, t, mean, cov) { for ( i= imin->ymin to imin -> ymax){ for ( i = imin->xmin to imin -> xmax){ for (k=0 to 3) { store the R,G,B values of each pixel in a vector;} Distance = Calculate the Mahabolinis Distance for this vector by using the mean and the covariance if (Distance > t) Temp = 0; Else
PAGE 140
128 b-> ell.mean[0] + = x; b-> ell.mean[1] + = y; imout -> p[j][i][1] = v; imout -> p[j][i][2] = v; } } } Check_neighbours: This function checks to see if the pixels surrounding a particular pixel are white or not and also to see whether it is labeled or not. Input: A PPM image, imin, x and y postion of the pixel and the label to be assigned (label). Output: Returns a flag if it contains the required neighbors. Pseudo Code: Check_neighbours( imin, label, x, y) { for( i=1 to label) {if (x-1)>= 0 && (y-1) >= 0 if ( imin->p[x-1][y-1][0] == label) return (i); if (x+1)< t->xdim && (y-1) >= 0 if ( imin->p[x+1][y-1][0] == label) return (i); if (x-1)>= 0 && (y+1) < t->ydim) if ( imin->p[x-1][y+1][0] == label) return (i); if (x+1)< t->xdim && (y+1) < t->ydim) if ( imin->p[x+1][y+1][0] == label) return (i); return (0); } Collect_Data_Blob: This function is used to collect the data of the blob which has been detected. Input: A PPM image, imin, the label to be assigned, label, the tack size , n_stack. x and y position of each pixel, status variable ,state. A RAW format of the image imraw Output: Data is collected into the blob b. Pseudo Code: Collect_Data_Blob( imraw, imin, b, label, x, y, n_stack, status) { n_stack + = 1; if ( n_stack >= MAX_RECURSIONS) state = 0; return (0); loc = (y * imraw->xdim +x)*4 ; imin->p[x][y][0] = label; b->[Red]-> n = imraw-> p[loc + 2]; b->[Green]-> n = imraw-> p[loc + 1]; b->[Blue]-> n = imraw-> p[loc ]; b->x -> n = x; b> y> n = y;
PAGE 141
129 ell-> b = sqrt (ellipsoiddistance (ell->cov); } b-> ell.cov[0][0] + = (x *x); b -> ell.cov[1][0] + = b-> ell.cov[0][1]; b-> ell.cov[0][1] + = (y *x); b-> ell.cov[1][1] + = (y *y); b-> ell.xmin = minimum of b-> ell.xmin, x; b-> ell.ymin = minimum of b-> ell.min, y; b-> ell.xmax = minimum of b-> ell.xmax, x; b-> e ll.ymax = minimum of b-> ell.ymax, y; if (x-1)>= 0 && (y-1) >= 0) if ( imin->p[x-1][y-1][0] == label) collect_data_blob(imraw, imin,b , label, status, stack, x-1, y-1); if (x+1)< imin->xdim && (y-1) >= 0) if ( imin->p[x+1][y-1][0] == label) collect_data_blob(imraw, imin,b , label, status, stack, x+1, y-1); if (x+1)< imin->xdim && (y+1) < imin-> ydim) if ( imin->p[x+1][y+1][0] == label) collect_data_blob(imraw, imin,b , label, status, stack, x+1, y+1); if (x-1)>=0 && (y+1) > imin=>ydim) if ( imin->p[x11][y+1][0] == label) collect_data_blob(imraw, imin,b , label, status, stack, x-1, y+1); return (1); } Compute_ellipse_Properites: This funcition calculates the angle of the ellipse, its mean , covariance , its minor and major axis . Input: An ellipse, ell, whose structure is define as shown below and the number of ellipses, n. ELLIPSE { cov[2][2]; //covariance Mean[2] ; Major_axis[2]; a, b ;//major and minor axes; theta ; //angle of the ellipse } Output: Ellipse ell, whose properties have been found out. Pseudo Code: Compute_Ellipse_Properites( ell, n) { ell-> mean[0] /= n; ell-> mean[1] / = n; x = ell-> mean[0]; y = ell-> mean[1]; ell-> cov[0][0] = nd/(nd-1) * ell->cov[0][0]/n –x*x); ell-> cov[0][1] = nd/(nd-1) * ell->cov[0][1]/n –x*y); ell-> cov[1][1] = nd/(nd-1) * ell->cov[1][1]/n –y*y); ell-> cov[1][0] = ell-> cov[0][1]; ell-> theta = ellipsoidangle( ell -> cov); ell-> a = sqrt( ellipsoiddistance (ell->cov);
PAGE 142
130 point..y = ( st * xtmp + ct *ytmp + ell-> mean[1] + 0.5); Set the color for the point calculated above; Detect_blob: This functions uses the above two functions to detect the blobs and gather the properties of the blobs. Input: A PPM image, imin, a RAW image , imraw, minimum blob size , min_blob. Output: Blobs detected b Pseudo Code:Detect_blob( imin, imraw, b, min_blob_ { count =0 ; for ( i = 0 to imin-> ydim) for( j = 0 to imin -> xdim) if( imin->p[j][i][0] == 255) flag2 = check_neighbors( imin, j, i, label); if (flag2 == 0) initialize_blob; flag =0; while(flag = = 0) collect_data( imraw, imin, j, i, label, 255, b, status, stack); if(status == 0) { remove the data from the stack; if (flag2 == 0){ if( b[count].n >= min_blob) { count++; label++; } else { collect_data( imraw, imin, b[count].x, b[count].y , label,255, b,status, stack); if number of blobs more than specified then exit the program; compute_ellipse_properties(b.ell, b.n); return ( count , which is the number of blobs detected); } Draw_ellipse: This function helps in drawing the boundaries for a detected blob on an image with the help of the properties of the ellipse. Input: A PPM image, im, an ellipse with know properties , ell, color for the boundary is mentioned in the value of r, g, b. Output: A PPM image , im, with detected boundaries for the blobs. Pseudo Code:Draw_ellipse( im, ell, r, g, b) { ct = cos (ell-> theta); st = sin (ell->theta); for ( angle =0 to ell->theta) xtmp = ell->a* cos(angle); ytmp = ell-> b*sin(angle); point..x = ( ct * xtmp – st *ytmp + ell-> mean[0] + 0.5);
PAGE 143
131 } Insertion_sort: This is sorting function used by the feature localization process in order to sort the array containing the centers of the control points. Input: An array ‘A’ containing the centers of the control points as a structure, size of the array which is N. Output: A sorted array A Pseudo Code: Insertion_Sort (A. N) { for (i =1 to N) temp = A[i]; for ( j = I;j>0 && A[ j-1].x > temp.x; j-) A [ j] A[j-1]; A[j] = Tmp;} } Main function for calibration and feature extraction Main() { PPM im, tmp1,tmp2,tmp3; RAW im_raw; BLOB blob[ Maximum_no_blobs]; Thresh = Default threshold; Filter_size = Default filter; Read_ppm_file(“Image_file.ppm”, im, x_dim, y_dim, n_header); Define_smoothing_filter( filt, filter_size); Filter (im, tmp1, filt); Mean1 = calculate the mean of the training data; Covariance = calculate the covariance of the training data; Threhold_PPM( tmp1,tmp2, thresh, mean, covariance); For( i = 0 to Max Blob size) { allocate memory to the blob;} number of blobs = Detect_blobs( im_raw, tmp2, blobs, minimum blob size); A[]= Sort the blobs according to color by using the Mahabolinis distance and putthem in an array; Insertion_sort( A, number of control points in each color); Data [] = Declare the values of the 3D world coordinates of each control point into this array; Calibration (data, points, par, pos, res ); (Where par stands for the intrinsic parameters of the camera and the pos stands for the pos stands for the extrinsic parameters of the camera and res stands for the residual error)
PAGE 144
LIST OF REFERENCES Abdel Aziz, Y.I. and Karara, H.M., “Direct Linear Transformation into Object Space Coordinates in Close Range Photogrammetry,” Proceedings of the Close Range Photogommetry, University of Illinois at Urbana Champaign, Urbana, pp. 1-18, 1971. Betke, M. and Gurvits, L., “Mobile Robot Localization using Landmarks,” International Conference on Intelligent Robots and Systems, Munich, Germany, pp 135-142, September 1994. Borenstein, J., Everett, H.R., and Feng, L., “Navigating Mobile Robots,” Wellesley, MA, A.K.Peters, Ltd., 1996. Bougnoux, S. “From Projective to Euclidean Space under any Practical Situation, a Criticism of Self Calibration,” 6th International Conference on Computer Vision, pp 790-796, 1998. Deveza, R., Thiel, D., Russell, R.A., and Mackay-Sim, A., “Odour Sensing for Robot Guidance,” The international Journal of Robotics Research, Vol. 13, No. 3, June, pp 232-239, 1994. Faig, W., “ Calibration of Close Range Photogrammetry Systems: Mathematical Formulation,” Photogrammetric Eng. Remote Sensing, Vol. 41, pp. 1479-1486, 1975. Heikkila, J “Geometric Camera Calibration using Circular Control Points,” IEEE Pattern Analysis and Machine Intelligence, Vol. 22, No. 10, pp 1066-1077, 2000. Jenkin, M., Milios, E., Jasiobdezki, P., Bains, N., and Tran, K., “Global Navigation for ARK,” Proceedings of the 1993 IEEE / RSJ International Conference on Intelligent Robotics and Systems, Yokohama, Japan, pp 2165-2171, 1993. Kleeman, L. and Russell, R., “Thermal Path Following Robot Vehicle: Sensor Design and Motion Control,” Proceedings of the 1993 IEEE / RSJ International Conference on Intelligent Robotics and Systems, Yokohama, Japan, pp. 1319-1323, July 1993. Matsuda, T. and Yoshikawa, E., “Z-shaped Position Correcting Landmark for AGVs,” Proceedings of the 28th SICE Annual Conference, pp 425-426, July, 1989. 132
PAGE 145
133 Mesaki,Y. and Masuda, I., “A New Mobile Robot Guidance System Using Optical Reflectors,” Proceedings of the 1992 IEEE/RSJ International Conference on Intelligent Robots and Systems, Raleigh, NC, pp. 628-635, July 1992. Premi, K.s. and Besant, C.B., “A Review of various Vehicle Guidance Techniques That Can be Used by Mobile Robots or AGVS,” 2nd International conference on Automated Guided Vehicle Systems, Stuttgart, Germany, 1987. Richard, O.D., Peter, E.H. and David G.S., “Pattern Classification,” 2nd ed., New York, John Wiley & Sons, 2001. Russell, R.A., “A Practical Demonstration of the Application of Olfactory Sensing to Robot Navigation,” Proceedings of the International Advanced Robotics Program (IARP), Sydney, Australia, pp. 35-43. May, 1995. Sobel, I., “ On Calibrating Computer Controlled Cameras for Perceiving 3-D scenes,” Artificial Intelligence, Vol. 5, pp 185-188, 1974. Roger Tsai, Y., “A Versatile Camera Calibration Technique for High-Accuracy 3D Machine Vision Metrology Using Offthe Shelf Cameras and Lenses,” IEEE journal of Robotics and Automation, Vol.3, no. 4, pp. 323-344, 1987. Tsumura, T., “Survey of Automated Guided Vehicle in Japanese Factory,” Proceedings of IEEE International Conference On Robotics and Automation, San Francisco, CA, pp 1329-13334, 1986. William, H.P., Saul, A.T., Willam, T.V. and Brian,P.F., “ Numerical Recipes in C,” 2nd ed. Cambridge, New York, Cambridge University Press, 2002. Yakimovsky, Y. and Cunningham, R., “A System for Extracting Three Dimensional Measurements from a Stereo Pair of TV Camera,” Computer Graphics Image Processing, Vol. 7, pp 195-210, 1978.
PAGE 146
BIOGRAPHICAL SKETCH Sai Krishna Yeluri was born on December 11, 1978, in Tirupati, India. He received a bachelor’s degree in mechanical engineering from Sri Venkateswara University in July 2000. He joined the University of Florida in August 2000 to pursue a master’s degree in mechanical engineering. Here he dabbled in the sciences of computers and electronics. Now greatly involved in the field of computer vision, he minored in electrical engineering and completed his master’s degree in May 2003. From here, he plans to continue into the industrial world of computer vision. 134