Using Digital Video Analysis to Monitor Driver Behavior at Intersections Final Report November 2006 Sponsored by the Iowa Department of Transportation (CTRE Project 05-214) Iowa State University’s Center for Transportation Research and Education is the umbrella organization for the following centers and programs: Bridge Engineering Center • Center for Weather Impacts on Mobility and Safety • Construction Management & Technology • Iowa Local Technical Assistance Program • Iowa Traffic Safety Data Service •
About CTRE/ISU The mission of the Center for Transportation Research and Education (CTRE) at Iowa State University is to develop and implement innovative methods, materials, and technologies for improving transportation efficiency, safety, and reliability while improving the learning environment of students, faculty, and staff in transportation-related fields.
Technical Report Documentation Page 1. Report No. CTRE Project 05-214 2. Government Accession No. 4. Title and Subtitle Using Digital Video Analysis to Monitor Driver Behavior at Intersections 3. Recipient’s Catalog No. 5. Report Date November 2006 6. Performing Organization Code 7. Author(s) Derrick Parkhurst 8. Performing Organization Report No. 9.
USING DIGITAL VIDEO ANALYSIS TO MONITOR DRIVER BEHAVIOR AT INTERSECTIONS Final Report November 2006 Principal Investigator Derrick Parkhurst Assistant Professor, Department of Psychology The Human Computer Interaction Program Iowa State University Preparation of this report was financed in part through funds provided by the Iowa Department of Transportation through its research management agreement with the Center for Transportation Research and Education, CTRE Project 05-214.
TABLE OF CONTENTS ACKNOWLEDGMENTS ............................................................................................................ IX 1. INTRODUCTION .......................................................................................................................1 2. DESIGN OF A VIDEO RECORDING STATION (VRS)..........................................................1 2.1 Design Constraints ......................................................................................................
LIST OF FIGURES Figure 2.1. Video recording system prototype ................................................................................3 Figure 3.1. Calibration markers placed 10 feet apart.......................................................................5 Figure 5.1. A map showing four intersections chosen for study on Bissell Road .........................10 Figure 5.2. A collision diagram (1995-2005) of U.S. 69 and 190th and a corresponding video frame .........................................
Figure 6.36. Object identification parameter setting dialog ..........................................................34 Figure 6.37. Identify objects in video ............................................................................................34 Figure 6.38. Review results dialog following object identification...............................................35 Figure 6.39. Correct object shape in video ....................................................................................35 Figure 6.
ACKNOWLEDGMENTS The authors would like to thank the Iowa Department of Transportation for sponsoring this research.
1. INTRODUCTION Commercially available instruments for road-side data collection take highly limited measurements, require extensive manual input, or are too expensive for widespread use. However, inexpensive computer vision techniques for digital video analysis can be applied to automate the monitoring of driver, vehicle, and pedestrian behaviors. These techniques can measure safety-related variables that cannot be easily measured using existing sensors.
Some of these constrains were that the stations should be capable of operating both during the day and the night, that the stations should be mobile, and that the station should be weather resistant and capable of running across a wide range of temperatures. Furthermore, we focused on making the production process scalable; thus, we used only commonly available parts that could be easily assembled with minimal work where possible.
The most constraining aspect of the design was finding a power source capable of supporting a running time of one week that was both low cost and mobile. While a solar power solution was considered feasible and could allow the system to run more than one week, that approach suffers a number of limitations. First, the complexity and price of the system would be unacceptably increased.
the center of the board. It has a PCI slot for the capture card and is one of the smallest form factors available. Memory—AMPO 512MB 240-Pin DDR2 SDRAM 533 (PC2 4200), $40.00 This is the amount and type of random access memory (RAM) necessary for this motherboard and application. Although less RAM would minimize cost, this amount will allow for buffering of video in memory. Buffering of video will minimize the amount of time that the hard drive will need to spin and reduce power consumption.
Battery Charger—BatteryMinder 12117, $50.00 This charger has a large enough capacity while also providing functions that can enhance battery life, such as desulfation and over/under charging protection. It has a visual monitor that tells charge state and quick-connect cords. Battery Cases—4 Group 27 Battery Boxes, $45.00 These vented boxes are designed to provide good protection from the elements and also provide mounting straps. 3.
possible to 10 feet. Only a single video frame is needed with the un-occluded markers for the calibration. The markers need not stay in place for the entire duration of the video. The calibration process can be completed at any time during the video recording as long as the camera remains fixed throughout. Once the calibration process is completed, a conversion factor between distance in pixels in the camera image and distance along the calibration plane in the scene can be determined.
objects of interest. It will only guarantee to find images without moving objects. The automatic algorithm will select scenes where vehicles are temporarily stopped. However, the user can use the automatic algorithm and then proceed by manually selecting a different background image if the automatic selection is inappropriate. Each image is converted to gray scale prior to point-wise comparison between the background image (B) and the scene image (I). The result is an intensity difference image (D).
4.2 Object Identification and Tracking The output of the object identification stage is an object track list. Each object that has been identified across multiple frames is assigned a unique object track. Each object track is thus a list of references to the relevant object entries in the object detection list. In order to link together objects across frames, each object in each frame of the recorded digital video is compared to each object in the previous frame.
The first step is to detect occlusions. It is assumed that each object that enters the scene also exits the scene. If this assumption holds, any object track that either ends or begins within the object frame rather than at the edges of the frame can be assumed to have been occluded. Thus, the position of the first and last object of each object track is examined. If the first object of the track is near the image edge, the track is classified as having entered the scene.
Figure 5.1.
These intersections and camera positions were chosen in order to test the system on a variety of backgrounds. Each background represents a different type of potential source of false alarms in the object detection process. The first urban intersection has a large tree in the background. The second and third urban intersections contain a view of the road crossing the lane of interest. The last urban intersection contains a large window that reflects light from passing vehicles.
A typical velocity profile of a vehicle that comes to a complete stop at an intersection is shown in Figure 5.3. Given the standard camera–intersection configuration, most vehicles are already decelerating when entering the system’s field of view. However, it is typical for vehicles that come to a complete stop to exhibit a second stage of much more rapid deceleration as they approach the stop sign. Coming to a complete stop at any of the studied intersections is a rare event.
Figure 5.4. Velocity profile of a vehicle failing to stop As can be seen in Figure 5.5, vehicles in the far lane, traveling in the opposite direction (in the negative direction), can also be tracked. Because the camera view does not contain the stop sign on the right-hand side of the intersection, only acceleration away from the intersection is tracked. Figure 5.5. Velocity profile of a vehicle in the far lane Figure 5.
Figure 5.6. Average velocity histogram for all object tracks The object tracts with an absolute average velocity of greater than 5 miles per hour are shown in Figure 5.7. It is clear that the vehicles with an average absolute velocity greater than 12 miles per hour are traveling in the far lane given the shape of the velocity profiles. Note that the speed limit on this road is 15 miles per hour.
The results of the video analysis demonstrate the ability of the system to provide data that can potentially be useful for the understanding of driver behavior. The primary advantage of this approach is the ease with which measurements can be made. The system requires only a single video camera and a simple calibration procedure. Another advantage of the system is that a large amount of data can be collected relative to the effort and expense involved.
intersection. Work is also progressing on 3D visualization of the intersection. The latest development version of TDAA can be obtained using Subversion software (http://www.sourceforge.net/subversion) from our Subversion server (svn://hcvl.hci.iastate.edu/DOT/TDAA). The remainder of this section walks through the graphical dialog boxes associated with the video capture and image extraction process, the image processing process, and data visualization process.
Figure 6.2. Capture options dialog A Project file selection dialog (see Figure 6.3) will allow the selection of an existing Project file or the specification of a new Project file. All Project files must be stored within the ‘TDAA/FILES/PROJECT’ directory for correct operation of the software. Figure 6.3. Choose project dialog Once the project has been selected or created, the name of the project will be displayed in the ‘Current Project’ textbox, as shown in Figure 6.4 for the Project file ‘example.tdaa’.
Figure 6.4. Choose dialog—project selected Videos can be added to the project by clicking on the Add a Video to the Current Project button. This will bring up the Video Source Panel shown in Figure 6.5. Figure 6.5. Video source selection dialog Video can be captured from a Digital Video camera, taken directly from a Digital Video file stored on the computer, or taken from a series of images stored on the computer.
Figure 6.6. Video source from Digital Video (DV) file selected The Digital Video File Input panel will then appear, as shown in Figure 6.7. Figure 6.7. Digital video file dialog The Choose Digital Video File button will then bring up a DV file selection dialog box such as that shown in Figure 6.8.
folder by default. However, DV files located elsewhere on the computer may also be chosen. All DV files must have the ‘.dv’ extension for proper processing. Figure 6.8. DV file selection dialog Once a DV file has been selected, the filename and the full path will be displayed in the Digital File Input panel. As can be seen in Figure 6.9, the Frame Extraction panel will also become visible. Figure 6.9.
Figure 6.10. Starting frame extraction Clicking on the Check Frame Count button will query the frame extraction process for a progress update. As can be seen in Figure 6.11, the total number of extracted frames will be displayed and the last frame extracted from the video will be displayed on the right hand side of the Frame Extraction panel. All frames will be extracted unless the Stop Processing button is pressed. Only those frames extracted prior to halting the process will be included in the Project.
To capture video from images already extracted from a Digital Video file, the From Images radio button and the Select button must be clicked, as shown in Figure 6.12. Figure 6.12. Video source from individual images previously extracted from DV The Load Existing Images panel will then appear, as shown in Figure 6.13. Figure 6.13.
In order to load the images into the Project, the first file in the image sequence must be selected by clicking the Load Starting Image button. The Choose Staring File dialog, shown in Figure 6.14, will appear and allow the specification of the location and name of this file. Images may have any prefix, but must contain a numerical suffix that indicated the image sequence. The suffix must be zero padded to 5 digits or 9 digits for accurate loading. Images must be in JPEG format. Figure 6.14.
To capture video directly from a Digital Video camera, the From Camera radio button and the Select button must be clicked, as shown in Figure 6.16. Figure 6.16. Video Source from DV acquired directly from DV capable camera The Camera Setup panels will then appear and ask that the digital camera be connected, rewound, and started, as shown in Figures 6.17–6.19. Because the exact procedure required to connect a camera to the computer varies, the camera instructions should be consulted. Figure 6.17.
Figure 6.18. Find Starting Frame panel—rewind or fast forward camera Figure 6.19. Capture Video panel—start the video data transfer The Digital Video Capture Progress dialog box will appear and extraction of the video will begin (see Figure 6.20). Progress can be monitored on the camera itself. Once the end of the video has been reached, the Stop button must be clicked. After a short delay, the capture program will shut down and at that point the Exit button can be clicked to close the dialog box.
Figure 6.20. Digital video capture progress dialog The Frame Extraction panel will appear, as shown in Figure 6.21. Individual frames from the captured video can then be extracted in the same way as described above for the From File video source (see Figures 6.10 and 6.11). Figure 6.21.
Once at least one video has been added to the project, the video(s) in a project can be played by clicking the Play Video button shown in Figure 6.22. Note that video acquired using the From Images methods described above cannot be played currently. Figure 6.22. Playing a selected video from the Capture Options dialog The external program playdv is executed to play the videos in the project, as shown in Figure 6.23. Video replay can be halted by closing the application. Figure 6.23. Video replay dialog 6.
Figure 6.24. Data analysis dialog The first processing step is object detection. In the Object Detection panel, the detection parameters can be set by clicking the Set Options button. The Detection Options dialog box will appear and parameters can be set by entering new numbers into the textbox (Figure 6.25). By default, the minimum vehicle area and minimum object area are set to ignore pedestrians. Lower the minimum object area to detect pedestrians as well as vehicles.
be selected. Clicking the Auto-Detect Background button will automatically select a background. Some manual adjustment may be necessary. Figure 6.26. Background selection dialog Objects will not be detected beyond the boundaries. The boundaries help eliminate false alarms and speed up the image processing. As can be seen in Figure 6.27, clicking the Upper Boundary button will bring up a crosshair. A click on the image itself will set the upper image boundary. Figure 6.27.
Figure 6.28. Upper-most area shaded in gray A similar process is followed for the lower boundary. The final result is shown in Figure 6.29. Only a small strip of the original image is likely to contain vehicles and the analysis is limited to that region. Figure 6.29.
Once an image that contains no objects of interest is found, the background can be set using the Set Background button. The text “Current Background Image” will be displayed on the image that is selected as the background image as confirmation of the selection (see Figure 6.30). Figure 6.30. Selection of background image without vehicles Once a background image is set, the starting image must be set using the Set Start Frame button.
Once a starting image is set, the ending image must be set using the Set End Frame button. The text “Ending Image” will be displayed on the image that is selected as the ending image as confirmation of the selection (see Figure 6.32). The ending image can be the same as the background image or the ending image can be after the background image. However, the ending frame cannot be before the background image. Clicking the Done button will close the dialog. Figure 6.32.
Object detection is a very time consuming process and can take many hours depending on the length and number of videos to be processed. Once completed, the textbox will provide a summary of the results (see Figure 6.34). The results can be reviewed by pressing the Review Results button. Figure 6.34. Object detection completed The Review Results dialog allows inspection of each frame (see Figure 6.35). Detected objects are enclosed by a red box.
The second processing step is object track identification. In the Object Identification panel, the identification parameters can be set by clicking the Set Options button. The Identification Options dialog box will appear and parameters can be set by entering new numbers into the textbox (Figure 6.36). The default settings work well for the standard camera–intersection configuration. Figure 6.36.
The Review Results dialog allows inspection of each frame (see Figure 6.38). Detected objects are enclosed by a red box. A green number indicating the object track is printed at the center of the box. The object track number should remain the same across multiple frames for the same object. When finished reviewing the results, click the Done button to close the dialog box. Figure 6.38. Review results dialog following object identification The third processing step is object track filtering.
The results can be reviewed by pressing the Review Results button in the Shape Correction panel. As can be seen in the left panel of Figure 6.40, the object box has been corrected for occlusion by the edge of the frame. In the right panel of Figure 6.40, the uncorrected center is biased to the right of the true center. Figure 6.40. Review results dialog with and without shape correction The final processing step is occlusion detection and correction.
6.3 Data Visualization The primary dialog box for data visualization is shown in Figure 6.42. The first step is to choose an existing project file by clicking the Choose button in the Current Project Panel. Once loaded, the first video frame and the velocity profile of the first object will appear in the Object Display panel and the Graph Display panel, respectively (see Figure 6.43). Figure 6.42. Data visualization dialog Figure 6.43.
The tracked object can be inspected one frame at a time by sliding the Frame toolbar (see Figure 6.44). The Object Display panel and Graph Display panel will automatically update. The Object Display panel shows the position of the tracked object using a red box. The Graph Display panel plots all of the object data as a blue curve. A red dot is also plotted corresponding to the observed value for the particular frame visualized in the Object Display panel.
When multiple objects are visible in the Object Display panel, any of the objects can be selected by clicking the Select Object button. A particular object can then be selected by positioning the crosshairs on the object within the Object Display panel and clicking on the object (see Figure 6.46). Figure 6.46. Object selection using crosshairs A variety of plot types can be displayed on the Graph Display panel.
The plots in the Graph Display panel can show either filtered and corrected or unfiltered and uncorrected data depending on the Use Equalized Length Data button state (see Figure 6.48). Figure 6.48. Visualization with and without shape correction and filtering Histograms of data can be plotted in the Graph Display panel using the options available in the Histogram Options panel (see Figure 6.49). The Area, Time, and Range slider bars set minimum values that filter out data from the histogram.
Multiple object tracks can be visualized simultaneously by clicking the Select Histogram Range button. Crosshairs will appear and two clicks in the Graph Display panel will set the lower and upper limits of the tracks to be displayed. The Single Variable Plot panel options will be applied to plot each of the tracks resulting in a joint visualization of vehicle tracks (see Figure 6.50). Figure 6.50. Multiple vehicle graph based on selected histogram range A Selected Object toolbar will appear.
Global Positioning System (GPS) coordinates can be entered to automatically obtain aerial images from TerraServer-USA or Google Maps. If the GPS coordinates and orientation of the camera are known, the vehicle positions can be plotted on the aerial images as they pass through the intersection (see Figure 6.52). Figure 6.52. Combined camera and aerial view of a rural intersection 7. CONCLUSIONS AND RECOMMENDATIONS Commercially available instruments for road-side data collection are significantly limited.
While there is significant risk in funding the development of technology through university collaborations, there is an enormous benefit. All software and hardware designs become available for the Department of Transportation to use without the additional costs levied when purchasing technology from commercial entities. Furthermore, the technology developed can be highly customized to the needs specified by Department of Transportation employees.