Handling Pedestrians in Crosswalks Using Deep Neural Networks

4691 words (19 pages) Essay in Information Technology

08/02/20 Information Technology Reference this

Disclaimer: This work has been submitted by a student. This is not an example of the work produced by our Essay Writing Service. You can view samples of our professional work here.

Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of UK Essays.

Handling Pedestrians in Crosswalks Using Deep Neural Networks in the IARA Autonomous Car

Abstract:

In this paper, we are working on detecting and handling pedestrian on crosswalk using deep neural network for Intelligent Autonomous Robotic Automobile (IARA) car, which depends on camera and LIDAR sensor. Location of crosswalk on road will mention manually on IARA map. Pedestrian will be detected using camera mounted on roof of car, it takes help of convolutional neural network. Here, after detecting pedestrian on crosswalk with help of LIDAR sensor it will show the position of pedestrian on crosswalk in IARA map. Then, it will show message that crosswalk is busy otherwise free. The decision is made by High-level decision maker subsystem. This High-level decision maker will control car according to different situations on crosswalk or on free road. Pedestrian handler subsystem is developed in IARA, in which car was driven on road in various laps and complex circuit with different crosswalk. In all situations, the Pedestrian Handler deal with pedestrian as expected without any human intervention.

 

Introduction:

In last few year, autonomous cars have become popular topic in research. Most important thing about autonomous car is to detect and handle obstacle safely come in environment. Because safety is most important concern in autonomous cars. So, while detecting obstacle they might be static or dynamic obstacle. Static means obstacle which are not movable, and they do not change their position over time. Example of such obstacles are curbs, road delimiters and holes. Next is dynamic obstacles which are movable obstacles. These obstacles may result on car’s trajectory. Example of these obstacles are car, vans, buses, trucks, motorcycle, cyclist, pedestrian and animals.

As mentioned above that detecting and handling obstacle is crucial task in any autonomous car. Here in this paper we are focusing on detecting and handling pedestrian on crosswalk. For doing this task of detecting and handling pedestrian on crosswalk we need to find out position of pedestrian, detecting if any pedestrian available there or not. And according to that situation autonomous cars should behave on crosswalk.

For detecting pedestrian there are different ways to find out. Many of them are based on basic technique of camera images but this is preferred by particular features. Flaw with is technique is not providing precise position of pedestrian. Few of them are based on LIDAR point clouds, these are sensors for giving accurate 3d position of pedestrian, but these are expensive hardware and are difficult to tune and vulnerable to false positives which gives lower discriminative features. Some of others are rely on fusion of image and LIDAR point cloud which gives accurate results. Moreover, most of these techniques were not applied on real autonomous car, its only applied on simulators. Also, none of techniques did not mention flaws with their techniques of detecting and handling pedestrian in crosswalk.

In this paper, we propose subsystem whish used to detect and handle pedestrian in crosswalk with help of deep neural networks for IARA (Intelligent Autonomous Robotic Automobile). This technique is depend on camera and LIDAR data fusion. This method is very efficient that it detects pedestrian’s position in accurate manner. It gives 3d view of pedestrians and their position. Also, its estimation cost is less as compared to other techniques.

In Pedestrian Handler, there are two phases offline and online phase. In offline phase, location or position of crosswalks are manually entered or noted into IARA’s map. In online phase, detection and handling of pedestrian is done with help of Convolutional Neural Network. The CNN takes input as image of pedestrian and then it gives output of bounding box around the each of pedestrian in real time. With help of CNN pedestrian’s position is mounted in map of car. For this, LIDAR focuses are changed from the map to the image reference, and the focuses that are inside bounding boxes and hit pedestrians are separated. Accordingly, if a person on foot point is inside the crosswalk territory, the crosswalk is set as occupied. At long last, a bustling crosswalk message is distributed to the High-Level Decision Maker subsystem. This subsystem builds up an objective state in the way, as indicated by the crosswalk condition, which is utilized to produce a direction that leads the vehicle appropriately through the crosswalk region – diminishing speed or ceasing in occupied crosswalks.

Here, CNN uses You Only Look Once V2(YOLOv2). YOLOv2 is state of art to detect object. YOLOv2 is trained with CNN on ImageNet and Common Object in Context datasets. YOLOv2 is being compared with R-CNN, ResNet and SSD so result shows that YOLOv2 is 2 to 10 time faster than all of these with providing better accuracy of detection of objects from datasets. Besides, it tends to be kept running at an assortment of image sizes to give a smooth tradeoff among speed and precision.

Moreover, it very well may be kept running at an assortment of image sizes to give a smooth tradeoff among speed and accuracy. To assess the execution of the Pedestrian Handler, IARA was driven self-governing along a circuit of about 900m in the ring street of the primary grounds of Federal University of Espírito Santo (Universidade Federal do Espírito Santo – UFES). IARA was driven for 22 laps in the test circuit and 88 sections through crosswalks, of which 37 were through occupied crosswalks. In all sections through crosswalks, the Pedestrian Handler managed pedestrians not surprisingly, i.e., with no human mediation.

Related Work:

A few methodologies for pedestrian detection were recently proposed in the writing. The vast majority of them depend completely on images due to their discriminative highlights. Hilario propose a calculation to distinguish and track pedestrian utilizing stereo vision and dynamic form models for self-sufficient vehicles. Benenson present two calculations to build pedestrian detection’s speed by better dealing with scales in monocular images and better misusing the profundity data in stereo images, which keep running at 50fps. Angelova portray a pedestrian detection calculation that falls profound neural systems and that keeps running progressively at 15fps. As of late, Molchanov use YOLO profound neural system to distinguish pedestrians in setting of video observation. These arrangements profit by discriminative image highlights, yet don’t give exact 3D position. A large portion of them could be melded with stereo calculations, so as to enhance 3D position estimative. In any case, the most extreme profundity encoded by images is restricted by the stereo camera standard and cameras are more inclined to commotion than LIDAR.

A portion of the methodologies for pedestrian detection depend just on thick LIDAR point mists due to their exact 3D position estimative. Premebida propose procedures to misuse 2D highlights from LIDAR and contrast different classifiers with distinguish pedestrians utilizing just LIDAR information. Wang depict a strategy that bunches 3D LIDAR focuses and hopeful groups are assessed by a Support Vector Machine (SVM)classifier. These arrangements acquire exact 3D position, yet are disfavored by less discriminative highlights, which makes them precarious to tune and inclined to false positives. In addition, they depend on an extravagant equipment.

A portion of the methodologies for pedestrian detection depend on combined image and LIDAR point mists, so as to enhance detection exactness. Szarvas propose a technique that uniquely in contrast to our right off the bat identifies pedestrians in the LIDAR point cloud, and later tasks locales of intrigue (most likely possessed by pedestrians) in the image plane to utilizes a CNN and affirm whether the given district is involved by a pedestrian or not. Correspondingly to Szarvas, Premebida present a strategy that identifies, tracks and groups pedestrians and vehicles in a 2D LIDAR space; ventures areas of enthusiasm for the image; and uses an AdaBoost classifier to build arrangement execution. García et al. depict a technique to recognize vehicles and pedestrians in street conditions, in which grouping is performed utilizing camera and 2D LIDAR information freely, which are later melded utilizing a Kalman filter. Dou et al. portray a strategy that distinguishes pedestrians in the image utilizing the Faster R-CNN and takes out false positives by combining the image with the LIDAR point cloud. In spite of these techniques are like our own, the greater part of them were not assessed on a genuine vehicle, but rather on recorded sensor information as it were. What’s more, none of them referenced the issue of identifying and taking care of pedestrians in crosswalks.

IARA’s Architecture:

Fig. Block Diagram of IARA’s software main subsystem

The Intelligent Autonomous Robotic Automobile (IARA) is an automated vehicle stage dependent on a Ford Escape Hybrid, which was adjusted to empower electric activation of controlling, throttle and brake; give the vehicle odometry; and supply control for PCs and sensors. The incitation on the directing wheel is made by the electric power controlling engine, the activation on the throttle is made through the pedal wire and the activation on the brake is activated by a water driven straight actuator framework associated with the brake pedal. The power is taken from the half and half framework battery and changed over from 330V DC to 120V AC. IARA has one Velodyne HDL 32-E LIDAR; one Trimble RTK GPS; one Xsens MTi IMU; one Bumblebee XB3 stereo camera; and one PC Dell Precision R5500 with 2 Xeon X5690 six-center 3.4GHz and one Nvidia Quadro.

  1. Mapper:

Mapper is used to build occupancy grid map which shows us obstacles in environment. There are map cells in Mapper which is used to store probability of mentioned area being occupied by obstacles. Here, Mapper operates in offline and online modes. Offline mode takes input from various sensors like odometer, LIDAR, IMU etc. These sensors keep records of different areas in car. Online mode takes input from IARA’s state and LIDAR data and updates map cells values.

  1. Localizer:

It is subsystem which predicts IARA’s state to map. Here, first it takes input from position of car with help of GPS. At each next step, it takes input from IARA’s previous location, map and current location to match with offline map. For matching its position with offline map, it uses particle filter algorithm.

  1. Path Planner:

Subsystem manufactures a way. It gets as input IARA’s present state and a street definition information document (RDDF). The RDDF is made out of an arrangement of IARA’s states, which were put away while IARA was led by a driver along a way of intrigue.

  1. Pedestrian Handler:

This subsystem is main focus of this paper. In this subsystem, actual detection of pedestrian happens in crosswalk. It takes input from camera and LIDAR senor data. After taking input it detects pedestrian from input image with help of CNN, then it predicts pedestrian’s position in map by LIDAR and camera. On that basis, it will make decision that crosswalk is busy or not. Then it sends message to Behavior Selectors to behave car according.

  1. Behavior Selector:

This subsystem is High-level decision maker which makes decision of car how to behave in different situations. It takes input from map, IARA’s current location, path, busy crosswalk and different annotation on road like bumps, crosswalk, speed barrier, traffic lights etc. After taking input it gives goal state in path some seconds beyond of IARA’s current state. According to goal state it changes car’s velocity to control according to situation.

  1. Motion Planner:

This subsystem is used to make trajectory. It takes input from map, current location of car, path and goal state. Then, it gives us trajectory from source to destination. The direction is created by a succession of control directions, everyone involved straight speed, directing wheel point and execution time.

  1. Obstacle Avoider:

Obstacle avoider verifies and make changes in trajectory depends on obstacles occurred in path of car. It takes input from map, current location of car and path. It always looks for obstacles which comes in path of car. If obstacle occurs, then linear velocity reduces and according to behave.

  1. Controller:

Subsystem processes the directions that will be sent specifically to IARA’s actuators in the guiding wheel, throttle and brake, utilizing a Proportional Integral Subordinate (PID) approach. It gets as information the direction and odometer information. At that point, it processes a blunder measure that accounts for how far real IARA’s guiding wheel point and speed are from those predefined in the direction. The blunder is utilized to process the incitation directions that will be connected straightforwardly on IARA to limit this blunder.

Pedestrian Handler:

  1. Crosswalk Position Estimation Based on Map:

Here, in offline phase crosswalk location is entered manually in map. This subsystem could utilize a CNN to gauge crosswalks’ reality positions on images, with respect to model utilizing data from strategies. IARA’s guide is spoken to by an inhabitance square network, in which every cell has a goal of 20 cm and stores the likelihood of the related area being possessed by an impediment. Fig. demonstrates a stretch of IARA’s guide. Fig. demonstrates the pedestrian detection in the image. Here, manually noted position of crosswalk marked by red color circle with stop line just after red circle. That stop line says to IARA must stop on that line if crosswalk is busy. In the online stage, commented on crosswalks’ positions are questioned for a crosswalk in IARA’s way at a scope of 100m from IARA’s state.

  1. Pedestrian Detection using CNN:

Detection of pedestrian on crosswalk is done by YOLOv2 with help of CNN. It takes input from camera mounted on roof of car and LIDAR sensor data. YOLOv2 was trained by different sizes of images from 320*320 up to 602*602 with multiple of 32 which gives us good speed and accuracy of images. Here, YOLOv2 uses ImageNet and Common Objects in Context (COCO) datasets.

Fig. Accuracy of Prediction of crosswalk conditions

The YOLOv2 engineering connected in this work is exhibited in above Fig. It is fundamentally made out of different convolutional layers with cushioning, some maximum pooling layers and some explicit layers. The yield segment exhibited in above Fig, is worried to the input size of 416*416 and vary from other input sizes. Here, YOLOv2 takes input from camera and LIDAR as image gives output of bounding box around pedestrian which is detected. YOLOv2 detects 80 different classes like boat, cat, dog, person and traffic lights etc. Then, objects get categorized according to that it put burgundy boxes around object with different colors.

  1. Pedestrian Position Estimation using Sensor Data Fusion:

With help of LIDAR fusion data, we can easily predict position of pedestrian in crosswalk. First, from 360°-point cloud we can get LIDAR points inside camera field view. Second, points hit to road are discarded. While implementing it two consecutive points are used from LIDAR to define vector. If inclination of vector is less than 10 degrees, then it consider that points are hitting to road so we can discard it. Starting from bottom all points are vertically align and each pair of point is collected. Third, these points are transformed into images. For this, focuses are changed from circular directions to cartesian arranges and interpreted from the LIDAR reference to the camera reference. At that point, focuses are anticipated to the image plan utilizing the camera projective change given by Equations (1) and (2). This transformation includes to convert cartesian coordinates which are points (x,y,z) to image points ( , ) which are generated from focal length of camera ( , ), pixel size also be included in this calculation  which is ( , ).

= +

(1)

= − +

(2)

Fourthly, focuses that are inside jumping boxes (anticipated by the CNN) and hit pedestrians are filtered. For this, just the focuses inside boundryAt long last, if the separation of each filtered point to the focal point of the crosswalk (circle that denotes the crosswalk viable zone) is littler than the span of the crosswalk circle, the crosswalk is set as occupied. boxes are chosen. A portion of these focuses may hit another question, as appeared, and are filtered utilizing the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) calculation. The chose focuses are right off the bat bunched utilizing DBSACN and the greatest group of focuses is taken.

  1. Behavior Selection at Crosswalk:

Fig. Behavior Selector’s state machine that handles pedestrians in crosswalks

In Free Running state, commented on crosswalks’ positions are consistently questioned for a crosswalk in the way at a scope of 100m. At a Free Crosswalk condition, the objective state is processed to diminish the vehicle’s speed to 2.5m/s amidst the crosswalk compelling region. At a Busy Crosswalk condition, the state machine moves to the Stopping state.

In Stopping state, the goal state is built up to stop the vehicle with the front lined up with the crosswalk stop line. At a <0.15 and <2 condition – when the vehicle’s speed diminishes to under 0.15m/s and the separation the crosswalk stop line diminishes to under 2m – the state machine moves to the Stop state. On the off chance that the pedestrian leaves the crosswalk zone before the condition above is achieved, the state machine returns to Free Running state.

In Stop state, an order is sent to the Motion Planner to totally stop the vehicle and the state machine moves directly to the Stopped state. At that point, the vehicle is thought to be halted and the objective state remains being characterized to stop the vehicle with the front lined up with the crosswalk stop line. Additionally, a catch in the interface is enacted, which enables the human administrator to order the vehicle to resume to the Free Running state, if there should arise an occurrence of a bogus detection of a pedestrian in the crosswalk.

In Stopped state, the vehicle stays halted and the objective state is registered utilizing the default way to deal with instate the vehicle’s development quicker. At a Free Crosswalk and >2 condition – when the crosswalk stays free for a period more prominent than 2s – the state machine moves to the Initialize Movement state. Two seconds is the time that the vehicle will trust that another pedestrian will go into the crosswalk territory, which is characterized by the Brazilian Defensive Driving Manual.

In Initialize Movement state, the goal state is built up utilizing the default approach. At a >0.5 and > condition – when the vehicle’s speed increments to more noteworthy than 0.5m/s and the vehicle’s state outperforms the final turning point ( ), the center of the crosswalk successful region – the state machine moves to the Free Running state. On the off chance that another pedestrian goes into the crosswalk region before the condition above is achieved, the state machine goes directly to the Stop state. In the Free Running State, when the vehicle achieves the NR on a Free Crosswalk, it accepts to have the inclination and will continue moving if a pedestrian methodology the crosswalk zone. The Obstacle Avoider can stop the vehicle to stay away from an impact whenever in the state machine.

 

Experimental Methodology

To check how autonomous car, behave with Pedestrian Handler system, IARA was diver automatically on ring road of Federal University of Espirito Santos (UFES). While IARA was driven that time one safety diver was there on driving seat to control car incase of emergency and one person was seating beside diving seat to monitor all aspects of IARA while driving on road.

The test circuit has a length of roughly 900m and contains 3 unique crosswalks. The main crosswalk is on a two-way road and was navigated twice in each lap. The second and third crosswalks are en route avenues and were navigated once in each lap. Because of hardware restrictions, images were estimated to 416×416, which brought about a casing rate of 8fps.

Results

IARA was driven self-ruling for 22 laps in the test circuit about 20km and 88 sections through crosswalks (4 in each lap), of which 37 were through occupied crosswalks. The exactness of expectations of crosswalks’ conditions, in wording of true positives (TP), true negatives (TN), false positives (FP) what’s more, false negatives (FN). TP is the quantity of crosswalks’ conditions effectively anticipated as occupied, TN is the quantity of crosswalks’ conditions effectively anticipated as free, and FP and FN are characterized in like manner.

the 37 occupied crosswalks were anticipated as occupied and the 51 free crosswalks were anticipated as free, that is, the Pedestrian Handler could gauge crosswalks’ conditions with 100% of exactness. Too, in all sections through crosswalks, there were no mediation from the security driver.

A video is captured while performing this research which shows us how IARA behave in different situation in real time https://youtu.be/RnsiwSZ_9RA. The Pedestrian Handler distinguishing a pedestrian when he comes nearer from behind the shrub. At that point, the vehicle stops with the front lined up with the crosswalk stop line. After the crosswalk zone stayed free for two seconds, the objective state is built up utilizing the default approach and the vehicle navigates the crosswalk territory. Demonstrates the objective state characterized to stop the vehicle with the front adjusted with the crosswalk stop line. Demonstrates the pedestrian detection in the image. This outcome relates to the time of the video somewhere in the range of 0:40 and 1:10s.

Demonstrates a circumstance in which there is no pedestrian in the crosswalk zone. Demonstrates the objective state built up to diminish the vehicle’s speed to 2.5m/s amidst the crosswalk zone. Indicates two pedestrians distinguished in the image; be that as it may, they don’t meddle in vehicle’s speed, since they are outside of the crosswalk territory. This outcome relates to the time of the video somewhere in the range of 1:12 and 1:25s.

Demonstrates the Pedestrian Handler identifying a pedestrian at a further separation. Following two seconds, the objective state is processed utilizing the default approach and the vehicle navigates the crosswalk. Demonstrates the objective state built up to stop the vehicle with the front lined up with the crosswalk stop line. Demonstrates the pedestrian detection in the image. This outcome compares to the time of the video somewhere in the range of 1:25 and 1:50s.

Demonstrates the Pedestrian Handler on a substantial activity circumstance. The vehicle stays quit sitting tight for a free crosswalk for a period more noteworthy than two seconds, which 7 pedestrians and 1 cyclist travel through the crosswalk. Besides, a few drivers try not to consent to the Brazilian Traffic Code, continuing the development before the pedestrian completes its movement through the crosswalk. Demonstrates the objective state registered to continue the vehicle’s development on a free crosswalk. Demonstrates three pedestrians and one cyclist recognized in the image. This outcome relates to the time of the video somewhere in the range of 1:50 and 2:30s.

Demonstrates a circumstance in which the vehicle stops for a pedestrian and resumes development at a free crosswalk. At that point, another pedestrian begins going through the crosswalk and the vehicle stops once more. Demonstrates the objective state built up to stop the vehicle with the front lined up with the crosswalk stop line. Demonstrates the main pedestrian’s detection in the image. Demonstrates the objective state characterized to stop the vehicle. Demonstrates the third pedestrian’s detection in the image. This outcome relates to the time of the video somewhere in the range of 2:30 and 3:05s.

Conclusion:

In this paper, we proposed a subsystem to manage pedestrians in crosswalks for the IARA self-governing vehicle, which utilizes a profound neural system, and utilizes camera and LIDAR information combination. In the Pedestrian Handler, pedestrians are recognized in the image utilizing the YOLOv2 CNN; pedestrians’ situations in the guide are assessed utilizing melded image and LIDAR point cloud; the crosswalk condition is anticipated as occupied or free; and a bustling crosswalk message is distributed to the Behavior Selector, in the event of a bustling crosswalk. The Behavior Selector changes the posture and speed of the objective state in the way, as per the crosswalk condition, which is utilized by the Motion Planner to deliver a direction that drives the vehicle effectively through the crosswalk zone.

The Pedestrian Handler was assessed on IARA, which was driven self-governingly along a circuit in the ring street of UFES principle grounds. IARA was driven for 22 laps in the circuit and 88 sections through crosswalks, of which 37 were through occupied crosswalks. In all entries through crosswalks, the Pedestrian Handler carried on legitimately with no human mediation.

A course to future work is to build up a Pedestrian Tracker subsystem, which would enhance the Pedestrian Handler exactness by deducing if a pedestrian is leaving or going into the crosswalk territory. Another course to future work is to build up a programmed Crosswalk Detector subsystem, which would lessen manual work.

 

References:

  1. 2017 IEEE Winter Conference on Applications of Computer Vision (WACV)
  2. 2016 International Conference on Intelligent Transportation, Big Data & Smart City (ICITBS)
  3. Pedestrian Detection Based on YOLO Network Model
  4. A multi-class pedestrian detection network for distorted pedestrians Jiao Zhang ; Jiangjian Xiao ; Chuanhong Zhou ; Chengbin Peng 2018 13th IEEE Conference on Industrial Electronics and Applications (ICIEA)
  5. Multi-pedestrian tracking for far-infrared pedestrian detection on-board using particle filter Ruilin Xu ; Qiong Liu 2015 IEEE International Conference on Imaging Systems and Techniques (IST).
  6. The Optimal pedestrian detection algorithm based on dynamic adaptive region convolution model Dong Qiu; Deyu Liu
  7. Pedestrian detection based on YOLOv2 with skip structure in underground coal mine Lin Wang; Weishan Li; Yuliang Zhang; Chen Wei
  8. A study on occluded pedestrian detection based on block-based features and ensemble classifier Wu Bin; Qu Shiru
  9. Pedestrian detection and tracking using particle filtering Prateek K. Gaddigoudar ; Tushar R. Balihalli ; Suprith S. Ijantkar ; Nalini C. Iyer ; Shruti Maralappanavar
  10. Pedestrian Detection Method Based on Faster R-CNN Hui Zhang; Yu Du; Shurong Ning; Yonghua Zhang; Shuo Yang; Chen Du
Get Help With Your Essay

If you need assistance with writing your essay, our professional essay writing service is here to help!

Find out more

Cite This Work

To export a reference to this article please select a referencing stye below:

Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.

Related Services

View all

DMCA / Removal Request

If you are the original writer of this essay and no longer wish to have the essay published on the UK Essays website then please:

McAfee SECURE sites help keep you safe from identity theft, credit card fraud, spyware, spam, viruses and online scams Prices from
£124

Undergraduate 2:2 • 1000 words • 7 day delivery

Order now

Delivered on-time or your money back

Rated 4.6 out of 5 by
Reviews.co.uk Logo (188 Reviews)