<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Real time object detection for blind people. International
Journal of Advance Research in Science and
Engineering</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Real-Time Object Detection And Identification For Visually Challenged People Using Mobile Platform</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Neeraj Joshi</string-name>
          <email>nirazjoshi007@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shubham Maurya</string-name>
          <email>webdshubham@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sarika Jain</string-name>
          <email>jasarika@nitkkr.ac.in</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>National Institute of Technology</institution>
          ,
          <addr-line>Kurukshetra</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2018</year>
      </pub-date>
      <volume>7</volume>
      <issue>1</issue>
      <fpage>306</fpage>
      <lpage>316</lpage>
      <abstract>
        <p>The life of a blind person is quite challenging. They face lots of problems in their daily life especially when they are traveling from one place to another. Lack of vision leads them to accidents. Lots of work has already been done in this field of real-time object detection and recognition for visually challenged people. In layman's terms, we can say that object detection is finding where the object is present and object recognition is finding what is the object present in the image or in the surroundings. Here in this paper, we have performed theoretical as well as experimental analysis of the existing works in the form of a comparative table in which the parameters that have been taken for comparison are dataset, algorithm, and average precision. In addition that, we have identified the research gap in the existing works for visually challenged people and for the fulfillment of the research gap we have introduced about our research agenda which shows the method that could have been used to make a model that is much more feasible, i.e to make a model that can work on low computation devices like a smartphone.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>In human beings, the eye is the major sensory
organ. It helps to visualize the world around us. Without
this, one wouldn’t be able to find the difference between
day and night, blue and black. So, we can assume how
difficult it is for the visually challenged to travel from
one place to another and to recognize the object around
them. According to the fact sheets of WHO (published
on 8 Oct 2020)[6]. Globally, 1 billion people have a
problem with vision impairment. It includes all types of
impairment like trachoma, glaucoma, uncorrected
refractive index, cataract, age-related macular
degeneration, corneal opacity, diabetic retinopathy. So,
for making their life a little bit easy we can provide them
with the vision of a computer. We can provide vision to
visually challenged people by object detection and
recognition and by informing them about their
surroundings using some auditory device like
headphones etc.</p>
      <p>Object recognition is a kind of simple process
for human beings but for computers it is not that easy
task as it consists of a step-by-step process of
recognizing, identifying, and locating the objects with
input with a given degree of precision. Recognition
basically consists of classification and detection. Objects
can be divided into their respective classes by
performing three steps - feature extraction, localization,
and classification on the objects. In classification, the
algorithm recognizes the class of the object with a
degree of confidence. After classification, we know that
the particular class of the objects from which this object
belongs. Now, in detection, we put a bounding box
around the object in the picture.</p>
      <p>The main objective of this work is to present a
comprehensive and comparative analysis of the work
that has been done in the field of object detection for
visually challenged people. We will present here the
comparative analysis of the algorithms that have been
used in existing systems. Basically, we divide the object
detection algorithms into two categories, first, one
category is region-based object detection algorithms and
the second category is regression-based object detection
algorithms. The main advantage of regression-based
algorithms over region-based algorithms is that
regression-based algorithms work faster in comparison
to region-based algorithms. Here, we will present the
gap in research work on the basis of the papers that we
have read and we will propose our system to fill that
research gap.</p>
      <p>After analysis of the research papers related to object
detection and identification for visually challenged
people, we came to know that lots of work have already
been done in this field but we didn't find any model
suitable for low computation devices like a mobile
phone without any dependency(like an external server
for GPU). The models which are present in the research
papers either require additional hardware or server
connectivity because they are using algorithms that
require faster processing devices like GPU. They should
have used the algorithms which can easily work on a low
computation device.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Overview</title>
      <sec id="sec-2-1">
        <title>Here, we have presented a brief overview of the</title>
        <p>procedure and internal working of the object detection
and identification systems. Figure 1 shows the various
steps which are involved in the process of object
detection and identification.</p>
        <p>In order to detect and identify the objects, the first step
is the collection of datasets for the training and testing of
the model. A custom dataset can be created by manually
capturing and labeling the images. A number of datasets
for object detection are also available over the internet.
Few popular available datasets are MS COCO (Provided
by Microsoft), ImageNet, and Pascal VOC.</p>
        <p>After the successful creation of the dataset, the next
step is to divide the dataset into the training dataset and
the testing dataset. After the division of the dataset into
training and testing dataset, the next step is to choose the
object detection algorithm. All the existing object
detection algorithms can be classified into two
categories, the first one is a region-based object
detection algorithm and the second one is a
regression-based object detection algorithm. According
to the system requirement, algorithms can be chosen and
the system can be trained on the chosen algorithm. In the
next step, object detection is performed on the given
image. If the system detects any object using DNN,
identification is performed over the object.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Literature Review</title>
      <sec id="sec-3-1">
        <title>3.1 Review Methodology</title>
        <p>A platform like google scholar has been used for
searching the papers related to our work. The papers
have been searched on the basis of a few keywords like
object detection in real-time, object recognition for
visually challenged people using smartphones, and
models for visually challenged people. After getting lots
of research papers we have selected a few research
papers on the basis of the abstract of research papers. We
have read the introduction and conclusion of the selected
research papers then we have excluded some research
papers and finally selected the research papers which
were closely related to our work.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2 Related Work</title>
        <p>A lot of research work has already been done for
object detection and identification. Existing works show
how the problem of visually challenged people can be
solved. In existing works the devices which are used as
visual substitution takes the input from the surroundings
of the VI people and draws out the information about the
objects which are present in their surroundings after this,
the system notifies visually challenged people about
their surroundings by using some auditory device.
Existing work is comparable on different-2 parameters
but here we are performing the comparison of the
previous work on the basis of algorithms and datasets
that have been used in the existing systems. All the
researchers have used either a region-based algorithm or
a regression-based algorithm in their work.
Region-based algorithms are popular for their accuracy
and regression-based algorithms are popular for their
speed. For real-time object detection, most of the
researchers have used regression-based algorithms. In
short, regression-based algorithms like YOLO,
MobileNet SSD are more popular for object detection
nowadays. Here, after compiling a number of research
papers on the basis of some criteria (selection criteria
have been discussed in review methodology), we have
selected 7 research papers for presenting the comparative
analysis of the existing work. Out of these selected
research papers, 4 are using the regression-based object
detection algorithms, 2 are using region-based object
detection algorithms and 1 is using both region-based
and regression-based object detection algorithms.</p>
        <p>Some of the existing research work related to object
detection and identification for visually challenged
people have been discussed here –</p>
        <p>Prateek Agrawal et al.[1] has proposed a system
for bank cheque verification. This system verifies the
bank cheque using the following information - cheque
number, bank account number, bank branch code, legal
as well as the courtesy amount, and signature. They have
used the IDRBT cheque dataset and deep learning-based
CNN with high accuracy of 99.14% for handwritten digit
recognition. In this system, for the signature verification,
they have used SIFT feature extractor and SVM
(Support Vector Machine) as a classifier with high
accuracy of 98.10%.</p>
        <p>Sandipan Chowdhury et al.[5] has proposed a
method for object detection through a webcam. In this
method, they have used a combination of Fast R-CNN
(Region-based Convolutional Neural Network) and
Region Proposal Network (RPN), where high-quality
region proposals are generated by training the RPN end
to end, which are in turn is used by Fast-RCNN for
detection. These two modules combine to generate an
object detection system called Faster R-CNN. Faster
R-CNN algorithms can detect objects in real-time with
very high speed still there are few algorithms that are
faster than Faster R-CNN.</p>
        <p>
          Cheng Qian et al. [
          <xref ref-type="bibr" rid="ref4">10</xref>
          ] has developed an indoor
wayfinding system. In this system, the YOLOv2
algorithm has been used for the detection of indoor
objects like doors, door handles, etc. The advantage of
using the YOLO algorithm is its high speed. In this
system, a visually challenged person is connected with a
portable camera, with Bluetooth earpiece and GPU. This
system contains three main components, a deep neural
network, a camera, and an auditory device through
which the subject can get to know about the objects
around him. Cheng Qian et al. has used a convolutional
neural network (CNN) in this model. The ConvNet
which is used in this model has 22 layers and that’s why
this model is perfect for identifying items as fast as the
input images are classified into the matrix in the primary
layer where bounding box offers are upraised. The
requirement of extra hardware devices, GPU, and the
stereo camera makes this model costly as well as an
extra burden for visually challenged people.
        </p>
        <p>
          Bor - Shing Lin et al.[
          <xref ref-type="bibr" rid="ref3">9</xref>
          ] have also used the YOLO
algorithm in their system. This model is worked on a
smartphone and a server. This model is worked in two
different modes, online and offline mode, The online
mode was “stable” and the offline mode was “fast”. In
both modes, the model works such that the smartphone
extracts the features from the surroundings and the
server provides information about the direction and
distance of the objects. Although, in the “stable” mode,
the server uses the Fast R-CNN algorithm whereas in the
“fast” mode it works on the YOLO algorithm. The Fast
R-CNN has been used here, for object identification and
for roughly estimating the distance and position of the
object, Fast R-CNN makes the system more accurate but
a little slower. The YOLO algorithm is used herein Fast
mode, as YOLO looks only once to take out both
information so this algorithm has been used here to make
the system faster. In this system object detection and
identification using smartphones requires a dedicated
server on which YOLO and Fast-RCNN models can run.
Its dependency on the internet is also a negative point in
this system, as the system performance is dependent on
the internet, it would not be able to detect object
accurately and in a fast way, if there would be any
internet issue occurs and it will be bad for VI people
because they can’t depend on the internet.
        </p>
        <p>
          F Particke et al.[
          <xref ref-type="bibr" rid="ref5">11</xref>
          ] has developed a system for
real-time object detection and localization by the use of a
smartphone platform. In this system, F Particke et al.
have used neural networks for object detection and
localization in real-time. Hence, detection instructions of
a DNN are connected with the depth data from a Depth
Sensor(RGB-D camera), and that RGB-D camera is
staged on a smartphone platform. In this system, the
YOLOv2 algorithm was the choice of researchers as an
object detection algorithm. In this system, the
localization can be further advanced in future works by
correspondingly considering spatial information in the
clustering system. By using the X-means algorithm
instead of the K-means algorithm object detection and
localization can be furthermore improved. The problem
with the YOLO algorithm is that it requires a GPU to run
and it can detect objects only in the range of 2m to 5m
so, there is also the possibility of improvement.
        </p>
        <p>
          Hanen Jabnoun et al.[
          <xref ref-type="bibr" rid="ref1">7</xref>
          ] have done object detection for
visually challenged people in video scenes. In this
system, the feature extraction and the RANSAC
algorithm are used to fit the model.SIFT feature
extraction algorithm is the scale and angle invariant so
its accuracy is decent. But the used in this system is very
small which contains only 8 classes. This system will not
perform well in real-time object detection because the
algorithms which are used in this system are slow for
real-time object detection.
        </p>
        <p>Chugai Yi et al.[14] has developed a system for
searching objects for assisting blind people. This model
contains a wearable camera and different – different
fixed cameras. The visually challenged person is
connected with a wearable camera and this camera is
bound with a computer system, the subject can make a
request for searching any object through the voice
command and then wears that camera for getting the
object location and when the system identifies the
requested object a voice message would be generated. In
this system, the SURF algorithm has been used for
feature extraction and the SURF algorithm provides high
accuracy in object detection and identification but in
future work, this system can be made more efficient by
using some fast object detection algorithm so that in
real-time the system can work in a better way. As we
know fast object detection and identification is required
when we detect objects in real-time.</p>
        <p>Object Detection and Identification is the area that
was targeted by researchers. Here, Table 1 presents the
comparative analysis of the research work in Object
detection and identification for visually challenged
people. In figure 2 we have presented the general
architecture of object detection and recognition using
DNN.</p>
        <p>The architecture shows that when the system gets a
command to detect an object it takes the image data as
input from its surrounding in real-time using the system
camera and provides that input data to a trained neural
network which predicts the feature in the given image
and shows the result on the system and if the object is
recognized it notifies to the user.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3 Experimental Overview</title>
        <p>Every algorithm has its own advantages and
disadvantages. In table 2, we are presenting the
experimental comparison of the YOLO algorithm with
its different versions and with EfficientDet on the two
different datasets that are MS COCO and DOTA
datasets. In table 2, we have shown the comparative
analysis of YOLOv2, YOLOv3, YOLOv4, and
EfficientDet on 4 different parameters. These parameters
are – FPS, mAP, BFLOPs (Floating point operations in
billions), and model size.</p>
        <p>As we know YOLO and EfficientDet both are
regression-based algorithms and regression-based
algorithms are very much popular for their high speed.
We can see from the provided analytics, In the starting
version of YOLO i.e. YOLOv2, speed is very high but
the value of mAP is 17.6 which is very low whereas in
the newer version of YOLO speed is high and accuracy
is also better in comparison to an older version. In
YOLOv4 speed is 18.5 FPS and the mAP is 55.4 which
is very good. In the earlier version of the YOLO, the
number of the floating-point operations (BFLOPs) that
are performed in one second are very less whereas in the
later versions of YOLO the number of the floating-point
operations (BFLOPs) that are performed in one second is
very high and this is the reason for the higher accuracy
of the later versions of the YOLO in comparison to
earlier versions of the YOLO. Due to less number of
floating-point operations per second in earlier versions
of YOLO, these versions of YOLO are able to process
more frames in one second (FPS). Model size has also
increased the newer version of YOLO. If we compare
EfficientDet with YOLO, both can compete with each
other in terms of accuracy but in terms of speed, YOLO
is far better than EfficientDet.</p>
        <sec id="sec-3-3-1">
          <title>Real-Time Object Detection using Deep Learning: A Webcam Based Approach</title>
        </sec>
        <sec id="sec-3-3-2">
          <title>Real-Time Object Detection and Tracking Using Deep Learning and OpenCV</title>
        </sec>
        <sec id="sec-3-3-3">
          <title>Crosswalk Detection for VI Pedestrian</title>
        </sec>
        <sec id="sec-3-3-4">
          <title>Door Knob Detection for Visually Challenged People with Feedback in Real-Time</title>
        </sec>
      </sec>
      <sec id="sec-3-4">
        <title>Bor-Shing Lin et al. 2017</title>
        <p>Smartphone-based assistive system
for visually challenged people</p>
      </sec>
      <sec id="sec-3-5">
        <title>F Particke et al. 2017</title>
        <p>Deep Learning for Real-Time
capable object detection and
localization on mobile platform</p>
      </sec>
      <sec id="sec-3-6">
        <title>Joseph Redmon et al. 2016</title>
        <p>Real-time Object Detection with
YOLO</p>
      </sec>
      <sec id="sec-3-7">
        <title>Chugai Yi et al. 2013</title>
        <p>Finding objects for assisting blind
people</p>
      </sec>
      <sec id="sec-3-8">
        <title>Dataset</title>
        <sec id="sec-3-8-1">
          <title>IDRBT Cheque dataset</title>
        </sec>
        <sec id="sec-3-8-2">
          <title>Custom Dataset</title>
        </sec>
        <sec id="sec-3-8-3">
          <title>COCO</title>
        </sec>
        <sec id="sec-3-8-4">
          <title>Custom Dataset</title>
        </sec>
        <sec id="sec-3-8-5">
          <title>ImageNet</title>
        </sec>
        <sec id="sec-3-8-6">
          <title>ImageNet</title>
        </sec>
        <sec id="sec-3-8-7">
          <title>Pascal VOC 2007, ImageNet Custom</title>
          <p>4. Findings people in existing systems because all the time they can’t
carry the extra burden with them. Comparative analysis of</p>
          <p>After analyzing the existing work of object the most popular object detection algorithm which is most
detection and identification for visually challenged people, suitable for real-time object detection and identification
we have presented here the detailed comparative analysis has been provided by Ugur Alganci et al. [2] on the DOTA
of the work in the form of table 1. Most of the existing image. The work of Ugur Alganci et al. shows that almost
systems have used regression-based algorithms like YOLO in every case SSD(Single Shot Detector) algorithm has the
and MobileNet SSD for object detection and identification. least recall time and the precision of the SSD algorithm is
Obviously, the reason is the fast speed of YOLO and also high, the precision of Faster RCNN and YOLO is also
MobileNet SSD over the region-based object detection high but recall time is not as good as SSD.
algorithms. In terms of accuracy, region-based object
detection algorithms are better than regression-based 4.2 Comparison between Region-Based
object detection algorithms but in real-time object Algorithms and Regression-Based Algorithms:
detection systems, priority is always speed. Region-based
algorithms are used where accuracy is a priority.</p>
        </sec>
      </sec>
      <sec id="sec-3-9">
        <title>4.1 Problems In Existing Method</title>
        <p>As in the existing system, most of the real-time
object detection systems use the YOLO object detection
algorithm for detecting and identifying objects due to its
fast detection and identification speed, very few systems
use R-CNN or Faster R-CNN. But the problem with the
YOLO is less accurate in comparison to Fast R-CNN or
other Region-based CNN. YOLO can detect objects only
in the range of 2-5m but what about the objects which are
not present in that range. Identifying the object outside the
5m requires some other model. Extra hardware device
requirement is also a negative point for visually challenged
Regression-based object detection algorithms are
faster than region-based object detection algorithms
because region-based object detection algorithms detect
objects in three phases whereas regression-based
algorithms detect objects in a single phase. In
region-based algorithms, firstly it generates the region
proposal, the second phase is feature extraction and the
third phase is classification and object detection. The
region-based object detection algorithm follows the
sliding window technique for generating the region
proposal then it uses feature extraction algorithms (like
SURF, SIFT, etc.) for feature extraction and in the last
phase, it uses a classifier for classification.
Regression-based object detection algorithms divide the
images into grids and provide 0 or 1 to each grid
according to the object presence and absence, that is how
regression-based algorithms detect objects in a single
phase. As the name of regression-based algorithms also
shows these algorithms work in single-phase like YOLO
stands for You Only Look Once and SSD stands for
Single Shot Detector. In terms of accuracy, region-based
object detection algorithms are better than
regression-based object detection algorithms. We use
region-based object detection algorithms where accuracy
is a priority and we use regression-based object detection
algorithms where speed is a priority.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>5. Research Agenda</title>
      <p>We should develop such a system that doesn't
require any extra device for object recognition for
visually challenged people. Our proposed system requires
just a single device i.e. smartphone, which is easily
available to all in this era. We need to use such an
algorithm that requires fewer computational devices and
can detect and identify objects in real-time with accuracy
as well as fast speed. SSD (Single Shot Detector) and
based algorithms would be more appropriate for object
detection and identification in real-time with low
computational devices. Faster RCNN and SSD have
better accuracy in comparison to YOLO, YOLO gets
more credit when we prioritize speed in the system in
comparison to accuracy. In DNN, SSD with Mobile Nets
performs more efficiently. SSD and Mobile Net detects
the object with accuracy and with fast speed also.</p>
      <p>SSD (Single Shot Detector) can detect multiple
objects in an image in a single shot, on the other hand,
region-based neural network algorithms take three steps
for detecting objects, one for generating region proposals,
the second for feature extraction, and the third one for
classification for detecting objects of each proposal. The
experimental results presented by Ugur Alganci et al. [2]
show that the Average Precision (AP) of this algorithm to
detect different classes as a car, person, and chair is
99.76%, 97.76%, and 71.07%, respectively. This raises
the correctness of object detection at a processing speed
that is needed for real-time detection and thus fulfills the
need for regular monitoring indoor and outdoor.</p>
    </sec>
    <sec id="sec-5">
      <title>6. Conclusion</title>
      <sec id="sec-5-1">
        <title>In this paper, we have presented the comparative</title>
        <p>analysis of the existing work which has been done in the
field of object detection for visually challenged people.
This analysis shows that existing systems are not that
appropriate and useful for visually challenged people.
Either existing systems are not useful in practical
scenarios or they require some type of dependency. Most
of the systems use extra hardware devices which becomes
an extra burden for visually challenged people. There is a
need for such a system which couldn’t feel like an extra
burden to them and also be a part of their life. Our
proposed system can fulfill this need of visually
challenged people. Our proposed system is an Android
Application for visually challenged people for object
detection and identification which will help them by
informing them about their surroundings. It will help
them by performing indoor as well as outdoor object
detection and identification. After identification of the
object system will inform the visually challenged people
by generating a voice.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>7. References</title>
      <p>[1] Agrawal, P., Chaudhary, D., Madaan, V., Zabrovskiy,
A., Prodan, R., Kimovski, D., &amp; Timmerer, C. (2020).
Automated bank cheque verification using image
processing and deep learning methods. Multimedia Tools
and Applications, 1-32.
[2] Alganci, U., Soydas, M., &amp; Sertel, E. (2020).
Comparative research on deep learning approaches for
airplane detection from very high-resolution satellite
images. Remote Sensing, 12(3), 458.
[3] Berriel, R. F., Rossi, F. S., de Souza, A. F., &amp;
Oliveira-Santos, T. (2017). Automatic large-scale data
acquisition via crowdsourcing for crosswalk
classification: A deep learning approach. Computers &amp;
Graphics, 68, 32-42.
[4] Chandan, G., Jain, A., &amp; Jain, H. (2018, July). Real
time object detection and tracking using Deep Learning
and OpenCV. In 2018 International Conference on
Inventive Research in Computing Applications (ICIRCA)
(pp. 1305-1308). IEEE.
[5] Chowdhury, S., &amp; Sinha, P. Real Time Object
Detection using Deep Learning: A Webcam Based
Approach.
[6]https://www.who.int/news-room/fact-sheets/detail/
blindness-and-visual-impairment
[12] Redmon, J., Divvala, S., Girshick, R., &amp; Farhadi, A.
(2016). You only look once: Unified, real-time object
detection. In Proceedings of the IEEE conference on
computer vision and pattern recognition (pp. 779-788).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Jabnoun</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Benzarti</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Amiri</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          (
          <year>2015</year>
          , December).
          <article-title>Object detection and identification for blind people in video scenes</article-title>
          .
          <source>In 2015 15th International Conference on Intelligent Systems Design and Applications</source>
          (ISDA) (pp.
          <fpage>363</fpage>
          -
          <lpage>367</lpage>
          ). IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Ju</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Luo</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , Zhang,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            , &amp;
            <surname>Luo</surname>
          </string-name>
          ,
          <string-name>
            <surname>H.</surname>
          </string-name>
          (
          <year>2019</year>
          ).
          <article-title>A simple and efficient network for small target detection</article-title>
          .
          <source>IEEE Access</source>
          ,
          <volume>7</volume>
          ,
          <fpage>85771</fpage>
          -
          <lpage>85781</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>B. S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>C. C.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Chiang</surname>
            ,
            <given-names>P. Y.</given-names>
          </string-name>
          (
          <year>2017</year>
          ).
          <article-title>Simple smartphone-based guiding system for visually impaired people</article-title>
          .
          <source>Sensors</source>
          ,
          <volume>17</volume>
          (
          <issue>6</issue>
          ),
          <fpage>1371</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Niu</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qian</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rizzo</surname>
            ,
            <given-names>J. R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hudson</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Enright</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , ... &amp;
          <string-name>
            <surname>Fang</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          (
          <year>2017</year>
          ).
          <article-title>A wearable assistive technology for the visually impaired with door knob detection and real-time feedback for hand-to-handle manipulation</article-title>
          .
          <source>In Proceedings of the IEEE International Conference on Computer Vision</source>
          Workshops (pp.
          <fpage>1500</fpage>
          -
          <lpage>1508</lpage>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Particke</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kolbenschlag</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hiller</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Patiño-Studencki</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Thielecke</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          (
          <year>2017</year>
          ).
          <article-title>Deep learning for real-time capable object detection and localization on mobile platforms</article-title>
          . IOP Conference Series: Materials Science and Engineering; IOP Publishing: Bristol, UK.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>