Difference between revisions of "Smartphone Facial Recognition"

From Psyc 40 Wiki
Jump to: navigation, search
Line 7: Line 7:
 
Facial recognition systems accomplish their tasks by detecting the presence of a face, analyzing its features, and confirming the identity of the person.<ref> Klosowski, T. (2020, July 15). ''Facial recognition is everywhere. here's what we can do about it.'' The New York Times. Retrieved November 14, 2022, from https://www.nytimes.com/wirecutter/blog/how-facial-recognition-works/ </ref> Training data is fed into a facial detection algorithm, where the two most popular such methods are the Viola-Jones algorithm and the use of convolutional neural networks.<ref name="enriquez"> Enriquez, K. (2018, May 15). (thesis). ''Faster face detection using Convolutional Neural Networks & the Viola-Jones algorithm.'' California State University Stanislaus. Retrieved November 14, 2022, from https://www.csustan.edu/sites/default/files/groups/University%20Honors%20Program/Journals/01_enriquez.pdf. </ref> The Viola-Jones algorithm was the first real-time object detection framework, and works by converting images to grayscale and looking for edges that signify the presence of human features. <ref> Viola, P., &amp; Jones, M. (2001). Rapid object detection using a boosted cascade of Simple features. ''Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.'' CVPR 2001. https://doi.org/10.1109/cvpr.2001.990517 </ref> While highly accurate in detecting well-lit front-facing faces and also requiring relatively little memory, it is slower than deep-learning based methods, including the now industry-standard convolutional neural network (CNNs).<ref name="enriquez" />
 
Facial recognition systems accomplish their tasks by detecting the presence of a face, analyzing its features, and confirming the identity of the person.<ref> Klosowski, T. (2020, July 15). ''Facial recognition is everywhere. here's what we can do about it.'' The New York Times. Retrieved November 14, 2022, from https://www.nytimes.com/wirecutter/blog/how-facial-recognition-works/ </ref> Training data is fed into a facial detection algorithm, where the two most popular such methods are the Viola-Jones algorithm and the use of convolutional neural networks.<ref name="enriquez"> Enriquez, K. (2018, May 15). (thesis). ''Faster face detection using Convolutional Neural Networks & the Viola-Jones algorithm.'' California State University Stanislaus. Retrieved November 14, 2022, from https://www.csustan.edu/sites/default/files/groups/University%20Honors%20Program/Journals/01_enriquez.pdf. </ref> The Viola-Jones algorithm was the first real-time object detection framework, and works by converting images to grayscale and looking for edges that signify the presence of human features. <ref> Viola, P., &amp; Jones, M. (2001). Rapid object detection using a boosted cascade of Simple features. ''Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.'' CVPR 2001. https://doi.org/10.1109/cvpr.2001.990517 </ref> While highly accurate in detecting well-lit front-facing faces and also requiring relatively little memory, it is slower than deep-learning based methods, including the now industry-standard convolutional neural network (CNNs).<ref name="enriquez" />
  
Convolutional neural networks are closely related to artificial neural networks (ANNs) [https://en.wikipedia.org/wiki/Artificial_neural_network]. Their architecture is sparse [https://en.wikipedia.org/wiki/Sparse_network], topographic,  and feed-forward [https://en.wikipedia.org/wiki/Feedforward_neural_network]<ref> Gurucharan, M. (2022, July 28). ''Basic CNN architecture: Explaining 5 layers of Convolutional Neural Network.'' upGrad. Retrieved November 14, 2022, from https://www.upgrad.com/blog/basic-cnn-architecture/#:~:text=other%20advanced%20tasks.-,What%20is%20the%20architecture%20of%20CNN%3F,the%20main%20responsibility%20for%20computation. </ref> featuring an input and output layer along with three types of hidden layers.<ref name="layers"> Mishra, M. (2020, August 26). ''Convolutional neural networks, explained.'' Towards Data Science. Retrieved November 14, 2022, from https://towardsdatascience.com/convolutional-neural-networks-explained-9cc5188c4939 </ref> Unlike traditional ANNs, CNNs have three dimensions - width, depth, and height - and only connect to a certain subset of the preceding layer.<ref name="intro"> O'Shea, K., & Nash, R. (2015). An introduction to convolutional neural networks. ''arXiv preprint arXiv:1511.08458''. </ref> The first hidden layer type is convolutional, which involves using a filter of n x n size with pre-determined values, sweeping across a larger matrix at a pre-determined stride and adding the dot products to an map.<ref name="intro" /> This presents a significant advantage over ANNs by greatly reducing the amount of information stored <ref name="intro" />. Because convolution only uses matrix multiplication, another process is needed to introduce non-linearity; the most popular is the Rectified Linear-Unit (RELU), which replaces negative values with zero. <ref name="layers" /> The next hidden layer type is pooling, which works to further reduce the size and required computational power.<ref name="basic"> Saha, S. (2018, December 15). ''A comprehensive guide to convolutional neural network.'' Toward Data Science. Retrieved November 14, 2022, from https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53 </ref> The most common pooling method involves sweeping over the activation layer with another layer, usually 2 x 2 with a stride of 2, and selecting the largest value to put onto the next activation layer.<ref name="easy" />
+
Convolutional neural networks are closely related to artificial neural networks (ANNs) [https://en.wikipedia.org/wiki/Artificial_neural_network]. Their architecture is sparse [https://en.wikipedia.org/wiki/Sparse_network], topographic,  and feed-forward [https://en.wikipedia.org/wiki/Feedforward_neural_network]<ref> Gurucharan, M. (2022, July 28). ''Basic CNN architecture: Explaining 5 layers of Convolutional Neural Network.'' upGrad. Retrieved November 14, 2022, from https://www.upgrad.com/blog/basic-cnn-architecture/#:~:text=other%20advanced%20tasks.-,What%20is%20the%20architecture%20of%20CNN%3F,the%20main%20responsibility%20for%20computation. </ref> featuring an input and output layer along with three types of hidden layers.<ref name="layers"> Mishra, M. (2020, August 26). ''Convolutional neural networks, explained.'' Towards Data Science. Retrieved November 14, 2022, from https://towardsdatascience.com/convolutional-neural-networks-explained-9cc5188c4939 </ref> Unlike traditional ANNs, CNNs have three dimensions - width, depth, and height - and only connect to a certain subset of the preceding layer.<ref name="intro"> O'Shea, K., & Nash, R. (2015). An introduction to convolutional neural networks. ''arXiv preprint arXiv:1511.08458''. </ref> The first hidden layer type is convolutional, which involves using a filter of n x n size with pre-determined values, sweeping across a larger matrix at a pre-determined stride and adding the dot products to an map.<ref name="intro" /> This presents a significant advantage over ANNs by greatly reducing the amount of information stored <ref name="intro" />. Because convolution only uses matrix multiplication, another process is needed to introduce non-linearity; the most popular is the Rectified Linear-Unit (RELU), which replaces negative values with zero. <ref name="layers" /> The next hidden layer type is pooling, which works to further reduce the size and required computational power.<ref name="basic"> Saha, S. (2018, December 15). ''A comprehensive guide to convolutional neural network.'' Toward Data Science. Retrieved November 14, 2022, from https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53 </ref> The most common pooling method involves sweeping over the activation layer with another layer, usually 2 x 2 with a stride of 2, and selecting the largest value to put onto the next activation layer.<ref name="basic" />
  
 
== History ==
 
== History ==

Revision as of 23:32, 21 October 2022

By Kenneth Wu

Note: This page is incomplete.

Facial recognition systems are computer programs that match faces against a database [1]. A trivial task for humans, achieving high levels of accuracy has been difficult for computers until recently.[1] Deep learning [2] through the use of convolutional neural networks [3] currently dominates the facial recognition field.[2] However, deep learning uses much more memory, disk storage, and computational resources than traditional computer vision, presenting significant challenges to facial recognition with the limited hardware capabilities of smartphones.[3] Accordingly, smartphone manufacturers have taken to using processors with dedicated neural engines for deep learning tasks [4] as well as creating simpler and more compact models that mimic the behavior of more complex models.[3]

Model

Facial recognition systems accomplish their tasks by detecting the presence of a face, analyzing its features, and confirming the identity of the person.[5] Training data is fed into a facial detection algorithm, where the two most popular such methods are the Viola-Jones algorithm and the use of convolutional neural networks.[6] The Viola-Jones algorithm was the first real-time object detection framework, and works by converting images to grayscale and looking for edges that signify the presence of human features. [7] While highly accurate in detecting well-lit front-facing faces and also requiring relatively little memory, it is slower than deep-learning based methods, including the now industry-standard convolutional neural network (CNNs).[6]

Convolutional neural networks are closely related to artificial neural networks (ANNs) [4]. Their architecture is sparse [5], topographic, and feed-forward [6][8] featuring an input and output layer along with three types of hidden layers.[9] Unlike traditional ANNs, CNNs have three dimensions - width, depth, and height - and only connect to a certain subset of the preceding layer.[10] The first hidden layer type is convolutional, which involves using a filter of n x n size with pre-determined values, sweeping across a larger matrix at a pre-determined stride and adding the dot products to an map.[10] This presents a significant advantage over ANNs by greatly reducing the amount of information stored [10]. Because convolution only uses matrix multiplication, another process is needed to introduce non-linearity; the most popular is the Rectified Linear-Unit (RELU), which replaces negative values with zero. [9] The next hidden layer type is pooling, which works to further reduce the size and required computational power.[11] The most common pooling method involves sweeping over the activation layer with another layer, usually 2 x 2 with a stride of 2, and selecting the largest value to put onto the next activation layer.[11]

History

Applications

References

  1. Brownlee, J. (2019, July 5). A gentle introduction to deep learning for face recognition. Machine Learning Mastery. Retrieved November 14, 2022, from https://machinelearningmastery.com/introduction-to-deep-learning-for-face-recognition/
  2. Almabdy, S., & Elrefaei, L. (2019). Deep convolutional neural network-based approaches for face recognition. Applied Sciences, 9(20), 4397. https://doi.org/10.3390/app9204397
  3. 3.0 3.1 Computer Vision Machine Learning Team. (2017, November). An on-device deep neural network for face detection. Apple Machine Learning Research. Retrieved November 14, 2022, from https://machinelearning.apple.com/research/face-detection#1
  4. Samsung. (2018). Exynos 9810: Mobile Processor. Samsung Semiconductor Global. Retrieved November 14, 2022, from https://semiconductor.samsung.com/processor/mobile-processor/exynos-9-series-9810/
  5. Klosowski, T. (2020, July 15). Facial recognition is everywhere. here's what we can do about it. The New York Times. Retrieved November 14, 2022, from https://www.nytimes.com/wirecutter/blog/how-facial-recognition-works/
  6. 6.0 6.1 Enriquez, K. (2018, May 15). (thesis). Faster face detection using Convolutional Neural Networks & the Viola-Jones algorithm. California State University Stanislaus. Retrieved November 14, 2022, from https://www.csustan.edu/sites/default/files/groups/University%20Honors%20Program/Journals/01_enriquez.pdf.
  7. Viola, P., & Jones, M. (2001). Rapid object detection using a boosted cascade of Simple features. Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001. https://doi.org/10.1109/cvpr.2001.990517
  8. Gurucharan, M. (2022, July 28). Basic CNN architecture: Explaining 5 layers of Convolutional Neural Network. upGrad. Retrieved November 14, 2022, from https://www.upgrad.com/blog/basic-cnn-architecture/#:~:text=other%20advanced%20tasks.-,What%20is%20the%20architecture%20of%20CNN%3F,the%20main%20responsibility%20for%20computation.
  9. 9.0 9.1 Mishra, M. (2020, August 26). Convolutional neural networks, explained. Towards Data Science. Retrieved November 14, 2022, from https://towardsdatascience.com/convolutional-neural-networks-explained-9cc5188c4939
  10. 10.0 10.1 10.2 O'Shea, K., & Nash, R. (2015). An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458.
  11. 11.0 11.1 Saha, S. (2018, December 15). A comprehensive guide to convolutional neural network. Toward Data Science. Retrieved November 14, 2022, from https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53