Metrika članka

  • citati u SCindeksu: 0
  • citati u CrossRef-u:0
  • citati u Google Scholaru:[=>]
  • posete u prethodnih 30 dana:4
  • preuzimanja u prethodnih 30 dana:3
članak: 1 od 1  
Telfor Journal
2018, vol. 10, br. 1, str. 44-49
jezik rada: engleski
vrsta rada: neklasifikovan
doi:10.5937/telfor1801044E


IP core for efficient zero-run length compression of CNN feature maps
(naslov ne postoji na srpskom)
aUniverzitet u Novom Sadu, Fakultet tehničkih nauka
bFrobas GmbH, Forstern, Germany

e-adresa: andrea.erdeljan@uns.ac.rs, bogdan.vukobratovic@gma

Projekat

Inovativne elektronske komponente i sistemi bazirani na neorganskim i organskim tehnologijama ugrađeni u robe i proizvode široke potrošnje (MPNTR - 32016)

Sažetak

(ne postoji na srpskom)
Convolutional Neural Networks (CNNs) are becoming a fundamental tool for machine learning. High performance and energy efficiency are of great importance for deployments of CNNs in many embedded applications. Energy consumption during CNN processing is dominated by memory access and since large networks do not fit in on-chip storage, they require expensive DRAM access. This paper introduces an universal Output Stream Manager (OSM) which can be used to compress and format data coming from a CNN accelerator and reduce external memory access. The OSM exploits the sparsity of data and implements two Zero-Run Length encoding algorithms and can be easily reconfigured to optimize usage for different CNN layers.

Ključne reči

Convolutional Neural Networks; FPGA; computer vision; zero run-length encoding; Zynq-7000; SystemVerilog

Reference

Chen, Y., Krishna, T., Emer, J.S., Sze, V. (2017) Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks. IEEE Journal of Solid-State Circuits, 52(1): 127-138
Erdeljan, A., Vukobratovic, B., Struharik, R. (2017) IP core for efficient zero-run length compression of CNN feature maps. u: 2017 25th Telecommunication Forum (TELFOR), Institute of Electrical and Electronics Engineers (IEEE), str. 1-4
Horowitz, M. Energy table for 45nm process. https://sites.google.com/site/seecproject
Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H. (2017) MobileNets: Efficient convolutional neural networks for mobile vision applications. CoRR abs/1704.04861 (2017): n. pag
Krizhevsky, A., Sutskever, I., Hinton, G.E. (2017) ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60(6): 84-90
Liu, Z., Dou, Y., Jiang, J., Xu, J., Li, S., Zhou, Y., Xu, Y. (2017) Throughput-Optimized FPGA Accelerator for Deep Convolutional Neural Networks. ACM Transactions on Reconfigurable Technology and Systems, 10(3): 1-23
Parashar, A., Rhu, M., Mukkara, A., Puglielli, A., Venkatesan, R., Khailany, B., Emer, J., Keckler, S.W., Dally, W.J. (2017) SCNN. ACM SIGARCH Computer Architecture News, 45(2): 27-40
Peemen, M., Setio, A.A. A., Mesman, B., Corporaal, H. (2013) Memory-centric accelerator design for Convolutional Neural Networks. u: 2013 IEEE 31st International Conference on Computer Design (ICCD), Institute of Electrical and Electronics Engineers (IEEE), str. 13-19
Qiu, J., Song, S., Wang, Y., Yang, H., Wang, J., Yao, S., Guo, K., Li, B., Zhou, E., Yu, J., Tang, T., Xu, N. (2016) Going Deeper with Embedded FPGA Platform for Convolutional Neural Network. u: Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays - FPGA '16, New York, New York, USA: Association for Computing Machinery (ACM), str. 26-35
Rakanovic, D., Erdeljan, A., Vranjkovic, V., Vukobratovic, B., Teodorovic, P., Struharik, R. (2017) Reducing off-chip memory traffic in deep CNNs using stick buffer cache. u: 2017 25th Telecommunication Forum (TELFOR), Institute of Electrical and Electronics Engineers (IEEE), str. 1-4
Reagen, B., Whatmough, P., Adolf, R., Rama, S., Lee, H., Lee, S.K., Hernandez-Lobato, J.M., Wei, G., Brooks, D. (2016) Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Accelerators. u: 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), Institute of Electrical and Electronics Engineers (IEEE), str. 267-278
Shen, Y., Ferdman, M., Milder, P. (2017) Escher: A CNN Accelerator with Flexible Buffering to Minimize Off-Chip Transfer. u: 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), Institute of Electrical and Electronics Engineers (IEEE), str. 93-100
Simonyan, K., Zisserman, A. (2014) Very deep convolutional networks for large-scale image recognition. CoRR, vol. abs/1409.1556
Song, H., Xingyu, L., Huizi, M., Jing, P., Pedram, A., Horowitz, M., Dally, B. (2016) Deep compression and EIE: Efficient inference engine on compressed deep neural network. u: 2016 IEEE Hot Chips 28 Symposium (HCS), Institute of Electrical and Electronics Engineers (IEEE), str. 1-6
Struharik, R., Vukobratović, B. (2017) Data transfer interface specification. FTN, Internal Report
Struharik, R., Vukobratovic, B. (2017) AIScale - A coarse grained reconfigurable CNN hardware accelerator. u: 2017 IEEE East-West Design & Test Symposium (EWDTS), Institute of Electrical and Electronics Engineers (IEEE), str. 1-9
Suda, N., Chandra, V., Dasika, G., Mohanty, A., Ma, Y., Vrudhula, S., Seo, J., Cao, Y. (2016) Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks. u: Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays - FPGA '16, New York, New York, USA: Association for Computing Machinery (ACM), str. 16-25
Szegedy, C., Wei, L., Yangqing, J., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A. (2015) Going deeper with convolutions. u: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Institute of Electrical and Electronics Engineers (IEEE), str. 1-9
Zhang, C., Li, P., Sun, G., Guan, Y., Xiao, B., Cong, J. (2015) Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks. u: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays - FPGA '15, New York, New York, USA: Association for Computing Machinery (ACM), str. 161-170