中心博士生徐绍夫的工作——An optical tensor core architecture for neural network training based on dual-layer waveguide topology and homodyne detection(一种用于神经网络计算的基于双层波导拓扑结构与零差检测的光子张量核架构)近期被Chinese Optics Letters期刊接收发表,该工作得到了国家重点研发计划(2019YFB2203700)、国家自然科学基金(61822508)的部分资助。为解决神经网络训练目前所面临的算力瓶颈问题,我们提出一种新型二维可集成的光学张量核架构。该架构的基本点积单元采用相干脉冲光零差检测方式实现乘法,并且在一个简单的电容电路中即可实现加法;同时,该架构利用了双层波导拓扑结构,交叉波导之间的串扰和插损具有一个极低的水平,可以将原有的空间光三维结构平面化至二维平面上,为芯片集成提供了可能性。通过仿真原理验证,该架构实现的训练精度与目前最为精确的64-bit计算机相差无几,并且通过超高速的光学时钟,可以相对于电学时钟速率提升数倍,为解决神经网络训练的算力瓶颈提供了一个极具潜力的解决方案。
摘要: We propose an optical tensor core (OTC) architecture for neural network training. The key computational components of the OTC are the arrayed optical dot-product units (DPUs). The homodyne-detection-based DPUs can conduct the essential computational work of neural network training, i.e. matrix-matrix multiplication. The dual-layer waveguide topology is adopted to feed data into these dot-product units with ultra-low insertion loss and cross talk. Therefore, the OTC architecture allows a large-scale dot-product array and can be integrated into a photonic chip. The feasibility of the OTC and its effectiveness on neural network training is verified with numerical simulations.