【学术报告】Occam's Razor in neural network

中文｜ English

发布时间：2021-10-09 浏览次数：437

时间：2021-10-22 下午16：30

题目：Occam's Razor in neural network

报告人：许志钦副教授，上海交通大学

摘要：

I would demonstrate that a neural network (NN) learns training data as simple as it can, resembling an implicit Occam's Razor, from the following two viewpoints. First, the NN output often follows a frequency principle, i.e., learning data from low to high frequency. The frequency principle qualitatively explains various phenomena of NNs in application. Second, the NN weights condense on isolated directions when initialized small, which means the effective NN size is much smaller than its actual size, i.e., a simple representation of the training data.

简介：

许志钦，上海交通大学自然科学研究院/数学科学学院长聘教轨副教授。2012年本科毕业于上海交通大学致远学院。2016年博士毕业于上海交通大学，获应用数学博士学位。2016年至2019年，在纽约大学阿布分校和柯朗研究所做博士后。主要研究方向是机器学习和计算神经科学。多篇论文发表于Journal of Machine Learning Research, NeurIPS (Spotlight), AAAI, Communications in Computational Physics，European Journal of Neuroscience和Communications in Mathematical Sciences等学术期刊和会议。

网站首页

实验室简介

学术委员会

教学指导委员...

人才引进工作...

复杂系统科学...

高级专业技术...

岗位设置与聘...

博士后科研工...

人员介绍

科技人员

管理人员

办公室人员

博士后

各类人才计划...

信息动态

学术活动

研究成果

科研进度

科研论文

发明专利

软件著作权

科研项目

科研获奖

其他成果

研究生教育

招生工作

培养工作

学位工作

常用下载

党建文化

招聘启事

校友天地

合作伙伴

中文｜ English

发布时间：2021-10-09 浏览次数：437

时间：2021-10-22 下午16：30

题目：Occam's Razor in neural network

报告人：许志钦副教授，上海交通大学

摘要：

I would demonstrate that a neural network (NN) learns training data as simple as it can, resembling an implicit Occam's Razor, from the following two viewpoints. First, the NN output often follows a frequency principle, i.e., learning data from low to high frequency. The frequency principle qualitatively explains various phenomena of NNs in application. Second, the NN weights condense on isolated directions when initialized small, which means the effective NN size is much smaller than its actual size, i.e., a simple representation of the training data.

简介：

许志钦，上海交通大学自然科学研究院/数学科学学院长聘教轨副教授。2012年本科毕业于上海交通大学致远学院。2016年博士毕业于上海交通大学，获应用数学博士学位。2016年至2019年，在纽约大学阿布分校和柯朗研究所做博士后。主要研究方向是机器学习和计算神经科学。多篇论文发表于Journal of Machine Learning Research, NeurIPS (Spotlight), AAAI, Communications in Computational Physics，European Journal of Neuroscience和Communications in Mathematical Sciences等学术期刊和会议。