Catalog
  1. 1. CNN最近几年的发展
    1. 1.1. AlexNet
    2. 1.2. VGG
    3. 1.3. GoogleNET
  2. 2. LeNet5
    1. 2.0.1. lenet5.py
    2. 2.0.2. main.py
  • 3. ResNet
    1. 3.0.1. ResNet.py
  • 卷积神经网络实战之Lenet5 & Resnet

    CNN最近几年的发展

    ​ CNN,全称为Convolutional Neural Network,中文名称卷积神经网络。

    ​ 卷积神经网络(Convolutional Neural Networks, CNN)是一类包含卷积计算且具有深度结构的前馈神经网络(Feedforward Neural Networks),是深度学习(deep learning)的代表算法之一 。卷积神经网络具有表征学习(representation learning)能力,能够按其阶层结构对输入信息进行平移不变分类(shift-invariant classification),因此也被称为“平移不变人工神经网络(Shift-Invariant Artificial Neural Networks, SIANN)” 。

    ​ 对卷积神经网络的研究始于二十世纪80至90年代,时间延迟网络和LeNet-5是最早出现的卷积神经网络 ;在二十一世纪后,随着深度学习理论的提出和数值计算设备的改进,卷积神经网络得到了快速发展,并被应用于计算机视觉自然语言处理等领域 。

    ​ 在2012年之前,ImageNet的错误率都是比较高的,直到2012年AlexNet的横空出世,将错误率一下子下降了

    近10%,掀起了一股深度学习的浪潮。

    AlexNet

    ​ 如图为Alexnet的网络结构,它是由八层网络构成,以11*11的kernel进行大刀阔斧的提取特征,在当时是很很好的模型,在今天看来,它的粒度太大。可以看到,当时比较先进的显卡只有3GB大小,所以用两块显卡跑。

    VGG

    ​ 这是2014年ILSVRC的亚军VGG,相比2012年的AlexNet,error又下降了近10%。它的网络层数又加深了,从AlexNet的8层网络,变成了11-19层。Kernel大小也变得更加细致了。

    GoogleNET

    ​ 上图为2014年的冠军GoogLeNet,为了纪念1998年Yann LeCun提出的1998年LeNet5,将字母L大写。该网络模型为22层。当时人们有一个想法,是不是网络层数越多,train的效果越好呢?在经历大量的实验之后,发现并不是这样的。也就是说,并不是层数越多越好。于是中国学者何凯明(大神一位,有兴趣可以百度)在2015年提出了一种网络模型叫做残差网络ResNet。

    ​ 该模型的创新点在于,它提出了一个类似于电路中“短接”的概念,将那些train效果起到negative作用的layer丢掉,从而在保证层数的情况下又保证了一个low error。

    我们今天主要介绍LeNet5和ResNet。

    LeNet5

    ​ LeNet5是1998年由Yann LeCun及其团队提出的,该模型在当时的手写数字识别问题中取得了成功。

    ​ 该网络由2层卷积层,3层全连接层共5层网络构成。图片的大小为32*32.

    lenet5.py

    import torch
    from torch import nn
    class Lenet5(nn.Module):
    def __init__(self):
    super(Lenet5, self).__init__()
    self.conv_unit = nn.Sequential(
    #x :[b,3,32,32]
    nn.Conv2d(in_channels=3,out_channels=6,kernel_size=5,stride=1,padding=0),#新建卷积层
    nn.MaxPool2d(kernel_size=2,stride=2,padding=0), #池化层压缩特征 只改变长款,不改变channel
    nn.Conv2d(in_channels=6,out_channels=16,kernel_size=5,stride=1,padding=0),
    nn.MaxPool2d(kernel_size=2,stride=2,padding=0)

    )
    self.fc_unit = nn.Sequential( #三成全连接层
    nn.Linear(16*5*5,120),
    nn.ReLU(),
    nn.Linear(120,84),
    nn.ReLU(),
    nn.Linear(84,10)
    )
    #[b,3,32,32]
    tmp = torch.randn(2,3,32,32)
    out = self.conv_unit(tmp)
    #[b,16,5,5]
    print('conv_out:',out.shape)
    def forward(self, x):
    batchsz = x.size(0)
    #[b,3,32,32] => [b,16,5,5]
    x = self.conv_unit(x)
    #[b,16,5,5] => [b,16*5*5]
    x = x.view(batchsz,16*5*5) #flatten
    #[b,16*5*5] => [b,10]
    logits = self.fc_unit(x)
    return logits
    def main():
    net = Lenet5()
    tmp = torch.randn(2,3,32,32)
    out = net(tmp)
    print('lenet out:',out.shape)

    if __name__ == '__main__':
    main()

    main.py

    import torch
    from torchvision import datasets
    from torchvision import transforms
    from torch.utils.data import DataLoader
    from torch import nn,optim

    from Lenet5 import Lenet5


    def main():
    batchsz=128 #根据自己机器的性能来定义batchsize,不建议太小,因为梯度的优化方向是按照平均
    #梯度方向,所以太小会具有偶然性
    #新建cifar文件夹,train = true ,transforms
    cifar_train = datasets.CIFAR10('cifar',True,transform=transforms.Compose([
    transforms.Resize((32,32)), #resize程需要的大小
    transforms.ToTensor()
    ]),download=True) #装载CIFAR10数据集

    cifar_train = DataLoader(cifar_train,batch_size=batchsz,shuffle=True)

    cifar_test = datasets.CIFAR10('cifar', True, transform=transforms.Compose([
    transforms.Resize((32,32)), # resize程需要的大小
    transforms.ToTensor()
    ]), download=True)

    cifar_test = DataLoader(cifar_test, batch_size=batchsz, shuffle=True)

    x,label = iter(cifar_train).next()
    print('x:',x.shape,'label:',label.shape)

    device = torch.device('cuda') #指定GPU寻来你
    model = Lenet5().to(device)
    criteon = nn.CrossEntropyLoss().to(device)
    optimizer = optim.Adam(model.parameters(),lr = 1e-3)
    print(model)

    for epoch in range(100):
    model.train()
    for batchidx,(x,label) in enumerate(cifar_train):
    #x [b,3,32,32]
    #label [b]
    x,label = x.to(device),label.to(device)
    logits = model(x)
    #logits:[b,10]
    #label: [b]
    #loss:tensor scalar
    loss = criteon(logits,label)

    #backpropagate
    optimizer.zero_grad() #记得迭代之前梯度要清零
    loss.backward()
    optimizer.step()

    print(epoch,"single_loss:",loss.item()) #打印一个batch的loss

    model.eval() #验证集
    with torch.no_grad(): #加入此函数的目的是为了不要带有梯度,因为测试集不需要梯度
    #test
    total_correct = 0
    total_num = 0
    for x,label in cifar_test:
    #[b,3,32,32]
    #[b]
    x,label = x.to(device),label.to(device)

    #[b,10]
    logits = model(x)
    #[b]
    pred = logits.argmax(dim = 1) #max 取最大值 argmax取最大值所对应的下标
    #[b] vs[b] => scalar tensor
    correct = torch.eq(pred,label).float().sum().item()
    total_correct += correct
    total_num += x.size(0)

    acc = total_correct/total_num
    print(epoch,'test acc:',acc) #打印平均正确率

    if __name__ == '__main__':
    main()

    ​ 可以看到,在epoch达到50多次的时候会出现明显的震荡。

    ​ 我仅仅训练了100个epoch,accuracy达到了97%左右

    ResNet

    ​ 以这样的”短接”操作,来保证层数和准确率。ResNet是何凯明同学在2015年的ILSVRC提出的,同时也是这届大赛的冠军。他于2016年获得CVPR的Best Paper。

    ResNet.py

    import  torch
    from torch import nn
    from torch.nn import functional as F



    class ResBlk(nn.Module):
    """
    resnet block
    """

    def __init__(self, ch_in, ch_out, stride=1):
    """

    :param ch_in:
    :param ch_out:
    """
    super(ResBlk, self).__init__()


    self.conv1 = nn.Conv2d(ch_in, ch_out, kernel_size=3, stride=stride, padding=1)
    self.bn1 = nn.BatchNorm2d(ch_out) #序列化 让train更加的快速和稳定
    self.conv2 = nn.Conv2d(ch_out, ch_out, kernel_size=3, stride=1, padding=1)
    self.bn2 = nn.BatchNorm2d(ch_out)

    self.extra = nn.Sequential() #初始化self.extra
    if ch_out != ch_in: #如果输入输出的channel不匹配,进行归一
    # [b, ch_in, h, w] => [b, ch_out, h, w]
    self.extra = nn.Sequential(
    nn.Conv2d(ch_in, ch_out, kernel_size=1, stride=stride),
    nn.BatchNorm2d(ch_out)
    )


    def forward(self, x):
    """

    :param x: [b, ch, h, w]
    :return:
    """
    out = F.relu(self.bn1(self.conv1(x)))
    out = self.bn2(self.conv2(out)) #激活函数可加可不加
    # short cut.
    # extra module: [b, ch_in, h, w] => [b, ch_out, h, w]
    # element-wise add:
    out = self.extra(x) + out #维度不匹配则无法相加,对应于图片中的F(X)+X
    out = F.relu(out)

    return out




    class ResNet18(nn.Module):

    def __init__(self):
    super(ResNet18, self).__init__()

    self.conv1 = nn.Sequential(
    nn.Conv2d(3, 64, kernel_size=3, stride=3, padding=0),
    nn.BatchNorm2d(64)
    )
    # followed 4 blocks
    # [b, 64, h, w] => [b, 128, h ,w]
    self.blk1 = ResBlk(64, 128, stride=2)
    # [b, 128, h, w] => [b, 256, h, w]
    self.blk2 = ResBlk(128, 256, stride=2)
    # # [b, 256, h, w] => [b, 512, h, w]
    self.blk3 = ResBlk(256, 512, stride=2)
    # # [b, 512, h, w] => [b, 1024, h, w]
    self.blk4 = ResBlk(512, 512, stride=2)

    self.outlayer = nn.Linear(512*1*1, 10)

    def forward(self, x):
    """

    :param x:
    :return:
    """
    x = F.relu(self.conv1(x))
    # [b, 64, h, w] => [b, 1024, h, w]
    x = self.blk1(x)
    x = self.blk2(x)
    x = self.blk3(x)
    x = self.blk4(x)
    # print('after conv:', x.shape) #[b, 512, 2, 2]
    # [b, 512, h, w] => [b, 512, 1, 1]
    x = F.adaptive_avg_pool2d(x, [1, 1])
    # print('after pool:', x.shape)
    x = x.view(x.size(0), -1)
    x = self.outlayer(x)
    return x



    def main():
    blk = ResBlk(64, 128, stride=4)
    tmp = torch.randn(2, 64, 32, 32)
    out = blk(tmp)
    print('block:', out.shape)
    x = torch.randn(2, 3, 32, 32)
    model = ResNet18()
    out = model(x)
    print('resnet:', out.shape)




    if __name__ == '__main__':
    main()
    #main.py
    import torch
    from torch.utils.data import DataLoader
    from torchvision import datasets
    from torchvision import transforms
    from torch import nn, optim

    from lenet5 import Lenet5
    from resnet import ResNet18

    def main():
    batchsz = 128

    cifar_train = datasets.CIFAR10('cifar', True, transform=transforms.Compose([
    transforms.Resize((32, 32)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], #RGB三个通道的均值
    std=[0.229, 0.224, 0.225]) #RGB三个通道的方差
    ]), download=True)
    cifar_train = DataLoader(cifar_train, batch_size=batchsz, shuffle=True)

    cifar_test = datasets.CIFAR10('cifar', False, transform=transforms.Compose([
    transforms.Resize((32, 32)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
    std=[0.229, 0.224, 0.225])
    ]), download=True)
    cifar_test = DataLoader(cifar_test, batch_size=batchsz, shuffle=True)


    x, label = iter(cifar_train).next()
    print('x:', x.shape, 'label:', label.shape)

    device = torch.device('cuda')
    # model = Lenet5().to(device)
    model = ResNet18().to(device)

    criteon = nn.CrossEntropyLoss().to(device)
    optimizer = optim.Adam(model.parameters(), lr=1e-3)
    print(model)

    for epoch in range(1000):

    model.train()
    for batchidx, (x, label) in enumerate(cifar_train):
    # [b, 3, 32, 32]
    # [b]
    x, label = x.to(device), label.to(device)


    logits = model(x)
    # logits: [b, 10]
    # label: [b]
    # loss: tensor scalar
    loss = criteon(logits, label)

    # backprop
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()


    print(epoch, 'loss:', loss.item())


    model.eval()
    with torch.no_grad():
    # test
    total_correct = 0
    total_num = 0
    for x, label in cifar_test:
    # [b, 3, 32, 32]
    # [b]
    x, label = x.to(device), label.to(device)

    # [b, 10]
    logits = model(x)
    # [b]
    pred = logits.argmax(dim=1)
    # [b] vs [b] => scalar tensor
    correct = torch.eq(pred, label).float().sum().item()
    total_correct += correct
    total_num += x.size(0)
    # print(correct)

    acc = total_correct / total_num
    print(epoch, 'test acc:', acc)



    if __name__ == '__main__':
    main()

    ​ 可以看到,ResNet的main文件和LeNet5的main文件相差不多。这说明,我门可以一套模板来训练不同的模型。实际应用中还可以加入一些工程技巧,比如数据增强操作,图片的旋转角度不宜太大,在-15°到15°为宜,角度太大,经实验证明,效果并不好。

    Author: superzhaoyang
    Link: http://yoursite.com/2020/03/15/卷积神经网络实战之Lenet5-Resnet/
    Copyright Notice: All articles in this blog are licensed under CC BY-NC-SA 4.0 unless stating additionally.
    Donate
    • 微信
    • 支付宝

    Comment