* 目的:
前一篇以最基本的vgg做backbone替換練手,這篇則嘗試著以 Resnet 將 Yolov4 的 backbone 做替換來看看會有怎樣的訓練趨勢。
* Backbone 替換
* yaml 檔修改
原始的 Yolov4_L yaml 檔案的 backbone
修改後的 Yolov4_L_resnet yaml。 number 表示幾個 blocks; args 則包含 channel output, stride, groups, width per group, downsaple or not。
* 程式修改
yolo.py, parse_model() 增加:
resnet_n=n
elif m is resLayer:
c1=ch[f if f<0 else f+1]
c2=args[0]
args=[c1,c2,resnet_n,*args[1:]]
if m is resLayer:
m_=m(*args)
c2*=4 #blocks.expansion
common.py 增加:
def conv3x3(in_planes, out_planes, stride=1, groups=1, dilation=1):
return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,padding=dilation, groups=groups, bias=False, dilation=dilation)
def conv1x1(in_planes, out_planes, stride=1):
return nn.Conv2d(in_planes, out_planes, kernel_size=1, stride=stride, bias=False)
class resBottleneck(nn.Module):
expansion = 4
def __init__(self, inplanes, planes, stride=1, groups=1, base_width=64, dilation=1, norm_layer=None, downsample=False):
super(resBottleneck, self).__init__()
if norm_layer is None:
norm_layer = nn.BatchNorm2d
width = int(planes * (base_width / 64.)) * groups
# Both self.conv2 and self.downsample layers downsample the input when stride != 1
self.conv1 = conv1x1(inplanes, width)
self.bn1 = norm_layer(width)
self.conv2 = conv3x3(width, width, stride, groups, dilation)
self.bn2 = norm_layer(width)
self.conv3 = conv1x1(width, planes * self.expansion)
self.bn3 = norm_layer(planes * self.expansion)
self.relu = nn.ReLU(inplace=True)
if downsample:
self.downsample = nn.Sequential(conv1x1(inplanes, planes * self.expansion, stride),nn.BatchNorm2d(planes * self.expansion),)
else:
self.downsample=None
self.stride = stride
def forward(self, x):
identity = x
out = self.relu(self.bn1(self.conv1(x)))
out = self.relu(self.bn2(self.conv2(out)))
out = self.bn3(self.conv3(out))
if self.downsample is not None:
identity = self.downsample(x)
out += identity
out = self.relu(out)
return out
class resLayer(nn.Module):
def __init__(self, c1, c2, n=1, s=1, g=1, w=64, downsample=False): #chin, plane, block_nums, group, width_per_group
super(resLayer,self).__init__()
blocks=[resBottleneck(inplanes=c1, planes=c2, stride=s, groups=g, base_width=w, downsample=downsample)]
for _ in range(n-1):
blocks.append(resBottleneck(inplanes=c2*resBottleneck.expansion, planes=c2, stride=1, groups=g, base_width=w))
self.layers = nn.Sequential(*blocks)
def forward(self, x):
return self.layers(x)
* parameter 變化量
原始的 Yolov4_S:
原始的 Yolov4_L:
修改後的 Yolov4_resnet18:
修改後的 Yolov4_resnet34:
修改後的 Yolov4_resnet50:
* 測試結果
因為coco 圖片集太多,為實驗方便,此處依舊僅取其車輛部分 names: [‘motorcycle’,’car’,’bus’,’truck’], 測試結果如下: