Yolov4 with Resnet Backbone

shihyung
3 min readNov 5, 2020

--

* 目的:

前一篇以最基本的vgg做backbone替換練手,這篇則嘗試著以 Resnet 將 Yolov4 的 backbone 做替換來看看會有怎樣的訓練趨勢。

* Backbone 替換

* yaml 檔修改

原始的 Yolov4_L yaml 檔案的 backbone

修改後的 Yolov4_L_resnet yaml。 number 表示幾個 blocks; args 則包含 channel output, stride, groups, width per group, downsaple or not。

* 程式修改

yolo.py, parse_model() 增加:

resnet_n=n
elif m is resLayer:
c1=ch[f if f<0 else f+1]
c2=args[0]
args=[c1,c2,resnet_n,*args[1:]]

if m is resLayer:
m_=m(*args)
c2*=4 #blocks.expansion

common.py 增加:

def conv3x3(in_planes, out_planes, stride=1, groups=1, dilation=1):
return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,padding=dilation, groups=groups, bias=False, dilation=dilation)

def conv1x1(in_planes, out_planes, stride=1):
return nn.Conv2d(in_planes, out_planes, kernel_size=1, stride=stride, bias=False)

class resBottleneck(nn.Module):
expansion = 4
def __init__(self, inplanes, planes, stride=1, groups=1, base_width=64, dilation=1, norm_layer=None, downsample=False):
super(resBottleneck, self).__init__()
if norm_layer is None:
norm_layer = nn.BatchNorm2d
width = int(planes * (base_width / 64.)) * groups
# Both self.conv2 and self.downsample layers downsample the input when stride != 1
self.conv1 = conv1x1(inplanes, width)
self.bn1 = norm_layer(width)
self.conv2 = conv3x3(width, width, stride, groups, dilation)
self.bn2 = norm_layer(width)
self.conv3 = conv1x1(width, planes * self.expansion)
self.bn3 = norm_layer(planes * self.expansion)
self.relu = nn.ReLU(inplace=True)
if downsample:
self.downsample = nn.Sequential(conv1x1(inplanes, planes * self.expansion, stride),nn.BatchNorm2d(planes * self.expansion),)
else:
self.downsample=None
self.stride = stride

def forward(self, x):
identity = x
out = self.relu(self.bn1(self.conv1(x)))
out = self.relu(self.bn2(self.conv2(out)))
out = self.bn3(self.conv3(out))

if self.downsample is not None:
identity = self.downsample(x)

out += identity
out = self.relu(out)

return out

class resLayer(nn.Module):
def __init__(self, c1, c2, n=1, s=1, g=1, w=64, downsample=False): #chin, plane, block_nums, group, width_per_group
super(resLayer,self).__init__()
blocks=[resBottleneck(inplanes=c1, planes=c2, stride=s, groups=g, base_width=w, downsample=downsample)]
for _ in range(n-1):
blocks.append(resBottleneck(inplanes=c2*resBottleneck.expansion, planes=c2, stride=1, groups=g, base_width=w))
self.layers = nn.Sequential(*blocks)
def forward(self, x):
return self.layers(x)

* parameter 變化量

原始的 Yolov4_S:

原始的 Yolov4_L:

修改後的 Yolov4_resnet18:

修改後的 Yolov4_resnet34:

修改後的 Yolov4_resnet50:

* 測試結果

因為coco 圖片集太多,為實驗方便,此處依舊僅取其車輛部分 names: [‘motorcycle’,’car’,’bus’,’truck’], 測試結果如下:

--

--

No responses yet