Pytorch导入每层的weight和bias保存的参数到指定的层里面

目的:我训练保存好了一个模型,然后我重新写了另外一个架构类似但是不一样的模型,但是我想复用原来训练好的模型参数,因此我想逐个导入每层的weight和bias参数到指定的层里面。我查看了Module里面导入参数的函数,主要代码如下:
 
    def load_state_dict(self, state_dict, strict=True):

missing_keys = []
unexpected_keys = []
error_msgs = []

# copy state_dict so _load_from_state_dict can modify it
metadata = getattr(state_dict, '_metadata', None)
state_dict = state_dict.copy()
if metadata is not None:
state_dict._metadata = metadata

def load(module, prefix=''):
local_metadata = {} if metadata is None else metadata.get(prefix[:-1], {})
module._load_from_state_dict(
state_dict, prefix, local_metadata, strict, missing_keys, unexpected_keys, error_msgs)
for name, child in module._modules.items():
if child is not None:
load(child, prefix + name + '.')

load(self)

加载参数调用了:(只复制了主要代码,有一些异常错误的处理没有复制上来)
 def _load_from_state_dict(self, state_dict, prefix, local_metadata, strict,
missing_keys, unexpected_keys, error_msgs):

for hook in self._load_state_dict_pre_hooks.values():
hook(state_dict, prefix, local_metadata, strict, missing_keys, unexpected_keys, error_msgs)

local_name_params = itertools.chain(self._parameters.items(), self._buffers.items())
local_state = {k: v.data for k, v in local_name_params if v is not None}

for name, param in local_state.items():
key = prefix + name
if key in state_dict:
input_param = state_dict[key]

# Backward compatibility: loading 1-dim tensor from 0.3.* to version 0.4+
if len(param.shape) == 0 and len(input_param.shape) == 1:
input_param = input_param[0]

if input_param.shape != param.shape:
# local shape should match the one in checkpoint
error_msgs.append('size mismatch for {}: copying a param with shape {} from checkpoint, '
'the shape in current model is {}.'
.format(key, input_param.shape, param.shape))
continue

if isinstance(input_param, Parameter):
# backwards compatibility for serialized parameters
input_param = input_param.data
try:
param.copy_(input_param)
except Exception:
error_msgs.append('While copying the parameter named "{}", '
'whose dimensions in the model are {} and '
'whose dimensions in the checkpoint are {}.'
.format(key, param.size(), input_param.size()))
elif strict:
missing_keys.append(key)
我看到里面核心是把模型参数(tensor类型)比如卷积层的weight对象的data从保存的参数里面copy过来,所以我根据逻辑写了一个函数,为了测试方便,我需要导入参数的模型和保存模型是一样的,使用官方给的方法是可以正确运行并且评估结果正确。
我写的函数如下:
def load_para(model, ckpt_file, load_to_cpu=True):
map_location = (lambda storage, loc: storage) if load_to_cpu else None
ckpt = torch.load(ckpt_file, map_location=map_location) #文件里保存了模型参数、优化器等
para_dict = ckpt['state_dicts'][0] # 取出保存了的模型的参数字典

for n,p in model.named_parameters(): # 遍历新模型的层名和参数对象
ip = para_dict[n] # 取出层名一样保存好了的参数对象
if p.shape == ip.shape: # 判断参数的尺寸是否一样
p.data.copy_(ip.data) # 复制保存的参数到模型里面
else:
print('{} -shape {} ,{}'.format(n, (p.shape), (ip.shape))) # 打印出新参数和保存好了的参数尺寸
print(judge_equal_para(model, ckpt_file, load_to_cpu)) # # 判断模型参数是否和参数文件一致
导入模型参数后,我还判断了是否模型的参数和参数文件里面是一一致的:
def judge_equal_para(model, ckpt_file, load_to_cpu=True):
map_location = (lambda storage, loc: storage) if load_to_cpu else None
ckpt = torch.load(ckpt_file, map_location=map_location)
para_dict = ckpt['state_dicts'][0].copy()
judge = True
for n,p in model.named_parameters():
ip = para_dict[n]
if p.shape == ip.shape:
if (p.data!=ip.data).all(): # 判断参数对象数据是否相等
return False
else:
pass
return judge
使用官方给的方法代码如下:
def load_para(model, ckpt_file, load_to_cpu=True):
map_location = (lambda storage, loc: storage) if load_to_cpu else None
ckpt = torch.load(ckpt_file, map_location=map_location)#文件里保存了模型参数、优化器等
para_dict = ckpt['state_dicts'][0] # 取出保存了的模型的参数字典
model.load_state_dict(para_dict) # 导入参数
print(judge_equal_para(model, ckpt_file, load_to_cpu)) # 判断模型参数是否和参数文件一致
使用官方给的方法运行结果为:
True
epoch[1]: r@1=0.949940, r@5=0.973778, r@10=0.985101 # 越大越好

使用自己写的方法运行结果为:
True
epoch[1]: r@1=0.008343, r@5=0.029797, r@10=0.045888
评估函数是绝对没有问题,就是不知道导入参数那里不对,希望有大神指点一二。

 
已邀请:

Glimmer

赞同来自:

似乎解决了。。。我把 ‘for n,p in model.named_parameters(): ’ 改为 ‘for n,p in model.state_dict().items()’就成功了,打扰大家了。

要回复问题请先登录注册