深度学习笔记

2021-06-23 阅读数:0次

微调

蒸馏

https://jishuin.proginn.com/p/763bfbd7cbe8

姿态估计

hybrlk
https://www.mixamo.com/#/

数据增强

深度学习总数

https://www.jiqizhixin.com/articles/2022-05-24-6

可视化

https://www.shuzhiduo.com/A/q4zVywajdK/

网站

https://pytorch.org/

数据集

https://xungejiang.com/2019/07/26/pytorch-imagenet/

显卡

V100-PICE/V100/K80

cuda安装

网站
https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=18.04

配置环境

vim ~/.bashrc
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
source  ~/.bashrc
nvcc -V #检查

参考
https://zhuanlan.zhihu.com/p/198161777

pytorch安装

网站

1	https://pytorch.org/get-started/previous-versions/

使用

1 2	conda install pytorch==1.7.0 torchvision==0.8.0 torchaudio==0.7.0 cudatoolkit=10.1 -c pytorch conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch #清华源比较快

python

内存泄漏

pip install memory_profiler
python -m memory_profiler xxx.py
添加@profile

mprof run xxx.py
mprof plot XXX.dat
export DISPLAY="tmux show-env | sed -n 's/^DISPLAY=//p'"

top -p 进程ID 查看内存占用

多进程处理数据

参考：https://blog.csdn.net/qq_18254385/article/details/90401181

数据格式转换

1
2
3

np.random.rand(3,4) #随机生成矩阵
torch.from_numpy(a) #numpy 转 tensor
b = torch.stack((a,a),0) #合并

图像数据处理

1	pic = cv2.resize(pic, (400, 400), interpolation=cv2.INTER_CUBIC)

matplotlib使用

介绍

matplotlib.pyplot.imshow
参数：
cmap:str或者colormap (RGB数据忽略此参数)
norm:缩放到0-1
aspect:控制轴的纵横比
interpolation:插值 默认nearest, 还有bilinear 
alpha:透明度0-1 （RGBA数据忽略此参数）
vmin,vmax ？？？ 如果使用vmin和vmax忽略此参数
origin 默认upper。 放置在轴的左上角 lower左下角
垂直轴指向上方表示“lower”，但指向下方表示“upper”
extent 边界框 left right bottom top
filternorm : bool, optional, default: True
filterrad : float > 0, optional, default: 4.0
resample : bool, optional

参考：

https://blog.csdn.net/ztf312/article/details/102474190
https://blog.csdn.net/guduruyu/article/details/60868501
https://www.cnblogs.com/HuZihu/p/9481068.html

工具使用

tensorboard使用

1 2	tensorboard --logdir=path --port=6005 tensorboard --logdir='./'

指标

bce
https://blog.csdn.net/tmk_01/article/details/80844260
指标
https://www.jianshu.com/p/b960305718f1
https://blog.csdn.net/geter_CS/article/details/79849386

https://www.cnblogs.com/jiaxin359/p/8627530.html
https://zhuanlan.zhihu.com/p/87768945

AP：
https://zhuanlan.zhihu.com/p/33372046
AUC：
https://zhuanlan.zhihu.com/p/267901426
多分类混淆矩阵
https://blog.csdn.net/m0_38061927/article/details/77198990
多分类指标
https://zhuanlan.zhihu.com/p/59862986
https://zhuanlan.zhihu.com/p/147663370
https://zhuanlan.zhihu.com/p/51125423
库
http://d0evi1.com/sklearn/model_evaluation/
https://sklearn.apachecn.org/docs/master/32.html（很有用）
https://scikit-learn.org.cn/view/93.html

pytorch框架

常用代码

查看数据和模型是否在GPU上

1 2	print(next(model.parameters()).device) #模型 print(data.device) #数据

数据格式转换

cpu_imgs.cuda()
gpu_imgs.cpu()
torch.from_numpy(imgs) #numpy->CPU tensor, GPU tensor不能直接转numpy
cpu_imgs.numpy()
loss_output.item() #tensor是标量，可以用item取出来
1.1 list 转 numpy
ndarray = np.array(list)

1.2 numpy 转 list
list = ndarray.tolist()

2.1 list 转 torch.Tensor
tensor=torch.Tensor(list)

2.2 torch.Tensor 转 list
先转numpy，后转list
list = tensor.numpy().tolist()

3.1 torch.Tensor 转 numpy
ndarray = tensor.numpy()
*gpu上的tensor不能直接转为numpy
ndarray = tensor.cpu().numpy()

3.2 numpy 转 torch.Tensor
tensor = torch.from_numpy(ndarray)

随机数种子

https://blog.csdn.net/weixin_43135178/article/details/118768531
https://www.cnblogs.com/xiaodai0/p/10413711.html

1
2
3

if seed == 0:
    torch.backends.cudnn.deterministic = True #每次返回的卷积算法将是确定的
    torch.backends.cudnn.benchmark = False #GPU，将其值设置为 True，就可以大大提升卷积神经网络的运行速度 参考：https://zhuanlan.zhihu.com/p/73711222

保存模型

https://www.daimajiaoliu.com/daima/479c201301003e4

并行训练的方法

torch.nn.DataParallel
torch.distributed
https://ptorch.com/docs/1/distributed
https://zhuanlan.zhihu.com/p/363755226
torch.multiprocessing 使用 torch.multiprocessing 取代启动器
apex https://bbs.cvmart.net/articles/2672
horovod
- 安装： pip install horovod
- https://zhuanlan.zhihu.com/p/264778072
- https://blog.csdn.net/weixin_26712065/article/details/108915884
- api https://horovod.readthedocs.io/en/stable/api.html
torch.cuda.amp
https://zhuanlan.zhihu.com/p/165152789
slurm（没研究）

代码：
https://github.com/tczhangzhi/pytorch-distributed
https://github.com/Xianchao-Wu/pytorch-distributed/tree/master
参考
https://zhuanlan.zhihu.com/p/98535650
https://zhuanlan.zhihu.com/p/105755472
https://zhuanlan.zhihu.com/p/343891349
https://blog.csdn.net/lgzlgz3102/article/details/107054314?ivk_sa=1024320u
参数更新
https://zhuanlan.zhihu.com/p/350501860
https://zhuanlan.zhihu.com/p/129912419
https://zhuanlan.zhihu.com/p/76638962（重要）

框架（未整理）

for task in tqdm(tasks):
pool = multiprocessing.Pool(args.nProc) # 调用是CPU的核数
pool.apply_async(func=prep, args=(tr_files, trDir, True)) pool.apply_async(func=prep, args=(ts_files, tsDir, False))
pool.close() # close pool, no more processes added to pool
pool.join() # wait pool to finish, required and should be after .close()
if config.use_gpu and torch.cuda.is_available():
   config.use_gpu = True
else:
   config.use_gpu = False
为CPU设置种子用于生成随机数，使得结果确定
torch.manual_seed(1)
model.apply(train.weights_init)
def weights_init(m):
  classname = m.__class__.__name__
  if classname.find('Conv3d') != -1:
      nn.init.kaiming_normal_(m.weight)
      m.bias.data.zero_()
checkpoint = torch.load(args.resume_ckp)
  model = checkpoint['model']
  修改参数
for name, m in model_old.named_modules():
    if any(i in name for i in shared_modules):
      if '.adap' not in name and isinstance(m, nn.Conv3d) and (m.kernel_size[0]==3):
          store_weight3x3.append(m.weight.data)
          name3x3.append(name)
      elif '.adap' not in name and isinstance(m, nn.Conv3d) and (m.kernel_size[0]==1):
          store_weight1x1.append(m.weight.data)
          name1x1.append(name)
       elif '.adap' not in name and isinstance(m, nn.InstanceNorm3d):
          store_weightNorm.append(m.weight.data)
          nameNorm.append(name)

  
  train.train(args, tasks_archive, model)

  def train(args, tasks_archive, model):
     torch.backends.cudnn.benchmark=True
     model = nn.parallel.DataParallel(model)
     sum([p.data.nelement() for p in model.parameters()]  参数大小 
     #如 x = torch.randn(size = (4,3,5,6)) 
     # x.nelement() = 360
     if config.use_gpu:
      model.cuda() # required bofore optimizer?
      #cudnn.benchmark = True
      optimizer = torch.optim.Adam(model.parameters(), lr=lr, weight_decay=config.weight_decay)
      dice_loss = MulticlassDiceLoss()
      ce_loss = nn.CrossEntropyLoss()
      focal_loss = FocalLoss(gamma=2)
    
      from multiprocessing import Process, Queue
      self.tr_dataPrep[proc] = Process(target=data_utils.trQueue, args=(self.config_task, task_archive['train'], self.trainQueue, self.patch_size, self.nProc, proc))
      self.tr_dataPrep[proc].daemon = True
      self.tr_dataPrep[proc].start()
      for epoch in range(start_epoch, config.max_epoch+1):
          for step in tqdm(range(config.step_per_epoch), desc='{}: epoch{}'.format(args.trainMode, epoch)):
                 batchImg = torch.from_numpy(batchImg).float()
                 if config.use_gpu:
                         batchImg = batchImg.cuda()
                  optimizer.zero_grad()
                  if config.trainMode in ["universal"]:
                        output, share_map, para_map = model(batchImg)
                  else:
                       output = model(batchImg)

                  output_softmax = F.softmax(output, dim=1)
          
                  loss = lovasz_softmax(output_softmax, batchLabel, ignore=10) + focal_loss(output, batchLabel)

                  loss.backward()
                  optimizer.step()

                   loss_epoch += loss.item()
                   num_batch_processed += 1
                  if epoch % config.save_epoch == 0:
                      ckp_path = os.path.join(config.log_dir, '{}_{}_epoch{}_{}.pth.tar'.format(args.trainMode, '_'.join(args.tasks), epoch, tinies.datestr()))
                      torch.save({
                  loss_epoch /= num_batch_processed
                   if Eval_bool:
                          eval(args, tasks_archive, model, epoch, iterations-1)

                  if len(trLoss_queue) == trLoss_queue.maxlen:
                  if last_trLoss_ma and last_trLoss_ma - trLoss_ma < 1e-4: # 5e-3
                      lr /= 2
                      for param_group in optimizer.param_groups:
                          param_group['lr'] = lr
                          last_trLoss_ma = trLoss_ma
          

      

  os.getcwd() 查看当前目录

  data agument
  from batchgenerators.augmentations.utils import create_zero_centered_coordinate_mesh, elastic_deform_coordinates
  coords = create_zero_centered_coordinate_mesh(patch_size) # 3*d*h*w
  coords = elastic_deform_coordinates(coords, a, s)

医疗图像处理

sitk_image = sitk.ReadImage(im)
orig_volume = sitk.GetArrayFromImage(sitk_image) # mod, z, y, x
sitk_image.GetDimension()
mod_num = sitk_image.GetSize()
original_shape = orig_volume.shape
以上两个x,y,z位置不一致
sitk_image.GetOrigin() ？？？
sitk_image.GetSpacing()
sitk_refer.GetDirection() ？？？
sitk.Resample

知识点

感受野Receptive Field

输入图像对这一层输出的神经元的影响有多大
公式：(N-1)_RF = (N_RF - 1) * stride + kernel

可持续学习

Continual learning (CL), incremental learning, life-long learning
方法
- regularization methods
  - 正则化方法约束了不同网络参数梯度的方向和步长
  - 通过将正则化项作为损失函数的惩罚，旧任务的重要参数只会有很小的变化。
  - 但随着任务序列的增加，会导致错误逐渐累积。
  - 论文：
    - Overcoming catastrophic forgetting in neural networks, 2016
    - LwF, Learning without forgetting， 2017
    - DFWF, Continual Learning for Fake Audio Detection， 2021, audio
- dynamic network methods
  - 动态网络方法在顺序任务的学习中调制网络。它为不同的任务使用相对独立的模块来避免灾难性的遗忘。然而，随着新任务的学习，新的模块通常会被添加到模型中，导致模型的大小增加。
  - Progressive neural networks， 2016
  - Lifelong learning of human actions with deep neural network self-organization， 2017
- replay methods
  - 重放方法部署内存缓冲区来存储来自每个任务的核心样本集，并在新任务中重放存储的样本.与动态网络方法相比，重放方法不需要扩展模型，只需要少量的样本存储。
  - 两个关键步骤是：理论更新和记忆检索memory update and memory retrieval
  - 回放方法
    - ER 随机选择
      - Catastrophic forgetting, rehearsal and pseudorehearsal, 1995
    - iCaRL
      - icarl:Incremental classifier and representation learning, 2017
  - Rethinking experience replay: a bag of tricks for continual learning, 2020
  - Continual learning with deep generative replay， 2017
  - Braininspired replay for continual learning with artificial neural networks, 2020
  - 计算机视觉
    - Incremental few-shot object detection, 2020.
    - Modeling the background for incremental learning in semantic segmentation, 2020
  - NLP
    - Continual learning in automated audio captioning. Master’s thesis, 2021.
    - Few-shot continual learning for audio classification. In ICASSP 2021- 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 321–325. IEEE, 2021
- 生成伪样本，是重放的另一种选择
- 可持续学习论文：
  - CV：
    - End-to-end incremental learning, 2018
    - Residual continual learning, 2020
  - audio:
    - Continual learning in automatic speech recognition, 2020
    - Continual learning for multidialect acoustic models, 2020
  - 综述
    - Continual lifelong learning with neural networks: A review， 2019
- 放回vs
  - 微调：一个个按顺序调
    - 遗忘
  - 多条件学习：同时学
    - 耗时
  - 模型集成
    - 消耗存储空间

deepfake

audio

比赛

ASVspoofchallenge 英文
- ASVspoof2019
  - 逻辑访问LA
    - 19种
    - A13、A17、A10、A19
  - 物理访问PA
    - 27种
    - bbbBB, cccCC, aaaAC, abcAA
      ADD challenge 中文

指标

AvgEER
Equal error rate

泛化能力不足

特征提取
- 线性频率倒谱系数 (LFCC)
  - 使用线性滤波器而不是梅尔滤波器，在高频区域捕获更多的光谱细节
  - Mel 频率倒谱系数 (MFCC)
- Constant Q cepstral coefficients (CQCC)
  - 具有可变的光谱时间分辨率，并且可以可靠地捕获操纵伪影的迹象
  - 扩展 CQCC (eCQCC) 和恒定 Q 统计量加主信息系数 (CQSPIC)
- 提出最大化cross-Teager能量倒谱系数（CTECCmax）可以跟踪由于重放条件引起的最大失真，并在重放音频中实现显着的性能。
- 使用远程频谱时间调制功能 - 对数梅尔频谱图上的2DDCT来捕获音频深度伪造伪影，并将其命名为全局调制 (Global M) 功能。
一类分类器one-class classifier
- 高斯混合模型（GMM）是传统的分类模型
- Light CNN (LCNN)
- capsule network 动态路由算法
- Res2Net
  - 其中将特征划分为多个通道组，并通过类似残差的连接进行连接以具有多个特征尺度
end-to-end solutions
- RawNet2
- spectro-temporal graph attention network (RawGAT-ST)
- ResWavegram-Resnet (RWResnet)
training strategies
deepfake audio 的可持续学习
DFWF
- Continual learning for fake audio detection
- 是第一个提出持续学习方法，即不忘检测，使模型逐步学习新的欺骗攻击
- 不需要数据存储，但是损失函数中的超参数需要根据数据分布进行调整。这会花费大量的参数调整工作
deepfake音频检测的任务不同于传统的持续学习场景。任务和类别的数量没有增加，只是数据分布发生了变化

灾难性遗忘catastrophic forgetting-多条件训练Multi-condition training-持续学习continual learning
（增量更新和不断学习）
boundary forgeries replay (BoFoRy)
在类边界选择有代表性的假音频样本进行回放
选择正确区分的假音频
保证存储在缓冲区中的选定样本接近类边界，并被正确分类为假音频。

缓冲区仅保存虚假类别(因为真实音频在不同场景下更加一致，而不同类型的假音频差异很大)

普通
类别增加，边界越接近中心
中心比边界更容易学习特征
deepfake
类别固定，更关注边界
真实数据分布集中，存储假数据的边界样本

假音频的特征向量与所有真实真实的平均特征向量的距离被用作它与类边界的接近程度的近似度量。