PyTorch内置模型简介

示例代码

from torchvision import models

vgg = models.vgg16(weights=models.VGG16_Weights.IMAGENET1K_V1)
print(vgg)

torchvision模型

分类

模型	说明
AlexNet
ConvNeXt
DenseNet
EfficientNet
EfficientNetV2
GoogLeNet
Inception V3
MaxVit
MNASNet
MobileNet V2
MobileNet V3
RegNet
ResNet
ResNeXt
ShuffleNet V2
SqueezeNet
SwinTransformer
VGG
VisionTransformer
Wide ResNet

语义分割

模型	说明
DeepLabV3
FCN
LRASPP

目标检测、实例分割和人体关键点检测

模型	说明
Faster R-CNN
FCOS
RetinaNet
SSD
SSDlite

实例分割

模型	说明
Mask R-CNN

关键点检测

模型	说明
Keypoint R-CNN

视频分类

模型	说明
Video MViT
Video ResNet
Video S3D
Video SwinTransformer

光流

模型	说明
RAFT

torchaudio数据集

模型	说明
Conformer	Conformer architecture introduced in //Conformer: Convolution-augmented Transformer for Speech Recognition// [Gulati //et al.//, 2020].
ConvTasNet	Conv-TasNet architecture introduced in //Conv-TasNet: Surpassing Ideal Time–Frequency Magnitude Masking for Speech Separation// [Luo and Mesgarani, 2019].
DeepSpeech	DeepSpeech architecture introduced in //Deep Speech: Scaling up end-to-end speech recognition// [Hannun //et al.//, 2014].
Emformer	Emformer architecture introduced in //Emformer: Efficient Memory Transformer Based Acoustic Model for Low Latency Streaming Speech Recognition// [Shi //et al.//, 2021].
HDemucs	Hybrid Demucs model from //Hybrid Spectrogram and Waveform Source Separation// [Défossez, 2021].
HuBERTPretrainModel	HuBERT model used for pretraining in //HuBERT// [Hsu //et al.//, 2021].
RNNT	Recurrent neural network transducer (RNN-T) model.
RNNTBeamSearch	Beam search decoder for RNN-T model.
SquimObjective	Speech Quality and Intelligibility Measures (SQUIM) model that predicts objective metric scores for speech enhancement (e.g., STOI, PESQ, and SI-SDR).
SquimSubjective	Speech Quality and Intelligibility Measures (SQUIM) model that predicts subjective metric scores for speech enhancement (e.g., Mean Opinion Score (MOS)).
Tacotron2	Tacotron2 model from //Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions// [Shen //et al.//, 2018] based on the implementation from Nvidia Deep Learning Examples.
Wav2Letter	Wav2Letter model architecture from //Wav2Letter: an End-to-End ConvNet-based Speech Recognition System// [Collobert //et al.//, 2016].
Wav2Vec2Model	Acoustic model used in //wav2vec 2.0// [Baevski //et al.//, 2020].
WaveRNN	WaveRNN model from //Efficient Neural Audio Synthesis// [Kalchbrenner //et al.//, 2018] based on the implementation from fatchord/WaveRNN.

天融信VPN客户端详解：VONE与安全接入的区别、下载与使用指南

使用vtk.js加载vtu格式（UnstructuredGrid）的文件

程序员常用开发软件

超图SuperMap下载、日志位置与入门指南 - 国产GIS平台资源大全

Ollama工具调用原理详解及Python代码实现教程

机器学习框架全面指南：从入门到实战应用

PyTorch深度学习教程：从入门到实战与常见问题解答

程序员常用技术大全：从编程语言到开发工具完整指南

PyTorch内置模型简介

示例代码

torchvision模型

分类

语义分割

目标检测、实例分割和人体关键点检测

实例分割

关键点检测

视频分类

光流

torchaudio数据集

推荐阅读

OpenVPN安装配置完整指南：从零搭建安全VPN服务器与客户端

机器学习框架全面指南：从入门到实战应用

程序员常用开发软件

Kaggle数据科学平台完全指南：竞赛、学习与社区全解析

Kaggle Notebook性能实测：免费GPU主机配置与运行时间分析

VMware Workstation 16激活码及许可证密钥获取方法

评论 (0)

发表评论

基础版

专业版