2,675 0
[略读]Align before Fuse
Align before Fuse: Vision and Language Representation Learning with Momentum Distillation 背景 VLP(Vis...
Align before Fuse: Vision and Language Representation Learning with Momentum Distillation 背景 VLP(Vis...