看下open clip论文 Learning Transferable Visual Models From Natural Language Supervision These results suggest that the aggregate supervision accessible to modern pre-training methods within web-scale collections of text surpasses that of high-quality crowd…