a: The CSF of CLIP vision-language models with two different image ...

a: The CSF of CLIP vision-language models with two different image ...