Self-supervised learning (SSL) has enabled Vision Transformers (ViTs) to learn robust representations from large-scale natural image datasets, enhancing their generalization across domains. In retinal imaging, foundation models pretrained on either natural or ophthalmic data have shown promise, but the benefits of in-domain pretraining remain uncertain. To investigate this, we benchmark six SSL-pretrained ViTs on seven digital fundus image (DFI) datasets totaling 70,000 expert-annotated images for the task of moderate-to-late age-related macular degeneration (AMD) identification. Our results show that iBOT pretrained on natural images, achieves the highest out-of-distribution generalization, with AUROCs of 0.80–0.97, outperforming domain-specific models, which achieved AUROCs of 0.78–0.96 and a baseline ViT-L with no pretraining, which achieved AUROC of 0.68-0.91. These findings highlight the value of foundation models in improving AMD identification, and challenge the assumption that in-domain pretraining is necessary. Furthermore, we release BRAMD, an open-access dataset (n=587) of DFIs with AMD labels from Brazil.