We introduce Retrieved Sequence Augmentation(RSA) for protein representation learning without additional alignment or pre-processing. RSA links query protein sequences to a set of sequences with similar structures or properties in the database and combines these sequences for downstream prediction. (EMNLP 2024)
We introduce the adversarial grade school math (GSM-Plus) dataset, an extension of GSM8K augmented with various mathematical perturbations. Our experiments on 25 LLMs and 4 prompting techniques show that while LLMs exhibit different levels of math reasoning abilities, their performances are far from robust. (ACL 2024)
We introduce the Bi-Modal Behavioral Alignment prompting method, designed to maximize the potential of Domain-Specific Languages (DSL) in augmenting complex multi-modal reasoning tasks. BBA initiates by guiding large vision-language models to create separate reasoning chains for visual and DSL representations. (Findings of ACL 2024)