BERT (Bidirectional Encoder Representations from Transformers)¶
BERT is a transformer encoder stack pretrained on large text with masked language modeling (predicting masked tokens in context) and, in the original recipe, next-sentence prediction; release checkpoints are then fine-tuned on downstream tasks with task-specific heads.
Why it matters in GWAS¶
PubMed- or clinical-notes–adapted BERT-style models are common baselines for biomedical named-entity recognition, relation extraction, and document classification used to mine phenotypes, drugs, and genes from literature or EHR text; like all LLM-family tools, outputs need verification against curated databases and study-specific ontology rules.
Example usage¶
"We compared zero-shot prompts against a BERT baseline fine-tuned on 10k manually labeled GWAS abstract sentences."
Related terms¶
References¶
- Devlin J, et al. (2019). BERT: pre-training of deep bidirectional transformers for language understanding. NAACL.
Last updated (UTC · Git history)