Embedding¶
Definition
AI-generated
An embedding is a learned, fixed- or variable-length vector (or low-rank factor) representing a discrete token, span, image patch, cell, or other object so that semantically similar items map near each other in the space; modern transformers compute contextual embeddings that depend on surrounding context.
Why it matters in GWAS¶
Sequence and single-cell models use embeddings as inputs to heads for variant scoring or cell typing; GWAS interpretability usually requires independent association evidence because embedding geometry can absorb confounding or batch signal.
Example usage¶
"The variant window was tokenized into k-mers and passed through a pretrained DNA encoder to obtain a pooled embedding for the MLP risk head."
Related terms¶
References¶
- Mikolov T, et al. (2013). Distributed representations of words and phrases and their compositionality. NeurIPS.
- Devlin J, et al. (2019). BERT: pre-training of deep bidirectional transformers for language understanding. NAACL.
Last updated (UTC · Git history)