Diffusion LM
- Structured Denoising Diffusion Models in Discrete State-Spaces (NIPS 2021)
- Diffusion-LM Improves Controllable Text Generation (NIPS 2022)
- AR-DIFFUSION: Auto-Regressive Diffusion Model for Text Generation (NIPS 2023)
- Diffusion Language Models Generation Can Be Halted Early (ACL 2023)
- Democratized Diffusion Language Model (2024 ICLR reject)
- Likelihood-Based Diffusion Language Models (NIPS 2023)
- DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models (ICLR 2023)
- DiffuSeq-v2: Bridging Discrete and Continuous Text Spaces for Accelerated Seq2Seq Diffusion Models (EMNLP 2023)
- DIFFUSION LANGUAGE MODELS CAN PERFORM MANY TASKS WITH SCALING AND INSTRUCTION-FINETUNING (ICLR 2024 reject)
- Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution (ICLR 2024 reject)
- Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models (2402 arxiv)
In Context Learning (not just for LLM)
- MetaICL: Learning to Learn In Context (ACL 2022)
- Meta-in-context learning in large language models (ACL 2022)
- Data Distributional Properties Drive Emergent In-Context Learning in Transformers (NIPS 2022)
- Why Can GPT Learn In-Context? Language Models Implicitly Perform Gradient Descent as Meta-Optimizers (ACL 2023)
- Finding Support Examples for In-Context Learning (EMNLP 2023)
- What Makes Good Examples for Visual In-Context Learning? (NIPS 2023)
- Meta-learning via Language Model In-context Tuning (NIPS 2023)
- Transformers as Algorithms: Generalization and Stability in In-context Learning (ICML 2023)
Larger language models do in-context learning differently (Google Research) - Large Language Models can Learn Rules
- UNDERSTANDING IN-CONTEXT LEARNING IN TRANSFORMERS AND LLMS BY LEARNING TO LEARN DISCRETE FUNCTIONS (ICLR 2024)
- How does Multi-Task Training Affect Transformer In-Context Capabilities? Investigations with Function Classes (NAACL 2024)
- An Information-Theoretic Analysis of In-Context Learning (2401 arxiv)
- DUAL OPERATING MODES OF IN-CONTEXT LEARNING (ICLRW 2024)
- MMICL: Empowering Vision-language Model with Multi-Modal In-Context Learning (ICLR 2024)
- Towards Multimodal In-Context Learning for Vision & Language Models (2403 arxiv)
- How Well Can Transformers Emulate In-context Newton’s Method? (2403 arxiv)
- Training Dynamics of Multi-Head Softmax Attention for In-Context Learning: Emergence, Convergence, and Optimality (2402 arxiv)
- Where does In-context Translation Happen in Large Language Models? (ICLR 2024 reject)
- Language Models for Text Classification: Is In-Context Learning Enough? (LREC COLING 2024)
- Can large language models explore in-context? (2403 arxiv)
- IN-CONTEXT LANGUAGE LEARNING: ARCHITECTURES AND ALGORITHMS (2401 arxiv)
- Training Nonlinear Transformers for Efficient In-Context Learning: A Theoretical Learning and Generalization Analysis (2402 arxiv)
- Batch-ICL: Effective, Efficient, and Order-Agnostic In-Context Learning (2401 arxiv)
- How do Large Language Models Learn In-Context? Query and Key Matrices of In-Context Heads are Two Towers for Metric Learning (ACL 2024 under review)
- In-Context Learning with Transformers: Softmax Attention Adapts to Function Lipschitzness (2402 arxiv)
- Linear Transformers are Versatile In-Context Learners (2402 arxiv)
- Training Nonlinear Transformers for Efficient In-Context Learning: A Theoretical Learning and Generalization Analysis (2402 arxiv)
- How do Transformers perform In-Context Autoregressive Learning? (2402 arxiv)
- Understanding In-Context Learning with a Pelican Soup Framework (ICLR 2024 reject)
- Superiority of Multi-Head Attention in In-Context Linear Regression (2401 arxiv)
- Enhancing In-context Learning via Linear Probe Calibration (AISTATS 2024)
- In-Context Learning of a Linear Transformer Block: Benefits of the MLP Component and One-Step GD Initialization (2402 arxiv)
- Benefits of Transformer: In-Context Learning in Linear Regression Tasks with Unstructured Data (2402 arxiv)
Diffusion
- Unveil Conditional Diffusion Models with Classifier-free Guidance: A Sharp Statistical Theory
- Towards a mathematical theory for consistency training in diffusion models
- Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation
'Paper Review' 카테고리의 다른 글
ICML 2024 reading list (update on 0510) (0) | 2024.05.10 |
---|
댓글