Pre Training And Fine Tuning Strategy Instruction

1

Llama 3.2 90B VisionModel59/100

via “instruction-tuned multimodal generation with alignment”

Meta's largest open multimodal model at 90B parameters.

Unique: Provides both base and instruction-tuned variants, allowing users to choose between raw model capability and aligned behavior, with torchtune framework enabling custom fine-tuning on proprietary instruction datasets

vs others: Open-weight instruction-tuned variants enable custom alignment without relying on proprietary API providers, though fine-tuning infrastructure requirements are higher than using managed APIs

2

Llama 3.2 11B VisionModel59/100

via “instruction-tuned variant for aligned task performance”

Meta's multimodal 11B model with text and vision.

Unique: Instruction-tuned variant available as separate model checkpoint, enabling users to choose between raw language modeling and task-optimized behavior. Approach avoids RLHF complexity while providing instruction-following improvements through supervised fine-tuning on curated datasets.

vs others: Instruction-tuned variant provides task alignment without RLHF complexity, while remaining smaller and faster than larger instruction-tuned models (70B+). Separate checkpoint allows users to experiment with both variants without retraining.

3

Mistral NemoModel57/100

via “base and instruction-tuned model variants”

Mistral's 12B model with 128K context window.

Unique: Dual-variant release strategy provides both pre-trained base model for custom fine-tuning and instruction-tuned variant for immediate deployment, enabling flexibility for different use cases without requiring downstream alignment

vs others: More flexible than single-variant models like Llama 3, offering choice between base and instruction-tuned without forcing users to fine-tune or accept pre-aligned behavior

4

LLMs-from-scratchRepository55/100

via “instruction fine-tuning with supervised learning on task-specific examples”

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Unique: Implements response-only loss masking by explicitly zeroing instruction token gradients, making the fine-tuning objective clear. Includes utilities to visualize which tokens contribute to loss, helping debug instruction-response boundary issues.

vs others: More transparent than HuggingFace's trainer because loss masking is explicit and modifiable; requires manual implementation of evaluation metrics unlike AutoTrain, but enables fine-grained control over training dynamics.

5

awesome-generative-ai-guideRepository51/100

via “fine-tuning methodology and framework comparison”

A one stop repository for generative AI research updates, interview resources, notebooks and much more!

Unique: Frames fine-tuning within a decision matrix comparing it to prompting and RAG approaches, with explicit cost-benefit analysis. Most fine-tuning guides assume fine-tuning is the right choice; this helps practitioners evaluate whether it's necessary.

vs others: More decision-oriented than framework-specific fine-tuning documentation; provides comparative analysis of when to fine-tune vs. use alternatives, whereas most resources focus on how to fine-tune assuming it's already decided.

6

ai-notesRepository49/100

via “instruction tuning and rlhf technique documentation”

notes for software engineers getting up to speed on new AI developments. Serves as datastore for https://latent.space writing, and product brainstorming, but has cleaned up canonical references under the /Resources folder.

Unique: Explicitly documents the pipeline from base model → instruction tuning → RLHF → chat model, showing how each stage builds on previous work rather than treating them as isolated techniques

vs others: More accessible than academic papers on RLHF because it contextualizes techniques within practical model development, but less detailed than specialized alignment research

7

Prompt-Engineering-GuidePrompt42/100

via “fine-tuning guidance for gpt-4o and other models with prompt engineering integration”

🐙 Guides, papers, lessons, notebooks and resources for prompt engineering, context engineering, RAG, and AI Agents.

Unique: Integrates fine-tuning guidance within the broader prompt engineering context, showing how fine-tuning and prompting are complementary approaches rather than alternatives

vs others: More practical than academic fine-tuning papers because it includes cost-benefit analysis; more comprehensive than vendor documentation because it compares fine-tuning with prompt engineering alternatives

8

llm-courseModel38/100

via “fine-tuning-and-preference-alignment-implementation”

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Unique: Provides both theoretical content (alignment algorithms, fine-tuning trade-offs) and 6 executable notebooks implementing SFT and preference alignment. Notebooks cover both efficient (LoRA) and full fine-tuning, enabling practitioners to choose based on their constraints.

vs others: More comprehensive than single-technique tutorials; more accessible than research papers because notebooks provide working code and step-by-step guidance

9

Prompt Engineering GuidePrompt24/100

via “fine-tuning guidance for model customization”

Guide and resources for prompt engineering.

10

OpenAI CookbookRepository22/100

via “fine-tuning workflow and evaluation patterns”

Examples and guides for using the OpenAI API.

11

Training language models to follow human instructions with human feedback (InstructGPT)Product21/100

via “supervised instruction fine-tuning on diverse task examples”

* ⭐ 03/2022: [Multitask Prompted Training Enables Zero-Shot Task Generalization (T0)](https://arxiv.org/abs/2110.08207)

Unique: Combines multi-task prompting with supervised fine-tuning to enable a single model to generalize to new tasks without task-specific training. The approach uses diverse instruction types in a single training pass, leveraging task diversity as an implicit regularizer for generalization.

vs others: More sample-efficient than task-specific fine-tuning and enables zero-shot generalization, while providing better initialization for RLHF than raw base models because it establishes instruction-following patterns before preference optimization.

12

Finetuning Large Language Models - DeepLearning.AIProduct19/100

via “supervised fine-tuning with instruction-following datasets”

![](https://img.shields.io/badge/Level-Medium-yellow)

Unique: Focuses on practical instruction-following fine-tuning rather than theoretical foundations, with emphasis on dataset quality, loss computation strategies, and preventing catastrophic forgetting through careful validation

vs others: More accessible than raw PyTorch training loops while providing deeper architectural understanding than API-only fine-tuning services like OpenAI's fine-tuning endpoint

13

LLM Bootcamp - The Full StackProduct19/100

via “llm fine-tuning strategy and implementation”

![](https://img.shields.io/badge/Level-Medium-yellow)

Unique: Provides decision framework for fine-tuning vs alternatives (prompt engineering, RAG, model selection) with explicit cost-benefit analysis — not just 'how to fine-tune' but 'when to fine-tune.' Covers both open-source and commercial fine-tuning paths.

vs others: More strategic than Hugging Face fine-tuning docs; includes ROI analysis and trade-off guidance that helps teams avoid expensive fine-tuning mistakes.

14

CS25: Transformers United V3 - Stanford UniversityProduct18/100

via “pre-training and fine-tuning strategy instruction”

![](https://img.shields.io/badge/Level-Medium-yellow)

Unique: Frames pre-training and fine-tuning as complementary optimization problems with explicit trade-off analysis between data efficiency, computational cost, and final task performance, rather than treating fine-tuning as a simple downstream application of pre-trained weights

vs others: More comprehensive than individual model documentation, but less practical than frameworks like Hugging Face Transformers that provide reference implementations and pre-trained checkpoints

15

CS25: Transformers United V2 - Stanford UniversityProduct18/100

via “transformer-training-and-fine-tuning-strategies”

![](https://img.shields.io/badge/Level-Medium-yellow)

Unique: Connects pre-training objectives to downstream task performance, teaching how different pre-training strategies (MLM vs CLM vs contrastive) create different inductive biases, and how to select fine-tuning approaches based on compute constraints and task characteristics

vs others: More comprehensive than fine-tuning tutorials and more practical than pure training theory, providing decision frameworks for choosing between full fine-tuning, LoRA, and other parameter-efficient methods based on specific constraints

16

OpenAI CookbookProduct

via “fine-tuning workflow guidance”

17

OpenAI CookbookTemplate

via “fine-tuning workflow with evaluation and validation”

18

AgenticProduct

via “agent-training-and-fine-tuning-pipeline”

Top Matches

Also Known As

Company