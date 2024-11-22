Instruction datasets are specialized datasets designed to help a language model understand and respond to specific instructions or prompts. They provide the model with structured examples of questions, commands, or statements and the corresponding desired responses. These datasets essentially "teach" the model how to follow instructions by exposing it to various scenarios where it learns the correct patterns and formats for responses.

When providing instruction datasets to InstructLab, you can think of the model as a "student" and yourself as the "teacher." You need to provide your student with an educational reading in the form of a Markdown file that will act as the student's source of truth when answering all questions. You also need to provide some example questions and answers in the form of a qna.qaml file to demonstrate how the student will be expected to apply the knowledge from the reading during the test.