Datasets from Instructions In Python
 
				Datasets from Instructions
This repository contains the code for Generating Datasets with Pretrained Language Models. The paper introduces a method called Datasets from Instructions (DINO sauropod) that enables pretrained language models to generate entire datasets from scratch.
🔧 Setup
All requirements for DINO can be found in requirements.txt. You can install all required packages in a new environment with pip install -r requirements.txt.
💬 CLI Usage
Single Texts
To generate datasets for (single) text classification, you can use DINO as follows:
python3 dino.py 
 --output_dir  
 --task_file  
 --num_entries_per_label  
 --batch_size 1
   where