DuckDB: run SQL queries on 50,000+ datasets on the Hugging Face Hub

The Hugging Face Hub is dedicated to providing open access to datasets for everyone and giving users the tools to explore and understand them. You can find many of the datasets used to train popular large language models (LLMs) like Falcon, Dolly, MPT, and StarCoder. There are tools for addressing fairness and bias in datasets like Disaggregators, and tools for previewing examples inside a dataset like the Dataset Viewer.

A preview of the OpenAssistant dataset with the Dataset Viewer.

We are happy to share that we recently added another

 

 

 

To finish reading, please visit source site