How do I write a HuggingFace dataset to disk?
I have made my own HuggingFace dataset using a JSONL file:
features: ['id', 'text'],
num_rows: 18 })
I would like to persist the dataset to disk.
Is there a preferred way to do this? Or, is the only option to use a general purpose library like joblib or pickle?

You can save a HuggingFace dataset to disk using the save_to_disk() method.
For example:
from datasets import load_dataset
test_dataset = load_dataset("json", data_files="test.json", split="train")

You can save the dataset in any format you like using the to_ function. See the following snippet as an example:
from datasets import load_dataset
dataset = load_dataset("squad")
for split, dataset in dataset.items():
For more information look at the official Huggingface script:


