There are a ton of different repositories and recommendations about where to find good datasets so we've compiled the best recommendations from other data engineers here.
**Start here:** https://datasetsearch.research.google.com/
Google's dataset search engine makes it much easier to find datasets vs searching for datasets in a static GitHub repo.
Other tips:
- Lot's of local governments publish their datasets: https://catalog.data.gov/dataset
- If you need additional help finding a dataset you can ask the [r/datasets community](https://www.reddit.com/r/datasets/)
> [!Tip]
> If you can't find a suitable dataset, another option is to create a dataset using a library like [Faker](https://github.com/joke2k/faker).
%% wiki footer: Please don't edit anything below this line %%
## This note in GitHub
<span class="git-footer">[Edit In GitHub](https://github.dev/data-engineering-community/data-engineering-wiki/blob/main/FAQ/Where%20can%20I%20find%20datasets%20to%20practice%20with.md "git-hub-edit-note") | [Copy this note](https://raw.githubusercontent.com/data-engineering-community/data-engineering-wiki/main/FAQ/Where%20can%20I%20find%20datasets%20to%20practice%20with.md "git-hub-copy-note")</span>
<span class="git-footer">Was this page helpful?
[👍](https://tally.so/r/mOaxjk?rating=Yes&url=https://dataengineering.wiki/FAQ/Where%20can%20I%20find%20datasets%20to%20practice%20with) or [👎](https://tally.so/r/mOaxjk?rating=No&url=https://dataengineering.wiki/FAQ/Where%20can%20I%20find%20datasets%20to%20practice%20with)</span>