There are a ton of different repositories and recommendations about where to find good datasets so we've compiled the best recommendations from other data engineers here. **Start here:** https://datasetsearch.research.google.com/ Google's dataset search engine makes it much easier to find datasets vs searching for datasets in a static GitHub repo. Other tips: - Lot's of local governments publish their datasets: https://catalog.data.gov/dataset - If you need additional help finding a dataset you can ask the [r/datasets community](https://www.reddit.com/r/datasets/) > [!Tip] > If you can't find a suitable dataset, another option is to create a dataset using a library like [Faker](https://github.com/joke2k/faker). %% wiki footer: Please don't edit anything below this line %% ## This note in GitHub <span class="git-footer">[Edit In GitHub](https://github.dev/data-engineering-community/data-engineering-wiki/blob/main/FAQ/Where%20can%20I%20find%20datasets%20to%20practice%20with.md "git-hub-edit-note") | [Copy this note](https://raw.githubusercontent.com/data-engineering-community/data-engineering-wiki/main/FAQ/Where%20can%20I%20find%20datasets%20to%20practice%20with.md "git-hub-copy-note")</span> <span class="git-footer">Was this page helpful? [👍](https://tally.so/r/mOaxjk?rating=Yes&url=https://dataengineering.wiki/FAQ/Where%20can%20I%20find%20datasets%20to%20practice%20with) or [👎](https://tally.so/r/mOaxjk?rating=No&url=https://dataengineering.wiki/FAQ/Where%20can%20I%20find%20datasets%20to%20practice%20with)</span>