The complexity of your problem and algorithms increases the size of the dataset needed. Don’t limit your ideas and use the vast amount of publicly available data on the Internet to feed and train your models.
Natural language processing
Build a program to process and analyze large amounts of “natural language data” such as reviews. For instance, our Yelp crawler checks the web for the latest reviews of selected restaurants. Or get reviews from the Google Play Store for your favorite app.
Many of the latest technological innovations rely on image recognition. In order to train self-driving cars, diagnostics imaging software, or simply the face-unlock feature in our smartphones, you need a colossal number of images.