Yup, really. There’s a place in MIT where you can lounge around and eat bananas. A lot of bananas.

How many is a lot, you ask? 280,000 bananas in this academic year alone. This is a project run by the Undergraduate Association at MIT, and they also place pianos around campus for folks to give it a try, and for those of us who prefer a more sedate outlook towards life, they also have a hammocks team, who are doing exactly what you hope they would.
The bananas are for free, by the way. If you happen to be on the MIT campus, you can drop in and chomp away to your heart’s content, courtesy an MIT alum who’s also been known to, um, do other stuff besides.
Cool stuff, right?
The reason I bring this up is because I and a student at GIPE were chatting the other day about questions that her juniors were asking her. And the question was about how they didn’t have “enough R projects” to do. (R, for the uninitiated, is a software that econ nerds like to freak out over.)
I’m always a little befuddled when students say they don’t have projects to work on, or are looking for datasets to work on. The lazy answer to give to queries such as this is something along the lines of Kaggle, or Google’s Dataset Search. There’s hundreds of such data sources available online for free, and they’re one simple Google search away, so that’s one reason for my befuddlement.
But the primary source of my befuddlement is the fact that students in possession of a software looking for a dataset is very much a case of the cart being put in front of the horse! Software is a tool that helps you in the work you’re doing. But the approach that most students take is that they have the chops to use the software, and they don’t know what work to do.
You could always try and see if you can get an alumni to buy bananas, and forecast demand for bananas!
Trend! Seasonality! Forecasting! For bananas consumed on campus.
I ask you: which is a cooler story to tell? A story in which you say that you downloaded a dataset from the internet and did some modeling with it…
OR
A story in which you say that you and a bunch of your friends got together and convinced your college to give a room to stock bananas, convinced an alumni member to sponsor these bananas, figured out the logistics to procure, transport and store these bananas, and used a tool called R (or Python, or SAS or SPSS or whatever) to forecast demand?
The second option teaches you project management, the art of pitching a proposal, teamwork, logistics and coding. And so much more besides! It builds a story that works for the team, the institute, the community, and you use a statistical software the way it was meant to be used: as a tool that makes your life easier.
I know which story gets my vote.
You could build shared calendars, YouTube playlists using Google Sheets, demos for sampling using Google Sheets, or anything else that takes your fancy. Use Statsguru to analyze cricket stats using Python, automate the creation of book recommendation websites, or well, give bananas away for free.
But datasets for projects?
You’re limited by your imagination alone.
Very interesting! I had heard about the banana lounge, but the way the students did the analytics, forecasting and logistics is impressive!
Indeed. And their other projects are also very cool 🙂
While I have several advance analytical tools at my disposal, I could forecast Credit card transaction volume using Holt-Winters exponential smoothing in humble MS Excel. Of course there are options for time-series, linear and polynomial equations, point being, which you rightly said, tools are just enablers.
Good to hear from you! And yeah, I think we overrate the importance of stats models (which is a very weird thing for a prof to say, no?)