By Rahim Hashim
When I first started my PhD studies in the department, many of my fellow students found the R bootcamp to be valuable for learning the basics of programming, but also expressed their desire to learn Python, specifically. Because there were no classes offered in the department that provided this type of training, I reached out to Dr. Iva Greenwald and Dr. Harmen Bussemaker, who expressed enthusiastic support for creating a Python boot camp. Working with co-teachers Shanice Bailey (graduate student, Earth Observatory) and Jonathan Reeve (graduate student, Computational Literary Analysis) from the Columbia Foundations for Research Computing and Software Carpentry groups on campus, I set up a 7-week course open to anyone in the entire Biological Sciences department, including post doctoral researchers and research assistants. When the invitation went out, I had no idea how many people might sign up. And I had no idea that this experimental boot camp would become a permanent course in the department.
The boot camp was billed as a training opportunity in computational problem-solving skills designed to support novice students as well as those seeking to refresh their skills in programming. The idea was to develop a foundational skill set in Python programming, and build fluency in important programming packages including NumPy, Matplotlib, and more. Just like many other classes during the COVID-era, we would meet via Zoom.
On the first day of the boot camp, Google Colab notebook, GitHub repo, and slides in hand, I was both excited and hesitant about leading my first virtual class, and expected 10-12 students to show up, max. Not even five minutes into class, I realized that I had crucially underestimated turnout, with over 50 graduate students, post-docs, and research assistants joining our Zoom. To my delight, half the students who signed up were 1st or 2nd year grad students, which meant the boot camp was poised to offer them skills they could carry forward throughout their graduate student careers. Having learned how to program late in my college career, I empathized deeply with the students who had expressed, in individual exchanges before class, uncertainty and doubt as to whether they would be able to learn given past struggles trying to code. I emphasized this on day one, focusing only on foundational principles of programming and Python-specific syntax for basic commands.
While classes were recorded, students were encouraged to make it to the Zoom meetings so they could ask questions that would benefit everyone. Shanice, Jonathan, and I administered regular feedback surveys, and heard frequently that many of the students who weren't able to make the Zoom meetings were thankful to be able to watch the recorded lectures at their own convenience, and to complete the weekly assignments independently.
Each week, we focused on one to two key topics, making sure to allocate time to practice what we learned as a group and to share solutions with students who may have come up with a different solution or run into some unfamiliar error. By weeks four and five, I could hear and feel the students increased comfort with setting up large blocks of self-written code, and could not have been prouder when students would screen-share code that would have been incomprehensible to them just a few short weeks ago.
In the last two weeks of the course, we spent half of each session on lecture/practice, and the other half working on final projects in small groups of two to three students, on any topic of interest to the group. The groups presented in the last class, and their final products entirely shattered my day one expectations. Students demonstrated a wide breadth of interests and Python libraries used. One group used a public API and the Python requests library to access data from the internet (e.g. Twitter) for lexical parsing and analyses. Another used the BeautifulSoup library to scrape daily Congressional Records for user-generated summaries of daily Senate and Representative meetings. And yet another performed rigorous statistical analyses of the Genomics Data Common database assessing the relationship between various risk factors (i.e. age, genetic makeup, family history) and outcomes. It was inspiring to see the students’ collective hard work accumulate into fully-functioning, efficient, and well-commented(!) scripts that could be used and recycled for pushing forward the analyses of their real lab data.
This was the first time this type of course has been offered in the department, so there was a lot to learn in terms of curriculum, logistics, and pretty much everything else, but overall, we are quite pleased with how it went this summer, and are excited to improve on it for the years to come! We’re thrilled that our Department Chair, Dr. Harmen Bussemaker, supports turning this successful Python boot camp into a permanent half-course for our first-year students, starting next fall. Interested students should contact Dr. Raju Tomer, who will be administering the credited course.