Volume 95 Issue 35 | pp. 30-31
Issue Date: September 4, 2017

Perspectives: Teaching chemists to code

Providing undergraduate chemistry majors with computer programming skills can make them more efficient and effective scientists
By Charles J. Weiss
Department: Science & Technology
Keywords: Education, computer programming, big data, informatics, software, open source

Spreadsheets are a standard tool in chemistry for simple tasks such as data analysis and graphing. Chemistry students are often introduced to spreadsheets their first year of college, if not earlier, and those who continue on to do research will likely use them as a means of handling and visualizing data.

Weiss uses Jupyter Notebooks, like these examples shown, in his scientific computing undergraduate chemistry course.
Credit: Charles Weiss
An image shows an overlay of three computer screens of Jupyter notebook pages with graphs, data, and information.
Weiss uses Jupyter Notebooks, like these examples shown, in his scientific computing undergraduate chemistry course.
Credit: Charles Weiss

Although spreadsheets have more capabilities than many people realize, they are still limited in comparison with science-specific data-analysis software applications. For example, spreadsheets are currently incapable of or not well suited for image analysis, automated data processing, high-dimensional data analysis, complex simulations, and machine learning that are needed in modern chemical research laboratories.

While some educators in the chemistry community have embraced advanced data-analysis software, it’s surprising that its use isn’t more common. And going beyond that, it’s surprising that computer programming has not effectively permeated the training of undergraduate chemistry majors. The American Chemical Society’s guidelines for undergraduate training do not directly address educating students on using digital tools and programming for dealing with experimental data.

In an age when chemistry classrooms have long been equipped with computers, and when research labs and companies where our students will go to work are experiencing a digital data deluge, it seems that we are leaving them ill prepared.

Software better geared to those earning chemistry degrees or conducting research is readily available. Common examples include MATLAB, Python’s SciPy stack, and GNU Octave. The last two are free, open-source packages. So why haven’t these become standard tools taught to all undergraduate chemistry majors? A key barrier to their adoption is that each of these packages requires some level of programming ability ... and computer programming is not part of the standard training for chemists ... but it should be.

Learning computer programming is an invaluable skill for chemists, as it empowers them to do more with collected data and, ultimately, to be more efficient and effective scientists. A competitive edge today doesn’t necessarily go to the person who can collect the best data but to the person who can best process and analyze the data collected. This nuance involves automating repetitive and time-consuming tasks, mining large data sets that don’t fit well in spreadsheets, and extracting information and trends too subtle or complex for people to discern without computers.

Add your thoughts

Share your experiences with undergraduate chemistry computer programming in the comment box below.

I recently chatted with a fellow chemist who had interviewed for a job at a company. The interviewer asked whether this chemist knew how to program. The company maintains an extensive database of its research results obtained over the years, and research managers want their team members to have computer programming skills so that they can access this data and use it in their ongoing research. The interviewer also pointed out that computer programming is a skill that, unfortunately, most chemists joining the company do not have.

I have seen firsthand how students light up when data analysis and computing resources are made available to them. To address the lack of programming training in traditional chemistry courses, last fall I taught the first iteration of a class called Scientific Computing for Chemists. During the course, chemistry students learned basic Python programming and how to use Jupyter Notebooks and SciPy stack—all free and open source—to solve chemical problems (J. Chem. Educ. 2017, DOI: 10.1021/acs.jchemed.7b00078).

The importance and utility of the course content, which assumes no previous programming experience, is underscored by the postcourse survey, which revealed that three-quarters of the students had applied what they were learning to their independent research projects or other courses before the computing course had concluded. The survey also shows that the students believed they learned to perform valuable data analyses that they didn’t previously know were possible and that the course provided them better understanding of and confidence in working with digital data. I am teaching the course again this fall.

My chemistry colleagues have noticed a difference because of the course, stopping me in the hallway to comment on how excited the scientific computing students were for taking it and reporting how student coding carried over into other courses. For example, one professor noted that the students taking the advanced analytical chemistry course in the spring approached many of the problems using Python and Jupyter Notebooks. Instead of performing the calculations by hand or using spreadsheets, students immediately went for the more advanced tools.

Incorporating more data analysis into the undergraduate curriculum takes time. Chemistry departments have increasing requirements placed on them for granting undergraduate degrees and, further, have limits on the number of courses they can require or offer for a major. This situation can make it difficult to include a topic such as computer programming that educators may not view as central to chemical science.

But that is an outdated view of chemical science. Scientific computing can be introduced to students in a course as described above or through a course from a computer science department. It can also be infused into other existing chemistry courses.

Last spring, I replaced an outdated piece of software for an intermediate-level chemistry course lab activity with Python and Jupyter Notebooks. Our general chemistry radioactivity lab this fall will include students running precoded, stochastic simulations of radioactive decay to examine how randomness influences experimental outcomes. The students are being asked to modify values used in the code, but they do not need to know programming for the activity. This exercise will provide our first-year chemistry students with an early exposure to using more sophisticated tools for running simulations and visualizations.

No matter how this new knowledge is incorporated into the curriculum, we should work to equip our students entering the field of chemistry with the digital skills to be efficient and effective scientists. This includes the ability to process, analyze, and visualize data using at least one programming language and to be comfortable writing scripts to automate research tasks. While it will take time and effort to implement, emphasizing programming skills will empower students and better prepare them for their careers.

Credit: Courtesy of Charles Weiss
A photo of Charles Weiss of Wabash College.
Credit: Courtesy of Charles Weiss

Charles J. Weiss teaches chemistry and scientific computing at Wabash College and conducts research in organometallic catalysis and bioinformatics. He taught himself to program in Python to support his teaching and research.

Chemical & Engineering News
ISSN 0009-2347
Copyright © American Chemical Society
Utsav Ranjan (September 5, 2017 2:08 PM)
Thank you very much for providing this valuable course to us. I want to learn it too, can you help me to get access of this course.
emma (September 6, 2017 9:25 AM)
All STEM majors could benefit through a course in code! Especially when all of our data analyzing software are made through such. It gives a greater connection between the chemist and the computer, bringing the field to new heights. As technology advances, most chemistry subatomic and microscopic chemistry will be dealt through computer analysis, and thus, yes, I think a much greater emphasis on coding should be pursued.
Venkatesh (September 7, 2017 4:28 AM)
Thank you very much for your valuable suggestion. I have to learn this. Can you please provide access to attend your course.
Dr David Thomas (September 7, 2017 5:30 AM)
I'm a chemist who works alongside programmers on a daily basis and I think its a terrible idea. Any kind of programming done by an amateur who has just attended a coding course is likely to look poor, be full of bugs and is most unlikely to have a 'help file' or manual. This makes it almost useless for the next student who comes along. Furthermore I can guarantee it will fail any sort of regulatory compliance test. Leave the job to a professional and concentrate on the most difficult task: SPECIFYING CLEARLY what you want new software to do for you. The skills needed are logic, critical thinking and communicating in good English.
Nathaniel Webber (September 11, 2017 6:22 PM)
There is a difference between educating chemists so they have the tools available to do things they want to and writing extremely complex, well vetted programs that are used by entire companies for compliance reasons. You don't have to do the latter to find programming useful. Any sort of computer science literacy would immensely help most chemists. To say otherwise is to ignore the way the winds are blowing. Not everyone has the luxury of having a team of coders at their beck and call. And that isn't required to write simple programs to automate your data analysis.

Also, amateurs are not the only ones who produce software that is full of bugs with poor documentation.
Tom (September 7, 2017 7:43 AM)
As someone who has taught myself programming during my graduate career this is a great idea to me. Most of the undergraduates that come through our lab take some class that incorporates matlab or Mathematica already. I have worked alongside programmers also and yes my code is often more crude, however they know what I am trying to accomplish and can refine it easily. The knowledge I have also allows me to know the limitations of the python modules that are available which helps collaboration with the software teams. Overall it is a great idea to teach students one of the popular programming languages as they will continue to be important in the field of chemistry.
Sukumar (September 7, 2017 8:31 AM)
Chemistry undergraduate students at Shiv Nadar University are required to take a programming course... in addition to computational chemistry & cheminformatics labs.
Hector (September 11, 2017 11:31 AM)
No question that this is a great idea. Chemistry and related degrees at UNAM (Mexico City) require anywhere from one to three applied programming courses. I had three, and it made me a better scientist. About 60% of my publications (https://scholar.google.com/citations?user=T1YleMUAAAAJ&hl=en) contain chemometric aspects developed through coding. Now as a professor I also teach a Scientific Python for Chemists course. We learn scripting to handle data files, implement numerical methods, signal and image processing, applications to spectroscopy and some basic molecular dynamics and Monte Carlo simulations. This empowers students greatlyand increases their productivity in school jobs and graduate school. They develop logical thinking and learn to utilize the computer as a scientific tool.
Jim Passmore (September 15, 2017 3:59 PM)
Couldn't agree more that some elementary coding is useful as a chemist. A spreadsheet might be fine for a one-off manipulation of data, but in my experience, real life data files are never all the same size and format, so a second set of data requires significant re-work (if not starting over!) in a spreadsheet. With Python, I can read in a data file, filter it (e.g., with Pandas) then use any number of ScyPy libraries to manipulate or extract data, whether that is to extrapolate an intercept, locate spectral peaks, curve fit to a model, or something more advanced. The real payoff is that subsequent data files take a fraction of the time to process. As an added plus, visualization is much better than the ugly plots created by that ubiquitous spreadsheet program.
Leave A Comment