Visualizing my first semester in Berkeley’s data science master’s program

W200 (Data Science Programming) and W201 (Research Design)

Richard Mathews II
4 min readApr 22, 2022

I just completed my first semester in Berkeley’s data science master’s program (MIDS), where I took W200 and W201.

  • W200 — Introduction to Data Science Programming
  • W201 — Research Design and Applications for Data and Analysis

I really enjoyed my classroom experience, especially W201, and wanted to share some neat visualizations for future/prospective students. Enjoy :)

Knowledge

In 2020, I adopted Obsidian as my PKM system (personal knowledge management). It is a link-based notetaking system, which allows me to visualize my notes as a graph. I tag my notes for graduate school (#mids/w200, for example) to make querying and graph views for my courses easier.

One of the nice features of Obsidian is node color-coding. To get an idea of how each course fits in my knowledge space, I color-coded both courses (blue for W200 and red for W201) and rendered my entire graph. The result is below. The Obsidian graph algorithm takes into account neighbors and link force, so you can get a sense of how “broad” each course is based on the space it spans in the graph. Clearly, W201 covers a broader portion of my graph than W200, which makes sense because it had a much broader scope.

My entire Obsidian knowledge graph, with my course content color-coded (W200=blue, W201=red)

Obsidian also lets you view local graphs, which represent a note’s neighborhood. I rendered the local graph for both W200 and W201, separately, and specified a depth of 2 so I could see the course modules and the notes they were linked to (shown below). I color-coded course content as blue and topic notes as gold. The local graphs give a sense of the topics that are covered in each course and how everything relates to each other. One can see that a lot of topics were covered in W201.

My Obsidian local graph for W200 (blue nodes are course modules)
My Obsidian local graph for W201 (blue nodes are course modules and textbook chapters)

Words

One of the reasons I prefer Obsidian over other applications (Notion, Roam, etc.) is that the notes are just text files, not some proprietary format. This means I can run Python scripts over all my notes, which opens the door for any text processing task. Using some Obsidian utility functions I created, I was able to write a Python script to scan over my MIDS notes and generate the word cloud below. The four dominant words/phrases were “data”, “data science,” “question,” and “people.” This is no surprise because much of the focus in W201 was on asking the right question and working with people (storytelling, persuasion, etc.).

Word cloud generated using text from my notes (both W200 and W201)

Time

I am an avid practitioner of the quantified self lifestyle, and one of the variables I track daily is time. Similar to how people track their financial transactions and create monetary budgets, I track my time and create temporal budgets so that I know how I spend my time and whether or not I am aligned with my “temporal budget.” Using my time data for the Spring 2022 semester, I visualized two time-series plots, one daily and one weekly, for my time spent on course activities (live sessions, assignments, reading, etc.). I averaged about 17 hours per week on both W200 and W201, which is standard from what I have heard.

Time-series plots for the time I spent on both W200 and W201 in total

Conclusion

I hope these visualizations help upcoming/prospective MIDS students better understand the time commitment and course content for W200 and W201. I plan on continuing these visualizations throughout my MIDS journey and posting them here on my Medium channel. If you want to get in contact with me to learn about my experience with Berkeley’s MIDS program, reach out to me on LinkedIn or through my personal website.

The code used to create my word cloud and time series visuals can be found here.

--

--

Richard Mathews II

Applied AI scientist, graduate student @ Berkeley, and biohacker. Interested in meta-learning, systems, AI, and data-driven lifestyles.