Many universities make their curriculums publicly available, listing all required courses to attain a degree. The Computer Science field is no different. Using such freely accessible resources (see MIT (English), JMU (German), and KIT (German) as a starting point), one can create a custom schedule.
This post attempts to recreate a CS Bachelor degree, but with online resources. Some of them are available at no charge, others will cost you a small fee. All in all, they are an inexpensive alternative to learning things similar to those taught at universities. A remark: With the resources used I generally tried to adhere to the referenced curricula. That’s not always possible; being enrolled in university courses obviously gives you access to a wider range of lectures.
Generally, a Bachelors degree takes six semestersm where the first courses and semesters focus on the basics, many courses at this stage are mandatory. After these obligations are met, you are free to choose from a pool of lectures. That’s where one can choose a track towards Software Engineering, Logic, or Machine Learning, which is the focus of this post.
The first semester consists of introductory courses only. You build your foundational knowledge in mathematics, logic, algorithms and data structures (often abbreviated as ADS), and programming.
“Algorithms and data structures” focuses on, well, algorithms and data structures. The first part, algorithms, introduces you to sorting, which includes QuickSort, MergeSort, and many others. Furthermore, you’ll learn to examine the running time of such algorithms based on their input, which is done with the help of Big-O notation.
The second part, data structures, introduces widely used data structures.
These structures are used to store data so that one can quickly find or update the relevant information. The structures used include red-black trees, binary search trees, and graphs. Naturally, you’ll also learn how to traverse such structures. A course that teaches you such topics is the Data Structures and Algorithms Specialization, available on Coursera.
This lecture helps you understand the basics of mathematics used for computer science. Mainly, this is about matrix calculations, derivatives, and functions. A course covering these topics is the Mathematics for ML Specialization, offered by the Imperial College London. You begin with linear algebra, proceed to optimize functions, and finally, learn how to compress high-dimensional data.
Basics of programming
This lecture wants you to get practical experience in programming. Since this post is en route to Machine Learning, it’s consequent to choose python as the language. I like python’s simplicity — after you have completed the Python for Everybody Specialization, you’ll know what I mean.
The lecture on logic focus on relational algebra, deducing facts from knowledge, negation, and further concepts of mathematical proofing.
Such topics may sound dull, but learning them also helps you outside of your studies. I found the Introduction to Logiccourse to be the best fit — as with the previous courses, there won’t always be 100% coverage. You can audit the course, which gives you free access to the resources, but not to the graded exams — that’s a fair deal!
That’s it for the first semester. The second semester continues to focus on the basics. You start with learning statistics and probability theory to deepen your mathematical understanding. With this ticked, you hear about software engineering and the basics of computers.
Statistics and probability
Working as a Data Scientist or likewise means working with data. And working with data includes exploration. Obviously, you can’t just crunch through 1000 images in one go. That’s not required, we can do this with the help of statistics. Checking the average shape? Possible. The mean of a list? Possible. Math is coming in handy here. The Probability and Statistics: To p or not to p? course is a good fit for this. That, and the title conveys humour.
With our math skills set, we can advance to more computer-related stuff. Software engineering teaches concepts of, well, software engineering: Creating software, making it adaptable, making it fast, the basics of clean code, inheritance, and patterns. There’s much going on in this field, the course Software Design and Architecture Specialization teaches you more about UML, OOP, and design.
Basics of computers
This lecture covers the components of (modern) PCs. I had a hard time finding a lecture covering topics equivalent to my undergraduate studies. In the end, I settled on the Computers and the Internet course from Khan Academy. If you are looking for a book, then you can read Computing with Quantum Cats, which also goes over the history of computers.
That’s it for the second semester, the third one is already coming up. It covers signal transmission, advanced programming, and an undergraduate seminar.
This lecture focuses on getting digital data from A to B. But what if the transmission is leaky, and we are losing information? Or there’s background noise? Well, there are techniques like filters or encoding schemes that solve this. Head over to the Digital Signal Processing Specialization to learn more.
The curricula I checked as reference contained such placeholder seminars. As long as you are choosing a topic from your field, CS & ML, you are fine.
Visiting a university you are required to cover a topic based on one or more given papers. You then write an analysis of this field and present your findings in a short discussion.
This setting is difficult to replicate with online-only courses. There are two solutions to overcome this: You can find a like-minded fellow and talk about a field of your choice, or you can go through yet another course. The last one is my preferred solution, and since this replicated degree focuses on Machine Learning it’s consequent to follow Andrew Ng’s CS230 Deep Learning course from Stanford. This course gives an overview of the field, presented by one of its most prominent researchers.
Lecture from a different field
This is not required, you can see it as a bonus: Hearing a lecture from a different field. I recommend choosing biology to widen your knowledge. You can also choose economics, physics, chemistry, whatever you are curious about.
Theoretical computer science
The fourth semester begins with a tough course: Theoretical computer science. I remember back, hearing this lecture with many like-minded fellows. This was a tough lecture, we had huge respect for this subject. It covers computability, language theory (which is not necessarily human language), determinism, convergence, and complexity theory.
Finding an online course that covers most of this was challenging, in the end, I settled on Intro to Theoretical Computer Science, and the
Computability, Complexity, and Languages book. I recommend taking the course first, and advancing to the book afterwards.
How does one find the shortest route between A and B? Modelled as a graph, we can take the shortest sub-routes between A and B. But graphs are present in such (theoretical) problems, social networks (who follows whom), molecule theory, and more fields they are used frequently. The following two courses will teach you more about them: Graph Theory and Introduction to Graph Theory.
A practical course in software
This course wants you to create something. It does not have to be something ridiculous complex. I recommend a simple script that you run on a web server. Have a look at the gallery at streamlit.io and be inspired. It’s not that much about creating a superb product, but about learning something new.
Lecture from a different field
In a previous lecture, we heard about the components that build a computer. We will extend this and shift our focus towards interacting with hardware on a lower level. Have a look at the Embedded Software and Hardware Architecturecourse to learn about low-level firmware. If you are keen to invent new components, then a course on describing hardware might be a good fit. In this case, check out the Hardware Description Languages for FPGA Design course.
Using a computer is easy these days. Using your toolset efficiently, that’s another story. When I started programming with PyCharm, I frequently consulted the documentation to look up key shortcuts. With them memorized, programming is way easier. Proficiency with your programs is not only limited to IDEs, it extends to using the command line, using text editors, and permission management. You don’t want to spend your life figuring out tools but creating things with them. That’s the intention of the The Missing Semester of Your CS Education and the Operating Systems and You: Becoming a Power User courses.
Instead of storing your data in named folders and watching the chaos grow, you can use databases as an alternative. Finding the average number of images? Finding all users that gave 5 stars? With their query languages, databases easily allow one to extract useful information. And because SQL is the de-facto standard language to interact with databases, every major programming language has interfaces to interact with them. Since python is one of them, it also provides packages to hide all the complexity. All that’s left for you is setting up the databases and preparing queries. That and more you’ll learn with IBM’s Databases and SQL for Data Science with Python course.
A course on general qualifications
In a world full of writing, reading, speaking and presenting it’s a smart idea to do a dedicated course. What you’ll learn is not only applicable to your studies but also helps you throughout other areas of your life. And, since the universities require you to write and present a lot, it’s good to begin with Presentation Skills: Speechwriting, Slides and Delivery Specialization and Effective Communication: Writing, Design, and Presentation Specialization.
Your last semester features two courses and your thesis.
Data mining teaches you how to analyse and extract data from many domains. Be it text or image data, the techniques are universally applicable. The Data Mining Specialization focuses on these concepts, beginning with data visualization, handling text data, discovering patterns, and ending with clustering.
Mathematics for Data Science
There has been a lack of concrete ML courses so far. This is due to a bachelor’s degree mainly covering the basics, with only partial freedom of choice in the last semesters. Often, a Bachelor degree is followed by a Master degree, where the concepts are deepened. This is why I have mainly restricted the courses towards a broad education, which also follows the curriculums I studied. Nonetheless, to prepare you for more, the last course covers the mathematical foundations of Data Science. You might see similarities to your earlier maths courses, which can’t be prevented. After all, that’s a good thing: You get to hear the same content but from another teacher.
Bachelor thesis / ML course
Having heard lectures from a wide pool (but still restricted to the field of mathematics and computer science so far), you surely have discovered your interests. They are difficult to find early on, but with growing experience, you’ll get to know your strengths and weaknesses. Having them in mind you can design your own thesis project and present your work in a short blog post or video. Another option is to take Andrew Ng’s Machine Learning course instead, which connects what you have learned in previous courses.
Where to go next?
If you are looking for more detailed resources, you can look at this GitHub repository.