For most of my adult life, I was too cowardly to write this text, never mind posting it. I was worried about what people would think, and the repercussions on my career. Would people still take me seriously? But I’m now a whole full Professor of Computer Science at a top university, with all kinds of fancy metrics and titles to point to. Time to stop being such a pussycat.
Here’s the thing: I’ve always been terrible at math. How bad? Tell me to solve a quadratic equation, or differentiate something, and I would have no idea where to even start. I usually skip right past the equations when I read a paper because I don’t understand them. Last time I proved a theorem was approximately never.
I also always hated math. Not the abstract idea of math, but math as it actually exists. In particular, the activity of doing math, and trying to get stuff right. I hate math because I’m so bad at it, but clearly my negative feelings towards the topic is not helping me get better at math.
I almost failed maths in high school, and all my memories of math class in high school are of me staring out the window, talking to friends, writing weird stories, or programming my calculator. Anything to avoid those detestable math problems. During my undergrad, I had to take an introductory calculus class in order to take some computer science class I wanted to take. I failed the exam for that calculus class four times, and only passed on the fifth try because I realized that one of the professors was reusing his old exams with very minor changes. I learned basically nothing from that course. And not only do I not know how to differentiate anything, I also never learned things such as matrix multiplication or other parts of linear algebra that are supposed to be crucial for AI researchers like me.
Our PhD program requires my PhD students to take some theory courses that I’m pretty sure I couldn’t pass myself. I’m not even sure I could make it through our required undergrad theory courses. Some kind of computer scientist I am. The reason I could get a bachelors degree is that my undergrad is in Philosophy, though I did take a bunch of CS classes.
Which brings us to the question everyone asks, even though they often don’t believe my answer. The question is: how the hell can I be a successful AI researcher without knowing math? The implication is that I’m lying, or at least grossly exaggerating, because we all know that machine learning is very mathematical. It must be, because those GPUs are multiplying matrices all day. I’ll try to answer this below. Please bear with me, I’m trying to be as honest as I can here.
My first instinct is to say that mathematics is not important to the research I do. I never need to prove a theorem or even rewrite an equation. The details of how the matrices get multiplied don’t matter to me. I deal in ideas and code. Not math.
I remember when I taught myself programming using a Turbo Pascal IDE I discovered on the used computer I had bought when I was 13. As I blundered my way through the intricacies of Pascal, mostly by trial and error, I felt that a beautiful new world was opening up to me. It was hard, but I could learn it, and I had talent for it. Writing program code felt pretty much like writing natural language. And I was always good at writing. One of the things I learned about was variables. Some time after that, we were introduced to variables in school. I was excited, as here was a concept I actually knew something about! I was pleasantly surprised that I seemed to understand variables better than anyone else. But this didn’t help with the mind-numbingly boring stuff we did in maths class, all these exercises up and down the page.
In my undergrad, after two years of philosophy and psychology, I started taking computer science classes. I was naturally good at computer science. I understood the concepts and I became a cracked programmer. It was a lot of hard work but that was not a problem, because it was so fun. It was very different to studying philosophy, where I would just read the book and ace the exam. Mathematics, on the other hand, was all hard work and no understanding, and I couldn’t pass the exam at all.
In short, I was good at writing, philosophy, programming, and most aspects of computer science, and saw these subjects as intimately related. At the same time, I was terrible at math. So you may understand how I can see maths as largely unrelated to what I do.
And yet, I often use mathematical concepts when I talk about my research. Actually, when I do research as well. A recent project of ours focuses on embedding programs represented as syntax trees into a latent space that can the be searched efficiently. This involves considerations such as keeping the dimensionality of the space low enough to allow covariance matrix calculation and how to regularize the search to stay within the training distribution. That’s a bunch of mathematical terms there. And they mean something, because reasoning with them is how we got the method to work so well. But please don’t ask me to write down the equations.
So, how do I reason with mathematical concepts if I cannot do the symbol manipulation? Mostly visually. There are these little images of these things going on in my head, like a search blob moving against a gradient in a latent space. The images are somehow incomplete and clearly misleading–it is impossible to visualize a 128-dimensional space, so you have think of it as two-dimensional–but they are useful. But I also sometimes think of them in terms of program code, and the program code often comes out as animations, e.g. I see the program counter looping in a for-loop. It’s not clear to me how being able to to do the symbol manipulation (e.g. rewriting the equation for for the encoder function in some other form) would be of any help in reasoning about the algorithm. But that might just be because I don’t know how to do the symbol manipulation. If I did, maybe I would see new possibilities.
There are other uses of mathematical concepts which are possibly even fuzzier. A key skill in designing algorithms is understanding approximately how they scale in time and space. This basically boils down to figuring out what operations take time and which data takes space, and then having a mental picture of how many of them there are. Quite often, you’re counting loops. I learned the basics of doing this formally back in undergrad, but I haven’t done a formal analysis of an algorithm since. But I do loose, very informal analyses a lot when thinking and talking about algorithms. They help. But please don’t ask me to write them down.
Could it have been different? Could I have become the kind of person who was genuinely good at maths, enjoyed it, and perhaps even published papers with mathematical results of my own? Who knows. The closest I ever came to thinking I understood math was during a discrete maths course in my undergrad, which I found myself actually enjoying, although it was a lot of work. For a little while I felt like math might actually be for me. I’m not sure if this was because of the topic, as discrete maths felt discontinuous with all the continuous maths I’d learned to not learn so far. Maybe it was mainly my very inspirational teacher, Thore Husfeldt. In either case, the feeling dissipated as soon as I encountered that analysis class, the one that I failed four times.
As I write this, I keep fighting the impulse to brag about how successful a researcher I am. “Trust me, I’m a good researcher even though I don’t know math, see, I published so-and-so many papers and got so-and-so many citations and won this-and-that award.” I hate being that guy. So I’ll keep fighting that impulse. But it speaks to how deeply the impostor syndrome has taken root. Enough people have told me that I cannot possibly do what I’m doing without knowing a lot of math so that I’ve somehow think I can’t do what I do.
If you’ve read this far, you may wonder where I’m going. Who am I writing this for, and what am I trying to say? Let’s discuss some alternatives.
I’m definitely not saying that you shouldn’t study math. If you like mathematics, go ahead and study it. It’s useful (I know) and beautiful (they say). I have a lot of respect for theoreticians and wish I could do what they do.
Another thing I don’t want to do is to blame my teachers. Maybe it was my teachers who taught me that math was boring and that I was bad at it. Maybe it was their curriculum they had to follow. Maybe it was me. Other people seemed to enjoy those same math lessons, after all. Dear teachers, thank you for trying to teach me; I don’t think you and I were good fits for each other, but that’s not your fault.
More likely, I’m writing this for those of my colleagues who are in the same boat as me, who somehow became successful computer scientists despite sucking at math. I’m like you, guys. We exist. I also write it for those of my colleagues who actually do know a lot of math, to explain how I work.
But I also write it for myself, because I genuinely don’t understand. Do I actually know a decent amount of math? I use those concepts all the time. But I certainly can’t solve any exercise problems. What does it mean to know math, anyway? I think the idea that you need to start from the basics and solve all those boring exercises to even learn about the more interesting concepts is male-cow-excrement. Or maybe that is one way of approaching mathematics, but far from the only one.
Most of all, I write for those who have been thinking of learning computer science, but are afraid to try because they don’t like math or are bad at it. You can certainly do it. You can become a very good computer scientist despite sucking at math. If anyone tells you that you can’t learn, say, machine learning because you don’t have the “mathematical fundamentals” tell them to go to Helsinki. In the winter.
There are some strong feelings involved here, and I should perhaps stop writing now before I get more explicit. And I should post this before I go back and re-read it and start toning it down. Better post it fresh and raw, like sushi.
No comments:
Post a Comment