For most of my adult life, I was too cowardly to write this text, never mind posting it. I was worried about what people would think, and the repercussions on my career. Would people still take me seriously? But I’m now a whole full Professor of Computer Science at a top university, with all kinds of fancy metrics and titles to point to. Time to stop being such a pussycat.
Here’s the thing: I’ve always been terrible at math. How bad? Tell me to solve a quadratic equation, or differentiate something, and I would have no idea where to even start. I usually skip right past the equations when I read a paper because I don’t understand them. Last time I proved a theorem was approximately never.
I also always hated math. Not the abstract idea of math, but math as it actually exists. In particular, the activity of doing math, and trying to get stuff right. I hate math because I’m so bad at it, but clearly my negative feelings towards the topic is not helping me get better at math.
I almost failed maths in high school, and all my memories of math class in high school are of me staring out the window, talking to friends, writing weird stories, or programming my calculator. Anything to avoid those detestable math problems. During my undergrad, I had to take an introductory calculus class in order to take some computer science class I wanted to take. I failed the exam for that calculus class four times, and only passed on the fifth try because I realized that one of the professors was reusing his old exams with very minor changes. I learned basically nothing from that course. And not only do I not know how to differentiate anything, I also never learned things such as matrix multiplication or other parts of linear algebra that are supposed to be crucial for AI researchers like me.
Our PhD program requires my PhD students to take some theory courses that I’m pretty sure I couldn’t pass myself. I’m not even sure I could make it through our required undergrad theory courses. Some kind of computer scientist I am. The reason I could get a bachelors degree is that my undergrad is in Philosophy, though I did take a bunch of CS classes.
Which brings us to the question everyone asks, even though they often don’t believe my answer. The question is: how the hell can I be a successful AI researcher without knowing math? The implication is that I’m lying, or at least grossly exaggerating, because we all know that machine learning is very mathematical. It must be, because those GPUs are multiplying matrices all day. I’ll try to answer this below. Please bear with me, I’m trying to be as honest as I can here.
My first instinct is to say that mathematics is not important to the research I do. I never need to prove a theorem or even rewrite an equation. The details of how the matrices get multiplied don’t matter to me. I deal in ideas and code. Not math.
I remember when I taught myself programming using a Turbo Pascal IDE I discovered on the used computer I had bought when I was 13. As I blundered my way through the intricacies of Pascal, mostly by trial and error, I felt that a beautiful new world was opening up to me. It was hard, but I could learn it, and I had talent for it. Writing program code felt pretty much like writing natural language. And I was always good at writing. One of the things I learned about was variables. Some time after that, we were introduced to variables in school. I was excited, as here was a concept I actually knew something about! I was pleasantly surprised that I seemed to understand variables better than anyone else. But this didn’t help with the mind-numbingly boring stuff we did in maths class, all these exercises up and down the page.
In my undergrad, after two years of philosophy and psychology, I started taking computer science classes. I was naturally good at computer science. I understood the concepts and I became a cracked programmer. It was a lot of hard work but that was not a problem, because it was so fun. It was very different to studying philosophy, where I would just read the book and ace the exam. Mathematics, on the other hand, was all hard work and no understanding, and I couldn’t pass the exam at all.
In short, I was good at writing, philosophy, programming, and most aspects of computer science, and saw these subjects as intimately related. At the same time, I was terrible at math. So you may understand how I can see maths as largely unrelated to what I do.
And yet, I often use mathematical concepts when I talk about my research. Actually, when I do research as well. A recent project of ours focuses on embedding programs represented as syntax trees into a latent space that can the be searched efficiently. This involves considerations such as keeping the dimensionality of the space low enough to allow covariance matrix calculation and how to regularize the search to stay within the training distribution. That’s a bunch of mathematical terms there. And they mean something, because reasoning with them is how we got the method to work so well. But please don’t ask me to write down the equations.
So, how do I reason with mathematical concepts if I cannot do the symbol manipulation? Mostly visually. There are these little images of these things going on in my head, like a search blob moving against a gradient in a latent space. The images are somehow incomplete and clearly misleading–it is impossible to visualize a 128-dimensional space, so you have think of it as two-dimensional–but they are useful. But I also sometimes think of them in terms of program code, and the program code often comes out as animations, e.g. I see the program counter looping in a for-loop. It’s not clear to me how being able to to do the symbol manipulation (e.g. rewriting the equation for for the encoder function in some other form) would be of any help in reasoning about the algorithm. But that might just be because I don’t know how to do the symbol manipulation. If I did, maybe I would see new possibilities.
There are other uses of mathematical concepts which are possibly even fuzzier. A key skill in designing algorithms is understanding approximately how they scale in time and space. This basically boils down to figuring out what operations take time and which data takes space, and then having a mental picture of how many of them there are. Quite often, you’re counting loops. I learned the basics of doing this formally back in undergrad, but I haven’t done a formal analysis of an algorithm since. But I do loose, very informal analyses a lot when thinking and talking about algorithms. They help. But please don’t ask me to write them down.
Could it have been different? Could I have become the kind of person who was genuinely good at maths, enjoyed it, and perhaps even published papers with mathematical results of my own? Who knows. The closest I ever came to thinking I understood math was during a discrete maths course in my undergrad, which I found myself actually enjoying, although it was a lot of work. For a little while I felt like math might actually be for me. I’m not sure if this was because of the topic, as discrete maths felt discontinuous with all the continuous maths I’d learned to not learn so far. Maybe it was mainly my very inspirational teacher, Thore Husfeldt. In either case, the feeling dissipated as soon as I encountered that analysis class, the one that I failed four times.
As I write this, I keep fighting the impulse to brag about how successful a researcher I am. “Trust me, I’m a good researcher even though I don’t know math, see, I published so-and-so many papers and got so-and-so many citations and won this-and-that award.” I hate being that guy. So I’ll keep fighting that impulse. But it speaks to how deeply the impostor syndrome has taken root. Enough people have told me that I cannot possibly do what I’m doing without knowing a lot of math so that I’ve somehow think I can’t do what I do.
If you’ve read this far, you may wonder where I’m going. Who am I writing this for, and what am I trying to say? Let’s discuss some alternatives.
I’m definitely not saying that you shouldn’t study math. If you like mathematics, go ahead and study it. It’s useful (I know) and beautiful (they say). I have a lot of respect for theoreticians and wish I could do what they do.
Another thing I don’t want to do is to blame my teachers. Maybe it was my teachers who taught me that math was boring and that I was bad at it. Maybe it was their curriculum they had to follow. Maybe it was me. Other people seemed to enjoy those same math lessons, after all. Dear teachers, thank you for trying to teach me; I don’t think you and I were good fits for each other, but that’s not your fault.
More likely, I’m writing this for those of my colleagues who are in the same boat as me, who somehow became successful computer scientists despite sucking at math. I’m like you, guys. We exist. I also write it for those of my colleagues who actually do know a lot of math, to explain how I work.
But I also write it for myself, because I genuinely don’t understand. Do I actually know a decent amount of math? I use those concepts all the time. But I certainly can’t solve any exercise problems. What does it mean to know math, anyway? I think the idea that you need to start from the basics and solve all those boring exercises to even learn about the more interesting concepts is male-cow-excrement. Or maybe that is one way of approaching mathematics, but far from the only one.
Most of all, I write for those who have been thinking of learning computer science, but are afraid to try because they don’t like math or are bad at it. You can certainly do it. You can become a very good computer scientist despite sucking at math. If anyone tells you that you can’t learn, say, machine learning because you don’t have the “mathematical fundamentals” tell them to go to Helsinki. In the winter.
There are some strong feelings involved here, and I should perhaps stop writing now before I get more explicit. And I should post this before I go back and re-read it and start toning it down. Better post it fresh and raw, like sushi.
Interesting! I think you are at least unusual in how large the discrepancy is between your maths knowledge and computer science performance, but it might not be uncommon to have at least a bit of such a discrepancy – I do myself. My research in algorithms (currently not active, though) has involved some proving theorems, which I enjoy, but very few equations, which I suck at compared to the typical algorithms researcher. Many people in my area seem to understand the world through mathematics: they GET things by sketching out a bunch of equations, which seems to make things fall into place in their heads. For me, things fall into place when I implement it. I understand algorithms only by putting it them into code, at least part of the way towards an actual program, and that’s pretty much the only way for me to understand what’s really going on. I envy the capability to use mathematics as a language for thinking, but on the other hand my angle on things has made me into a more skilled programmer than some of the mathematics types, and has helped me to some discoveries as well.
ReplyDeleteI also struggled with calculus: the continuous parts, and particularly the multidimensional stuff. I’m a very one-dimensional person when it comes to mathematical thinking, which has impaired me a bit and kept me in working on one-dimensional problems (mostly string algorithms). I also enjoyed discrete mathematics much more, and found it much more useful. When I first saw the book Concrete Mathematics by Knuth et al., it felt like I could have saved about a year of struggling with mostly irrelevant maths if I had just been given this book to start with instead. It contained exactly the kind of maths that I found useful and interesting, and none of the rest. (It does connect with calculus in generating functions, but if I had got that as motivation, I think it might have had a better angle at understanding it.)
My story from school is that I hated maths in grade 1–6, when it was mainly about performing computations. I understood how to do it but it bored me out of my mind. I was one of the slowest in my class in going through and filling in answers in the maths book, and frequently had to bring it home to catch up, expanding the torment to what was supposed to be free time. But in grade 7, maths turned into being more about figuring out how to solve problems, and then I started liking it. I was good at figuring out solutions, but bad at then doing the calculations correctly, most commonly having the ideas right but the answer wrong, because I made little mistakes when working on the actual numbers.
And that was what got me hooked on programming in the first place. The computer could do the stuff I hated and sucked at – the actual computations – and let me focus on the parts I liked: figuring out the sequence of operations that would get the problem from the input to the output. That’s probably the most important aspect of maths in relation to computer science for me: it was my path into programming.
Thank you for sharing this. I have a different kind of problem where I'm really slow to understand any new math concept which others grasp quickly but once I get it then it sticks with me for a long time. Like I never understood what matrices actually meant but when I get the intuition that it's basically represents a kind of transformation that's applied to vectors to move and scale it in a certain way everything clicked. Like I understood how that could be useful in implementing game engines to compute a rotation or any kind of movement
ReplyDelete