jph00/pytorchcon.md

## pytorchcon.md

      
    Raw
  

              pytorchcon.md
            
          
    Speakers:

Jeremy Howard - Founding Researcher, Fast.ai
Anna Tong - Reporter, Forbes


[00:00:32] Anna Tong: Hi everyone. Hi Jeremy. I'm told by many people here that you need no introduction, but for the two people who don't know who you are here, Jeremy, you're the CEO of Fast.ai, you're the CEO of Answer.ai. You created the first large language model, and you were the first outside organization to pour resources into PyTorch. Am I missing anything here?
[00:00:55] Jeremy Howard: That sounds good.
[00:00:56] Anna Tong: Okay. Well, tell us what it was like back then.
[00:01:01] Jeremy Howard: Well, it wasn't like this. PyTorch was really the baby of two people, Soumith and Adam. And it came from a longer line of software that was written in Lua, which is just called Torch. And at the time, Google was investing heavily in something called TensorFlow. And TensorFlow never felt very human-friendly to me. It's very focused on enterprise and making things harder for people and easier for computers, which felt wrong to me.
[00:01:41] Jeremy Howard: So when I saw what Soumith and Adam had done with PyTorch, we spent hundreds and hundreds of hours working with it and we were just like, this is so much more human-friendly. And so we just stopped everything else we were doing and we went all in on PyTorch. We were the first to do so. And at the time, people were just like, "You're crazy." Everybody knows Google's going to win. And people were like, "Why would I come to your course when you're teaching some obscure, open-source, random thing?" Because it's better. You should learn this because it's going to win because it's better.
[00:02:21] Jeremy Howard: And the first language model, or first large language model as you say, was ULMFiT, that was written in PyTorch. And it wasn't written on some big enterprise thing, it was written in a little Jupyter Notebook, and it smashed every state-of-the-art thing that was there. Same problem, everybody told me, why are you working with language models? Nobody cares. Everybody knows that's not the future. I said, "Well, they're wrong. Transfer learning is the future. This is what's going to work. This is what's going to change everything." And it was great because the people who were in those courses where we were teaching PyTorch and we were teaching language models, they did indeed learn the future. And now a lot of them are running the major research labs, are the CEOs of the big hot startups. It's no point following, you've got to, you've got to see where things are going and you've got to lead.
[00:03:16] Anna Tong: What was it like to create the first large language model and how did that, how did that work?
[00:03:22] Jeremy Howard: So, I'd been telling people for years that the natural place for deep learning, the type of AI that we were teaching and researching, was in natural language. And that was extremely controversial. Everybody thought that was stupid. And I got sick of trying to convince people of it, so I thought I'd just do it. And it was literally, I decided to do it, and by the next day, I had done it. And the very first thing I tried it against was basically the hardest academic benchmark, the IMDb benchmark, the longest documents, the most tricky, that had been very heavily invested in. And my very first result was substantially better than anybody had ever done. And it was just like, holy crap. Because I'd studied philosophy back at university and I'd studied this idea that goes back a long time of what if just being able to symbolically, statistically, whatever, finish a sentence, looks like the essence of intelligence? What if you, can you tell the difference between real intelligence and that? I thought, no, you can't tell the difference. So I've been thinking this for 30 years. And now I had an algorithm that was doing it.
[00:04:42] Jeremy Howard: But again, it still took a really long time because Alec Radford at OpenAI saw that work, he told me, and he's like, "This is going to work great." And so he turned that into what became GPT, again, PyTorch, Jupyter notebooks. But nobody understood the core of the essence was this transfer learning, fine-tuning idea. And it wasn't until ChatGPT came out that used that exact approach, the exact same three-stage approach, that everybody was like, "Oh, this is a good idea." So it takes a long time for an idea to percolate.
[00:05:20] Anna Tong: Yes, for sure. So we've established that you are contrarian. So let's get into some of your current contrarian takes. So I think today we're hearing AI agents, AI agents, AI agents. Everyone is saying the future of AI is for AI to do everything for you. What do you think of this? Do you think that's a good thing or a bad thing?
[00:05:44] Jeremy Howard: I don't think that's terribly likely to be true. But regardless, we can see that there are two possibilities: either it is or it isn't. The question is, what should you do about it? If it is true, and if AI takes over everything and does all the work, then it doesn't matter what you do. You're going to be obsolete, so whatever. On the other hand, I think it's very likely that it won't be true, and people will be very much needed.
[00:06:16] Jeremy Howard: Now, what should you do then, personally? Well, if people are needed, then if you go headfirst into agents, agents, agents now, you're going to stop learning. You won't be practicing your craft of coding, of building models, of analyzing data. You'll be outsourcing it all. So you're going to be in the group of people who just stultify. And I plan to be with a group of people who uses AI very carefully and thoughtfully to make me continue to improve in my craft. And so I'm using AI now to get better, for me to get better at my work, for me to learn more, for me to get more skills, for me to practice better. So, for me, I would say for the humans in the audience, you should be focusing on how can AI help you improve as a craftsperson? How can it help you improve to develop your skills?
[00:07:19] Jeremy Howard: If you outsource everything, I'm seeing this happening already, Anna. People are forgetting how to do work. They're forgetting they can do work. And if the AI can't do it for them, they're just lost. And the thing is, I think it's bad for the psyche as well to be like outsourcing everything to AI, it's creating thousands of lines of code. You can't understand it all, you can't keep on top of it all, and it's really stressful because then it doesn't work, or later on you have to integrate it and you don't know how, and you just get this bigger and bigger pile of debt. And it just, I've seen it weighing on people. I've seen people becoming just depressed that they're no longer competent and they are no longer in control. So to me, the agentic approach is like the computer is in control. The human should have agency. You're not going to go after this and write up an article where you say, "Please ChatGPT, write my article for me." You would be losing your craft as a writer. What's the point? So I think I would be very, very careful of spending too much time trying to get agents to do your work for you, because I think that is you are on the path, regardless of what happens to AI, of making yourself obsolete and incapable.
[00:08:46] Anna Tong: Well, do you think it makes people faster? You know, there's this argument that we're now able to write and ship so much more product now.
[00:08:55] Jeremy Howard: I'm not sure. I think in the short term, maybe, but probably not. The people I know who have been diving deep into AI-powered coding, I have personally experienced they seem to be shipping less but creating more code. And I think the problem is the AI code isn't very good. It isn't very well integrated. It doesn't create layers of abstraction that fit nicely together. When software engineering is going well, you should be getting faster and faster and things should be getting easier and easier. With AI code, it's the opposite. So I see people using AI more create more code, but ship less product.
[00:09:36] Jeremy Howard: Now, sometimes, maybe it's a little faster in the next two months, but you're not building your skills as well. You're not learning as much. You're not building as powerful a set of abstractions to craft better and better stuff on. So in a six-month time frame, I would say almost certainly slower. In a two-year time frame, I think companies that bet too much on AI outsourcing are risking destroying their company because I think in two years' time, they're going to look back and be like, "Wow, in the effort to get a quick two-week result here, we destroyed our competence as an organization to create things that last."
[00:10:21] Anna Tong: So how should we be using AI? Maybe to check our work or to ask questions? How are you using it?
[00:10:27] Jeremy Howard: Yes, so that's actually, so Answer.ai, which is Fast.ai's parent organization now, that's what we're dedicated our mission to do, is to figure out how can humans use AI where humans have agency and they are empowering themselves to sustainably do high-quality work and to get the answers they want. So we've actually got a new course coming up where we try to teach people a way of working with AI where you do small pieces, highly iterative, a little piece, and you try to write it yourself. And you have an AI kind of in there with you, watching and kind of giving you tips, and you can ask. And when you ask, it's totally written in a way to say, don't solve the problem for them, just guide them.
[00:11:22] Jeremy Howard: And we found in this, so we've created this shared environment called Solve it. It is actually based on principles from a 1945 math book by a guy called George Pólya, "How to Solve It." So these ideas have been around for decades of there are principles about how to do high-quality work. And so we've tried to create an environment and a way of thinking where the AI is in there with you, interacting with you and the environment that you're in. A similar idea is also my co-founder, Eric Ries, created "The Lean Startup" book and movement. And the whole approach of the minimum viable product, of short iteration cycles, it's all the same thing. So he's writing his new book with the help of Solve it. Solve it's written none of the words, but it's in there with him in the book as he's writing it, helping him fact-check, helping check transitions, and the more he writes with it, the more it knows what he's trying to do, what kind of help he needs. So again, he's becoming better and better as a writer because he's practicing his craft, he's getting the feedback and he's doing the work. He's the agent. And same for me, I feel like as a developer, I'm a much better developer than I was two years ago because I'm all about using AI to help me get better. I want to, I want to outrun the AI.
[00:12:55] Jeremy Howard: If in three years' time AI is a lot better than it is now, but it's still not making all of us obsolete, then which people are going to be valued? It's going to be the super great people, right? So if you're somebody again who's let AI do all the work, you've basically become incompetent, you're no longer your own agent. You're not going to be one of the people that still matters. So as AI gets better, it's more and more important that you are, that your skills are growing faster than the AI's skills.
[00:13:31] Anna Tong: I got you. So, you've been a major open-source advocate in the past. Do you still feel this way with frontier-level AI models? Should they be open-sourced?
[00:13:43] Jeremy Howard: Yeah. I never used to be a particular kind of radicalized open-source advocate or whatever. I mean, I used it a lot. I loved it. Fastmail, which is one of my earlier companies, big email provider that lots of you probably use, we were one of the very first to use Linux. We were using Linux at a larger scale than anybody in the world at that time, back in the late '90s. I've always felt like open-source is the right way for the global community to advance our software capabilities. But now I think it's much more important. AI is now a source of power in the world, and it's becoming an increasingly important source of power.
[00:14:30] Jeremy Howard: There have been many times throughout human history where technology has created power, various types of technology, you know, everything from the printing press through to technologies like effective education or writing. At every point, there's been a major new source of power, a group of people says, "That's too dangerous for most people to have, because some people will misuse it. So only the rich and powerful should have it." And we're seeing it again now with AI. The problem is that actually the rich and powerful are not the people that can be most trusted with that power. So if we let AI just be in the hands of rich and powerful people, the best AI, it could literally tear apart democracy at its foundations. So I feel like we're actually at a point now where we have to reinvest in enlightenment principles. Enlightenment principles is basically this idea of saying, "Okay, there are bad people in the world, but we believe that humans overall are good. They are a force for good. And that when there's a new power in the world, distributing it is the safe thing to do. Centralizing it is the unsafe thing to do."
[00:15:48] Jeremy Howard: So open source is the way we ensure it remains distributed. You know, PyTorch is what everything is built on today. None of this would exist if this open source didn't exist. And so I think we all have to recommit to this principle that, okay, some people will use this technology to do bad, but most people will use it to do good. And as a result, by using this technology to help defend against the bad guys, we'll be much more powerful than locking it away so that only the rich and powerful can use it.
[00:16:28] Anna Tong: I mean, I think AI is different though because it requires vast amounts of compute in order to train frontier-level models these days. So what should we do to ensure that we still have democratic access to AI and that small labs can still make great models?
[00:16:44] Jeremy Howard: And again, we've faced all this before as a global community. And countries that have looked at things that require huge capital investments like giant coal-fired power plants, fiber optic lines down every street for telecommunications and so forth, we figured out ways to do it. We've said, "Okay, the government needs to play a role in ensuring that everybody has some level of access. Private institutes have a role in taking advantage of markets and capitalism and that as well." We need to find a way again of not ending up with 10 different groups all spending a trillion dollars. The government needs to do some pieces of it. The government needs to ensure that there's some level of access, but there also needs to be some level of competition. I don't think any of this is new, right? Like, the early days of electricity, some countries like China only gave electricity to the elite. And some countries like America said, "We're going to put it down every road in New York and make sure everybody has access to it." And again, I think in the end we found that trying to make sure that everybody has access to these technologies is the way that actually benefits us all better.
[00:18:05] Anna Tong: Do you think it's too late? I mean, we are already in a state where just a few companies have a lot of, have all the access to compute.
[00:18:12] Jeremy Howard: You know, it's funny to say this given the kind of geopolitical history, but at the moment, China is the country that's saving us from that. So the best open-source models today are all Chinese. I don't find this particularly surprising because having spent quite a bit of time in China, the system there really invests in computer science and in math, and a lot of people believe in openness. So that's been very important, but I also think it'd be great to see America turn things around. The one company, actually, that has stood out head and shoulders above the others—well there's two—one is Meta, the creators of PyTorch, and the other is NVIDIA, who just in recent months have created some of the world's best models and they are open-source and they are openly licensed. So no, I don't think it's too late. I think there are some pioneering companies that are showing the way.
[00:19:14] Anna Tong: Great. So final question for you. What's something you're really excited about in the next year that you think will be enabled by AI or a couple of things you're excited about?
[00:19:25] Jeremy Howard: I mean, I'm honestly really excited about the work we're doing. I feel like we've discovered a way to work with AI that is deeply human, deeply supportive, and is the very opposite of grind and vibe coding and agents. And it's small, iterative steps where the humans and the AI are in the same canvas, working together, and the focus is on improving the human capabilities. So, I'm sure a lot of people in the audience have done the Fast.ai courses in the past. We're going to have a new Fast.ai course next year, which will be large language models and deep learning for coders, and it'll be really harnessing this idea of getting everybody in this environment to learn the foundations. We're going to do a similar thing for the foundations of the internet and web programming. We'll do a similar thing for building startups. I'm really excited about this. I feel like we've discovered something important here at a really critical time of how to work with AI in a way that that supports humans rather than replaces them.
[00:20:48] Anna Tong: Well, thank you very much for your time, Jeremy.
[00:20:50] Jeremy Howard: Thank you, Anna.
No results found