The Mayfield AI Pathfinders had their inaugural meeting this week on the Stanford campus, with the goal of building a thought leadership community for peer learning on key issues, trends, and innovations across the quickly emerging AI landscape. We think it’s important that investors and other thought leaders help bridge the gap between academics, entrepreneurs, and the actual enterprise practitioners and buyers, and see this as a great opportunity to cut through some of the noise.
- The underlying silicon systems are going to remain a huge bottleneck for AI – Back in the twenties there was a helium shortage, crippling the aviation industry – we’re kind of stuck in the same spot with AI today. A lot of money is going to one enterprise, and there may not be other competitors in the space. Software and architecture will definitely evolve, but it won’t change the hardware situation. These constraints are here to stay for at least the next 4-5 years.
- People want a car, but it has to have wheels and brakes. Enterprises still haven’t solved the fundamentals preceding AI – but now the appreciation for these fundamentals is getting featured front and center.
- The deep learning community has had the luxury of being raised like a trust fund kid – It’s important that the builders try and find other interesting things to do besides just scale. The dimensions of innovation still need to diversify, and much more is still needed at the design and architectural level.
- Innovation in software and architecture will continue to drive down costs – In some respects, the argument could be made that the hardest time has already passed. It’s a bit of a chicken and egg problem, if you don’t have enough capital you can’t scale up, but if you don’t show results, you won’t get the capital. Thanks to GPT, business users got a chance to see the hope/potential, so many incentives now exist to drive down costs by optimizing everything. Today, people are in a hurry, so they’re trying to quickly scale up, but in 2-3 years, as enterprises start to scale their solutions, there will be giant incentives to drive costs down. This is where the real innovation is going to start happening.
- There will be a shortage of authentic data – We’re already scraping most of the unique text that’s out there. But in areas like audio/video there’s still a huge amount of untapped data out there that we can use. We need to figure out how to get more out of all this text or use derivative text to sort of bootstrap further. Nine times out of ten its not building the 10 trillion parameter LLM
- Training can only get so far ahead of inference, they’re interlocked. So, the more mature the industry becomes, the more people will start to recognize the role of inference. One of the big architectural innovations will be seeing how the relationship between the cloud and the edge evolves over time. We have constraints at the physics level that will put us in a place where the cost per useful unit of inference at the edge can be much lower than doing it in the cloud. So there’s a bit of a system architecture imperative to figure out what can be moved to the edge. When you’re talking about media AI, it’s a lot, maybe almost all of it. As we refactor the whole system to understand how inference can be redistributed in a way that’s appropriate to the end application, we’ll see orders of magnitude benefits around cost. It will become incredibly cheap to do some sorts of things at the edge, and will even influence what we choose to do. There’s an opportunity here besides just paying for high-end GPUs.
- There will be a datacenter of the future – People’s time-to-solution is improving, which has a direct correlation with productivity, so everyone is motivated to build a more efficient datacenter. A >10x improvement could be achieved in memory if people just decided to focus on it. So much time has been spent in computing, but the movement time of data is also a huge setback. Today there’s finally market motivation to do something about it.
- Closed Source or Open Source? You can think of it like in-sourcing vs. out-sourcing – there are pros and cons with each. If you outsource, you lose control of your destiny, IP creation, speed of updates, etc. People have data and that data can really become the differentiator. How you’ve cleaned that data is in itself IP. Customers will create their own custom filtering, which leads to certain model capabilities, and that model is then an expression of IP. Those incentives exist and will always lead to more models, whether that’s 100% rational or not.
- In the future people will be buying capabilities and not models. Evaluation is one of those high entropy areas where you’re matching something that’s not deterministic with something that is, and you need to figure out that translation. Will something do what you need within your application safely, responsibly, and without hallucinations?
What’s the Difference Between AI 1.0 vs. AI 2.0?
Back when AI first got started, a lot of people didn’t believe in deep learning. So there was this phase of trying to convince everyone it was useful. Then we hit peak hype: AI can solve everything! – and that was the new normal. But we’ll come down from that to a place where AI is just a tool in the enterprise toolbox.
Computer science has been marked by several points throughout its history where people were researching a problem and came up with an algorithmic paradigm that didn’t work. So, researchers brute forced things, which led to an initial burst of innovation that ultimately tapped out. And in the end, innovators will always try to find a vector upon which they can scale and enable new capabilities. So, you’ll see these waves of algorithmic innovation that go through an S-Curve and tap out, become a tool, and then later, another thing is built on top of that.
Neural networks were on the rise in the early 90s. Support vector machines killed that, and almost all neural network research stopped around that time (from 1996-2006). But if we look today, in 2023, the vast majority of machine learning is still leveraging neural networks. That context is important because in 2006 the boom basically ended, but in 2023 we’re still seeing it.
So what about the next 10 years? Things are going to get tapped out, and something else will supplant it, build upon it, and do more.
Where are the Opportunities in AI Today?
As large organizations begin to explore using AI for some of their IT and business needs, there continues to be tremendous latent demand for enterprise software serving AI and LLMs today. However, there are still significant issues around access to compute, and these constraints are here to stay for at least the next 4-5 years. Moore’s Law is beginning to slow down and these costs are going to continue rising. This will be further exacerbated by geopolitical tensions with China and others. Unfortunately, this could be a limitation that enables only the most well-funded startups and incumbents to get ahead.
However, there’s a whole world of opportunity beyond scaling. To date, we’ve invested in scaling LLMs by using the dumbest most brute force methods. The dimensions of innovation still need to diversify, and much more is still needed at the design and architectural level. And this doesn’t just mean special-purpose models, it could mean models where the huge inefficiencies start to get ironed out, or working on things that aren’t models at all. Building a bigger transformer (so to speak) is only one way to skin this cat.
Furthermore, there’s still a huge need for specific solutions that serve end applications. It’s great that LLMs are so general purpose, which has added to their speed and propagation across the industry, but we don’t need or want that generality in 90% of the apps that will get used.
So, there will be still be a huge market for less costly pursuits, including:
- Domain-Specific Models – You can take a pre-trained model from LLaMA or others, and fine-tune it with high quality proprietary data. If you think about these models, accuracy and fluency (having human quality output that sounds like human quality output) are the two most important axes. On larger models, you wind up with diminishing returns, it’s easy to get fluency to near 100%, but accuracy is much more challenging.
- Model Capabilities – There will be new capabilities in models way beyond autoregression. For example, innovation in the training itself, the way the data is presented to the world, how to construct different loss functions, etc.
- Solving for Software Inefficiencies – The discrepancy between the growth of compute and growth of the models means that there are inefficiencies in the software stack that are being taken for granted (e.g. the execution of Python or the way you consume data to train a model). We will be pushed towards a more efficient stack, whether we use compilers, optimization techniques, or new approaches to incremental learning. Because we’re getting to the point where this becomes a creative constraint, we’re going to need to build better and more efficient platform stacks
- Applications and Agents – The next layer of abstraction we’ll care about will be agents and applications, which is a differentiable and differentiated space. Companies will need to work on creating applications that get over the hump of minimum viable usefulness. ASR systems for the longest time were horrible and useless when they had an error rate of 15%, they had to get to 98% to become truly useful. We’ll see these kinds of things a lot, and most (if not all) LLMs are still under that line, especially for industrial use cases
- Infrastructure – Gen AI adds a lot of dimensions that simply didn’t exist before: Fault resilience, inference optimizations, etc.
For founders working on companies today, it’s important to think about whether or not you have a consolidated way to sell your widget across different verticals, because in that case you may have a good horizontal play. Look at the GTM action as your guiding force and try to be at the lowest point where the GTM action remains the same. Don’t go broad because then you’ll need dozens of different GTM motions, but don’t go too niche either, because then your market size is capped, and that’s basically the issue with every SaaS platform. Go to the lowest possible common denominator where the sales motion is the same or similar. It’s hard to have multiple GTM motions as a startup.
Additionally, it’s really important to be able to build a product at the right quality, cost, latency, metrics, etc. In the process of developing the product, it will often force you to focus and diminish its generality based on how early conversations with customers or deployments wind up going.
What Does Early Adoption Look Like?
For large organizations, practical deployment of AI at scale is kind of the icing on the cake – the cake being your data, your infrastructure, your governance, your security, your privacy, etc. And if you don’t have everything, you can’t build an enterprise scale ML platform. You’re going to need lineage and traceability once regulations come in. It will be difficult for a startup to do something like that, but at the app layer there will be tons and tons of interesting companies coming up with new ideas.
So, IT and product teams are starting to get down to brass tacks after a year of evaluation: What does it take to deploy gen AI at scale in the enterprise? Fundamentally, the same considerations that were in place in 2017 still apply today. Things like cost, data cleanliness, etc.
For example, there are some incredibly interesting use cases in pharma where you could have a scientist ask an LLM to summarize all the experiments in the last five years on X molecule, and find out what the most common side effects are. The issue with building this out always comes down to the data. It’s a mess and it’s not complete. Enterprises still haven’t solved the fundamentals – but now the appreciation for these fundamentals is getting featured front and center. People want a car, but it has to have wheels and brakes.
When AI first hit the scene, everyone was talking about models and their capabilities, but that conversation is already becoming obsolete. Today, you can’t go to an enterprise company and tell them: “Look how awesome my model is.” Last year everyone was talking about “How many parameters?” (you wanted more), but now, not a single customer asks about the size of your model. In fact, if you tell them your model has tons of parameters, they will shut the door on you.
The business side of the house just isn’t interested in the same things as the technologists. They want to talk about their business use case and get help, not hear about what’s under the hood. What will it cost to get this in the hands of all my customer service agents? And what will the ROI of that be? People are thinking about putting these into production, they want an easy POC, and to figure out the ROI/Total Cost of Ownership. There were lots of multi-million dollar deals signed with Open AI so that companies could get boards and CEOs off their backs, but when it comes time for renewal, how many of these will even get renewed (vs. a downsell or churn)? People care about TOC and when you’re thinking about a 5,000 person contact center and giving everyone a chatbot and RAG implementation that equation becomes really difficult.
In the long run, cost may push people towards open source solutions as some of the talent gaps get sorted out – companies will just need a little help using their proprietary data and getting it into open source models at a reasonable cost long term. Today, there’s a 3-4x drop in price to train a model of a given size per year, being driven by, primarily, software and clever algorithms. If the cost falls by 8x, you’ll have way more people start working on these problems.
You want to evaluate technology at a high level, and if it’s transformational, you want to be investing right now. You can think about it using these three axes:
- Are there current use cases that customers are deploying? Yes
- Does the tech need to be improved? Yes
- Is the rate of improvement pretty fast? Yes
With those three considerations in play, the future may be pretty optimistic.
Who Will Own the Models?
Since models are fundamentally building blocks towards a problem that a company is trying to solve, it may be that it makes more sense for large organizations to wait for their SaaS providers to build their own models in many cases vs. building something from scratch and then trying to compete. It will definitely depend on the use case, but many use cases won’t need a company’s proprietary data to bring value.
SaaS providers are already hard at work embedding these systems as part of their solutions offerings, many have gen AI departments and lots of smart people, but others are just trying to get an SLM to embed in an OEM model.
Aside from the obvious two issues: the costs around the underlying silicon systems, and shortage of valuable talent, there are a number of challenges companies are facing today when it comes to gen AI adoption.
In some cases, it’s hitting the cost-latency-performance profile, in others, it’s just not accurate enough. In regulated industries you may need more evaluation or explainability. So it depends on the particular use case or segment, different things will need to be solved for.
And on the business side, companies want to go fast, but they often have trouble defining the problem they actually want to solve. They need to pick one or two high value, high impact, problems and then properly define success. Today everything is just maximum hype, people want to do this shiny new thing, but they don’t know what it’s really capable of. Evaluation is important, but so is defining the problem you’re solving.
Additionally, it’s not yet clear how companies are going to take these models and successfully deploy them. No one knows what the killer apps are yet. It’s so new that everyone’s excited about it and it’s hard to disambiguate the hype and excitement from real market traction. People will pay for things right now, even huge deals, but it’s not sustainable yet. Where is the real traction vs. the hype?
Still, there are many silver linings, innovation in software and architecture will continue to drive down costs, but in some respects, the argument could be made that the hardest time has already passed. It’s a bit of a chicken and egg problem, if you don’t have enough capital you can’t scale up, but if you don’t show results, you won’t get the capital. Thanks to GPT, business users got a chance to see the hope/potential, so many incentives now exist to drive down costs by optimizing everything. Today, people are in a hurry, so they’re trying to quickly scale up, but in 2-3 years, as enterprises start to scale their solutions, there will be giant incentives to drive costs down. This is where the real innovation is going to start happening.
Furthermore, there’s a lot of capital going into this and it’s likely going to support larger and larger clusters. Public markets are trillions of dollars when compared with venture capital and IT budgets.
The Architecture – Cloud and Edge
Training can only get so far ahead of inference, they’re interlocked. So, the more mature the industry becomes, the more people will start to recognize the role of inference. One of the big architectural innovations will be seeing how the relationship between the cloud and the edge evolves over time. We have constraints at the physics level that will put us in a place where the cost per useful unit of inference at the edge can be much lower than doing it in the cloud. So there’s a bit of a system architecture imperative to figure out what can be moved to the edge. When you’re talking about media AI, it’s a lot, maybe almost all of it. As we refactor the whole system to understand how inference can be redistributed in a way that’s appropriate to the end application, we’ll see orders of magnitude benefits around cost. It will become incredibly cheap to do some sorts of things at the edge, and will even influence what we choose to do. There’s an opportunity here besides just paying for high-end GPUs.
Will the Datacenter Look Different?
People’s time-to-solution is improving, which has a direct correlation with productivity, so everyone is motivated to build a more efficient datacenter. A >10x improvement could be achieved in memory if people just decided to focus on it. So much time has been spent in computing, but the movement time of data is also a huge setback. Today there’s finally market motivation to do something about it.
Additionally liquid cooling and other solutions are gaining popularity to help reduce electricity requirements – both for cost and sustainability purposes.
Will There Be Winner-Take-All Super Large Models in the Future?
Deep learning is the most efficient way of solving large problems. It’s not winning for no reason and there isn’t an alternative that can solve the problem any cheaper today. Super large models will happen and need to happen. There are good reasons for them, just as there are good reasons for smaller models.
Open Source vs. Closed Source
We can argue about OpenAI, but most would agree that it’s a very good model. Likely the best model out there today. In a steady state, there’s a real possibility that everything could go closed source with large vendors stealing the enterprise market. If proprietary models keep getting better and better, we may hit a point where it gets hard for open source to catch up. It may or may not happen, but the future market could very well be 80-20 (or more!) in favor of incumbents. Furthermore, when it comes to the most cutting-edge models, the dynamics of commoditization will likely take hold and prevent pricing from getting out of hand. However, at least for the time being, customers still want choice and will likely evaluate a number of different models.
Today, open source vs. closed source will ultimately depend on the goals of each individual organization.
First, open source will be much cheaper to scale, and that will create a counterweight against closed source models unless they pull ahead dramatically in performance. People are focused on getting apps up and running today, but later on they’re going to care a lot more about cost.
Second, it externalizes the cost of development and drives innovation much faster at scale.
Third, there’s way more flexibility inherent in open source. Companies can create domain-specific models, do whatever they want with the weights, and then run them on-prem or on-device. There’s a control aspect both around the business and around privacy. And additionally, there will be a huge role for a variety of deployments involving concerns around latency. There are a lot of industrial use cases today that are interested in using open source because people want to own the destiny of their product. You can think of it like in-sourcing vs. out-sourcing – there are pros and cons with each. If you outsource, you lose control of your destiny, IP creation, speed of updates, etc. People have data and that data can really become the differentiator. How you’ve cleaned that data is in itself IP. Customers will create their own custom filtering, which leads to certain model capabilities, and that model is then an expression of IP. Those incentives exist and will always lead to more models, whether that’s 100% rational or not.
Finally, open source is not just about being a good person and giving back to the community. There will always be a draw for smart people, because from a selfish standpoint, you can contribute something and take that longitudinally alongside your career. Your name and GitHub repo is on that project. Companies will then build products and services on top of that raw material.
Open Source vs. Proprietary
There are great cases to be made about open source vs. proprietary. It comes down to what each enterprise wants. Most enterprises today don’t really want to train their own model from scratch. There’s a maturity curve at play here:
Companies start by experimenting with GPT-4, then prompt engineering an open-source model, then fine-tuning that model, then adding RAG, then pre-training, etc. Eventually, it’s possible to work your way towards building your own model.
But, it’s not a one and done: someone has to ultimately maintain these. Talent can leave and it can become really hard. You see problematic things like this all over the place in business today. For example, a team can’t move from excel into Google Sheets because someone wrote a macro 15 years ago and left. These sorts of problems will be further exacerbated if companies start building their own models.
Safety and Bias
It’s hard today to assess the safety of a system without good standards or a good way to audit. What if the data itself is biased, for example? As models proliferate, the harms are getting more and more specialized and teams all have different ideas on where to point the flashlight. A more profound consideration is whether you can even have safety at all without bias – everyone is making different choices and there are too many explicit value judgments. Regulation will be needed to help guide companies on what they need to do.