Yasir Anwar, Former CTO, and Chief Data Officer for Williams Sonoma

Today we welcome Yasir Anwar, former CTO and Chief Data Officer for Williams Sonoma, to The CXO Journey to the AI Future podcast.

Yasir Anwar is an innovation enthusiast who is an Entrepreneur at heart and a Technologist at execution. He considers himself on a D.I.E.T (Disruption; Innovation; Execution; Technology Excellence). To keep the balance between solving technical issues and shaping an overarching strategy, he strives to switch roles between being an engineer, architect, and C-Level Executive as needed. He believes this keeps himself humble, realistic and truly connected with his teams.

Most recently he was the Chief Digital and Technology Officer for Williams Sonoma Inc. leading all digital, technology, and IT functions. Before this, he was the first CTO for Macy’s, where he drove the growth of e-commerce channels, and omni-channel capabilities and led the merger of the previous digital and CIO organizations under one mission and culture.

Yasir is also a Lean practitioner at scale and established the LeanLabs culture at Macy’s which grew from one LeanLabs team to 40+ teams contributing significantly to an increase in revenue, cultural agility, and innovation.

Question 1: Current Job: Could you tell us a little bit about your background and how you ended up where you are today?

This career is what I’ve always wanted. I’ve tried to work towards being both an entrepreneur and an execution person, and it has taken a lot of work to get to where I am today.

I started as a core engineer in C++, Java,…all those languages and databases. And I’ve always believed that everything I’m doing must make a business impact (whether it’s internal or customer-facing revenue development). It can’t just be a shiny object or an academic exercise.

So, I’ve tried to make an impact via tech innovation, and more importantly, always measure it. I’ve always focused on how to better devise mechanisms, culture, and frameworks in order to facilitate continuous measurement.

That has been the core thread of the career I’ve developed. In all the leadership roles I’ve had, whether at Walmart, Macy’s, or William Sonoma, making an impact was ultimately accomplished by building great teams and a great culture, while still being capable of partnering with the broader ecosystem.

Question 2: Generative AI: Of course, here in the valley AI is all the rage. San Francisco has several new startups that are AI-focused. But you’ve been through quite a few technology journeys of your own. How much priority do you think we should be putting on this?

I think generative AI is critical to the growth, evolution, and innovation of any business. If you look at it from a time series function, last year was a year where companies and businesses should have invested in exploring and understanding the art of the possible in both the near-term and the long-term.

2023 was a year of exploration for most of the people who are not producing core AI solutions. I envision that over the long-term, generative AI will drive a paradigm shift of resources, mind share, and time investment throughout companies of all sizes. Today, the majority of businesses are focused on core operations, and only a small periphery is spent on growth and innovative initiatives. I strongly believe that generative AI has the potential to alter resource investment in growth and innovation at a scale that we haven’t experienced in the past.

This could be very different from many other technologies that came before. It has a much larger impact on every aspect of a company’s operations and consumer functions. I also think that it will free up time and resources through automation and collaboration tools.

We also know that consumers are being rapidly trained with AI-powered experiences and the expectations are going to climb very soon. In a 3 to 5-year period, I do see a paradigm shift in time, resources, and mindshare deployed to emphasize growth, innovation, and newness powered by AGI.

Extra question: You shared an image with us. Could you explain the key points? I know you’re talking about this growth mindset of AGI really transforming how we think about innovation.

I see a business as two concentric circles. One is core operations and on the periphery is growth, innovation, and newness. This is backed by the data with companies traditionally growing around 6% – 10%. Only some startups have meteoric growth.

However, with the deployment of AGI, the core operations inner circle is going to shrink. Mundane tasks will be reduced and growth and innovation will increase at a very significant scale, because now the right tools will be available for it.

Question 3: Early Learnings: You make digital transformation the first stepping stone to true business transformation. And AGI is the catalyst, the enabler of that. While at William Sonoma you deployed some AGI initiatives. Do you have any early learnings?

First, I think that for every layer of technology that comes in you need a team, and you need yourself to be aware and have learned and gone through the process of building some expertise for all the previous technologies. So it’s microservices, platform data, big data, AI in general, machine learning…

Then, AGI comes as an additional layer on top of all that. You have to start by establishing a test-and-learn model with a small team approach – hopefully, if you’re reading this, your company has already started.

I also think it’s key to establish and prioritize relevant and valuable use cases, and to understand where generative AI could help. These use cases need to be quickly paired with LLM solutions vs. people just talking about what could happen. Not everything will work out, and that’s okay. You need to look at what is promising and then set a test-and-learn iterative approach to shape the product or feature your teams are working on.

Extra Question: What is ready from an AI perspective (the models and LLMs available)? And what are your relevant use cases?

For retail in general, it has been content generation and product descriptions. They have proven to be very, very productive. A chatbot helps not only by just checking your order but also some product expertise can be outsourced to that. Designing the look, shoppable images and videos, buying the look, buying the experience, searching, and recommendations improvements by applying AGI and LLM models, have all been very important aspects of the work. Also, 3D and photo-realistic imagery is becoming very common now.

On the other side, supply chain and operational efficiencies are being improved upon by predicting and optimizing shipping delivery routes (which used to be a very transactional table in the past). Now you can dynamically learn what is the best route for delivery. You can also do some returns, prediction, and accuracy, which will help you throughout your entire supply chain.

You could boost associate productivity through code co-pilots, which we tried, and collaboration tools to optimize communication. There are companies now that are talking about inside retrieval from all the documentation that’s available, and providing you with suggested actions that you can pick and choose from.

Talent acquisition and talent management have also come up as a new topic where AGI is being deployed.

Question 4: Metrics: I presume there are a bunch of use cases that you had to challenge inside of the organization. How do you rationalize the list of use cases? Did you apply some business metrics to them? How did you prioritize?

I think the easiest thing to pick is where you already have a pain point in the business. Then you try to look at it from both angles, as I said.

One is defining the problem. I think we should always stop and start with the pain removal, before diving into growth.

As a technologist, your job is to figure out what technology is ready. For example, what kind of element models are ready for this deployment versus others? You can spend cycles and months and months, and you find that the lift was only 2% which you could have done by changing the color of the button on a screen, for example, or a location.

So the metrics are important to your point, and I believe that generative website experiences are going to be an area that’s heavily impacted. So I envision that the typical search bar on the website is likely going to go away. It will be more of a generative website experience where you’ll find more relevant and personalized results. So search, recommendations, and chatbot are all going to be meshed into a more generative website experience based on the time of the day, your original location, if you are traveling, or if you’re on an iPad versus a phone.

It’s about giving the customers something very relevant in order to lift the conversion through the roof. That’s the current opportunity for AGI.

Then, the entire commerce funnel has clear KPIs at every step already established. For example, conversion at every step of the funnel: AOV, AUR, category penetration, shipping costs, etc.

Still, while deploying generative AI at each of these commerce steps, we can’t only measure the KPIs as we used to. In the past, you would usually deploy something and achieve a 3-4% conversion lift, or, revenue-per-visitor for that particular small step.

I believe the power of AGI to validate is that its impacts have to be at a much larger scale and pace. For something you were trying to do over 6 months, the way I would vet an AGI solution, in the beginning, would be: Are you shrinking my cycle from 6 months to 2 months? Or one month? Or are you talking about a 5% lift and you’re bringing an additional newness to my solution and experience that could give me anywhere from like 20 to 50% growth across that segment?

I think that’s the way you look at metrics here. And if you can even reach the halfway mark, I still think that’s a fantastic success vs. still trying to just get a 5% lift in conversion. Avoiding bias and cannibalism of other KPIs over a period of time is an important thing to watch as well.

Another important category in my mind is the analytics to track the AGI progress. I think there needs to be a shift from existing transactional reporting to more predictive trend analysis, facilitating better planning and decision-making.

Question 5: The ‘Buy versus Build’ concept: There’s a technology decision as well. You are the technology leader and in all of this effort at some level, you have to decide: Do you go with existing vendors? Do you take on some of these new vendors? Do you build your own model? What’s the ‘buy versus build’ thesis that you believe we should be thinking about? How do you go about making all of those decisions when it’s an early adopter market?

I practice a balance of this because you can’t be just a buyer person–you don’t need to be a technologist to be a buyer. However, if you’re just building, then you get too much into the weeds of proving yourself and your teams.

Everyone is new in this space because it’s all new. However, I believe that over 50% of the enterprises that have built LLMs from scratch are likely going to abandon their efforts due to cost, complexity, and technical debt. It’s not going to be manageable.

So these foundational ecosystems, like OpenAI or Azure providing a view into OpenAI, or some niche companies who have built on top of OpenAI, are credible. These provide you with a safe playground. They also provide you with learnings from their own data, and models at an unprecedented scale. You can’t compete with the data Google has, or Azure brings, etc.

That being said, you could deploy all your inference models and orchestration LLM models on top of their models to add even more value and specialized results for your business and problem space.

That’s the build versus buy combination.

You’ve got to watch it. But at the same time, I believe that if you don’t build muscle internally, then at some point you’ll just be running blind. You won’t know what you’re doing and you’ll be trapped working with all the major brands. So it’s also important that there are groups of people within the company who continue to evaluate and approach the latest and greatest open models for relevant business cases in order to validate the offerings from the large ecosystems and vendors. This ensures that you’re not missing out on something difficult on account of suffering from vendor lock-in.

Question 6: Gaps and Issues: What do you think some of the gaps, issues, or impediments are for large organizations? I know this isn’t a technology-only issue. What are some of the issues, or maybe even gaps in technology, that you believe need to be addressed?

I think the overall core talent pool is very small, and by this I mean people who know the guts of data and the guts of AGI. Tools like ChatGPT make it seem easy. The talent that says “I can get started by generating great prompts” is plenty. You don’t need to be a technologist to do that. But the core talent, to my previous point, is still very small.

However, for any large-scale enterprise, it’s critical to have some core talent that understands the guts of AI, the anatomy of these LLM models, and, more importantly, the criticality of data preparation.

So I think companies should also invest in their existing deep experts in data and platforms and get them motivated and trained on AGI. That will help integrate them with a new generation of LLM engineers. So the total product as a team will be more holistic, versus just rushing towards building LLM models.

To me, this remains the biggest challenge for most companies. And this despite all of the investments in the past decades that I’ve seen in data lakes, and data oceans or pools, whatever they’ve created. Still, your results and LLM insights are going to depend on your data. So preparing that data and the quality of that data is going to be very important.

Question 7: Responsible AI: We talked about bias and hallucination, trust, and risk governance issues. What do you think about responsible AI? And what message would you give to your senior leadership on how they should think about it?

From a business perspective, the scale of impact of AGI has to be much larger than the normal projects you would do in tech investments. And that’s the promise it should bring. Therefore, it must be vetted.

For that reason, I don’t even like to call it artificial intelligence. I’d rather call it Accelerated Intelligence, because that’s what it should deliver. If it’s not giving you accelerated intelligence to make an impact, then why even bother building all of these models?

AI should have constructive business goals and measurable KPIs and transparency. If you can, up front, define that this is the part that an LLM model is going to produce, and you can measure and provide that transparency to your business partners and leaders, I think they will come along for the journey. They will want to get trained and educated. They can even bring their own insights, and challenge you with the right insightful questions about the KPIs and measurements.

Don’t treat it like a black box: “We’re doing some research, and we’ll tell you where we find a nugget.” Rather, be upfront and defined. If there are seven failures, it’s good for the company to know that it’s not mature yet, even though our teams have tried seven times. Or you want to be able to say: “In this particular domain, it is mature, and it is giving us results, and the results are realistic.”

When it comes to the responsibility of AI, there are other aspects to consider as well. It needs to be protected from biases and corrupted training data or user feedback, which is called data poisoning or prompt injection attacks.

So it’s important to transparently define both of these issues in order to ensure correct outcomes across the various stages of development and deployment. Large vendors and ecosystems will need to focus here as well.

Another important aspect is that generative AI and its proliferation poses a big risk to organizations as it provides a significantly larger attack surface for cybersecurity issues.

It also provides a new attack vector, where it’s much harder to identify anomalies. So to me, data should be protected all the time and for any new deployment. One of the types or ways to approach this is that we should implement a safe test-to-learn approach with only necessary data provisioned in a sandbox environment. The solution runs to understand the scale of the idea and the impact and validate its safety before we roll it down to the full dataset of the organization, domain, or department.

Another idea I have is that there should be guided AI-type platforms like OpenAI. I believe there is an overarching need for these platforms given it has such a global impact. In parallel to OpenAI, this guided AI platform can govern all AGI implementations in an enterprise, map them each to their planned outcome, and then also catch signals proactively.

Even 100 implementations may cause long-term deviation from company goals.

If you implement AGI across many different departments, individual teams may be doing fine, but not aligning with top-level initiatives. You might have the goal of driving operating margins in a particular direction, and two individual teams could be polluting everything, even if the rest of the solutions are working fine.

There needs to be a parallel platform that can provide this guidance watchtower aspect, and that is not specific to any specific company. It’s a larger platform that could be deployed in enterprises and would watch out for every level of possible anomalies and deviation from a specific implementation of an LLM, but also tie it back to overall company goals, and see if any of the LLM implementations by themselves, or combined, may muddy the waters.

Yasir Anwar is an innovation enthusiast who is an Entrepreneur at heart and a Technologist at execution. #Techpreneur

Yasir is on a D.I.E.T – Disruption. Innovation.Execution.Technology excellence.

Yasir is shaping the strategic technical direction of the companies and transforming technology for the next generation of retail – the world of connected commerce. Yasir has led end-to-end business and digital transformation for companies through technology excellence, innovation, and digital prowess focused on customer experience.

Most recently he was the Chief Digital and Technology Officer for Williams Sonoma Inc. leading all digital, technology, and IT functions. Prior to this, he was the first CTO for Macy’s, where he drove the growth of e-commerce channels, and Omni capabilities and led the merger of the previous digital and CIO organizations under one mission and culture.

Yasir is also a Lean practitioner at scale and established the LeanLabs culture at Macy’s which grew from one LeanLabs team to 40+ teams contributing significantly to revenue increase, cultural agility, and innovation. Before that Yasir had many leadership roles with WalmartLabs, SamsClub.com, Walmart.com, and Wells Fargo, and executed large-scale projects across various domains like manufacturing, ERP, Auto Industry, Learning Management Systems, etc.

“If people are not laughing at your goals, your goals are not big enough.”

Dan Elron, Managing Partner, Enabled Strategy

Today we welcome Dan Elron, Managing Partner, Enabled Strategy, to The CXO Journey to the AI Future podcast.
Until recently, Dan Elron was Managing Director of Corporate Strategy, Technology, and Innovation at Accenture, where he helped drive strategy for their technology business. He has advised many key clients, working with CEOs, corporate boards, and leading academics and policymakers across the globe.

Most of his client work has been in the high-tech, telecom, and financial services industries, and more recently, the automotive industry, as it began to integrate advanced digital technologies. He was also the Information Technology Industry Advisor for the World Economic Forum. For the past decade, he has worked on the impact of artificial intelligence across large enterprises, and anticipates significant disruption during the coming decade.

Question 1: Current Job: Could you tell us a little bit about your background and how you ended up where you are today?

I’ve been in management consulting for a long time. I recently retired from Accenture after working for many years in the tech and telecom industries, as well as other large global enterprises. I eventually began working on Accenture’s own business and technology strategy, its ecosystem and its relationship with both large providers of technology as well as startups.

Now I’m working with startups as a mentor, as an investor and advisor to investors, and with large organizations that are struggling to integrate the amazing wave of technology that’s coming to us, including the topic today, which is Generative AI. But as you can imagine, that’s just the tip of the iceberg. Many other things are happening over the next 10 years, I think not just in IT, but in IT-enabled technologies, including in healthcare, biology, agriculture, etc.

So there’s a lot of interest today around leveraging technology for strategic value, and I really enjoy working with clients on that topic.

Question 2: Generative AI: You’ve had so many conversations at a strategic level with thought leaders and major corporate teams. Is there something unique about Generative AI or AI, specifically, that you think means higher priority from a leadership standpoint? How much priority do you think we should be putting on this?

I’m reminded of Yuval Harari, who wrote the book Sapiens. There’s something he said that really struck me; he said that using Gen AI is “hacking the operating system of humans.”

Language is the operating system of humans, and if we’re hacking that, even if it’s primitive hacking today, we’re impacting a lot of the information flow among humans, a lot of the transactions between humans, and eventually a lot of the emotions between humans. Ultimately, all organizations are made up of humans, and they don’t work very well if the humans don’t work well with each other and communicate with each other.

So here we are, introducing a new agent. An automated agent that speaks our language, generates our language, and can help or hinder how an organization works. So because of that, I think it’s extremely important. I don’t think anybody understands exactly where it’s going to go. We’re early in the technology’s development; every month brings something new. And we’re fascinated by what’s going on today. But it behooves us to think about the likely evolution over the next 5 or 10 years, which is very difficult to predict. It will likely extend from language to images, and from the digital to the physical world. But it is fundamentally disruptive.

So the answer to your question is, yes, I think it’s extremely important. It doesn’t necessarily mean that this is where you need to spend the most money as an enterprise, but it probably means that this is where you need to spend the most thinking as an enterprise.

Question 3: Early Learnings: How have you executed a Gen AI strategy for your team or organization and what initial learnings or initiatives can you share?

I was shocked by the early use cases even a year ago. Very basic things such as feeding support tickets into a model, reducing the speed of figuring out those problems, and reducing the level of experience required to address those problems was truly astonishing.

So very impressive gains came very quickly. And not necessarily in the areas that we predicted. That requires experimentation which I think most folks are doing right now, and it requires creativity. There are a lot of underappreciated sources of knowledge that you can feed into Gen AI.

And it’s important to note that I make a distinction between Gen AI and the rest of AI, including machine learning. I think these things integrate with each other and need to work together; we can’t just focus only on Gen AI. But when you do that you get amazing results. Not necessarily always the most strategic results in terms of market positioning, but certainly impactful in terms of speed, cost, and as I said, the level of talent required to execute a process.

So I think the early learnings here are that it can be transformational for many processes. It requires experimentation. It requires integration, as I said, with other technologies, whether it’s AI specifically, machine learning and associated technologies, or with other technologies. If you’re in biotech it means integrating with the latest learnings in science and biotech. And that’s something that we’re beginning to see.

There’s an amazing difference between the few leaders and companies that have been at this for a year or six months, and those that are just getting started today. So if you’re not doing a lot of that, you’re already behind.

The other lesson learned, and something that concerns me, is that a lot of companies when they think about Gen AI, think: “What should I do? Which department is it? Customer care, chatbots?” And they rarely think about the ecosystem as a whole.

The suppliers are going to use the same thing, and you can work much better with your suppliers. Or sometimes be disintermediated by your suppliers if you don’t think about how they’re going to use Gen AI. And the same thing will happen with other partners: supply chain partners, government partners, your customers, etc.

Many of today’s use cases I see are internally focused. We need to start looking outside. Odds are, somebody outside our ecosystem will have better ideas than we do, statistically, in terms of what to do with Gen AI.

In some cases, if leadership teams aren’t careful, companies will get disrupted. Because somebody will build a platform, and products and services will be consumed by some kind of automated agent that they design and control. And companies will have to respond.

So the early concern I have here is: you really need a holistic point of view. Everyone is looking at their competitors today, but consider the whole value chain, how you’re delivering value to customers. Who has more power? Whose power is going to be amplified by the use of Gen AI? And, unfortunately, whose power is going to be diminished because they still have too much friction? For example, we’re making information more easily available. So the competitive dynamics in many value chains are going to be disrupted by that. It’s just beginning now, and it’s going to accelerate.

Because of this the role of the Chief Strategy Officer, which maybe wasn’t that significant five years ago, is really becoming much more significant.

Question 4: Metrics: So how do you move a team to begin? You mentioned the Chief Strategy Officer. How do you create metrics that set an agenda? How will you prioritize where to deploy Gen AI solutions? What areas or use cases are at the top of the list?

Before we go there, I think the strategy to implement Gen AI and AI in general, is potentially strategic. But it’s also so important tactically, in terms of cost reduction, and process design.

The strategy has to be driven both top-down and bottom-up.

The experimentation that you and many others talk about needs to happen and often should happen bottom-up. I don’t like the idea of having the Chief AI Officer; I know that’s a little controversial. I think everybody should be a Chief AI Officer. Certainly, everyone who reports to the CEO should be the Chief AI Officer for their domain. It’s not something you can delegate. It’s the same thing that I think about innovation more broadly, by the way.

But in terms of the impacts, they’re going to be different. Therefore the metrics are going to be different between the tactical bottoms-up work and the more strategic work. So clearly, for the tactical bottom-up work, the classic measures of productivity can be extremely significant, and indeed we are seeing sometimes major productivity gains already. Certainly, we see that in software development, and also in customer service, in repair, trouble ticketing, etc., they’re very significant. I would focus less on the ROI. One reason is that the cost side of geAI is still unclear, I think.

Prices are going to come down eventually, and I wouldn’t hamper the technology by allocating too much cost to it yet. When that scales, maybe, but for now I think you want to experiment and introduce it as quickly as possible and not apply the usual business case logic which slows everything down in many, many companies. I know that’s easy to say and CFOs don’t like it. But this technology is still early and I think that’s the right thing to do.

Also, for all executives, while there’s a lot of value in taking a two-hour course, if you’re an executive affected by AI or Gen AI, there’s a lot more value in actually seeing it across your organization, working, and hopefully working well – so you need to get close to the work in the field and not treat it as a standard ROI-based effort.
In terms of the metrics, I think speed is extremely important. Because, as we know, speed is strategically valuable in just about any context.

Another thing to consider is the business process: how many steps in the business process are automated versus manual? How can that change? It’s a very basic thing. This is something we looked at in the past when we did reengineering for those of you who remember that – efficiency and effectiveness.. But that will help you to understand the ability of your organization to change by applying GenAI, agents, etc..

Also, how susceptible are you to disruption? If something that took many steps suddenly becomes much more simple with this technology, it could be that your competitors, or new entrants, can do that with a new platform and overcome the complexity that kept you successful as an incumbent.
So some of these metrics are not traditional, but because we’re early in the technology, we want to understand the strategic value.

Another set of impacts that I find very interesting addresses the work experience and training required. We all know there’s a shortage of talent and that Gen Z has different priorities and different requirements for their jobs. So what is the impact in terms of the experience, training, and job satisfaction of introducing this technology? Therefore, what does that mean in terms of your labor demand, the kind of labor, and the amount of labor that you’re going to need over the next few years? How do you design the new AI-enabled roles while maintaining what humans value?

To give an example of impact, one case I know of is in the automotive industry. The average experience required to solve a particular problem went down from the need to use an engineer with 5 to 7 years of experience to engineers with 1 or 2 years of experience because they were enabled by Gen AI. That’s extremely powerful, and raises the question of, for example, how to keep both levels of engineers engaged and learning.

In markets such as Europe, where many engineers are retiring, the use of GenAI, including automating processes via genAI agents, will be extremely helpful. So, to your question, preserving knowledge and reducing labor demand are metrics that should both be considered.

And the last metric which I don’t really think you were asking about, is risk.

We tend to attribute too much risk in some cases to AI because of all we read. What is the real risk here? I encourage companies to evaluate the risks specifically for each use case.

What’s the worst that can happen? And how often do we have problems? Obviously, it’s still early days, we’ll have problems. But are they show-stoppers, can they result in serious, systemic injury or damage, or are they things that can be managed and fixed?

That perspective should help management teams overcome the fear that in some cases prevents them from doing anything. Even in a law firm, there are many, many things you can do with Gen AI that are not going to get you in trouble when you get to a judge who doesn’t like a brief that you’ve written with genAI.

And so it will help to have a solid, simple framework for saying, say, this is high-risk, high-reward, or maybe high-reward, low-risk. That helps you with the deployment strategy and helps you measure whether or not you have a balanced portfolio of tech applications in terms of risk and reward, and to be prepared for clean-sheet disruptors who will start with genAI as a basis for designing their processes.

Extra: You came from working at a management consulting firm for years. How do you think management consulting will evolve through this?

I think because the industry is populated by people, who, as an economist called them, are ‘symbolic manipulators’ (people who work with symbols, whether the symbols are software or words on the Powerpoint), that the industry is going to be perhaps one of the most affected by this ‘hacking of language.’ And if the language is structured, such as computer code, the disruption will be much more serious.

There are still barriers to this disruption. It turns out that, you know, PowerPoint, for example, is not the best way to train a large language model. But we’re going to get there.

From my perspective (and this is an industry I spent pretty much all my career in), the industry needs to reinvent itself. The impacts on software are definitely already happening. I just saw various projects where companies are using software to develop complex systems or at least pieces of complex systems that would have cost tens of millions of dollars and now can be done much faster with less experienced talent. So it’s disruptive there.

It’s also disruptive in something that the industry does, which is to take information, aggregate it, and deliver it in a customized way to a client.

It’s an industry that really needs to rethink what it’s doing. There are huge opportunities for the industry in redesigning the ecosystems, for example, for the use of agents; in rethinking how work is organized; in enabling the human transition; in making sure the benefits are captured and risks managed. The consulting industry did that in the process reengineering days; it is time to do that again and be willing to let go of work that is now highly automatable, however hard this is.. When a new technology is really threatening, sometimes you don’t move very fast, or you focus on small projects and proofs of concept, and I think that’s very dangerous for the industry. Technology providers would love to capture the value that used to go to consultants and integrators during past tech transitions. This time it could happen much more quickly, before the industry has managed to change how it works.

Question 5: Gaps and Issues: What do you think are some of the gaps, issues, or impediments for large organizations? I know this isn’t a technology-only issue. What are some of the issues, or maybe even gaps in technology, that you believe need to be addressed?

Well, I’m going to discuss it with you as a venture capital firm. There’s a terminology we haven’t used for a while, but maybe you’ve heard of it, it’s called the ‘- ilities.’

These are attributes of technology, especially computer technology, which include scalability, manageability, integrability, security, etc. Those are often missing for Gen AI – I think we all know that. But I don’t think there’s enough funding that’s going towards these things, which may not be that sexy, but for an enterprise to use this software in a reliable, predictable way, we need a lot of investment in those.

Another gap that kind of addresses something that I mentioned earlier is risk mitigation and risk management. We know that technology is far from perfect (think: hallucinations), but there are very few tools at this point to tell us what is dangerous and what is not dangerous. Or: to help enterprises figure out where to put the technology and let it run, where to turn up the temperature on a model, which means where to let it be more creative vs. use a completely explainable model. Not everything has to be explainable, that would be terrible if we required every model to be explainable. I know a lot of companies think they should. I don’t agree with that.

There are places where you can take risks, where creativity is good, and where, frankly, you don’t care whether you know how the answer was received. You probably can tell that the answer was a good one and the risk of not explaining how you got there is acceptable. I already see how big companies are unnecessarily preventing those uses.
We are also missing good approaches to training. We know that even an hour of training on using language models has a huge impact on productivity. But what about a day of training? What about a month of training with mentors? What kind of training? We still don’t know.

Another gap is the change management. People in many organizations, up to the very top are, in many cases, frozen. They will say, “Yeah, let’s have a few pilots here and there. Let’s reconvene six months from now, and see how these pilots have gone.” In some cases it’s fine, but in many cases, it’s too slow. So what kind of change management can we implement in order to encourage departments and product groups that are susceptible to disruption, or which have an opportunity to really do much better and generate a lot more revenue? How do we encourage them to be comfortable with the technology and to move from exploration into production? One lesson seems to be that it is better to deploy the technology to teams than just to individuals.

Even example case studies, videos, and interviews/podcasts such as the ones you’re doing in this series can give confidence to organizations to move faster. And right now I think many are saying they’re moving fast, but when you really look at the rate of implementation and how close they are to getting something into at-scale production, the answer is often not very good.
You extrapolate the things that they think could get into production and you see that they’re going to be a year or two years away, which for the more basic things, or, say in the IT shop, is way too long.

Extra: You’ve helped people get through this change management obstacle many times in your nearly 30-year career at Accenture. How did you get people fired up?

Well, it is challenging. But yes, I remember even having to convince many telecom companies that broadband was a good idea. And here we are with something that probably is much more disruptive than broadband.

Fear and greed are very helpful here.

For example, if you paint the picture of an ecosystem that’s full of agents that decide who should buy what, at what price, and from whom and then look at your own process and see how many steps and how bureaucratic what you’re doing today is, this creates a degree of fear because your competition could adopt these technologies and be much better, faster, and potentially cheaper.

The more I read about these competitors to Amazon such as the Chinese retail and e-commerce players who are coming into the US, and how they’re using not necessarily Gen AI, but AI more broadly, or technology in general, and how quickly they’re moving, that should strike fear into all retailers and other companies as well, maybe even highly regulated, often inefficient, protected industries such as insurance, who may intellectually understand the situation but don’t move very fast.

About greed, I think we all know how that works. It can be very powerful with the C-suite.

The cost reductions, as we mentioned, the reduction in the talent requirements for addressing business processes, the simplification of organizations, the integration of agents, not just today’s Gen AI, but what’s coming next, etc.: when you describe those and provide real examples in a C-suite you see eyes light up. They understand that this will have a bottom-line impact on their organizations. The revenue side may be more challenging, but the examples are coming quickly here as well- for example in simple upsell situations.

The biggest concern I have around getting people fired up is that there’s so much negativity about AI right now, especially in Europe. The concerns about the risks, the amount of regulation, and the issues around hallucinations that could get companies into trouble are all valid. But I believe that’s putting many speed bumps in front of the exploration and employment of the technology when they’re not necessary – or when it is others, such as regulators, or your suppliers, who have to take the lead. You may not be able to wait for these often slow processes for what you are doing inside your business, despite the negativity.

We often don’t like where we ended up with the Internet, and as a society we should learn from that and address these issues quite forcibly, but we should not let the sins of the past prevent us from dealing with the opportunities and challenges of this new set of technologies.

So these are some of the challenges we need to overcome. These are early days, and competitive advantage often is created in the early days. New models are created by innovators; in some cases, being second or third doesn’t work, especially when you look at platforms and other business models, new ecosystems, etc. Every industry is different, but I think having a view of what might happen, having scenarios, and moving quickly when necessary, makes a lot of sense.

Question 6: What would you say to a C-level executive on how they should be thinking about going forward?

My sense in the last few months, having been in this business for a long time, is that I have never seen a technology that’s moving as quickly with the potential impact that Gen AI has. This is partially because it’s built on top of the internet and a lot of prior investments. But we’re now accelerating, and I struggle to imagine what and how quickly things can happen.

So that’s the message I would leave. It requires making efforts to learn and reflect, to really think about the medium/long term impacts and how your business and ecosystem could change, including how you make money. As I mentioned, there are lots of lessons from early adopters, from China, other industries and places.

The second thing is, don’t delegate. As I said, hopefully you and your team can be your own Chief AI officers, and you don’t need to hire one. This is something that every senior business executive needs to take seriously and think about, at least for their domain, and how that’s going to accelerate as we get into, for example, autonomous agents, as we discussed,
Yes, the change can be concerning and could well be disruptive, but I think the impact of technology on things we care about such as healthcare, the environment, or transportation is going to be dramatic over the next ten years, and hopefully very positive.

I am hopeful our children are going to benefit from this. We need to be careful, but also really be inquisitive and excited about what might come.

Until recently, Dan Elron was the Managing Director of Corporate Strategy, Technology, and Innovation at Accenture, where he helped define its technology vision and drive its strategy for its technology business, and where he advised key clients, working with CEOs, corporate boards, and leading academics and policymakers across the globe. He also helped lead Accenture’s Innovation Network and its relationships with large and emerging technology partners. Most of his client work has been in the high-tech, telecom, and financial services industries, as well as the automotive industry as it needed to integrate advanced digital technologies. He was also the Information Technology Industry Advisor for the World Economic Forum. For the past decade, he has worked on the impact of artificial intelligence on large enterprises, anticipating significant disruption during the coming decade.

Currently, Dan works on strategic topics and advises senior leaders at large US technology players, and mentors and invests in startups in the US and Europe. He teaches at the University of Virginia, where he is associate director of the Center for the Management of Information Technology, and also at INSEAD in France.

Practical AI: Early Use Cases in the Enterprise Today

Over the past 6 months we’ve hosted several CXO Insight Calls around the topic of AI. However, there’s a real need to solve a number of core issues before enterprise IT leaders can adopt and move to production. According to Forbes, 55% of business leaders say their teams are resistant towards adopting AI today. It’s not hard to see why.

Our goal with this discussion was to move beyond the high-level conversations we’ve been discussing over the last 12 months and hear from a few leading startups on how to leverage external technology to drive practical advances in AI today. What’s the reality a year and a half in?

We were joined by Rehan Jalil of Securiti.ai, who covered his perspective on leveraging xStructured data, Rob Bearden and Ram Venkatesh of Sema4.ai on agents and automation, and Vin Sharma of Vijil.ai on trust and safety…so let’s dive in.

 

Early Learnings at Mayfield

Our thesis in AI is that AI + Human = Human², and that instead of displacing today’s workforce, AI should be used to elevate human productivity. The first wave of AI, which arrived over a decade in the past, was all about the robotic automation of tasks. Today, we’ve made it to the point where we have co-pilots (or AI assistants). In the future, we believe that there will be digital coworkers with humans as assistants – unlocking employee potential to work on more meaningful tasks. We’re calling this new AI era “Cognition-as-a-Service.”

While this is very new, in the past, we’ve followed similar waves of technology. There was infrastructure-as-a-Service, followed by platform-as-a-service (focused on developers), then software-as-a-service (focused on line of business users). And in this new era we’ll have cognition-as-a-service, and the first block is going to be cognitive cloud infrastructure.

Looking at our AI stack, let’s zoom in bottoms-up:

  • Semis and infra software: Both of these will be covered by the big cloud players
  • Models: These are important and will start to become increasingly multimodal. Companies will need things like model trust, safety, and evaluation testing
  • Data: There’s going to be a huge need for data infrastructure, operations, and security
  • Middleware, Apps, and Digital Coworkers: People will build the apps and tools needed to enable digital coworkers

We had three companies join us on our call and speak to different aspects of this stack: the models, the data layer, and the middleware. In the future you’ll be hearing a lot more from us around intelligent applications and the future of agents (or digital assistants). The end game may very well be a hybrid workforce, where we humans will have these digital workers as our teammates.

We think that there will be endless possibilities over the next 15-20 years in this area and are looking forward to seeing what everyone comes up with. To kick off our call, we ran an audience poll (150+ responses) to get a sense of how optimistic vs. cautious the audience was with regards to their upcoming AI initiatives. Generally speaking, it seems everyone is looking forward to the future.

Unleashing the Power of xStructured Data Safely with Gen AI – Rehan Jalil, Securiti.ai

Today, organizations across the globe are eager to utilize their proprietary data with LLMs in order to generate value. Unfortunately, this requires the need to have proper visibility and manageability of both structured and unstructured data – particularly unstructured – which is now center stage. How do you deal with figuring out where your unstructured data lives? Or what data you have available? Or how to bring that data into new gen AI pipelines?

One approach could be a knowledge graph. Imagine: all your data in different environments, organized to include an understanding of what files live where, what kinds of sensitive information they might contain, what types of files they are, what purpose they have sitting within the organization, what the applicable regulations are around that data, and even the enterprise data processes that are operating on top of it all. This is Securiti.ai’s approach: the data command center.

Via this command center approach, it’s possible to ensure stronger data security, data governance, data privacy, and the monitoring of unstructured data. So, where does gen AI come in?

The advanced modern gen AI application is going to need to take data from anywhere, data that’s sitting across all these different systems, and combine that data with your custom models to create all the exciting new innovations on top of them – even creating new agents or assistants.

Today, organizations are trying to build their own knowledge systems or applications where people can go to ask simplifying questions. Or, they’re using these data pipelines to tune existing language models. Very occasionally, you’re even seeing a few companies training their own custom LLMs using this infrastructure – but that requires the data to be actually brought into the mix.

In the past, structured data was used for business intelligence, and the whole ecosystem around capability was required. But today, we’re seeing on a daily basis the need to understand unstructured data and catalog it, discover it, and understand what its role is within the organization. And there’s even more emphasis on things like the quality of the data, the lineage of the data, the compliance of the data access, the entitlement of the data, etc. You can almost draw a parallel of the entire stack that once existed on the structured side now needing to move to the unstructured side.

Building a typical pipeline requires real effort:

  • The most appropriate data needs to be selected – it must the right data which doesn’t violate your internal controls, entitlement issues, or the governance that you actually want to put on these pipelines
  • The data must be ingested and extracted across a variety of formats (you may have PDFs, audio files, video files, etc.)
  • The data needs to be sanitized, redacted, or anonymized as-needed in an automated fashion
  • The data may need to be converted into vectors and provided to retrieval engines alongside guardrails to ensure no model or data poisoning is occurring

…all this, before it can be married to a model…and then you run into struggles like inspecting malicious prompts. So you also need a solution around exfiltration.

Audience Question: The idea of having a centralized mechanism for understanding what data we have (and making it simple to use and re-use) has been a bit of a holy grail in the information technology space for a long time, but we’ve never been able to achieve it because classifications generally break down the larger they get. However, we’ve been able to get by by doing just enough to make data accessible when and where it needs to be accessible, in order to protect privacy and so forth. So what’s unique about this moment in time that we need a data command center? And what makes it different from past efforts in the space?

A couple of things are coming together today. First is that people want to utilize this unstructured data in a very different way than they did in the past: to power LLMs. So a new need has definitely arisen.

You could say some of the same things about structured data in the past, but now with the vast understanding of language that LLMs bring, you can classify things in a much more high-efficacy way. However, you still need the metadata to be available. If it is though, you can build out a knowledge graph – which is a literal file-level understanding of where your data is. For example, if this is my audio file, tell me where it’s sitting, who is entitled to it, and what’s inside that file.

Generative AI is helping enable the ability to ask questions from these systems across any of the entities. Additionally, with new regulations popping up all the time, it’s becoming increasingly difficult to stay on top of data management in a piecemeal fashion. This is essentially a much stronger facilitator for enterprise search – except actionable against gen AI use cases (selecting data, protecting data, sanitizing data, and sending it to your LLMs).

Audience Question: A big trend we’ve all been seeing is the use of synthetic data – what’s your perspective on that?

Synthetic data is unquestionably used to replace structured data as part of your data pipeline. If you can replace your original data with a synthetic version, particularly on the structured side of things, mathematically, you can remove the unique tie back to the individuals. That’s useful. However, it’s not useful in many other situations where you still need to clean the data. Let’s say you have an audio file and within that audio file you want to know certain things, for example, that the audio file actually has very specific censored information at a single point in time that must be removed. You may need to apply different techniques to actually remove and redact this information. So it’s a very important part of your toolkit, but only applicable in certain situations.

Audience Question: What are the best practices for keeping AI models secure? What guardrails should be in place?

There are five steps organizations need to take in order to enable gen AI safely:

  • Discover AI Models – Discover and catalog AI models in use across public clouds, SaaS applications, and private environments. Figure out what the ratings are and what the best models are for your use case
  • Assess AI Model Risks – Evaluate risks related to data and AI models from IaaS and SaaS, and classify AI models as per global regulatory compliance
  • Map Data + AI Flows – Connect models to data sources, data processing paths, vendors, potential risks, compliance obligations, and continuously monitor data flow. What data is going into your models for tuning, training, embeddings, vector creation – it all needs to be monitored
  • Implement Data + AI Controls – Establish data controls on model inputs and outputs, securing AI systems from unauthorized access or manipulation. Ensure that inbound prompts are blocked from any prompt injection attacks and that they’re inspected using an LLM firewall (these have policies defined for you to understand if something is a jailbreak, offensive content, exfiltration, etc.)
  • Comply with Confidence – Conduct assessments to comply with standards such as NIST AI RMF and generate AI ROPA reports and AI system event logs. There are already 15+ AI regulations in place today

Deploying and Managing Intelligent AI Agents – Rob Bearden and Ram Venkatesh, Sema4.ai

The future of AI is going to lie in enterprises being able to build, deploy, and manage intelligent AI agents. Generative AI as a whole is probably one of the most important enabling, and enabling is the operative word here, enterprise technologies in the last generation.

What we have to realize is that it is enabling technology, and that if we look forward, every company will need an intelligent agent to be able to capture this enabling technology and apply it within the enterprise (where it can create and capture value by taking advantage of enterprise capabilities required for managing and accessing data, and then leveraging that data through gen AI structures).

Intelligent agents will be how the enterprise interacts with their customers, their supply chain, their employees, and really even their products in the not-too-distant-future. So Sema4.ai’s focus is on how intelligent agents can get to a place where they’re delivering significant outcomes and enabling high ROI use cases at scale.

These agents are fundamentally different from LLMs or RAG applications, they’re very nuanced on how a user’s natural language queries are executed, they don’t just respond to prompts but actually provide the reasoning about and with full context of all the user’s needs (even anticipating what those needs are!). Then, they’re capable of determining the sequence of actions that they want to execute, based on short term working memory, but with an additional, longer-term repository of knowledge that comes from a runbook that Sema4.ai provides, and a gallery of use cases that reflect what human cognitive processes look like. Additionally, these agents provide the ability to leverage external tools and APIs that are truly required in order to execute complex tasks across both digital and physical environments, while leveraging both structured and unstructured data. This is going to be the unlock for massive value creation, making enterprises far more efficient.

Today, given where we are in this journey, the benefit for enterprise engagement on building, deploying, running and enabling these intelligent agents really comes down to achieving competitive advantage. But anyone can start small by an MVP success and continuing to iterate. It’s a good time to build a center of excellence around how to operationalize this muscle and create and measure the value capture and creation from it.

Example Use Case: Global Electrical Equipment Manufacturer, serving customers in 90 countries.

This company has a small team that performs a very interesting function today: monitoring a set of government websites for export compliance as the rules change around things like sanctions. Whenever they see that there’s a new update, they download the PDF, and this PDF is analyzed by their legal team. They come up with the deltas between their current policy and the government’s new policy (e.g. here are the new agencies and individuals who have been placed on the list), then they have to turn around and see if this actually impacts their business.

At this point, they need to go do some lookups in their backend systems to understand: is this a customer of ours? What kind of financial relationship do we have with them? Once they understand that, a new person (a combination of account management, sales ops, and legal) has to come up with a legal policy statement. Sometimes these could have ramifications, for example, if they’re a large customer who was associated with Russia last year. This will create a material change in revenue from that customer. So someone has to put this information together and send it all the way to the CFO’s office for them to review. These policy documents have very, very structured requirements, a template they must follow, and a review process they must go through. And at the end of all that, it’s only published internally.

This entire manual workflow can take 4-5 people a couple of weeks to get done. It’s very material and consequential for the business because if they get this wrong, not only do they have top line impact, but they also have legal and compliance impact that they could be exposed to. And then they get to do this all over again because the next update shows up every three weeks or so. This is an example of a workflow that humans are really good at, but if you take this workflow and deconstruct it, you can see that this involves having access to unstructured data that’s typically behind a paywall of some kind, being able to summarize and analyze that data according to a narrow set of guardrails established for performing this process, and then querying backend systems that are very structured themselves. Finally, the answers coming back from them must be used to create some unstructured content that goes through a document processing workflow, and then finally, you get your output.

Typically, the day in the life of someone who is doing this part of the workflow has a couple of downstream systems they’re taking action upon. The promise of agents is to convert workflows like these to short conversations. These are a few key elements that we’ve found very valuable when thinking about good initial use cases for agents:

  • Well known agent “persona” – Many of these job functions, whether in HR ops, sales ops, collectibles and receivables, etc. are all very structured jobs
  • Well known standard operating procedure – This could include manuals, handbooks, trainings, material videos, or even an exam that somebody has to take to demonstrate competency in that particular function. These are all really good materials for agents to train on
  • Well specified intents and outcomes – You need a well specified outcome that tells you whether the thing that you wanted to do actually happened (or not)
  • Auditable for completion – Having clear outcomes makes this all very auditable for completion. You will be able to say: this is how the completion happened, these are each of the steps the process actually followed, here are all the calls to the external systems, the prompts fed into the LLMs, the decisions that the LLM recommended, the actions that we actually took in the backend systems…etc. There’s a complete history and an audit trail that’s actually very relevant and important for a task like this
  • Needs enterprise context to work – This is a really key part: agents must be able to access your enterprise context in a meaningful way, whether it’s structured, unstructured, etc. Humans aren’t thinking about the data model, they’re thinking about the question they need answered and that is what you need to facilitate
  • Actions are consequential – Actions are also a part of this. It’s about being able to absorb context, apply a set of procedures to evaluate what needs to happen, but then also being able to take those actions in these downstream, external systems and publish updates so that other systems and people and other agents can consume them

These are the elements of a good use case. It brings all these kinds of attributes together and this hopefully gives you a sense of how you can go about considering which areas are amenable to this kind of cognitive or intelligent automation.

Audience Question: Do you have any real-world examples of the kinds of impact metrics you’re seeing? Whether that be efficiency, latency, consistency, quality, or something else? How are you measuring all that and what have you seen in the real world?

First, you want to define high ROI use cases that you can execute against quickly and efficiently. For example, instead of looking at point to point automation for invoice payment processing, where the current automation is either an ERP or an RPA, you want to think about how you can instead enable an end-to-end order to cash cycle where you can have many intelligent agents that are executing tens of thousands of invoices and HR engagement workflows. The key is the size and scale.

Sema4.ai today is providing customers with a runbook, or gallery, where these templates are available that you can either leverage and build upon (or you can bring your own new best practice standard operating procedure). This is all managed through the runbook with ample security, governance, and lineage standards. But this isn’t just about executing those automations, but also having the AI intelligence to remediate the changes in automation as conditions and events change or evolve. The goal is for the agents themselves to in fact help define what ROI efficiencies there are and where the most value capture lies.

Audience Question: From an interoperability standpoint, it almost sounds like we’re getting to a stage where agents are talking to agents or systems are talking to systems, and I’m very curious if you foresee this being a closed loop like, hey, we express intent our own way or we express the logic our own way, versus some sort of standardization on how it is actually expressed. Just like we have programming languages to express logic, correct? Do you foresee any sort of standardization there?

You have three really good points that you make. This is a really good perspective for us to take into account. All the parts of our cognitive architecture are natural language based, because we believe that that’s the best way to interact with agents. That’s also exactly the way we interact with humans today. So if you think of how our users interact with our agents and assistants, it’s through the messaging paradigm, through tasks like an inbox, through chat, through Slack and teams, and through APIs. This naturally leads to composability.

And so to your point, how does this language give you the flexibility of interoperability while also giving you the precision of knowing that you’re talking to the right domain agent? For example, you don’t want to ask a physics question to a travel agent. We believe that there will be a next level of higher order semantics schemas that will come into play here.When we were talking about the galleries of agents and the templates, these are all ways for us to start to publish metadata that helps us identify the agents that are appropriate for a given interaction. But we do believe that multi-agent is a composability problem, not a turn of the crank where you need to do something very different.

LLM Trust and Safety – Vin Sharma, Vijil

Today, you often see enterprises hesitate in front of the potential of new technologies, particularly open source technologies, like foundational models today vs. their deployment in production (at scale, on systems in the real world).

That has been the motivating factor for Vijil. My goal is to help infuse trust and safety into agents, as they’re being developed, rather than as an afterthought. And I agree that as this plays out over the long run, we do expect a world in which there are many different types of agents. There will certainly be agents personalized to individuals, as well as agents customized to large organizations. And most certainly they will be interacting with each other, although there will very likely be many intermediate agents that rest within the public sphere, some of which are actually quite well-behaved and normal, but there may be others that are antagonistic and hostile. We won’t be able to assume that the world in which the agents interact with one another will necessarily be a benign and friendly world.

There will almost certainly be bad actors who build bots that are designed to be harmful to other agents and to other users. And so that’s part of how we see this world at Vijil. But even looking at today, before we get to that end state, there are still a variety of risks that must be addressed specifically at the model level. So I think in large part, one of the reasons why I have a particular take on this is that we see models as the first class citizens, and we plan to build on top of the data protection mechanisms that Rehan at Securiti.ai is building. But even if you’re focused specifically on the model, there are many different risks and lines of attack facing model developers today.

If you look at the history and the literature on the subject, there’s a ton of work on many potential ways of attacking ML systems, ML models, and the data that drives them. And now it’s true for LLMs as well. We’ve looked at the full taxonomy of attacks and possible ways in which these models can be compromised. But ultimately, we see today’s approaches around the state of the art for LLM security and safety to be focused on this inbound vector of prompts and prompt injections, with the response being to block and filter them through LLM firewalls or guardrails that prevent harmful outputs from going back out to the end user.

Perhaps some amount of alignment based on reinforcement learning with human feedback can put a friendly face in front of this kind of vast LLM behemoth. But underneath the hood is a monstrous entity that has been trained across the entire internet, for better or for worse, to produce whatever it is that it produces. This isn’t an area where we see as much attention being paid fundamentally to the nature of the model itself.

And that’s why we think that the way to approach this should be two-pronged. Certainly the approach of finding vulnerabilities to attack within models and detecting propensities for harm via some kind of an evaluation or scanning mechanism is important. Our approach is a red team and a blue team that work together to find and fix vulnerabilities and other negative propensities on a continuous basis, constantly adapting to new attacks. This must be done fast enough as to where it doesn’t block the model developers and the agent developers from deploying their models into production.

The way we built processes like this into the AWS AI organization, where I led the deep learning team, was to really bake in the classic “shift left” of security into the development process for AI and ML development. And you’ve seen trends or movements like ML SecOps reflect that position. But in some ways, I think even with ML SecOps, it feels like the point of insertion is reflected in the name, and the security should go even further left into the design to the development of the model itself. And so we see this coupling of the red team and the blue team as something that you bake into the agent and the model development process. And so what we started out with was just this fundamental idea that you can’t improve what you cannot measure. So you have to first measure trustworthiness. And today trustworthiness is barely well-defined, let alone measurable.

So we started out by building metrics and putting together an evaluation framework that lets you measure the trustworthiness of an agent, RAG application, or the model specifically inside it, with the capacity to test this at scale, quickly, so that it doesn’t interrupt your development workflow. Step two is creating a holistic envelope that protects the model and agent from both inbound and outbound issues. The final result of this analysis and evaluation generates a trust report and score for that LLM or agent – similar to how credit scores work on the individual level. It has eight dimensions today, and is coupled to performance so that the model or agent’s competence is tested as well (ultimately, these are some of the trade-offs we often see developers make in practice). If you improve the robustness, for example, it’s possible that when you train the model, the robustness may improve, but its accuracy may be reduced. So, the balance of these eight dimensions must be taken into consideration while developing a high competency model.

Audience Question: Is the scoring dependent on the use case itself? Or can you have one trust score that applies across all use cases?

I don’t think there will ever be a situation where you’re done with the evaluation. That being said, I do think that individual evaluations can be extended and customized. They’re frequently use case specific, although there are some common patterns across model architectures and model types. For example, you could start with a set of common goals that attackers might have when trying to disrupt the operation of an agent or a model. So you can identify a hundred or so of these goals and they’re fairly use case agnostic. We should be testing for these types of potential attacks from any source under any condition.

But when you look at more specific use cases, perhaps a bank chatbot, the bot in this case is representing a customer service agent that has access to a backend database, and perhaps isn’t enabled to write to the database, but can certainly read from it. So you’d want to ensure that it represents the organization’s policies and standards for customer interaction, that it protects the organization’s brand identity, that it’s not recommending other brands for simple questions, and that it doesn’t disclose personally identifiable information. So there’s a bunch of things that are unique to what a bank chatbot does. That would mean that the evaluation of the bank chatbot is highly customized, but at the same time as an agent, as an LLM model, I think it’s vulnerable to a number of different attacks and you would do a broad range evaluation for that purpose.

Audience Question: How do you weigh the various elements in the model? What’s the math behind that and how is that controlled? I assume as you progress your business and get into the really large enterprises, you’d have an entire policy driven governance model so that corporate admins could skew things towards one of the eight elements from a weighting standpoint? Is that the direction you’re going?

That’s exactly right. We already do this today. We have a weighted means. So right now, the weights upon shipment are equal. But we expect our customers to adapt and modify things for their own use case valuation frameworks.

Audience Question: How do you establish trust in the trust core? Why you, and why your sources of explainability?

The report itself and the measures that we use, as well as the individual prompts that produce the particular responses, and the evaluations, are publicly available to our customers. So we’re transparent in the evaluation process. That’s not where we’re withholding IP.

The core of where we see Vijil evolving is in building blue team capabilities that fix those issues, which we think is the much harder problem. Almost every paper on adversarial machine learning today is about finding the upper bound of attack vectors that you can assault a model with. There’s much less out there about building systems that can defend the model in-depth or build in-depth defense mechanisms more specifically. So that’s what we’re focusing on. We’re happy to share the specifics of how the model is evaluated and adapt it as we go along.

 

Elvis Cernjul, Chief Information Officer, The Ubique Group

Today we welcome Elvis Cernjul, Chief Information Officer at the Ubique Group, to The CXO Journey to the AI Future podcast

Elvis is a distinguished leader in the field of information technology and operational excellence, combining a rich academic background with extensive professional experience. He holds an M.Sc. in Technology Leadership from Brown University (as of May 2024), an MBA, and a B.Sc. in Information Technology. Elvis has excelled in multiple high-level roles including Chief Information Officer, Chief Operations Officer, and Chief Information Security Officer. His career is highlighted by his skill in transforming IT infrastructures, pioneering digital marketing strategies, and leading teams in high-stakes environments. He is currently making impactful strides at The Ubique Group while continuing to drive innovation and operational excellence across the tech industry.

Question 1: First Job: Could you talk a little bit about your background and how you got to the position where you are now?

So, the Ubique Group is a collection of brands that source and sell commercial-grade office and home products through channel partners like Wayfair, Walmart, Amazon, and others, along with B2B and D2C models. Prior to my tenure here at Ubique Group, I served in a variety of roles including a stint as an Army Ranger, believe it or not, and eventually wound up as both a CIO and COO.

My background includes about 35 years of leadership experience. 25 of those are within technology with a particular focus on the retail space.

Question 2: Generative AI: Everybody’s talking about it. You’re an operator of a large organization and you probably have other things to worry about, but we in the investment world think it’s all the rage, of course, and arguably it could be. What do you think about it from a priority standpoint?

I think that for our business, and the retail space as a whole, it’s a top priority. It provides such a leap in capabilities that if it’s not embraced in some capacity you’ll fall behind your peers.

It needs to be a part of every conversation that involves the future state of the business. My roadmap here at the Ubique Group revolves around leveraging the AI capabilities of any given platform. It’s not even really a competitive advantage anymore, it’s quickly transforming into a commodity.

And as we all know, machine learning and AI have been around for a while, but ChatGPT, Copilot, and some others have ushered in the accessibility of large AI language models.

Question 3: Early learnings. I presume you’ve been experimenting with it and done some personal testing with ChatGPT. I’m sure some of your teams are using it. I’m sure bosses or business units are coming to you with ideas. So what are some of the early learnings about it?

We’re thoughtfully, cautiously, yet excitedly approaching AI through ChatGPT. Today we’re subscribed to co-pilot and we’ve been kicking the tires on Einstein through Salesforce and a variety of other solutions.

A couple of initial learnings:

First, it’s really difficult to get a peek behind the curtains on how companies that are offering AI solutions plan to use our data, and what true safeguards are in place.

Second, the results today are not always accurate, detailed, or polished.

All that being said, we’re tackling a few initiatives where AI is at the forefront of discussions, including new customer care processes, call center systems, content and copy, business intelligence… and my development team is using it quite a bit

When managing change of this magnitude within an organization, especially with AI, it’s important to position it as a complimentary service and not a replacer. So I’ve been trying to balance it with that message as part of my language to the organization.

Question 4: New metrics: If you could break down the use cases you just described, how would you think about the metrics applied to each of them? You mentioned developer teams. Are you able to realize that impact? How are you thinking about metrics?

One thing that I’ve seen gain a lot of traction is time to market on development through AI-enhanced coding.

One of the intangible ones that we’re just now playing with is bubbling up business metrics through conversational AI. So it’s asking the data to reveal things that the business may not yet know to ask.

When ChatGPT released advanced data analytics, I started taking data from FRED BLS and some other public information, and throwing it into a spreadsheet along with Google Trends word. Then, I started asking ChatGPT to come up with correlations that I specifically wouldn’t even think to ask. I was amazed at the results that it was providing.

So, as a next step, I started showcasing this functionality to other key stakeholders within the business, and that led us to adopt a budget to pursue improving our insights and data analytics hub and expanding that to a data lake with some generative AI capabilities built into it.

Question 5: One of the questions that CIOs often ask is “buy versus build.” You mentioned some investments in current tech today, including Einstein and others. So the existing providers, I’m sure, are coming to you with their AI offerings. Does that give you an opportunity to buy versus build? Is there a combination of that? Or do you choose one or another?

We’re strictly focused on leveraging what our vendors offer. So we’re not staffed to create or train our language models. It’s not necessarily a differentiator for us. So yes, we’re closely partnering with our software providers and platforms to provide that means.

Bonus: Are they giving you an opportunity to convey your use cases, and design toward your use cases, so to speak? Is it evolving that quickly? Do you see a receptivity on that side?

Yes, for sure. With our partners at Salesforce along with the data lake that we just implemented, it’s a pretty close relationship.

Question 6: Gaps: Where are the gaps you’re seeing today? Undoubtedly you see some divots in the field, things you need to address. Is that a people issue? Is that a process issue? Or is that a technology issue?

I think it’s a little bit of all of the above. It’s training, knowing how to ask the AI the right questions, and how to identify errors. There’s also staffing. There’s no position, at least in our organization, dedicated to AI. But that is something I see changing throughout the industry within the next few years.

One of the key issues, as far as gaps go, is how to best wrap guardrails around AI so that users aren’t accidentally sharing sensitive information.

Bonus: And is that just educating the team on how to best use it?

It is, yes. We’ve had a few educational opportunities for employees, and we’ll continue down that path as well.

But I also struggle with how to wrap a technological guardrail around it so that we protect our assets, such as our customer data.

Question 7: Responsible AI: How does an organization like yours manage to be responsible when it comes to leveraging AI? How do you think about it? What actions should you take as a CIO?

Responsible AI…Defining it is an interesting endeavor that for me includes data privacy and security. Especially since now, we’re a global company, we have different regulations to adhere to. This includes our level of transparency with regard to using customer information. It also includes mitigating biases.

I believe everyone in retail wants to ensure that things like product recommendations, pricing, and marketing strategies are fair and equitable across all demographics. The same type of concerns extend to the internal consumption of data. We want to avoid a learning model that amplifies and doubles down on inaccurate results, which I’ve seen.

But I’m proud that we’re championing ESG within our organization. Part of our thoughts, or at least my thoughts, about wrapping responsibility around AI, is ensuring that we’re considering the use of energy-efficient AI technologies.

Elvis Cernjul is a distinguished leader in the field of technology and operational excellence, combining a rich academic background with extensive professional experience. Holding an M.Sc. in Technology Leadership from Brown University (expected May 2024), an MBA, and a B.Sc. in Information Technology, Elvis has excelled in multiple high-level roles such as Chief Information Officer, Chief Operations Officer, and Chief Information Security Officer. His career is highlighted by his skill in transforming IT infrastructures, pioneering digital marketing strategies, and leading teams in high-stakes environments.

Elvis’s unique leadership approach is shaped by his military background as a Combat Veteran and Ranger, bringing a blend of strategic discipline and resilience to his corporate endeavors. Currently making impactful strides at The Ubique Group, Elvis continues to drive innovation and operational excellence in the tech industry.

Announcing the 2024 Mayfield Farmlink FIELD Fellows

I’m excited to announce the second cohort of the Mayfield Farmlink FIELD Fellows: student leaders dedicated to an 8-month, action-driven pipeline that educates, immerses, and enables changemakers to create an impact across different segments of the food system. We have a long tradition of philanthropy at Mayfield where we partner with community organizations addressing barriers to education, promoting diversity/equity/inclusion and providing innovative solutions to food scarcity. We are great admirers of The Farmlink Project, which is catalyzing the next generation of ambitious students to create sustainable solutions within the food space and enact innovative change.

We hosted a welcome gathering for the new Fellows during which we had a lively discussion around startups, founders, life, and AI. Here were a few of the key takeaways:

  • Be yourself in everything you do.
  • Always have a learning mindset.
  • Don’t be afraid to ask questions.
  • Make sure you have mentors.
  • We’re a people-first firm, so we’re always spending time looking at the “people” aspect of company building: What motivates someone? What are they trying to do? Are they just pitching me or are they authentic? How do they treat people around themselves?
  • Building a company is like running a marathon, not a sprint, it’s hard. You need to spend the time and have grit to create greatness.
  • The chance to fail is always high in any startup. It’s a pattern recognition business. It’s a team sport. Know what you’re best at and surround yourself with excellence with other team members and have seasoned mentors and coaches.
  • Be first principles in everything and treat people the way you want to be treated. Practice respect, empathy, honesty.
  • AI is your teammate. Don’t be afraid, it’s the same as calculators and excel macros used to be. Humans are smart and always figure out how to leverage these tools. Think of AI as another horse – a new one that runs faster, is smarter and can reason. It will always need a jockey to ride it, who will be a human!

Naveen Zutshi, CIO, Databricks

Today we welcome Naveen Zutshi, Chief Information Officer at Databricks, to The CXO Journey to the AI Future podcast. He joined Databricks in January 2022, and is responsible for their IT solutions, driving transformational programs to help the company scale its consumption-based business globally. He’s on the board of advisors for several fast-growing startups like Propelo & Torii, and more established companies including Zoom and Rubrik. He’s also an investor in many fast-growing startups and early-stage VCs.

Question 1: First Job: Could you talk a little bit about your background and how you got to the position where you are now?

I’m a computer engineer, so I love coding. However, I realized early on that maybe I was a slightly better manager. So I moved into management, mostly engineering management, for a while, but stayed technical, where I found my true calling.

In between, I’ve been at startups and in large companies. What’s been interesting is that sometimes you can take what you learned at high tech companies and leverage that in other environments such as retail. For example, I remember the work I did at Gap because it was such a huge opportunity. After all, there was a lot of legacy, and there was huge improvement to be made both on the e-commerce side as well as on the store side. In every job I’ve done, it has been amazing to work with so many smart people, which includes some of the incredible startup founders I’ve had a chance to partner with.

Question 2: Generative AI: Everybody’s talking about it. Maybe you can start by sharing your own view of how important CIOs should be prioritizing Generative AI. Is this a unique opportunity? We’ve all been through the mobile era, the social era, and the cloud era, Is this different? Could you put a little context around Generative AI?

I can tell you a little bit of my own story. Obviously, for me, the number is 22, because that’s when ChatGPT came out. I spent hours and hours asking all sorts of questions and getting some amazing answers. The initial hype was incredible, and I think, at least for me personally, I felt it was a seminal moment similar to, or maybe stronger than, any one moment that happened for mobile or the internet.

However, over the last year we’ve started to see what it can do, and right now there’s a disillusionment that we’re going through as an industry, but all the same I still feel incredibly confident that the future is bright for GenAI. In the meantime, there are a few thorny problems that we need to solve first.

Question 3: Early learnings. What are some of the early learnings when you think about Generative AI use cases there internally at Databricks? I’m sure you talk to CIOs in your role there. What are some of the early findings about how to become excellent?

The first use case that I’ve seen the most both internally as well as with our customers, has been the copilots, typically focused on software. We have implemented it for all our engineers and R&D.

We see a lot of our customers have rolled out different versions of it whether it’s Github, Copilot, Copilot X, or other versions of it. And they’ve seen various amounts of benefits, anywhere from 10-40%.

And I think there are other interesting use cases around software development. For example, test migration and test unit case creations, e.g. migration of codes from one language to another. That’s been a really good one that I’ve found for even applications like Salesforce. So, that’s starting to become a more established and more mature use case, and companies are mandating that every developer use Copilot in their daily work, which has become the new standard or the new expectation around productivity.

The second real use case I’ve been seeing is summarization. We want to drive summarization, and reduce hallucinations using RAG. And ultimately, we’re starting to see this action as intent-based. You know, where you take an intent, and have a multi-step process that actually achieves that intent. And I think that’s hopefully where the industry is moving. We’re seeing some early examples of that in the B2C space. But I’m assuming those will also translate into the B2B space.

Question 4: Gaps: So if that’s the future, where we have intent-based agents actually resulting in true productivity or workflow redesign, what are the risks in getting to that? What are some of the obstacles, issues, or maybe even gaps in the technology that you believe need to be addressed? I know this isn’t a technology-only issue. What are the headwinds overall?

Let’s start with RAG use cases: The biggest obstacle today is still reliability. You measure the reliability of answers coming back and in an enterprise setting your reliability needs to be pretty high, even close to perfect. And with GenAI use cases, you can reduce hallucinations, but it’s hard to achieve perfection. That’s one area.

The other area is unfortunately still the completeness of data. A lot of customers still talk about that. I don’t have all the data sets in a common form. For example, we work with a large air carrier. Do we have all of it in a data lake or a lake house paradigm like that large air carrier? Do we have the access control setup? So that’s another area. For instance, if I took out this unstructured data, and then ran it against a RAG model, but then you asked it a question, and I asked it a question, it should be privacy-preserving for you versus me.

And then, if you think about each summary, or each step of the process, if that reliability is very high, can I chain it together into a common model? Where can I drive content into this sequence of steps? For example, could I have it flawlessly plan an entire summer trip if I provide all the right parameters? Would it book the hotels, the activities, the flights, the meals?

I think that the validation step with humans in the loop is something that will still be required before it becomes truly autonomous. I think that we’re a ways from a truly autonomous state.

New metrics: We’ve watched the hype slow down as people don’t see the results they want. What are the necessary success metrics to prevent people from falling into that trough of disillusionment?

One thing I would say is to think about AI as a whole. What has been the success? The success has been dramatic, right? So there’s a dramatic success in revenue growth and profitability.

Let’s say you have so many use cases across AI overall that are now in production. And companies have seen revenue growth as well as profitability improvements as a result of AI. In fact, many companies are just AI-based as a result.

The same thing is going to happen with GenAI use cases too. I think the productivity benefit is the biggest one. If you look at the McKinsey report, it says 30% of all work could be automated by GenAI. I’m not sure if that is true or not yet. It’s a big number, but I would expect to see people start with some core use cases where you develop some level of success at scale.

Question 5: One of the questions that CIOs often ask is “buy versus build.” The idea here is that this is a new technology we need to learn. We’ve got people and staffing issues. But there are also existing vendors, and they’re offering AI-capable functions. Copilot, for example, as you mentioned. Do I do a little of both? Do I become an expert at building? What’s the “buy versus build” thesis that you believe we should be thinking about?

The way I think about this today is that there are two vectors: one is transitory, and the other one may be more permanent.

The first vector is thinking something like this: “Hey, if the field is still nascent, my SaaS vendors or my new startups can’t achieve scale yet, or they’re not working on the problem that I’m working on. Maybe I can experiment and build something in production. Maybe that will last for a year or two. By that time my SaaS vendors will catch up, and I will leverage their products.”

The second vector is that I have proprietary data, or have proprietary algorithms that I want to use against private data. And I don’t want to share this data with my SaaS vendor. I want to build it on a framework that is my own platform.

And I think there will be continued use of that, you know. That is Bloomberg’s use case. And it’s not just tech companies. I think non-technology companies will also have some use cases like that.

And I think on the buy side, what we’ve primarily seen is newer vendors coming into play whose GenAI capabilities we can leverage. And that has been very, very interesting. And so we’re using some buy as well and internally we’re using both buy and build.

So it’s a hybrid model and my expectation is that over time, with internal use cases, you will do more buy than build, but you will have some level of build done for sure.

Question 6: Responsible AI: This is a concept we’ve heard about in the press. There’s this idea around potential bias, hallucinations, privacy risk, as you noted before. How do you think about it? What advice would you give to a CIO?

I’ll mostly speak to B2B companies. I think B2C use cases may have additional privacy restrictions. So for me, customer data is sacred. How are we ensuring that the customer data is privacy-preserving? We’re talking to the customers before we use their data for training our examples or internal custom employee data. There is personalization, and there is privacy on that data as well. So you have this confidential and sensitive data. And in practice, you should have the right classifications on that data, so you just need to preserve those classifications back into your AI models as well. That’s one area.

The second area is policies. What kind of policies have you established? Make sure that you have really robust and clear policies established with legal and that you’ve educated your teams and your employees on those policies. There’s a lot of hype to use these tools, but you don’t want to inadvertently use them in the wrong manner.

Finally, having a good governance model in place is important as well. Depending on the use case, you may still have some hallucinations, and that could be acceptable, but in other use cases complete precision may be required. Understand what kind of use case you’re actually delivering. What kind of information do you want to create?

Bonus: Do you see an organizational role for responsible AI? And if so, on whose shoulders does it rest?

It’s hard for me to say right now. Today, legal, security, and IT are working together on this, but I’m assuming that in other companies, especially in B2C companies, that might become a specific role.

Naveen is currently the Chief Information Officer of Databricks. In this role he is responsible for Databricks’ information technology solutions, driving transformational programs to help the company scale its consumption-based business globally. Naveen is on the board of advisors for several fast-growing startups like Propelo & Torii, as well as established companies like Zoom and Rubrik. He’s also an investor in many high growth startups and early-stage VCs.

Naveen’s experience spans software development and infrastructure, from leading organizations from Fortune 500 companies to tech startups. Prior to Databricks, Naveen was CIO at Palo Alto Networks for the last six years, helping the company move to a cloud-based business, and integrated over 17 companies in that time period. Prior to Palo Alto Networks, Naveen worked as Senior Vice President at Gap Inc. where he was responsible for the company’s infrastructure, operations, and information security organizations. Before Gap, Naveen spent time at SaaS startups or high-tech companies like Cisco.

He earned a BE in Computer Engineering from Bangalore University and an MBA from the University of Arkansas.

Monica Khurana, CTO, Dodge & Cox

Today we welcome Monica Khurana, CTO at Dodge & Cox, to our “CXO Journey to the AI Future” podcast. She joined Dodge & Cox in December 2017 as a visionary who has led the industry in creating products and solutions that are leading edge and sometimes the first of their kind in their respective spaces. Monica is a seasoned executive with over twenty-one years of management, operations, product, technology, financial planning, digital marketing, and cybersecurity experience.

Question 1: Current Job: Could tell us a little bit about your role? How long have you been at Dodge & Cox?

This is my seventh year. After a stint in technology, I moved into finance back in early 2000. I had to do a lot of data analytics, both in terms of looking at market data, as well as looking at client retention and client data. At this time, I built my first performance attribution system to evaluate how investment funds are doing and what was driving their performance. This was a huge success, and I was asked to manage hundreds of billions of dollars.

I was hired here to do something similar, which was to bring new capabilities from a technology perspective, including cloud capabilities, generative AI, and many different data and analytics solutions. The goal has always been to continue improving the performance of our funds.

Question 2: Generative AI: Everybody’s talking about it. You’ve been managing technology for a number of years, how do you see Gen AI as different? And how much priority do you think we should be putting on this?

I think we’re looking at an inflection point from a technology perspective. You and I have been around for a long time, we saw the internet, saw mobile, saw crypto. I think this is another inflection point.

However, generative AI isn’t new. It’s been around for a while, but ChatGPT made it so accessible to everyone that most of us, by experimenting and trying little projects to see what capabilities we have, saw that the promise of the journey is very real.

You no longer need to be a technological person to do a lot more than what you could do earlier. So we’ve been spending time considering: What value do we want to add? Where do we want to apply it? And, what projects are we working on today that could use it? Do you apply it to looking at investment data? Looking at Alpha generation? Looking at benchmark data and other market data providers?

The second bucket would be clients. Are we retaining our clients? Are we supporting them? Are we answering their questions quickly enough? We definitely need tools around that.

The third bucket is internal productivity. This could be pre-market commentary, code testing, or reviewing disclosures. There are many different ways we can get more productive.

Right now we’re mostly just dealing with where to start and how to prioritize all of these different applications.

Question 3: Metrics: Obviously, new technology always has to be justified. How will you measure ROI on this and choose one use case over another? What sort of metrics guidance would you give other CTOs?

Prioritizing all this is very complex. How do you invest and where do you deploy your resources? There are a lot of parameters you can look into, but a good one is savings over time. So, how do you quantify those savings?

Another good metric to consider is time. Where will you redeploy time that gets saved? Is it going towards strategy? Towards driving revenue? Towards product development? So it comes down to who’s going to get the value out of it.

The second piece is around the return on investment. What will it take to build a model? What will it take to maintain it, from the computing, research, and storage perspectives? And then, when do you expect a return on investment? One year, two years, three years?
There are certainly some low hanging fruits like chatbots and helpdesk. In those use cases, we expect a faster return. Things like product will take longer.

The third piece is going to be improving the employee experience. Ideally we want employees to be able to spend less time doing mundane tasks and more time doing creative, innovative, strategic things. The hope would be to see improvement in employee satisfaction and stakeholder satisfaction over time.

So it’s a complex mix, and we have to look at all of these together to figure out how to maximize the value.

Question 4: The ‘Buy versus Build’ concept: You talked a little bit about the effort of cost analysis, because with AI (and any new technology) you’re having to both teach and learn new products, new technologies, and new services. This could even include building new teams and resources. So is there a different level of quandary around buying technology that’s prepackaged versus building your own because you have custom requirements? What’s the ‘Buy versus Build’ thesis that you believe we should be thinking about?

I think this is a complex problem. All our vendors and key strategic partners are investing in the space. And we’re trying to see where they’re going and what they’ll be able to offer.

The second component is around data. They’re all building these capabilities on the data they own and the data they have access to. So we are seeing market data providers. We’re seeing auto management systems, all of them building in different capacities.

So our goal here is to see what our vendors are going to provide. And then the question turns to our own data: What do we do with that? Should we provide it to one of our strategic partners and have them build on it? Should we be applying these third-party LLMs and doing some analysis on our own?

Additionally, there’s this whole thing around hallucination, false outputs. We’ll also need to evaluate the output coming out of our vendors and strategic partners, not just our own AI resources. At the end of the day, we’re still held accountable to our shareholders and clients.

Question 5: What are some of the issues, or maybe even gaps in technology, that you believe need to be addressed? I know this isn’t a technology-only issue. What are the headwinds?

We’re already seeing a renewed focus on data. I mean, we’ve already had data governance, and data quality in the works for a bit. I think the importance of data has become more real now on account of generative AI. Understanding your data and having it structured in the right way so that these new AI tools can leverage it is important.

The second issue is around security. Risk of data loss is real. So how do you ensure that your contracts with vendors cover and protect your data?

And then there’s a risk of hallucination as I mentioned earlier. How can you really ensure accuracy around all this? Just like cybersecurity insurance, are we going to see more insurance providers providing protection around hallucinations? There are definitely some tea leaves swirling around that.

You made a good point about the human side of things. We’re going to see more AI operationalization meetings at the management layer and the leadership layer. Everybody needs to understand this. So what are the shifts in skills that are needed? Be it in terms of education, or be it in terms of bringing about some of these tools?

Getting started is also a bit tough. You want to start small, see some success, and then expand. So how do you bring that growth mindset into all this? There’s still a lot of figuring out to be done, and the challenges related to that.

And the last piece is regulation. The EU, Biden, as well as the India Digital Act, and the Canada Act. We’re still trying to figure out where that’s all going to land.

Question 6: Responsible AI: How do you think about it?

I think that we all have to focus on it significantly both from the perspective of the biases and the data it is being trained on. How much can you rely on the quality of your outputs when trying to apply it? Recruiting is a good example of this: you have to be so careful that no bias is introduced.

The second piece also is around looking at IP infringement. The verdict is still a little unclear on where some of this will go. I mean, is training on publicly paid available data a good thing? Not a good thing? And how do we make sure we’re not negatively impacted?
I think in 2024 we’re going to see a lot more clarity starting to emerge.

Bonus: What do you think about this market opportunity? You said it’s a high priority. But let’s say you’re standing in front of a large audience – your board, your leadership team – How would you tell them to think about the AI market right now?

At the end of the day, we’re at a big inflection point. AI has democratized IT across the workforce, and that is powerful by itself. The question now comes down to how we harness this power.

I think you and I talked in the past about what we would see in the infrastructure layer, the cloud layer, and the data layer. We should expect to see gen AI embedded into every aspect of everything we do, from the consumer side, to the investment side, to the data side. I can’t imagine a facet where we would not expect to see this.

The question is more around timing. How much to expect by 2024? And how much to expect by 2027? Where regulations are heading will be important too. AI is already important today, and I don’t think there’s any going back. It’s more about how far this will take us.

Monica Khurana has a diverse work experience spanning over several industries and roles. She is the Chief Technology Officer at Dodge & Cox and serves on the board Committee for T200 (a non-profit promoting women in technology).

Monica also held leadership positions at Guardian Life Insurance-RS Investments/Victory Capital Management, Cornerstone Research, MUFG, and MNM Partners Inc. She has been in Chief Information Officer and Chief Technology Officer roles since 2007, and has been responsible for various strategic initiatives, such as integrating acquired firms, transforming technology platforms, and aligning technology with business goals. Earlier in her career, she worked at Barclays Global Investors, CareCore/Varian Medical Systems, HP, and the University of Missouri-Columbia Hospital and R&D department, where she led projects in areas such as asset management, healthcare technology, and patient care systems.

Monica holds a master’s degree in computer science from the University of Missouri-Columbia, attained from 1996 to 1999. Monica also has a master’s degree in industrial engineering from the same institution, earned from 1996 to 1998. Prior to her postgraduate studies, Monica completed her Bachelor of Engineering in Industrial Engineering (gold medalist) from the National Institute of Technology (REC Jalandhar) between 1990 and 1994.

Innovation: The State of Play and Current Best Practices

In partnership with Peter Temes, Founder & President at the Innovation in Large Organizations Institute, we recently interviewed around 60 innovation leaders focused on guiding change and adding value to their respective large organizations. This was a varied cohort including leaders in CPG (Coca-Cola), Technology (Microsoft), and Healthcare (University of Michigan Health System).

Perhaps unsurprisingly, we found a remarkable variety in the level of influence, org structures, and responsibilities of innovation leadership, and far less consistent practice, or even agreement on what best practices are, than we observed five and ten years ago.

And yet several major shifts are clear, as noted in our key findings below. We see fewer firms establishing or supporting Chief Innovation Officer titles, and more innovation practices landing in operating roles, or shared-services roles like Data and Strategy. Overall, when a Chief Innovation Officer departs a large firm, the most likely replacement is nobody.

At the same time, many more mid-level staffers have “innovation” in their job titles. In some cases, this is positive. At firms like AT&T, DFW Airport, and Accenture, innovation as a mindset and mission has been successfully established as part of many people’s jobs, rather than as a special-purpose role. In other cases, the spread of the title is more lip-service than substantial positive change.

Most commonly, the role of innovation has begun to consolidate: its core function has become the exploration and deployment of new technology towards existing operations and goals. This is especially critical in light of the boom around generative AI – many innovation teams today are heavily focused on integrating this exciting new technology across the broader organization. This is trend is likely to continue, unless the boom loses steam. We expect this phase to last at least five years, at which point there may be a cyclical swing back to business-model and more strategic innovation activities which are typically downstream of the adoption of new technology.

The Process:

We conducted deep interviews with about 30 innovation leaders in large organizations across industries whom we recognize as strongly effective in their roles, and group interviews with about 30 more senior innovation leaders. We define the head of innovation as the key executive who has most of the responsibility for the processes that identify, test, and hand-off early-state new products, new business models, and new operational processes.

Key Findings:

  1. Large organizations are less likely to have a Chief Innovation Officer or Senior Head of Innovation today than they were five years ago – and yet more innovation activity is alive and growing at lower levels of the org chart.
    • At one large US-based insurance company, after their Chief Innovation Officer departed, the Chief Digital Officer took over the function, and eventually saw his title change to Chief DIgital and Innovation Officer.
  2. Very few innovation programs are pursuing “disruptive” innovation. The vast majority of innovation leaders at large firms rely on existing functional leaders to shape the innovation agenda, and to supply or approve innovation projects.
    • One innovation leader at a large food and beverage company explained: “I don’t try to sell anything internally.”
    • A lead innovation executive at SAP shared: “We don’t have appetite for radical or step-change innovation. SAP doesn’t need it…we call [the function] innovation, but it’s more like let’s quickly assess what should be the first proof points and applications of LLMs, as an example.”
  3. Business-model innovation has dropped off the agenda for innovation leaders across industries, quite dramatically. Understanding and deploying newly maturing technologies – AI and LLMs in particular – has replaced business-model innovation as the top agenda item.
  4. Innovation functions are viewed internally as helpers and drivers, rather than challengers, in most of the large organizations we studied.
    • At Google, one innovation director explained: “My group sits within the generative AI engineering team. I report to the VP of our conversational AI technology. We explore how we can take our domain knowledge and better understand the use cases. Then how we can triangulate that to customers.”
    • At a large financial-services firm, the VP of Innovation Development shared: “We are really careful to build a culture that avoids the spotlight, and is service oriented. We don’t want to take 12 weeks for a product that’s killed because the interest beyond our team isn’t there.”
    • At SAP, one innovation leader shared: “It’s really important to me that I can always point to the senior leadership team to say they’re ultimately in charge. Otherwise, whenever I engage with VPs of Engineering, or others, I’ll have problems with their engagement.”
  5. In most cases, small budgets for core staff and some operating expenses come through annual budgeting processes, generally under a CIO’s budget (in a few cases, from the strategy function or R&D), with more extended efforts funded by business units.
    • The head of innovation at a large financial-services firm explained that beyond his core team’s funded HR cost, “getting the buy-in from all the different businesses is the recurring challenge. Any initiative in terms of innovation has to come from the businesses – they have the budget.”
  6. Senior innovation leaders are more likely to be strongly credentialed technologists, and more likely to have risen up internally, reflecting the shift toward a service-and-exploration agenda, compared to the larger points of emphasis on disruption, business-model change, and organizational change.

The Four Clusters:

We found successful leaders clustering into four main groups, as follows:

  • Innovation as a strong, centralized, strategic function: AXA, Vertiv, Top Build, KPMG, HP (prior to shift), The Coca Cola Company, UM Health System.
  • Innovation as a shared service, with a modest strategic voice: HSBC, DFW, Salesforce (new iteration), UAB Medical Center, SAP
  • Innovation as a subsidiary to sales or product groups: Bloomberg, Salesforce (prior to shift), JLL, Microsoft, Harley-Davidson, HTC
  • Innovation as a subsidiary to digital/data: Nationwide, Humana, Centene, Vertiv, HSBC

On The Record Interviews

Big Four Accounting Firm Vice Chair for Innovation

I started doing this job that didn’t previously exist: Head of Innovation. The goal was building connections, finding out who’s doing what’s new, better and more forward looking, and linking things together.

There was no charter – there’s more of that now, but not compared to a traditional operating company.

The multi-level structure of our firm is a positive when it comes to the variety of what’s being done. The leadership of our partners is very important – they understand what the account teams and service teams are doing – and as a result, are better able to better see (and elevate) what’s new. This helps make connections happen, and helps a new best practice today become a standard practice across the firm tomorrow.

The measures of success for our innovation function are dollars and cents – customer pull, retention, account growth, and the firm’s competitive status in the market.

Chris Massot, Director of CEO Office Customer Co-Innovation, Microsoft

I’m part of a small team, very collaborative, and one of many innovation teams at Microsoft.
Our job is to build a business case for new innovation that come into our company at the highest level, from conversations with important customers at the highest level. Exit criteria from our corner of the organization is agreement at the executive level (and enthusiasm that the concept is correct).

We’re not interested in one-offs (single-purpose) and are instead focused on building things that are both scalable and extensible.

We’re often looking for thought partners inside our organization who can help us measure capacity, readiness and fit for our own teams, and we find those people mostly through word of mouth, experience, and interpersonal relationships. The go-find function, and the go-share + go-support functions are very important to our work across the company. There’s still a culture here that your success comes from relationship and influence, rather than hierarchy. If you need help, go get it. Not in terms of a “survivor alliance” kind of network, but more building a network of helpful people, and creating mutual value.

Luke Mansfield, VP of Motorcycle Management, Harley-Davidson

I was previously Chief Strategy Officer, and now my title is VP for Motorcycle Management, which means that I oversee motorcycles, parts & accessories, consumer insights, and innovation. Because of that structure, we’re able to decide whether or not something is an innovation program or not.

My innovation team has created strategies around both safety and technology. As a new product group, we also decide which investments into new technologies make sense (and which don’t). There’s an annual program around new product development where ideation is assessed by my team in conjunction with our commercial colleagues. We decide where the potential value lies and what returns are most likely. As a part of this process, I collaborate with design, engineering, and other development functions within the executive team.

Our engineers are good at assessing necessary spending and timing – you give them something and they can draw you a resource curve for something new. We created a forum for decision making, the Motorcycle Review Board, which includes the CEO, CFO, Heads of Engineering, Design, Marketing, and I. Our CEO has the tie breaking vote if there’s a stalemate. But in practice, this is rare.

Experimentation doesn’t often occur without the approval of the motorcycle review board. That’s the appropriate forum to assign resources based on likely returns.

I also run a parts and accessories review board. It’s a predictable business and scales with motorcycle sales, but the teams are now thinking about more interesting products than before. ANYTHING that can be on or around a motorcycle. A much bigger addressable market.

Apparel and licensing are managed separately, almost as a standalone business, not by me.

The finance and investing team is very focused on tracking the real dimensions of investment and returns on new offerings, working in parallel to my own team, this particularly applies to the strategic recapture of investment in new technologies or bikes, fully costed, and calculating actual long-term returns.

When innovation is a stand-alone program, it’s suboptimal. It’s far more effective when integrated within product functions. I made a conscious choice to rebrand myself from an innovation specialist to a growth specialist. Whilst people disagree on innovation, everyone agrees on the need to grow.

Faisal Zanjani, Head of Open Innovation, Experimentation, and New Business Models, The Coca-Cola Company

My organization reports into the innovation team in the Chief Technology Innovation Supply Chain Officer’s organization, which is a very large role.

We have an annual budgeting process, but when something important comes along that’s valuable or a new goal, we ask for more. We run very distinct challenges in a few different verticals – quality, safety, environmental, and our Sustainability Accelerator.

Sometimes we’ll have 35 experiments underway, sometimes we’ll have 10.

An experiment could be a proof of concept, a prototype, localizing a proven concept, or globalizing proof of concept.

We do a lot of dramatically new and different things, so we’ll have long persistence before a real handoff, to help shake out some of the risk. Often, these experiments won’t be on people’s radars or balance sheets yet, and work will be passed into larger-scale R&D groups.

Most of the time when we declare success, we’re ready for commercialization. We’ll handoff to the supply chain group, to procurement, to engineering.

Philippe Duban, Head of Transformation, AXA UK Retail

I report to the CEO, and consider myself a challenger as well as an orchestrator.

My job is to deliver the firm’s strategy. I am here to deliver outcomes which are defined across AXA. My budget gives me capacity, which primarily comes down to the ability to bring people in. Within my group, I allocate capacity and funding quarterly.

We do the POC (proof-of-concept), taking a standard approach, and test for two weeks or up to two months.

If it delivers what we hope, we don’t really hand off – we are meeting as a group with key stakeholders, so we have the product owner involved from the outset. There’s really no formal hand-off – their teams are working with our teams, more on the value and application.

There are things we don’t do because of no bandwidth – or interest – in those partners. We don’t staff anything that does not have the buy-in.

CIO at a Provider of Critical Infrastructure and Services for Data Centers

I report to the CTO. His remit includes the R&D function, while the innovation function is separate from R&D. I chose to set up this org, just a couple of years ago, to not have an innovation domain in and of itself, but instead to have little birdies in all of the businesses.

Each business has embedded champions and scouts, about 20 people who are expected to give up 30% of their time for forward thinking innovation (not the problems of today). The rest of their time they’re rubbing shoulders with clients and handoff receivers. Maybe 60% of those people were already in those domains.

The business case for new products or programs is built in part with a quad chart, the total ability to leverage a technology on one axis, and commercial readiness level on the other. While I have a small budget to get things moving, the BUs have to buy in – I have to make the case and they have to buy it.

VP, Innovation Development. Large Financial Services Firm

I report to the head of digital – the SVP of Digital.

We get a budget for a team annually, and don’t necessarily know which strategic initiatives will be funded by that budget. We have an executive advisory board – not governing for us, but providing a gut-check on the projects and directions we’re inclined to follow. It’s a very senior group and includes many of the handoff receivers.

Our group works a lot with data, doing the trend analysis and strategy for our 275 agile teams that are at work building out new and innovative products.

I serve as the technology leader for the product incubation group. We’re usually working about two years prior to market launch, identifying elements of the technology that need to be further developed. We build MVPs and test with a subset of the population.

We’re really careful to build a culture that avoids the spotlight and is service-oriented. We don’t want to take 12 weeks for a product that gets killed because the interest beyond our team isn’t there.

A lot of times we put things on a shelf. We’re good about monitoring the things we were either wrong about or the market wasn’t ready for. We monitor, and in many cases we’ll pick something back up and move it forward when the market is in the right place.

We’re matrixed – my technical team is a handful of people only, about a dozen FT focused on incubation, plus contract engineering. We’ve found that having full time engineers can be difficult because we so often have to change what we’re working on. Some engineers are frustrated by that, and a new project can simply demand different expertise. We do job-swaps as well. We work with Capgemini, Accenture, and a few others. So we have a small staff but lots of leverage.

Senior Executive in a Consumer Technology Company (50B+/year)

As Chief Technology Officer, I reported to the CEO and effectively played the role of Chief Innovation Officer, in partnership with a Chief Corporate Strategy and Incubation Officer. I ran the labs, and everything that moved into incubation came out of the labs, so we did a great deal of planning together.

The transitions between labs and incubation happened seamlessly, though the business units had a number of their own incubation units and shadow lab operations which they did not necessarily want to be centrally managed.

These phases were driven by core questions: What are intermediate tech milestones? Do people want what we’re building? Is there a pain point? Can I make money on this?

Ideators worked like project or product managers during incubation. They made requests concerning the kind of talent they need next. I always advocated a core team, an auxiliary team, and then corporate resources for each new project.

You need to keep the core team super tight and super nimble. Then, every once in a while, you need an SME. You have to be very fluid about the talent, otherwise it’s too visible too early. The plan was to keep the core investment and core team minimal, SMEs on as-needed, and then the next layer of corporate resources is always problematic. Lots of approvals and lots of process is needed.

Senior Executive in Large Financial Information and Media Company

The strategy leaders in our different business units play the role of innovation leaders – we plan new products, test them, marshal resources, and launch them. I’m now directly launching a new product, in software, in my area of the business.

I report to our Global Head of Sales and Service, which still feels a little surprising. I used to be officially in the strategy function. The logic of the shift is that you, the forward-looking leader, need to own the number. The key metric is revenue. Year-one revenue. Senior venture people go in or through sales. You cannot own a P&L unless you go through sales. Our CEO likes to fashion himself as being in the sales department.

For funding you beg, borrow or steal from existing programs, at first. You grab five other guys, and work on things after hours. You try to get a guy from sales, from engineering, from tech. That’s year zero. Six people. No budget.

Then, when you have something you believe in that’s been built to the point that you can prove you can build it, and demand is documented in some meaningful way, you go to the management committee – that’s the CEO and his direct reports. They like to hear new ideas annually. Above baseline they will fund zero-stage if they love it.

We’re trying to do a better job of testing, with more iteration, and more hard data, to show market interest alongside a more structured proof-of-concept. It’s very personal – we follow the money.

Senior Executive at SAP

We work in the IT organization and 90+% of our costs are staff related. We get a certain allocation of FTE, for us we were 20 and now we’re 40. That’s a good size for an innovation group. We get the FTE and we request development budget for learning, travel, and other functionalities. We run at 5% of the overall product staff. We are not attached to any revenue generating function.

If we had revenue, we’d be more of an internal consultancy, supporting customers – not engaged in de-risking – and that’s not what we want. We have enough trust in the organization that our GM says he wants this, as a central cost.

For me, in the end, the leadership team signs off on the roadmap of anything we do. It’s really important to me that I can always point to the senior leadership team to say they are in charge. Otherwise, whenever I engage with VPs of Engineering or others, I’ll have problems with their engagement.

We’ll always invest a day or a little more on any request – that’s where the false positives are. But as soon as it’s more than a 2 days or so, then I want to check back and check up, and I can greenlight if I know it’s already aligned with strong interest.

We don’t have an appetite for radical or step-change innovation. SAP does not need that. It’s more like let’s quickly assess what should be the first proof points and applications of LLMs, as an example.

There are always ideas that come top-down. The GM or Head of Corporate Strategy points to something. If he says he wants a thought leadership piece on X, he gets it. But they don’t abuse that. It’s part of it, but they understand that this is something that needs to be carefully used.

Bottoms up, we invite everyone to pitch ideas. I do have to say that most of the time if it’s a single or individual contributor, I have not seen that much success. The biggest successes are where there are teams that already understand the value, connected to bigger waves of opportunity. They’ve been dabbling for a few weeks or months before they come to us.
Personally, my biggest metric is how many people got promoted. How many people are ready to move up? Can we help the senior leadership team alleviate their pain? Can we pick up the things that are otherwise not getting done for others?

Former Head of Innovation at Salesforce

Usually I would report to the EVP of Solutions Engineering, who reported into the Head of Sales.

We were researching what was happening in the market with live customers, and believed that there are companies with habits for good innovation. So, we’d look at quantitative and qualitative assessments of who should be a partner that we offered a trial to. Then, we ran benchmarking to pick those companies (vs. others) to see if our investments in those companies reaped higher sales, or higher customer growth.

To monetize it broadly, we’d look at these leading firms and explore what the bigger number of follower companies would be doing when they got as smart as these leaders, and we’d bring these insights, habits, and targeting data to the account teams. They would use the insights as value-add in their own relationships, and we would feed the insights to the product teams, so the product would continually contain more prompts and functions for these best practices as they emerged.

For targeted growth accounts, we’d have an innovation executive, a designer, a software person for prototyping, the solutions engineer for the account, the account executive, and someone in professional services for later sales. Those people were engaged for three months. Funding came in from the product-engineering groups inside the sales groups, based on the assessment of market opportunity driven by the analytics of customers and the next layer of who we can help level up, as customers, to be bigger customers by using us more, and by growing.
We were answering the questions of how to help Salesforce grow faster and get to opportunities sooner, that you can only get when you are closer to the customer.

Senior Executive, Director of Software Labs, Large Financial Services Organization

I was deliberately asked to set up a tech labs/innovation labs organization, with the intent to shift culture and see around the corners. The goal was to work with emerging technology, solve business problems, and have a leg up in the competitive market.

We explore opportunities that rise above the regular organizations in our company – they’re not the day-to-day work. Generative AI is currently very important, it looks like a broad platform, and it’s not connected to one organization in our firm. It touches banking, auto loans, lending, and the core financial institution. So that lands with us.

We pick and choose which projects to work on. Most of the POCs make it into production. Our hit ratio is about 90%. Once a project goes into production, we ensure the use case we aim for is working well in production, and generally stay engaged for 2-3 months into production, when we identify who we transition the work to as a complete handoff.

We are funded through the CIO’s budget, but we do separately support the building of some marketing functionality that requires new and emerging technologies – that funding comes from the marketing organization.

Managing Director Strategic Partnership Delivery and Innovation for Conversational AI, LLMs, and Generative AI, Google

My group sits within the Generative AI engineering team. I report to the VP of our Conversational AI Technology.

We explore how we can take our domain knowledge and better understand the use cases. How can we triangulate that to customers?

The outside-in motion can be done by business development or sales planning. In this case, we’re so early in identifying patterns that we can speak with a broad brush, but more often than not, the work we execute on is not patternized yet.

The reason this team was created in engineering was that there are not clear patterns yet, so it requires engineering discipline to find the initial cases and repeatability. Once you can see the patterns, you can scale it out of engineering to business development, to sales, or to strategy.

I try to limit the number of similar customer engagements for the team so that we’re not doing the same things over and over again. I look at projects with the lens of asking what’s new, unique, or different.

We immediately bring in other teams when something becomes repeatable.

Senior Innovation Portfolio Management VP, Large Global Bank

We started off under the CIO, but now we report to our global COO. That person has global ops and IT. We’re very close to our executives in terms of strategy.

We’re called innovation ventures. We have the CVC arm, and a labs team of engineers and scientists to focus on strategic projects.

To date, we’ve built a carbon trading platform, amongst other things. We work on development of generative AI applications. We also have a go-to-market with a focus on incubating ideas within the bank – reimagining the mortgage process in blockchain, for example.

I’ve never seen more excitement around technology within the bank than now. I think you’re starting to see a trend of corporations building their own models.

Getting buy-in from all the different businesses is the recurring challenge. Any initiative in terms of innovation has to come from the businesses – they have the budget.

Where to get Started with Gen AI – WINS Work: Words, Images, Numbers and Sounds

We recently had our latest CXO Insight Call: “Where to get Started with Gen AI – WINS Work: Words, Images, Numbers and Sounds,” hosting both John Sviokla, Co-Founder at GAI Insights and contributor to Harvard Business Review, and Toby Redshaw, CEO at Verus Advisory. They got a chance to lead a lively discussion around how executives can frame up short term AI objectives through a new category of work, more precise and actionable than “knowledge work” – WINS Work.

Today, executives are in the process of considering large, looming issues around AI, such as accuracy, privacy, and bias, as well as the potential impact on “knowledge workers” and even economy-wide job losses and societal risks. While this view is important, it’s difficult to translate to what it means for businesses today. A bottoms-up view is necessary to take in what’s directly ahead and identify the immediate opportunities and threats. Understanding the practical, what’s really happening in the enterprise today (not just the vendor push), is crucial to coming up with a framework of engagement.

Today, it’s starting to look like generative AI is essentially power tools for knowledge work. Natural language has unlocked this entirely new arena for people to explore. There are so many low-code ML capabilities that allow the IT organization to better assist the business. Before, you’d need to build out an enormous data pipeline and infrastructure. You’d need to have highly-paid data scientists who understand statistics and algorithms come in and write all of the code necessary to train these models. And they themselves would need enough knowledge to understand when to choose which particular machine learning technique. So much of that has been abstracted now. You can go to one of the major cloud platforms, sign up for a free account and get access.

 

So? Where is Gen AI being practically adopted today?

McDonalds
McDonald’s has kicked off “Ask Pickles” – a chatbot that will help its frontline workers get quick answers on questions around maintenance, food preparation and customer service. The bot has been trained on everything from employee training materials, to device specifications, to food preparation, and at over a hundred locations, Pickles is now also accepting drive-through orders. Since the vast majority of orders today come through the drive-through at McDonald’s (70%), this could be a huge productivity boost across their locations.

Walmart
Walmart announced at CES that they’re kicking off a new gen AI-powered search experience for iOS shoppers. It will enable customers to search by specific use cases while shopping, for example, a football watch party. This way relevant, cross-category results are generated quickly as opposed to the customer having to individually search for chips, wings, drinks and a 90-inch TV.

Volkswagen
ChatGPT is coming to vehicles, as announced at CES by a number of car manufacturers. Ask questions, get directions, use natural language instead of scrolling through endless menu screens. Volkswagen will be the first volume manufacturer to offer ChatGPT as a standard feature (kicking off in the second quarter of 2024).

Bloomberg
Bloomberg already generated a lot of buzz when they quickly announced that they were building their own language model, BloombergGPT, a 50-billion parameter language model for finance, but that hasn’t stopped them from innovating further. They just implemented AI-powered earnings call summaries on their terminals, which use AI to help analysts with their research process. The new tool enables users to decipher complex financial information and quickly extract key insights on topics addressed by corporate management teams, such as guidance, capital allocation, hiring and labor plans, the macro environment, new products, supply chain issues, and consumer demand.

Microsoft Epic
Epic and Microsoft announced a gen AI collaboration, where they’re expanding to develop and integrate generative AI into healthcare by combining the scale and power of Azure OpenAI Service1 with Epic’s industry-leading electronic health record (EHR) software. The collaboration expands the long-standing partnership, which includes enabling organizations to run Epic environments on the Microsoft Azure cloud platform.

Ideally, with innovations like these, we can reach the point where doctors don’t just have to sit behind the computer and type everything in while talking to people. The hope is for the dialogue to be picked up, recorded, and sent to payments.

In summary, we have these huge elephant companies putting this stuff not in some “Store Eight,” which Walmart just closed down, but into actual mainline capabilities, which is very different.

Early Best Practices

PWC just came out with their global CEO survey, which found that 43% of CEOs believe that generative AI will decrease labor costs in their organizations, over 70% have some AI implemented already, and 45% believe that within ten years or fewer, their current business model will not be viable, that they will have to transform themselves.

So, everyone is dipping their toes in the water, and what companies are starting to get smart about things:

  1. First, it’s crucial to put in place the proper frameworks to enable governance upfront, instead of trying to retrofit all that. Even though this sounds like common sense, it can be very difficult in practice. Outsourcing this piece to legal means you may not like what you get – so it’s important to have a pragmatic process
  2. The models and the tools of today are changing and evolving constantly. There is no one size fits all. Everything depends on the tasks a specific team is after. Change is constant and will continue to be. So part of this foundational work must include frameworks where models can be relatively easily swapped in and out – being as flexible as you can is important, in order to avoid vendor lock-in
  3. Don’t underestimate the importance of prompt engineering. Garbage in, garbage out. It’s much like any other type of query language. Each model has slightly different ways of interpreting the prompt to gain the same output, and learning the nuances matters. For instance, Anthropic’s syntax is not the same as OpenAI’s. So spinning things up like a center of excellence to understand all this can help, but at the end of the day, engineers will need to be educated, as well as any line of business employees who are helping to build low-code applications
  4. Consider whether generative AI is even the right tool for a given use case. Some traditional AI with structured data inside a traditional database can be more effective for certain tasks.
  5. Solving use cases with traditional tools can happen first, with generative AI filling in the gaps
  6. Many companies are dipping their toe into the water today via zero-risk chatbots. Companies quickly learn that the kinds of issues they considered early on are different from the issues that occur in production. It also makes it easy to identify who on your teams is willing to take some personal risk and invest time here – you can quickly find your best people who want to learn how to do this.

What’s Happening Next? WINS Work

The noise around this topic is likely not going away anytime soon, so it’s important for executives to start considering how they want to address it. If you look at the market capitalization of the top six players in AI, as of the end of October, it’s 10.1 trillion. To put that into perspective, that’s greater than the gross domestic product of Japan and Germany combined. So these big players have a ton of cash, and this is critical to them growing their businesses, because it allows them more ongoing revenue and grants them the ability to grab hold of more of the stack. This really puts Microsoft’s investment into OpenAI in perspective. They generate just under 2 billion of EBITDA a week, and if you look at the total EBITDA of the top six AI companies in 2023, it’s 428 billion. To put that into perspective as well, the Russian government collects something like 350 billion in taxes. So I mean, this is not going away. This is central to their strategy. It’s important to keep reminding people of this and get their perspective on it. So if you look at some of these startups who have raised about $600 million to build some models, what are they going to do when they run out of money? Because that’s three days of EBITDA for Microsoft.

So where should your company start with generative AI? Everyone is saying AI will transform knowledge work, but when you think about it, a lot of people technically qualify as knowledge workers. A plumber is a knowledge worker, a heart surgeon, a lawyer, but they’re all not going to be impacted the same way.

So maybe the core concepts to consider are really the manipulation, creation, or improvement of words, images, numbers and sounds – WINS work. So yes, a lawyer would be impacted, but not a heart surgeon. GSIs are another clear example of high impact: most of the work is WINS work and most of the work is digitized. There will be massive changes in the core of those businesses, whether it’s around tax, consulting, or audit.

What we’re going to see is that over the next three to five years, there will be a shift in the relationship between capital, labor and productivity. You’re already seeing it in some offshore firms. Clients are already pushing them to lower costs because they’ve started reducing headcount on account of these new tools, offshore medical coding is a good example of this. Pharma is another winner: they think they may be able to explore ten, a hundred, a thousand, or even ten thousand times as many potential molecules using this new technology.

And then you have industries who are attempting to hold out: for example, media and education. And some industries just don’t have enough digitization: think Cisco – being operationally intensive with thin margins makes it very difficult for them to take advantage of this opportunity.

So we’re definitely starting to see some pretty fundamental edits or transformational sort of themes. It feels as though a lot of what we’re hearing is copilot is being applied to traditional back office work: software development, customer support, HR, marketing, etc. And this is sort of the new table stakes. This is how companies should do HR going forward, but it’s not fundamentally transforming the business, it’s improving productivity. It’s giving a gift to the CFO. In contrast, digitized PWC data could create the consulting of the future organization at a wholly different cost model, wholly different workflow, wholly different customer experience. That’s where the leaders are looking to get to today.

Traditionally what happens is that you’ve got three things going on. First, you’ll have some small population of existing companies adopt this aggressively and force their other competitors to join in. Usually industries aren’t just moving wholesale. Over the next three to five years, the one or two leading companies will garner 60, 70, 80, or even 90% of the profitability available in that industry. That’s the first thing you’re going to see.

The second thing will be the entrance of entirely new business models, for example, a neo law firm that has completely re-imagined their processes, but that’s 3-5 years out, it will take a little while before the market realizes this new model is viable. The most expensive price for a taxi medallion in NYC was five years after Uber was founded, when they were traded for 1.2 to 1.3 million a piece, 12 months later they traded for $34,000. So, it takes a little while for the market to believe it. But once they do, there’s massive financial disruption, not market disruption, because the taxis are still all around New York City, but nobody’s putting money in them. So that’s what happens in industries. When somebody sees the new model come in, when Amazon gets to a certain size, Macy’s gets marked to market way down. Now they stay around as a zombie for a long time and they gobble up other zombies trying to stay alive, but they’re gone and no talent goes there, no capital goes there, no good technology goes there and so forth. So they just keep getting farther behind.

So finally, companies will see the new technology and think, okay, how would I design my company from scratch with this new stuff, and change my entire business model? Any company that has access to data, modern development practices and a courageous mindset can be re-imagined and progressed. If not, others from outside that industry are going to make inroads. We’re going to see better asset utilization, better customer experience, better controls, and cheaper cost structures. There will be companies that get eviscerated and drop out of the S&P 500. From the board on down, there should be some fear that if you’re not moving fast on this, you’re going to be left behind.

The unions in Hollywood were able to get a five-year moratorium using AI to write stuff. But it’s still going to come out of Bollywood, and every single other place. Unions are going to drive themselves into the ground over this issue. What they should be lobbying for is getting the studios to grant them open source access to education, training, etc. They should be going the other way. Ideally, if you’re in a union, you want to be contracted with the companies that are going to win.

Leadership

This is a situation where people who have absolutely no experience with the technology can have incredibly strong opinions about it, and that’s a really dangerous thing to have to deal with. So if companies want to be in the top quartile, boards need to be educated: What is this new gen AI wave? Assume nothing about their current understanding. Second, VPs need to be personally spending at least three hours using this stuff on something that matters to them. Doesn’t necessarily have to be the business, it could be planning a vacation, doing a personal health routine, whatever. But they need to be able to understand and use these new tools. Finally, you’re going to need a basic employee training policy – give everyone access to the tools, but also ensure that they’ve taken the training.

Training especially helps the business users: they often have no idea what to ask for. Or, what they ask for is unrealistic relative to existing tooling. In the past, they’ve never had the option of having IT applied to their business use cases. So if you’re an IT organization, waiting for people to come and tell you what they want may not be the most effective strategy. Instead, getting good at prototyping may be the answer. Leverage some of the low-code capabilities, pick a generic use case, and show the business how it works and what you’ve learned. Once you win people over, go build something in production that has actual KPIs behind it. That’s when you start to get the gears turning and get people excited about what IT is capable of. That’s when IT and the business can start to open up a partnership around business needs.

Innovation infrastructure may need an overhaul as well: this isn’t just a tech transformation, it’s a top-down leadership transformation. Organizations have lost the muscle to innovate and they need to grow that back. You don’t just need an AI champion, you need an icebreaking ship: someone needs to change company behaviors and force people to do things differently. “We may not know the distinct value yet, but we’re going to find it.” This is hard: mass innovation inside a company comes with a lot of cultural and behavioral problems, but gradual change can succeed. How can you reward people for trying things and screwing up? That sort of behavior normally gets punished at companies with monthly ops reviews on plan vs. action. So if you don’t have a culture that can broadly reward innovation, it’s going to be very, very hard to get there on AI. Many companies will be tempted to avoid building and just focus on “buying,” but it may be that commodity tools lead to commodity results, and the real outlier outcomes will come from building something yourself.

There is no success scenario 3-5 years out where you’re not good at gen AI. But it’s not just about technology, it’s about human capital management and optimization.

Risk in AI Today

There isn’t yet a clear solution today on how companies are managing risk for externally-facing AI solutions. It’s hard to know until there’s a more definitive regulatory environment. Guardrails are ultimately going to play a pretty big role. Back when the internet first came around, everyone was using it to look things up instead of going to encyclopedias on the bookshelf. But, how trustworthy is it really? What are the sources? What are the sites? Today we have Wikipedia as an assumed source of data that’s fairly reliable, but with AI, we have that added risk of “Who are you sharing your data with?” or “How is information being shared?” This is where you’ll start to see people making money off this who aren’t the gold miners, but the people who are selling shovels. For now, human-in-the-middle is the obvious choice. In five years? The world may look a lot different.

2024 IT Priorities and Adoption of Gen AI