Agentic AI: Why It’s the Next Big Shift in Tech
The player is loading ...
Agentic AI: Why It’s the Next Big Shift in Tech

Agentic AI: Why It’s the Next Big Shift in Tech
Mehrnoosh Sameki

Get featured on the show by leaving us a Voice Mail: https://bit.ly/MIPVM

👉 Full Show Notes
https://www.microsoftinnovationpodcast.com/768  

Agentic AI is transforming enterprise technology by moving beyond content generation to autonomous actions. In this episode of Copilot Show, Mehrnoosh Sameki explores the risks, guardrails, and governance frameworks needed to deploy AI agents safely and effectively.
 
🎙️ What you’ll learn 

  • How agentic AI differs from generative AI and why it matters 
  • Key risks: task misalignment, prohibited actions, sensitive data leakage 
  • Practical guardrails and evaluation strategies for AI agents 
  • How to manage agent sprawl with Microsoft Foundry Control Plane 
  • Why red teaming and observability are critical for AI safety 

Highlights 

  • “Everything that I hear at work is about agentic AI.” 
  • “Agents don’t just output text or image. They take actions.” 
  • “Task alignment and staying on task is a huge one.” 
  • “Sensitive data leakage is more and more important.” 
  • “Bad actors could overwrite those information with different techniques.” 
  • “If you don’t know how many agents are out there, huge safety risk.” 
  • “We released something called Foundry Control Plane.” 
  • “Each agent gets a unique identity to suspend, quarantine, or stop.” 
  • “You can set org-wide policies against your agents.” 
  • “Red teaming is huge for identifying the risks.” 
  • “Our AI red teaming agent gives you a scorecard of vulnerabilities.” 

🧰Mentioned 

✅Keywords 
agentic ai, generative ai, responsible ai, guardrails, observability, task misalignment, sensitive data leakage, agent hijacking, foundry control plane, entra, red teaming, ai governance 

Microsoft 365 Copilot Adoption is a Microsoft Press book for leaders and consultants. It shows how to identify high-value use cases, set guardrails, enable champions, and measure impact, so Copilot sticks. Practical frameworks, checklists, and metrics you can use this month. Get the book: https://bit.ly/CopilotAdoption

Support the show

If you want to get in touch with me, you can message me here on Linkedin.

Thanks for listening 🚀 - Mark Smith

04:36 - The Shift to Agentic AI

06:17 - New Safety Challenges

08:51 - Top Risks for Enterprises

12:18 - Managing Agent Sprawl

15:51 - Building Robust Guardrails

20:14 - Red Teaming for AI Systems

24:33 - Humans + AI: Augmentation, Not Replacement

00:00:01 Mark Smith
Welcome to the Copilot Show, where I interview Microsoft staff innovating with AI. I hope you will find this podcast educational and inspire you to do more with this great technology. Now, let's get on with the show. Welcome to the Copilot Show. Today's guest is from Boston, Massachusetts. Welcome to the show, Manush.

00:00:23 Mehrnoosh Sameki
Thank you so much for having me. It's an honor to be here.

00:00:26 Mark Smith
I'm so looking forward to this conversation because you're deep in everything AI at Microsoft at this time. We're in the month of Ignite, so exciting things are happening. But before we get started, tell me a bit about food, family, and fun. What do they mean to you?

00:00:41 Mehrnoosh Sameki
I love it. I'm originally from Iran, even though right now I live in Boston, Massachusetts. And if you know Middle Eastern, they're big foodies. constantly in search of great new spots in Boston and Cambridge and lucky that nowadays there are so many cool spots here, but also hugely fan of speakeasy bars. So we're kind of like on a hunt to explore one each time we go to a different city family.

00:01:08 Mehrnoosh Sameki
I'm here in Boston with my husband, a little dog, and my parents. Love spending time outside the hours that I'm working about AI. And they ask a lot of AI questions. So it's always fun to see different perspectives. And then for fun, I do boxing. Wow.

00:01:25 Mehrnoosh Sameki
Which keeps me grounded, really grounded.

00:01:28 Mark Smith
I bet, I bet. Interesting. that you're from Iran. And the reason I say this, I watch a bit of TikTok every now and again. And there's a lady from the UK that's recently gone, just like this week or the last couple of weeks, she's gone to Tehran and she's going across various cities. And of course, the perception often we have in the world of a country is quite different than the reality because what she is showing us is like, Massive libraries with 10s of thousands of books, malls that are bigger than any malls that I've seen in other places in the world that I've traveled to. Amazing architecture, and it's just been a massive eye-opener as she's documented her travels through Iran in this last month, which is like, you know, very present tense.

00:02:16 Mehrnoosh Sameki
I love that you said that. And even as someone who is from there, but has lived outside for the past 12 years, the progress that has been made over the past, I would say even couple of years has been incredibly heartwarming to see there. Like I was looking at a video of a street rock music, hard rock music and people dancing and, you know, just vibing with it. was fascinating to see. But yeah, thanks for calling that out. I totally agree that these two images are often like not even 10% similar, sometimes quite opposite of each other.

00:02:50 Mark Smith
Oh, unbelievably so. I mean, one of the travels my wife and I did was across Russia. We did from coast to coast of Russia over a period of time. And We had so many people tell us how unsafe it was. It is crazy that you're doing this. was in 2017 that we did the trip. And we've got to say we never felt so safe. We never felt so that one of the biggest things that we identified with wherever we went is that people have a deep sense of family wherever you go in the world.

00:03:24 Mehrnoosh Sameki
That's very true.

00:03:26 Mark Smith
And wanting to connect, but they also fear the unknown because We would be in a city, let's say we're in Vladivostok, which is on the west coast of Russia. And we would say, we're going on to this next city. And those people in the city would go, oh, it's dangerous there. You don't want to go there. You don't want to go out at night. And I would like, have you ever been there? Oh, no, we've never been there. We've just heard about it. And of course, you'd go to the next city And they were just amazing, family-friendly people again, but they'd always be the next place you're going to. And it's just like the world's funny and the stories it tells us about the unknown. I want to go to your country, your home country though, because I just, it just looks amazing.

00:04:05 Mehrnoosh Sameki
I hope one day you do and you experience it very personal and close.

00:04:09 Mark Smith
Yes, I flew over it earlier this year. I was on my way to Italy from New Zealand and went over part of it from Qatar. But that would be amazing. Anyhow, let's talk about AI and Microsoft. What's top of mind for you right now in this AI space as we draw to the end of calendar 2025? What are you seeing? What are you thinking about? What's important at the moment in what you're doing?

00:04:36 Mehrnoosh Sameki
That's a really great question because right now, everything that I hear at work is about agentic AI. And if we think about my journey with Microsoft, when I joined about six, seven years ago, I was working on traditional machine learning models when talking about classifiers, regressors, and how to make them responsible, like how to build the right fairness tools or interpretability, explainability. And then a couple of years Back then, generative AI became a thing, and we pivoted to build tools for responsible AI around more traditional, more generative AI, like retrieval augmented generation, where the model uses a knowledge base to maybe create a chatbot or something. And now it's all about agentic AI. And we were looking at some studies and some stats from last year that was saying that about like 81% of executives think that some form of agentic AI will be deployed in their organization. So the big difference with generative AI is these agents, they take actions. So they necessarily don't just output text or image or content. They actually could call a tool.  They can do operations on your behalf. They can take care of certain actions. And so the whole framework that we had in mind for Response by AI, which was more focused on the content, the safety of the content they generate, now needs to be redefined for the actions and the operations that agents do. And they're very autonomous. They can be extremely powerful. So our entire world is now about agentic AI and what Response by AI means for that.

00:06:17 Mark Smith
So if I just interpret that and how I think about it, so you're saying that when we looked at safety up to this point, we're looking at things like hallucinations, giving the wrong answer, being incorrect, but in an agentic world, it could take the wrong action. And so therefore, the safety and the guardrails that we need to be building in need to be in the context of action, not just knowledge transfer.

00:06:42 Mehrnoosh Sameki
Exactly. And if I want to give you a little bit more framing of what are some of the top of mind for folks, enterprises that I talk to when it comes to agentic AI, one is task misalignment. So these agents essentially take a task on your behalf and they want to execute. And it's very important that they stay on task, right? If they go and like suddenly call certain tools that were off limits or suddenly delete your entire database, or send a e-mail to everyone that you never wanted it. So task alignment and staying on task is a huge one. Prohibited actions, I just talked about deleting database or sending emails to everyone. So how can we make sure that they're not doing those prohibited actions? And there's some guardrails around it, but there's also observability so we can understand how they're doing certain actions and we can look at the traces. Another one is sensitive data leakage.

More and more important to really make sure that if there is any presence of, let's say, credit card info, SSN info, social security numbers, this type of uniquely identifiable personal information in the knowledge base, they're not just like pass it to any tool that might be an untrusted tool or just not leaking it to the outside world. And there are other concerns like agent hijacking. So there could be bad actors that are trying to take over your agent operations to specifically make it do actions that you never wanted it to do, even if you've been really careful about saying, these tools are not trusted, do not do these actions, do not touch this database, do not leak this information. But then bad actors could overwrite those information with different variety of different techniques. So nowadays, our focus has shifted on, sure, we have to work on content harms, but also think about sensitive data leakage, think about prohibited action, think about task misalignment, or think about agent hijacking and how should we put the right observability evaluations and guardrails around it to keep your agentic AI solutions safe?

00:08:51 Mark Smith 
Yeah. And you know, I would have normally thought, you know, why would an agent delete a database? But I read in the media a couple of months ago, when I say I read, I watched more like lit, was Replit had that exact situation happen in their org. A master database was deleted by the agent and took that absolute destructive action.

00:09:13 Mehrnoosh Sameki
And there was one customer I was talking to last week and he was telling me that in his organization, they've developed one agentic AI solution that essentially interacts with different knowledge bases and it's supposed to help. It was fortunately an internal only agentic system, but he was mentioning that he got information in the response that it clearly was coming from some confidential meetings that probably the recordings were somewhere that this agent had access to, but he was not supposed to see those. So these things are extremely top of mind for folks these days.

00:09:49 Mark Smith
Yeah. And you talked about PII and most companies would have executive assistants or support staff that would have a range of executives credit card numbers, because when it comes to booking their flights and they'll have those files and those assets with all the security codes, everything. And it's needed to do day-to-day business, but of course, you don't want that leaking into the hands of somebody else in the business that doesn't have the privilege.

00:10:17 Mehrnoosh Sameki
Yeah, absolutely. In fact, all of our records are now with hospitals, right? When you, at least here in the US, the moment that you want to register with a hospital, pretty much they ask for all the information. And so if such a thing then gets leaked by an agent, it's a disaster in the making.

00:10:34 Mark Smith
Yeah. I came from the Microsoft side of the house, which was low code. So the power platform, dynamic X365, been in the space for over 22 years. And one of the massive concerns that came out about six years ago in the Power Platform is that Microsoft changes security of the way the Power Platform run to allow anybody to build apps in the environment. And one of the concerns at the time, because it was a smart thing to do, right, because it gave power to people that was not just IT gatekeeping everything, for something that would be relatively low harm. The drawback or the flip side of it, when I went in on projects, is that you would find 3,000 apps in the environment that might've been built, used once. And now with agents coming along, there's this, once again, this kind of fear, I suppose, in some companies of agent sprawl. How do we combat that so that the agents that are built are usable, practical, and do they have an end of life at some point where they get recycled or flushed out of the system or decommissioned? How are you thinking about that?

00:11:44 Mehrnoosh Sameki
I love how you said it because one major factor is, first of all, if you don't know how many agents are out there acting on behalf of your organization, huge safety risk and gap. Number 2, they burn through money, right? Like suddenly you get a bill that is way more than what you had budgeted for. And you kind of realize that, gosh, so there are these 20, 25 agents, maybe 50, maybe 100 agents acting on my behalf. I have no idea where they are. So At Microsoft Ignite, we released something called Foundry Control Plane. There's many dimensions to what this Foundry Control Plane, but think about it as your cockpit for managing and governing not just all your agents, but also your models, your tools across your fleet. So not just talking about your, like, just say, Microsoft Foundry, sure, you can govern them too, but also the Power Platform, M365, Copilot Studio, SRE agents, the agents you're bringing from other clouds and external agents for the lack of better term. So that was our very first attempt towards this step to give you an inventory where you can take a look at all your assets in one place, but not just taking a look at it. You can also connect it to evaluation and observability to understand how well they're doing quality-wise and safety-wise, and you can schedule certain post-production evaluation or monitoring on them to kind of continue observing them. That's one aspect of it. The second aspect being we integrated that with Entra, which is our platform that assigns a unique identity to each agent. And that identity could be used to suspend, quarantine, or stop an agent. If you're believing that it's going off the script, if the evals and traces are suggesting that certain thing is really off, you can proactively get up those information from the The observability story that we have, and then put a stop on it, because now it has a unique identity and it's recognized as an entity, just like I have an employee ID, and then the third aspect of it is... This control plane enables you to set org-wide policies against your agents. And we're starting with guardrails, meaning you can say all these different model endpoints that are being used by my agents all over, they should have these certain guardrails, like content harms or protected materials, or maybe like hallucination or certain type of guardrails. guardrails always on by default. And if there is a person in your organization under that policy that is building agents that is not honoring that policy, then obviously you can, you get flagged in that dashboard and you kind of see the violation. So we are making major attempts towards that. In that experience, there's also cost and quota that you can both customize and adjust and also take a look and gives you a lot of optimization hints as how you can improve that. I'm very excited for folks to take a look at it and let us know what are the remaining concerns that we can then evolve that project together.

00:15:03 Mark Smith 
I like it. When companies think about guardrails in the agents they build, do you have any framework that they should operate under or that they should think about when considering Do they have robust guardrails in place?

00:15:17 Mehrnoosh Sameki
I love that question because it could really prompt people to think about a few different factors. Factor one is the type of the output that is being generated by that agent. Sometimes they actually spit out some text and image. In that case, we always recommend things like content harms. For example, on our side, in Foundry, we enable like guardrails for sexual content, hateful, self-harm, hate and unfairness, violence, and they can create custom categories, protected materials, if there's a copyright materials and output, that always is important. The other factor is to what extent that agent is hanging out or using with knowledge bases or meshing with knowledge bases. So suddenly groundedness becomes quite important as whether it's hallucinating or whether it's like extracting the right amount of information. Then of course, the moment things become agentic, task misalignment is a critical factor group to think about. And for that, there are different guardrails that we have like task adherence and we have a bunch of evaluators. for helping you measure the task adherence, whether the agent remained on task, to what extent, or whether the right tools were called, whether the right parameters were passed to the tools, whether the right orchestration of tools happened in an efficient way.

00:16:38 Mehrnoosh Sameki
So you kind of understand that your agent is very tight cost-wise. Obviously, prohibited actions on sensitive data leakage, if they are in scenarios like healthcare or financial scenarios where there is such data in the knowledge base or there could be disasters happening from bad actions. And then if this is an agent that is about to be released externally, hijacking XPIA or jailbreak is always top of mind. In fact, those are the ones that make it the most to social media, right? People like literally use it as their pride moment that we broke a system. The creators need to pay extra attention to that.

00:17:18 Mark Smith 
And so with Azure AI Foundry, there are these controls on content types, et cetera, at the moment. But I just wanted  I wanted to highlight, and correct me if I'm wrong, but if you're a law enforcement agency, that for example, part of the job involves dealing with explicit content, like I've been involved in a police project where a range of police operatives act like young girls to find online through social media channels. And we had to do a full like air gap network so that other police officers didn't know these personas, the credit cards associated with them, all that. So even though Microsoft have these controls, for law enforcement agencies, they do have the ability to tweak other parameters, right, because of the type of work they're in.

00:18:02 Mehrnoosh Sameki
100%. In fact, I do work with a lot of enterprises that they say, oh, for our scenario, we actually want... violence, like if they're building gaming, maybe to some extent, violence guardrails need to be, at least their severity needs to be adjusted. Or I was working with a e-commerce company and they were saying that, you know, for bachelor or bachelorette party, there might be some items that might have some sexual themes, but it's like they don't want that product to be filtered. That's part of their, you know, product catalog. So Everything, including evals and including guardrails, come with the adjustment of, number one, what is the intervention point that you want that guardrail to be applied? Is it the input, output of the AI, or is it the tool call? And then same for the evaluators that you can choose which ones are relevant to you and what is the definition of defective? Is it if it's like low violence defective or medium violence defective or high violence is defective?  So they can completely adjust it.

00:19:05 Mark Smith
How much is Microsoft providing tooling around things like red teaming? And I suppose this is a multi-part question. Do you see that in the future, more and more companies, just like you would have a, you know, your security teams monitoring things like firewalls and, you know, pen testing, all that? that more and more as we go into AI, that a red team would be potentially an integral part of any organization when they are working extensively with AI and really pushing the edges of the technology, that they should be building that skill in-house. And is Microsoft providing a lot of content in that area?

00:19:40 Mehrnoosh Sameki
So red teaming is huge for identifying the risks. And then formal evaluations, metrics, they can help you with the extent of the risk, how bad it is. because they can scale versus humans usually can scale. Originally, when we got access through partnership with OpenAI to early versions of GPT models, that's how we built trust with it. We got to know them. Like we put together a team of experts, literally from all over the company, from the legal org, from the policy org, from technical land from research. And each group was randomly assigned a particular topic. It could be like violence, hate. And then they would attack this, of course, like very internally, that was our attempt to understand the nuances of what are the top risks, what guardrails and evals we should plan. We put a lot of best practices out there for how red teaming experts in any company could operate, but we were also very conscious about the fact that not every company and every organization might have the bandwidth, the budget, and expertise to do this red teaming. So one thing that we have right now as a part of Foundry, which I'm really excited for folks to try, is our AI red teaming agent. So the way that it works is There is a team within Microsoft Security that has built an open source toolkit called Pyrit, P-Y-R-I-T. This is essentially a collection of world-known attack strategies. for actual human red teamers to say, okay, maybe I'll come up with 20 nasty adversarial queries, and then I pair it with this attack strategy. So now ask this question by flipping the word, ask this question in ASCII mode, ask this question in certain embeddings, and then see whether the model falls right. So what we did was we took the attack strategies with pirate, And then we combined it with a lot of important adversarial like seed questions. So in this case, what you come to us and you provide is your endpoint that you want us to attack. Obviously, by us to attack, you run it. So everything is in your sandbox, your subscription, we don't see anything. So you activate this AI red teaming paired with your endpoint. It starts attacking your endpoint with regards to lots of adversarial queries based on the risk categories you identify could be violence, self-harm, and then paired with attack strategies like jailbreak, flip, ASCII mode, base 64, different things. And then it gives you, at the end of the day, a scorecard of your vulnerabilities of your AI system. And so the beauty of it is, first of all, it takes away that pressure that you need to have red teaming experts, but still keeps the quality of that red teaming so it can scale well. But second of all, literally when the vulnerabilities are right in front of you, can't miss them. So you have to then go and apply the guardrails or improve prompt optimizations or maybe adjust some of your tools are not trusted. So some of the tools. And at Microsoft Ignite, we just expanded that AI red teaming agent to red teaming actual agents. So originally it was more for understanding the vulnerabilities around content harms and now it can attack your agent to understand whether it leaks sensitive information, whether it performs prohibited actions, whether it falls for jailbreak attempts or gets hijacked. So people could do it kind of both for actions and also for content harms.

00:23:11 Mark Smith
Fantastic, fantastic, exciting developments. You know, for a long time, I've believed that a lot of people don't realize that we're all cyborgs, right? We have a mobile phone that has unbelievable power in, you know, what was around 10 years ago. And I feel that with AI, we're going to the next level of that and being able to take our human skills and augment them with AI and do other great things. And we're hearing more and more now this concept of digital workers being another name for agents where people inside an organization are going to work with an orchestration of agents. How do you talk about human work and AI augmentation? What are your thoughts?

00:23:59 Mehrnoosh Sameki
So this is a very, very common conversation. You can imagine we're all having as people in the field with, like, even when you go to a bar, to a restaurant, everyone, the moment they realize you're in the AI field, they're like, you're replacing us. So my personal experience with AI so far has been, it's purely for augmentation of humans, meaning there are so many jobs that are really well suited for AI. They're very repetitive. They require less creativity. They're not as exciting. mundane and AI is going to take care of that and you will be in business of kind of being its master in a way. It's like main operator and shifting it in any way or form you want and also controlling its output. AI is going to grow our economy. There is no doubt about that. And with that growth, there are way more opportunities that are being created. So I feel like it's going to augment us really well. We will focus on more creative jobs. We're going to focus on places where there is more deep decision making, human decision making is needed. And also we are going to be a lot more in business of kind of operating AI in order for our tasks to be finished. So definitely I don't see it as something that replace human expertise, but I definitely see it as if someone is resilient to adopting AI, maybe they're going to get replaced. because I think the future is probably some more than now. I already is here and I use AI on my daily basis for my product development, but way more, it's going to be more like a cooperation of humans and AI.

00:25:38 Mark Smith
Yeah, totally agree with you there.

00:25:40 Mehrnoosh Sameki
But I understand there's fear, you know, like with any big change, even with industrial movements. But I only see, and not just because I'm in the field, because I see how thoughtful people are behind the scenes of how to put the right guardrails, evals, how to make it safe, how to give it identity. So I'm really excited that now we are in a different era that we deeply think about the impact of it on humanity. And we're really willing as technologists to shift left and bring all of this goodness to the actual development of an AI and really understand how we can help humans rather than rebel against them.

00:26:19 Mark Smith
Yeah, I like it. As in, I just had a tradesperson working at my place yesterday and they were like, are you worried about the Terminator scenario? And I said, I'm absolutely not. I just, I cannot see in anything that I'm looking at that that's a scenario. But George Orwell's 1984, I can see that as a scenario where it's humans that use it nefariously at the end of the day. And that brings me to regulations. we've had the EU AI Act. We've seen in the US, the current administration being kind of anti any type of regulation. And when I often talk about AI, I talk about it in something tangible. I liken it to electricity and saying electricity is super important. It cooks our toast, it boils a kettle, it does so much for us. But if you touch the wrong piece of electricity, you'll be dead. It can kill you, it's that powerful, but we've had, and as electricity was developed, regulations came out to make it safe so people didn't accidentally do things. And so I see regulations as a very important part of the conversation. What are your thoughts around regulation and perhaps what is Microsoft doing? At last I heard there's over... Two 100 pieces of regulation that Microsoft is monitoring globally, because you're operating in multiple geographies, but what's kind of top of mind from a regulatory point with you at the moment?

00:27:42 Mehrnoosh Sameki
There are a couple of different dimensions I can answer this, and I'll categorize that for you. Number one is, as a person who builds these tools every single day that I interact with enterprises and... customers, I see them being confused about how certain regulations are applicable to these new wave of AI systems. Even if they kind of figured out the generative AI, now there's whole agentic AI. And I feel like even though I don't think we need a lot more, like we have the right regulations in the buckets of policy, security, safety, all of that, but they're not translated yet to how with this autonomous agents. So there is so much progress that needs to happen. And within Microsoft, for example, we have the Office of Responsible AI, team of experts from legal policy. Not only they try to influence the external regulation agencies as what is the right thing to do and what is the applicable thing to do, but also they guide our own development policies like within Microsoft, what are the requirements of what we need to measure how we need to put the right things in place. So one thing that I see is there is a huge opportunities to be that translator. And at Microsoft Build, which was a couple of months ago, we formed partnerships with two companies, Credo AI and Saidot. They're specifically in the business of taking these regulations and defining what it means. What are those risks that needs to be measured based on this piece of regulation? And then Microsoft could enable the evaluators or guardrails that specifically matches to those harms and risks.

I'm also quite a lot investing on custom policies on my team, so we can upload pieces of regulations, get responses from an agent that's like, what are the different dimensions of harms, and then get recommendations on what evals and guardrails need to put in place. So that translation and the companies that are kind of contributing to that translation of those regulations to generative AI actual requirements or agentic AI requirements is hugely important right now. And I'm really glad to see their companies, including Microsoft, making progress. Within Microsoft, we have a platform.

00:29:56 Mehrnoosh Sameki 
It's called 1RAI. Every single AI solution that leaves the door or generative AI and agentic AI solution that leaves the door of Microsoft is registered there. And then there is a deployment safety board consisting of experts from all the walks of life and different backgrounds, part of this committee that oversee that release process of their response by a champs for assigned to that project. So take a look at whether the right evals are in place, the right guardrails are in place. So one thing that I'm really excited to see is how more companies are also interested to put such deployment safety boards in place. So they're a lot more proactive to even in absence of super crisp and clear translation of regulations to requirements, at least take an attempt to do their best to make sure the AI systems are secure and safe.

00:30:48 Mark Smith 
Amazing. Thank you so much, Manoj, for coming on, as in you've just shared so much. I know it's a busy time of year for you. I really appreciate it. Thank you again.

00:30:57 Mehrnoosh Sameki
Of course, it's an honor and pleasure to be here. I just wanted to finish this interview by saying that I definitely believe in the future of AI, as I mentioned, but I really welcome everyone to take a look at not just our evaluators and guardrails, but across the market. Stay in touch on LinkedIn. Let us know what trends you're seeing that maybe a company like Microsoft could better incorporate in the ecosystem. And I'm really grateful I had this time with you. Thank you.

00:31:26 Mark Smith
Hey, thanks for listening. I'm your host, Mark Smith, otherwise known as the NZ365 guy. Is there a guest you would like to see on the show from Microsoft? Please message me on LinkedIn and I'll see what I can do. Final question for you, how will you create with Copilot today? Ka kite.

Mehrnoosh Sameki Profile Photo

Mehrnoosh Sameki

Mehrnoosh Sameki currently leads the Generative AI and Agents Evaluation Science, AI Governance, AI Redteaming, and Safety Evaluation efforts within Azure AI Foundry at Microsoft, shaping how customers assess and govern AI models, generative AI applications, and agentic systems. Her team focuses on building scalable, responsible evaluation and governance capabilities that are deeply integrated into the Azure AI ecosystem, with a strong emphasis on AI safety and security.

Mehrnoosh co-founded several open-source tools in the Trustworthy AI space—Fairlearn, Error Analysis, and the Responsible AI Toolbox—and contributed to InterpretML, advancing transparency and accountability in machine learning. She is honored to be recognized on the 2024 Brilliant Women in AI Ethics list.

Beyond her role at Microsoft, she serves as a curriculum developer and lecturer with Break Through Tech, helping drive underrepresented women toward meaningful careers in data science. Mehrnoosh earned her Ph.D. in Computer Science from Boston University and served as an Adjunct Assistant Professor from 2020 to 2024.