Learning Kerv | Transforming Business with GenAI: Practical Examples and Lessons Learnt | Episode 5 | Learning Kerv

Description

In this episode, we delve into the practical applications of generative AI with real-world examples from industry experts. Host Rufus Grigg is joined by special guests Kyle Ansari, David Groves, and Dawn Kelso, who share their experiences in deploying AI systems that are operational and impactful.

Whether you're a tech enthusiast, a business leader, or juat curious about the practical uses of AI, this episode provides valuable insights into how AI is transforming industries and solving complex problems.

Key Highlights:

Operational AI in Compliance:
Understand how Kerv's product, Compliance Cloud, uses generative AI to manage and analyse alarms in a cloud-based voice recording solution for banks. Learn how AI reduced false positives and operational workload by 97%, enhancing efficiency and accuracy.
AI in Customer Service:
Explore how generative AI is used to provide multilingual customer service in contact centres. Discover the challenges and solutions in implementing automated translation for various communication channels, enabling businesses to expand into new markets without the need for extensive multilingual staff.
Data Quality and AI:
Delve into the importance of data quality in AI applications. Understand the critical role of accurate, relevant, and compliant data in ensuring effective AI outcomes. Learn about the challenges of managing diverse data sources and the necessity of continuous monitoring and human oversight.
Future of Generative AI:
Explore the potential future applications of generative AI in compliance and other fields. Gain insights into ongoing developments and the importance of ethical considerations in AI deployment.

Tune in to this episode of The Learning Kerv to hear first-hand accounts of AI in action and gain a deeper understanding of the practicalities and benefits of generative AI in today's technological landscape.

Hosted by Ausha. See ausha.co/privacy-policy for more information.

Transcription

Rufus Grig
Hello and welcome to The Learning Curve, the podcast from Curve that delves into the latest developments in information technology and explores how organisations can put them to work for the good of their people, their customers, society and the planet. My name is Rufus Grigg and in this series, and with the help of some very special guests, we've been looking at all things generative AI. The worlds of business and technology are massively excited about Gen AI and in this series so far, all episodes of which are still available on the podcast player of your choice, we've covered the... basics of machine learning and AI in general. We've lifted the hood on generative AI and foundation models themselves in particular, and looked at some of the faster growing applications of AI. And in the last episode, we even paused to consider how we use AI responsibly and securely. But in this episode, we are going to get to where the rubber really hits the road, because my guests this time round are people who have built and deployed real live AI systems in the wild that are up. and running and working and being used every day. And it's a fantastic opportunity to hear their stories, understand what they've been able to achieve with AI, the problems that they've been solving, and see what lessons they've learned along the way. So let's meet them. First of all, we're joined by Kyle Ansari, who is the CTO of Curve's Collaboration and Compliance Practice. Hello, Kyle.
Kyle Ansari
Hello, Rufus. Thank you for inviting me on this week.
Rufus Grig
No, brilliant to have you. Just tell us a little bit about your day job and what you do for Curve.
Kyle Ansari
Yeah. So I'm responsible for our development and engineering team in collaboration and compliance. We provide compliant voice recording solutions for tray floors, for back office, and primarily these days, everything being cloud-based for team Zoom and anything that requires recording.
Rufus Grig
Brilliant. Well, thank you, Kyle. Great to have you with us. We're also joined by David Groves, who's product development director at Curve. Hi, David.
David Groves
Hello there.
Rufus Grig
And tell us a little bit about your world.
David Groves
So we create products. for the rest of Curve to go and sell, implement, and generally do brilliant stuff with. And several of these are now sort of using AI for real.
Rufus Grig
Brilliant. We look forward to hearing about that as we go on. And then finally, Dawn Kelso. Dawn is Data Solutions Director of our Digital Transformation Practice, known as Curve Digital. Hi, Dawn.
Dawn Kelso
Hi.
Rufus Grig
Just tell us a little bit about your background and your role day-to-day.
Dawn Kelso
So... I, as you say, lead Curves digital data practice at the moment, and that involves all things to do with data. So anything and everything, particularly in the Microsoft space, but broader than that as well. And so that's looking at intelligent data platforms, AI and all things that data can touch.
Rufus Grig
Brilliant. Thank you, Dawn. So great to have you all with us. We're going to start with Kyle and Kyle, you've got a very operational example to talk us through. And I. believe you're going to talk about something that you've done to support the compliance cloud service and how you've used generative AI to help manage it. But before we dive in, you better tell us what compliance cloud is and what problems it sets out to solve.
Kyle Ansari
Yeah, so compliance cloud was our full cloud solution. It's fully hosted by us at Curve to allow banks to have us record their back office and now trade floors in a cloud environment. So This allowed us to build private instances to secure data to a bank's network, and we've effectively just become an extension of their recording estate. We went down this road of being able to do that because voice recording primarily in a bank has been a very difficult thing to manage, and it requires sort of normally quite a large team to be able to do it. There's also budget constraints in trying to get the right resources in that understand voice recording. So we kind of put Compliance Cloud in. with the team that we have from my side and from our operational side to kind of take that pressure off the bank of having a dedicated team that understands voice recording, understands the compliance and regulatory side of it and security side, and to be able to give it all to them in a solution.
Rufus Grig
And so I guess you talk about voice recording, but I guess in this day and age, it's not just necessarily about voice, is it?
Kyle Ansari
No, it's not. It is about text. It's about screen recording. webcam sort of video recording. We're now going down to the route of looking through emails and going through sort of email capture, text message capture, WhatsApp capture. It's kind of any form of communication now. So it's not just voice anymore.
Rufus Grig
Great. So it's clearly a very mission critical service. The banks and the regulated industries that are your customers are utterly dependent on it to keep them on the right side of the regulators. What was the problem that that you guys as the providers of the service were setting out to solve with AI?
Kyle Ansari
So for me, it was an operational challenge. Because Compliance Cloud has grown sort of the way that it has done, and we're now running around 100 servers in Azure across multiple different tenancies, you get a lot of alarms. You get a lot of false positives, and a lot of things happen in the network. Primarily, we're recording Teams, and Teams has been a very new implementation from a compliance point for most of our customers. So when you're seeing new types of alarms coming in, things happening Microsoft side, things happening recording side, we were getting... lots of alarms. So we were up at our worst point. We were looking at three, four thousand a day of just false positive alarms coming in and hitting our operation centre.
Rufus Grig
So when you say a false positive alarm, you mean that the system is telling you something's gone wrong, but actually it's not anything that you really needed to worry about.
Kyle Ansari
Exactly that. And that's how it started. But the problem we were having with that is because there were so many of them, it hides. any potential real alarm that could come in because you're just being flooded with data that's coming in from systems. So I set out to try and solve that and to try and work out a more efficient way for us to control our alarming, to control what data we're getting out from the tendencies and out from the voice recording servers, and then using AI to analyze it for us to reduce the manpower time we've got to look at them.
Rufus Grig
So that's really interesting. Before we get into that, just... It'd be good to get a sense of just how much time it would take. So for each of those 3,000, 4,000 alarms, how long does it take a human analyst to be able to look at that and work out that it's a false positive or something they haven't got to worry about?
Kyle Ansari
Yeah, so in that first couple of weeks of the first service going live, we worked it out. It was around 50 hours per 100 tickets or for 100 alarms that came in. So you can imagine it scaled very quickly to the point where we were getting completely overwhelmed by the amount of alarms coming in from the system.
Rufus Grig
So even my maths can do that's half an hour per alarm and half an hour times 4,000 alarms is untenable for human beings to get those eyeballs on that.
Kyle Ansari
Yeah, completely.
Rufus Grig
So tell us what you did next. What was the idea and how'd it go?
Kyle Ansari
So the idea was that we could take Azure's Cognitive AI service and we could categorize all of the alarms. and train the AI service on what the alarm was. And we did this by looking at every alarm code that came out from the system, pulling the logs from our log aggregator, which has the complete set of details of what was going on at the time, and then using a positive and negative enforcement on whether the alarm was a true issue, or whether it was a false positive, or was there an actual problem, or is it just the user didn't pick up the phone? So our very first version of it was just here is every alarm we have, here is thousands of rows of log files and what is good and what is bad. So we had to generate multiple calls and actually train it like in the log. This is where the pickup happens. This is where the bots join. This is how the audio stream should look. When it doesn't look like this, there's probably a problem. So that training, it took us just over two weeks to kind of pull the data set together. We had a lot of the data already from our lab systems that we could use to do this from. And then using the unstructured training service from Azure, we managed to build this all into a private GPT model that sits inside our tenancies.
Rufus Grig
So those of you who can remember our episode on models and things, you've got a GPT foundation model and you are fine tuning it. Is that what you're doing with this additional data?
Kyle Ansari
Yep, exactly that. It's a fine tune that goes in. purely just on voice recording and compliant log files. So it does just get a complete understanding of what it's looking at every time we paste a log file into it and every time we send it over the API to it, it can analyze it and go, yes, that does match what it should do or no, it doesn't look right. And if it doesn't look right, it then gives you a supporting, this is what we think the problem is and how to solve it. And it's evolved quite a bit since that first version.
Rufus Grig
We'll come on to what you've done to evolve it later. But in terms of, you know, That first version, what sort of success did you have when you turned it on?
Kyle Ansari
We reduced our intervention of tickets by 97%. So we took it from 30 minutes to 60 seconds per alarm, and it filtered out massive amounts of data for us and kind of really gave us a true picture of what was going on inside the environment. And within that first sort of month of us having it up and running, and being able to reduce all of that noise we did actually start to pick up real issues with inside the team's infrastructure. And we started to detect errors sort of quicker than they were being notified out on sort of Azure's issues tracker. And there's a couple of tickets we got into Microsoft before they knew themselves that they had a problem in their infrastructure.
Rufus Grig
Wow. So, I mean, that 60 minutes to 30 seconds, I mean, that's not a bad ROI stat if you're looking to demonstrate that this was a worthwhile project, is it?
Kyle Ansari
Yeah. Yeah.
Rufus Grig
So, I mean, that sounds brilliant, but you said you then went on to evolve it. I mean, what was there to improve in that?
Kyle Ansari
We started to work out that we could train it on more data that wasn't just to do with voice reporting. So we took all of Microsoft's error codes that they have published and put that into the training stack. So when an alarm comes in now and I'm getting a Microsoft fault code come back, we can now also go and look up Microsoft's fault code against ours and tie the whole piece together to go, actually. I can see that, yes, we alarmed, yes, there was a problem, but the Diag code comes back because the Sitcom customer side didn't connect. So we're completely crawling out our infrastructure by being able to look up Microsoft's codes coming back to us. That was the next big bit of training that we did. We then kind of kicked it up another notch that for any particular call that we were seeing that was having a potential problem, we were getting the internal call IDs back from the team service. And because we're using bot recording, we were then able to start looking up that data with the insider Microsoft CQDs and then tracing the full call flow. Did the call genuinely happen? Was there an issue with someone's audio? And it just gave us a level of investigation that we just couldn't have previously. So it's just made it even faster and more accurate as we've gone along.
Rufus Grig
Wow. You've presumably been through sort of multiple iterations. One of the things people do say about fine tuning is that it can be expensive. It takes a lot of time to prepare the data. But from what you're saying in your particular application, it really has paid off. It's also, I suppose, not the sort of innate, it's not the sort of thing that the models are going to have come across in their foundational training.
Kyle Ansari
No, absolutely. It was expensive. Was it as expensive as having a whole team of people go through 30 minutes a ticket? No, we've saved a massive amount on being able to use AI to go through these tickets and pull this data together for us. We're having to train and I'm doing a refi in tune currently every six months. I do see drift in it. You do see some kind of like hallucination responses still, but I think that'll get better over time. So we are still testing it and we're still having to stress test it and put. deliberately force data in there just to see that it is still working as intended, then it will get better over time.
Rufus Grig
So, I mean, Kyle, you've been using this for, what, a year? Is it more than a year it's been in place now?
Kyle Ansari
More than a year it's been in place now, yeah.
Rufus Grig
So, I mean, I think, give the date away, and I know that Ed and Kim, our producers, generally don't want us to say what the day is, but I think it is actually Chachi PT's second birthday today. And so, I mean, this really is probably one of the earliest genuine production events. applications that you've been running for a long time. So, I mean, it's really fascinating to speak to somebody who really was in at it quite early on. Are you starting to see any other potentials for generative AI in compliance in general? And what are you doing about those?
Kyle Ansari
Yeah, there is a whole other application we have for generative AI and just AI in general in compliance. There is a whole stack of development we have planned for next year. which is revolving around another form that I've been working on. I can't go into too much detail just yet about what that is, just for legal and compliant reasons right now. But there is actually a issue that we're solving that primarily is something that used to take hours of people power to go and get a certain task done. And using AI in our tests and sort of all of our kind of alpha work that we've been doing on it. again, we're reducing it down to minutes over hours worth of work. And it's allowing us to look at sort of remote locations. It's allowing us to work on multiple offices at the same time. So there's just a huge improvement that we can bring in in the next year using generative AI.
Rufus Grig
Brilliant. Well, look, when you get through the hush-hush secret squirrels page of this, you will come back and tell us about it, won't you?
Kyle Ansari
Absolutely. And I'd love to talk about it when I can.
Rufus Grig
Brilliant. Brilliant. Thanks, Kyle. Well, look, that's been fascinating. David. If we can turn to you, you've been using generative AI to sort of augment an existing solution too. Just give us a little bit of the background about the problem you were setting out to solve.
David Groves
So Curve is one of the biggest partners in the UK, and for that matter elsewhere, with a company called Genesis who do contact center systems. And they're one of the leading contact center platforms that you can get. And like all contact centers, some people want to take voice calls, some people want to take emails, some people want to take web chats. And then you've got a whole gamut of other different media types, whether it's WhatsApp, Messenger, and the like. And Well, in this particular use case, we had a customer who was using it for customer service. So it's basically an aftermarket or supporting a customer type thing. And they were trying to get into lots of different territories across Europe. And it's very expensive to go and hire 20 contact center people who speak Norwegian if you want to get into Norway. And then another 20 that speak Swedish. And then another bunch of people that speak every single language. you can imagine in Europe. And there are a lot of languages in Europe as well. So one of the things that we were looking to do was basically allow this customer, and for that matter others, to handle customer service in whatever language they needed to using a single pool of people speaking whatever languages they do speak to provide customer services initially throughout Europe, but now worldwide. So basically what we did was we... put automated translation in place right at the heart of the contact center, initially to handle web chat and WhatsApp messaging and those sorts of different media types, and now email as well.
Rufus Grig
Okay. I mean, that sounds like, you know, translation has been a fairly high profile application of generative AI, I suppose, and you've been deploying it. in a way that delivers very real benefits into a contact centre, allowing a contact centre to access new markets, allowing people who might not necessarily speak the same language as the manufacturer or the service provider. It sounds absolutely on the money for this sort of technology. So how did it work? I mean, what sort of challenges did you hit along the way? Give us a sense of what happened.
David Groves
So I think the interesting thing is that we're speaking English and we're all native English speakers. And that makes us very, very spoiled when it comes to... um understanding the nuances of other languages for example english doesn't have male and female words unlike german where they've got male female and neuter words and the way that you would say something is pretty much the same whether you happen to be male or female but that's definitely not the case in other languages and also you find that idioms are really hard to translate well so if i start telling you that you made a pig's ear of something You as an English speaker completely understand what I'm saying, but someone who doesn't have that exposure to English, if you translate every single word into another language, it sounds like complete gobbledygook. And I remember reading a message that someone had flagged for us that was actually in Spain saying that someone had thrown the house out of their own window, which actually means treating something as if it was ridiculously expensive, I believe. That sort of idiom just... does not translate terribly well. And it's important because you're dealing with consumers who are just using their language the way they used to, to be able to get a sense of what they actually mean to the other person that doesn't necessarily speak that language. So we found that was a real issue.
Rufus Grig
I guess if I think about, you know, translation services have been around for a long time. I remember trying to tell my children that they shouldn't be using Google Translate to do their French homework when they, you know, years and years and years ago. But I guess those original translation services were a lot more literal, it's almost a word for word. So does generative AI manage those idioms, that sort of natural sense of language, better than the more traditional translation services?
David Groves
Yes, they're all getting better all the time. But depending on which service you use, they're not all as good as each other. So you actually do need to go and check. to see whether the quality of the translation for the particular use case you've got is going to be what you need. And a lot of the differences are now around handling of idiom and also their ability to handle context. So one of the other examples is we've got one customer who's commonly talking about hand cream and doesn't really want the word cream translated as dairy cream because that's not really what they're talking about. And so the same word in English is used for both, but it's not the same word in other languages. And you need to make sure that the translation engine has got some context. That's actually quite hard to do. It's something that the large language models are doing quite well, far better than the previous generations.
Rufus Grig
So is that something that you, I mean, you as the developer, the end user is completely oblivious to this, but you as the developer, you're taking care of this in the prompt, are you? You're saying... This is a conversation between a customer service agent and a customer and they're talking about, you know, a cosmetic hand cream product and that steers the LLM to give you a better translation. I mean, I'm sure I'm oversimplifying it, but that sort of thing.
David Groves
There are several ways to do it. So you can do it in the prompt. You're providing context. The more context you provide, the more tokens you're using, and therefore the more expensive it gets. So there is a trade-off to be had between how much of this background information you want to provide every single time and how much you want to pay for it. There are other ways of doing it as well, but many first language services have got a concept of something called glossaries, which allow you to have that context about the subject that you're talking about. The thing is that if you just type something into a web chat and you simply take that and ask a translation engine to translate it, there's an awful lot of context that you're just not giving it. It's just that one sentence that you've typed without any of the history of what's happened before or what this company provides or the kind of voice that they would like to use or any of those things. And the new developments are about getting that context into the conversation to make the translation and the communication just work better.
Rufus Grig
Okay, interesting. I'm also quite interested in how the different languages perform, because one of the things that we talked about in other episodes on the series is some of the inherent biases that you get with the models, because, as you said, we speak English. A lot of the internet is in English, and therefore you understand, you know, there are some minority languages which the language models would have had a lot less sort of source material to train on. Do you find that... Things like Spanish to English have a much higher percentage of accuracy than perhaps less spoken, less common languages.
David Groves
Yeah, Western European languages tend to perform better than most other languages. And probably I should add Chinese and Japanese to that list as well because there's a lot of written materials that is used to train these AIs that can translate between those languages. I think also, if you look at the resources that the people providing the... translation engines use. There's a massive amount of simultaneously translated text for all of the proceedings of the EU Parliament, and that's a fantastic resource for people to go and train their language models on.
Rufus Grig
And the language models don't fall asleep while trying to read it either.
David Groves
Well, there is that too. I have looked at some of this stuff, and it's not riveting. But equally, you've got the stuff that is done at the UN. If you are dealing with... Well, let's take Norwegian, which actually has two different scripts. So you can write it in two different ways using two different alphabets. There will be less translation to the less commonly used script, and that will just make the quality of the translation different. And the same is true in Serbian, for example, different scripts.
Rufus Grig
Just all of our Norwegian and Serbian listeners to The Learning Curve, we know that you have absolutely, your languages are just as valid as everybody else's. Just one final thing, David, on how do you work out which language somebody is speaking? Do you have to pre-populate that? Is automatic language detection reliable? Are there nuances that you stumble on along the way?
David Groves
You can automatically detect it. As we all know, languages have got a common root. So English, German, Dutch, Danish, all are Germanic languages. And the problem is that if... I just simply type the word hello, which is spelled the same way in three out of those four languages, I'm not going to be able to tell which language that is. H-A-L-L-O, I think, in most of those languages, and also that's allowable in English. You need a little bit more context to be able to work out which language that actually is. I think also customers are customers, and occasionally they just swap languages halfway through their web chat. We can try and tell them not to do that. But actually,
Rufus Grig
I have bilingual nieces and nephews who do that all the time. They start a sentence in one language, migrate to another one halfway through and finish up in the first one.
David Groves
Yeah. So actually, you need to be able to detect what language they're actually speaking. Because if you tell an AI to translate from French to German and give it Russian, it's not going to do a very good job of it. So you actually do need a layer in front of it to go, actually, that's not French. Treat it as Russian and it will do a much, much better job of it.
Rufus Grig
Brilliant. And just give us a sense of how busy is this tool? How many words are you translating every month in this service now?
David Groves
I haven't counted words, actually. We're doing about five million translations a month. So each of those will be one, two, three, five sentences up to an entire email. It's hundreds of millions of characters per month easily.
Rufus Grig
But as with Kyle and his thousands of operations a day, you know, a system absolutely operating at scale and really doing a lot of work. David, thank you. That's been really interesting. If we could turn to Dawn, both David and Kyle. data sources kind of already that they're using. Kyle's got his log files and his alarms and his reports and David's got every utterance at the United Nations in the last 50 years, plus the messages that the customers and the agents. But I mean, data is fundamental to making AI work properly, isn't it?
Dawn Kelso
Yeah, absolutely. And there's almost sort of two slices to it. There's the slice where you have, which David and Kyle have been talking about, where you... You've got kind of operational systems where the data is sitting there and you're applying AI to that and within that system. And then you've got the broader area when you kind of you want to consolidate data from different data sources, unify it together and see what you can do with that data in that separate data platform kind of area.
Rufus Grig
So, I mean, what's the problem if you don't get your data right?
Dawn Kelso
Your results. your outcomes are only ever going to be as good as the data. So there's that kind of rubbish in, rubbish out concept in the world of data, which is if you put rubbish in, you will get rubbish out. So you can have distorted predictions. You can have bias. You know, we hear a lot about discrimination in AI, particularly when sort of ChatGPT first launched into the market and other AI tools, they were being trained on not necessarily good quality data. And therefore, because of that, They were starting to give poor outcomes, poor results, whether that was kind of abusing the end users, which a lot of people had fun trying to get them to do that, or more serious things like, for example, I know AWS, the Amazon recruitment folks there, had issues where they were training their data. They were using AI to speed up the recruitment process, but the way that they trained the data, they trained it on good CVs, which they had more men. applying and more males in technological roles, which is an issue we still have. And because of that, it started to pick up bias against women and started rejecting female applicants and female CVs where people mentioned that they were in women in technology groups or whatever it may be. So you have to be really careful in terms of the quality of the data that you are using, or you can really go off on tangents.
Rufus Grig
That's really interesting that I tend to think of quality data as each individual piece of data is quality. In that Amazon example that you gave, each of those individual pieces of data were fine in their own. It was all the bits that were missing that was the problem. So the quality problem is an extent or a size of data set or a completeness of data set?
Dawn Kelso
Yeah. So there's all sorts of different chunks that make up data quality. So there's the accuracy, there's the data compliant, is it appropriate? Is it relevant? Is it up to date? Are you labeling and classifying the data to help your AI models work through it as well? So there are all sorts of nuances. And the best way to find these nuances is by trying it out and breaking it and getting core results and then realizing, oh, just like Amazon, you know, they obviously, they weren't trying to get their recruitment AI to recruit more men. They were just trying to get it to recruit good candidates. But so there's lots and lots of different things that you have to think about. And there's a real key thing around. something called feature selection. So is the data that we're using contextually relevant for the outputs and the outcomes that we want to achieve? And that really helps with focusing the kind of data that you're using for the outcomes that you're looking for.
Rufus Grig
And that to me is a really, it shows up just the skills and experience that, are we talking about a data scientist? Is this a data scientist role to make sure that you get that feature selection right?
Dawn Kelso
Yeah, very much so. It can be, I mean, these days data job titles merge and change and shift, and they are continuing to merge and change and shift. But certainly, you know, data scientist is a really good place to start because that's somebody who's used to really drilling into the minutiae of data and where it's at. But equally, it's having really clear, straightforward, this is our objective, you know, this is what we want to do, a real clear use case before you get started.
Rufus Grig
But it is a really specialist piece of expertise. I mean, if I think back to my early days, a data expert was somebody who could write a really complex SQL query and knew what a NINAR join was, but they didn't have to understand the world or the universe in which they were doing. They had a very specific, you know, discover this, produce this report. And now we're expecting people to have a... a broad macroscopic understanding of the problem they're solving and the ability to interpret how do I get this this data right yeah absolutely and like with all things in data and you know back when I was the girl writing the SQL queries there was still always that challenge about making sure that the data was relevant and up to date and that remains a major issue today whether you're talking about AI or more traditional kind of data and analytics you need to have that kind of review and monitor that sort of, you know, in the AI world, we talk about having a human in the loop, someone who's checking and revisiting, is this still fit for purpose? Because you end up in this kind of this data swamp, this AI swamp where things can self-perpetuate and get worse and worse and worse and very much in the AI space. If you aren't making sure that that data is right, and you're honing and refining the results, you can really go off in an interesting tangent that may not be one that you want.
Kyle Ansari
Yeah, yeah. I mean, that's really interesting. I guess another aspect must be the security of the data. You know, what data am I training on? And some of the pitfalls of training on data that is private or confidential. Talk about that a little, if you would.
Rufus Grig
Yeah, absolutely. So making sure that the data you're using is appropriate in terms of the content of it. So if you have PII, personally identifiable information in there, you know. Why are you using that? Do you have a legitimate reason to be using that? And particularly when it comes to biases in terms of gender, in terms of race, etc., etc., these are all real sensitive data topics and potentially things that should be stripped out. And often there is that kind of positive and negative bias that you can take out, or if you want the positive bias, you can put in. But absolutely, that kind of making sure that... You've got the right data in there is really key. And sensitive information, when it's exposed to an AI tool, and you may not realize you've exposed it to an AI tool, and that's where the real danger can come. If you set an AI tool over your SharePoint estate, for example, you will probably find CVs that people have applied years and years ago to roles that you're advertising. That's just one random example. But we find that when we're looking at HR, there's spreadsheets, you know. So that data... that is held all over the place. And it's not realised how actually, how accessible that is. When you point a tool at it, that tool that has that access switched on can go off and get all sorts of information that it really shouldn't. So there's a wider piece around the security of your data estate before you start setting any of these tools into motion.
Kyle Ansari
That's really interesting. And I guess there's also the fact that in, you know, back to my ancient history, It's not all rows of data in neatly organised tables. You've talked about documents, CVs on SharePoint and Excel spreadsheets and probably videos and call recordings. It's all data, yeah?
Rufus Grig
Absolutely, absolutely. The world we're in now with data, you've got structured, unstructured, semi-structured data, so it's anything and everything. Should you have those CCTV recordings? What does your data retention policy look like? All of that kind of thing. feeds into what you are using AI for and how you're potentially exposing yourself if you're not careful.
Kyle Ansari
And how do you manage that? I mean, are there principles around how you, you talked about labeling earlier and classifying, you know, whose role in an organization is it? Is that a CISO role? Is it a privacy person role? Is it a data person role?
Rufus Grig
I think your InfoSecs will definitely have something to say about it. Your data protection officers will have something to say about it. In Europe, we're looking at GDPR regulations. So, of course, there's all sorts of key things in there about how long you should be keeping data for and what you should be using data for. So there's a responsibility there. There's something about, as I said earlier, making sure that when you're pre-processing your data, you're using the right data sets. So pointing the AI tools that you're using at specific chunks of data is really key to make sure you're getting... the results, the outcomes that you want. So it varies dramatically from use case to use case. And that's where you can have some really kind of nailed down, beautifully clean applications and some really broad, messy ones, depending on what you're doing.
Kyle Ansari
Okay. Really interesting. And I guess even from a sort of more pure data AI type model, you've got to pick the right algorithm. I mean, in the same way that David found some services. do a better job of some translations than others. Just how do you go about understanding even what type of AI algorithm or model you should be using to achieve a particular outcome?
Rufus Grig
There are different algorithms, there are different off-the-shelf models. So things like Azure OpenAI, for example, there are language translation, image search, all sorts of tools there where you can actually pick something off the shelf to use. There are things like AI Studio, which will let you test different algorithms and look at the outputs and see which is the right fit for you based on those outputs and those test cases you put through. Or you can also, you know, the internet is full of decision trees in terms of taking you through what the options are. You know, what are you trying to achieve? What are the most important things for you, whether it's speed, accuracy, et cetera, et cetera. And you can go through all of those kind of decision trees in order to kind of to work it out. But yes, your mileage may vary. There are a lot of options.
Kyle Ansari
Yeah, I'm sure. And then once you have got the thing built and deployed, I mean, I'm guessing you don't just. turn you back on it and go on with the next thing. There's a role to make sure that it continues to do the right thing. Can you just explain some of those sorts of aspects?
Rufus Grig
Yeah, absolutely. So review and monitor is a real key thing. You know, when we were talking about machine learning, which was the first kind of AI thing that certainly for me in the data platform world became something real and tangible that we were using, the real key thing with machine learning was... It's learning. It's constantly learning. And you need to go back and re-evaluate it and make sure that the way that it is processing data now is the way that you still want it to process data. So making sure in the AI world that you're reviewing the outputs, you're monitoring them. The things that David talked about earlier in terms of how data is being translated, are we looking at that? Are we checking? Have we got a human in the loop who's going, yes, this is still doing the right thing, or hang on a minute. this is starting to veer off and we need to give this more thought. And of course, tools are constantly evolving. So there's constantly new versions coming out. If you're using cloud-based tools, then you don't generally have an option. You get the update whether you like it or not. But there are always things that we need to review. There are improvements, there are changes. So constantly revisiting what you're doing with AI tooling to make sure it is still fit for purpose, it's still doing the right thing. And having that human in the loop is a real key part of that. You know, you can set AI off and let it do its thing and you can monitor it with all sorts of tools. But every so often you need a person to have a look at it and just check.
Kyle Ansari
Isn't it nice that there's still a role for a person after all this time? All of you, thank you. This has been brilliant. Thank you, Kyle. Thank you, David. And thank you, Dawn. It's been absolutely fascinating hearing from people who really do this stuff and really bring this AI stuff to life. If you've been interested... in what we've had to say here on The Learning Curve, then please do get in touch and tell us what you think. You can find out more about Curve and about AI in general by visiting curve.com. And in fact, in our final episode of this Generative AI series, we're going to do something a bit different. I'm going to be joined by Will Dorrington, who was my guest right at the beginning of all of this. And we're going to look into some of the immediate future for Generative AI, particularly looking at the roles of agents and agentic AI. But we're also going to keep a bit of time back to answer some of your questions. So if you have a question on AI, generative AI, co-pilots, hallucinations, ethics, privacy, environmental impact, or quite literally anything, then please do get in touch. You can put a comment wherever you get your podcast and we'll pick that up. Or you can email us on hello at Curve.com. So do look out for that episode. If you subscribe and tell all your friends, hopefully you'll get it notified in your podcast player of your choice. Until then. thank you to my guests thank you for listening and until next time goodbye

Description

Key Highlights:

Operational AI in Compliance:
Understand how Kerv's product, Compliance Cloud, uses generative AI to manage and analyse alarms in a cloud-based voice recording solution for banks. Learn how AI reduced false positives and operational workload by 97%, enhancing efficiency and accuracy.
AI in Customer Service:
Explore how generative AI is used to provide multilingual customer service in contact centres. Discover the challenges and solutions in implementing automated translation for various communication channels, enabling businesses to expand into new markets without the need for extensive multilingual staff.
Data Quality and AI:
Delve into the importance of data quality in AI applications. Understand the critical role of accurate, relevant, and compliant data in ensuring effective AI outcomes. Learn about the challenges of managing diverse data sources and the necessity of continuous monitoring and human oversight.
Future of Generative AI:
Explore the potential future applications of generative AI in compliance and other fields. Gain insights into ongoing developments and the importance of ethical considerations in AI deployment.

Hosted by Ausha. See ausha.co/privacy-policy for more information.

Transcription

Rufus Grig
Hello and welcome to The Learning Curve, the podcast from Curve that delves into the latest developments in information technology and explores how organisations can put them to work for the good of their people, their customers, society and the planet. My name is Rufus Grigg and in this series, and with the help of some very special guests, we've been looking at all things generative AI. The worlds of business and technology are massively excited about Gen AI and in this series so far, all episodes of which are still available on the podcast player of your choice, we've covered the... basics of machine learning and AI in general. We've lifted the hood on generative AI and foundation models themselves in particular, and looked at some of the faster growing applications of AI. And in the last episode, we even paused to consider how we use AI responsibly and securely. But in this episode, we are going to get to where the rubber really hits the road, because my guests this time round are people who have built and deployed real live AI systems in the wild that are up. and running and working and being used every day. And it's a fantastic opportunity to hear their stories, understand what they've been able to achieve with AI, the problems that they've been solving, and see what lessons they've learned along the way. So let's meet them. First of all, we're joined by Kyle Ansari, who is the CTO of Curve's Collaboration and Compliance Practice. Hello, Kyle.
Kyle Ansari
Hello, Rufus. Thank you for inviting me on this week.
Rufus Grig
No, brilliant to have you. Just tell us a little bit about your day job and what you do for Curve.
Kyle Ansari
Yeah. So I'm responsible for our development and engineering team in collaboration and compliance. We provide compliant voice recording solutions for tray floors, for back office, and primarily these days, everything being cloud-based for team Zoom and anything that requires recording.
Rufus Grig
Brilliant. Well, thank you, Kyle. Great to have you with us. We're also joined by David Groves, who's product development director at Curve. Hi, David.
David Groves
Hello there.
Rufus Grig
And tell us a little bit about your world.
David Groves
So we create products. for the rest of Curve to go and sell, implement, and generally do brilliant stuff with. And several of these are now sort of using AI for real.
Rufus Grig
Brilliant. We look forward to hearing about that as we go on. And then finally, Dawn Kelso. Dawn is Data Solutions Director of our Digital Transformation Practice, known as Curve Digital. Hi, Dawn.
Dawn Kelso
Hi.
Rufus Grig
Just tell us a little bit about your background and your role day-to-day.
Dawn Kelso
So... I, as you say, lead Curves digital data practice at the moment, and that involves all things to do with data. So anything and everything, particularly in the Microsoft space, but broader than that as well. And so that's looking at intelligent data platforms, AI and all things that data can touch.
Rufus Grig
Brilliant. Thank you, Dawn. So great to have you all with us. We're going to start with Kyle and Kyle, you've got a very operational example to talk us through. And I. believe you're going to talk about something that you've done to support the compliance cloud service and how you've used generative AI to help manage it. But before we dive in, you better tell us what compliance cloud is and what problems it sets out to solve.
Kyle Ansari
Yeah, so compliance cloud was our full cloud solution. It's fully hosted by us at Curve to allow banks to have us record their back office and now trade floors in a cloud environment. So This allowed us to build private instances to secure data to a bank's network, and we've effectively just become an extension of their recording estate. We went down this road of being able to do that because voice recording primarily in a bank has been a very difficult thing to manage, and it requires sort of normally quite a large team to be able to do it. There's also budget constraints in trying to get the right resources in that understand voice recording. So we kind of put Compliance Cloud in. with the team that we have from my side and from our operational side to kind of take that pressure off the bank of having a dedicated team that understands voice recording, understands the compliance and regulatory side of it and security side, and to be able to give it all to them in a solution.
Rufus Grig
And so I guess you talk about voice recording, but I guess in this day and age, it's not just necessarily about voice, is it?
Kyle Ansari
No, it's not. It is about text. It's about screen recording. webcam sort of video recording. We're now going down to the route of looking through emails and going through sort of email capture, text message capture, WhatsApp capture. It's kind of any form of communication now. So it's not just voice anymore.
Rufus Grig
Great. So it's clearly a very mission critical service. The banks and the regulated industries that are your customers are utterly dependent on it to keep them on the right side of the regulators. What was the problem that that you guys as the providers of the service were setting out to solve with AI?
Kyle Ansari
So for me, it was an operational challenge. Because Compliance Cloud has grown sort of the way that it has done, and we're now running around 100 servers in Azure across multiple different tenancies, you get a lot of alarms. You get a lot of false positives, and a lot of things happen in the network. Primarily, we're recording Teams, and Teams has been a very new implementation from a compliance point for most of our customers. So when you're seeing new types of alarms coming in, things happening Microsoft side, things happening recording side, we were getting... lots of alarms. So we were up at our worst point. We were looking at three, four thousand a day of just false positive alarms coming in and hitting our operation centre.
Rufus Grig
So when you say a false positive alarm, you mean that the system is telling you something's gone wrong, but actually it's not anything that you really needed to worry about.
Kyle Ansari
Exactly that. And that's how it started. But the problem we were having with that is because there were so many of them, it hides. any potential real alarm that could come in because you're just being flooded with data that's coming in from systems. So I set out to try and solve that and to try and work out a more efficient way for us to control our alarming, to control what data we're getting out from the tendencies and out from the voice recording servers, and then using AI to analyze it for us to reduce the manpower time we've got to look at them.
Rufus Grig
So that's really interesting. Before we get into that, just... It'd be good to get a sense of just how much time it would take. So for each of those 3,000, 4,000 alarms, how long does it take a human analyst to be able to look at that and work out that it's a false positive or something they haven't got to worry about?
Kyle Ansari
Yeah, so in that first couple of weeks of the first service going live, we worked it out. It was around 50 hours per 100 tickets or for 100 alarms that came in. So you can imagine it scaled very quickly to the point where we were getting completely overwhelmed by the amount of alarms coming in from the system.
Rufus Grig
So even my maths can do that's half an hour per alarm and half an hour times 4,000 alarms is untenable for human beings to get those eyeballs on that.
Kyle Ansari
Yeah, completely.
Rufus Grig
So tell us what you did next. What was the idea and how'd it go?
Kyle Ansari
So the idea was that we could take Azure's Cognitive AI service and we could categorize all of the alarms. and train the AI service on what the alarm was. And we did this by looking at every alarm code that came out from the system, pulling the logs from our log aggregator, which has the complete set of details of what was going on at the time, and then using a positive and negative enforcement on whether the alarm was a true issue, or whether it was a false positive, or was there an actual problem, or is it just the user didn't pick up the phone? So our very first version of it was just here is every alarm we have, here is thousands of rows of log files and what is good and what is bad. So we had to generate multiple calls and actually train it like in the log. This is where the pickup happens. This is where the bots join. This is how the audio stream should look. When it doesn't look like this, there's probably a problem. So that training, it took us just over two weeks to kind of pull the data set together. We had a lot of the data already from our lab systems that we could use to do this from. And then using the unstructured training service from Azure, we managed to build this all into a private GPT model that sits inside our tenancies.
Rufus Grig
So those of you who can remember our episode on models and things, you've got a GPT foundation model and you are fine tuning it. Is that what you're doing with this additional data?
Kyle Ansari
Yep, exactly that. It's a fine tune that goes in. purely just on voice recording and compliant log files. So it does just get a complete understanding of what it's looking at every time we paste a log file into it and every time we send it over the API to it, it can analyze it and go, yes, that does match what it should do or no, it doesn't look right. And if it doesn't look right, it then gives you a supporting, this is what we think the problem is and how to solve it. And it's evolved quite a bit since that first version.
Rufus Grig
We'll come on to what you've done to evolve it later. But in terms of, you know, That first version, what sort of success did you have when you turned it on?
Kyle Ansari
We reduced our intervention of tickets by 97%. So we took it from 30 minutes to 60 seconds per alarm, and it filtered out massive amounts of data for us and kind of really gave us a true picture of what was going on inside the environment. And within that first sort of month of us having it up and running, and being able to reduce all of that noise we did actually start to pick up real issues with inside the team's infrastructure. And we started to detect errors sort of quicker than they were being notified out on sort of Azure's issues tracker. And there's a couple of tickets we got into Microsoft before they knew themselves that they had a problem in their infrastructure.
Rufus Grig
Wow. So, I mean, that 60 minutes to 30 seconds, I mean, that's not a bad ROI stat if you're looking to demonstrate that this was a worthwhile project, is it?
Kyle Ansari
Yeah. Yeah.
Rufus Grig
So, I mean, that sounds brilliant, but you said you then went on to evolve it. I mean, what was there to improve in that?
Kyle Ansari
We started to work out that we could train it on more data that wasn't just to do with voice reporting. So we took all of Microsoft's error codes that they have published and put that into the training stack. So when an alarm comes in now and I'm getting a Microsoft fault code come back, we can now also go and look up Microsoft's fault code against ours and tie the whole piece together to go, actually. I can see that, yes, we alarmed, yes, there was a problem, but the Diag code comes back because the Sitcom customer side didn't connect. So we're completely crawling out our infrastructure by being able to look up Microsoft's codes coming back to us. That was the next big bit of training that we did. We then kind of kicked it up another notch that for any particular call that we were seeing that was having a potential problem, we were getting the internal call IDs back from the team service. And because we're using bot recording, we were then able to start looking up that data with the insider Microsoft CQDs and then tracing the full call flow. Did the call genuinely happen? Was there an issue with someone's audio? And it just gave us a level of investigation that we just couldn't have previously. So it's just made it even faster and more accurate as we've gone along.
Rufus Grig
Wow. You've presumably been through sort of multiple iterations. One of the things people do say about fine tuning is that it can be expensive. It takes a lot of time to prepare the data. But from what you're saying in your particular application, it really has paid off. It's also, I suppose, not the sort of innate, it's not the sort of thing that the models are going to have come across in their foundational training.
Kyle Ansari
No, absolutely. It was expensive. Was it as expensive as having a whole team of people go through 30 minutes a ticket? No, we've saved a massive amount on being able to use AI to go through these tickets and pull this data together for us. We're having to train and I'm doing a refi in tune currently every six months. I do see drift in it. You do see some kind of like hallucination responses still, but I think that'll get better over time. So we are still testing it and we're still having to stress test it and put. deliberately force data in there just to see that it is still working as intended, then it will get better over time.
Rufus Grig
So, I mean, Kyle, you've been using this for, what, a year? Is it more than a year it's been in place now?
Kyle Ansari
More than a year it's been in place now, yeah.
Rufus Grig
So, I mean, I think, give the date away, and I know that Ed and Kim, our producers, generally don't want us to say what the day is, but I think it is actually Chachi PT's second birthday today. And so, I mean, this really is probably one of the earliest genuine production events. applications that you've been running for a long time. So, I mean, it's really fascinating to speak to somebody who really was in at it quite early on. Are you starting to see any other potentials for generative AI in compliance in general? And what are you doing about those?
Kyle Ansari
Yeah, there is a whole other application we have for generative AI and just AI in general in compliance. There is a whole stack of development we have planned for next year. which is revolving around another form that I've been working on. I can't go into too much detail just yet about what that is, just for legal and compliant reasons right now. But there is actually a issue that we're solving that primarily is something that used to take hours of people power to go and get a certain task done. And using AI in our tests and sort of all of our kind of alpha work that we've been doing on it. again, we're reducing it down to minutes over hours worth of work. And it's allowing us to look at sort of remote locations. It's allowing us to work on multiple offices at the same time. So there's just a huge improvement that we can bring in in the next year using generative AI.
Rufus Grig
Brilliant. Well, look, when you get through the hush-hush secret squirrels page of this, you will come back and tell us about it, won't you?
Kyle Ansari
Absolutely. And I'd love to talk about it when I can.
Rufus Grig
Brilliant. Brilliant. Thanks, Kyle. Well, look, that's been fascinating. David. If we can turn to you, you've been using generative AI to sort of augment an existing solution too. Just give us a little bit of the background about the problem you were setting out to solve.
David Groves
So Curve is one of the biggest partners in the UK, and for that matter elsewhere, with a company called Genesis who do contact center systems. And they're one of the leading contact center platforms that you can get. And like all contact centers, some people want to take voice calls, some people want to take emails, some people want to take web chats. And then you've got a whole gamut of other different media types, whether it's WhatsApp, Messenger, and the like. And Well, in this particular use case, we had a customer who was using it for customer service. So it's basically an aftermarket or supporting a customer type thing. And they were trying to get into lots of different territories across Europe. And it's very expensive to go and hire 20 contact center people who speak Norwegian if you want to get into Norway. And then another 20 that speak Swedish. And then another bunch of people that speak every single language. you can imagine in Europe. And there are a lot of languages in Europe as well. So one of the things that we were looking to do was basically allow this customer, and for that matter others, to handle customer service in whatever language they needed to using a single pool of people speaking whatever languages they do speak to provide customer services initially throughout Europe, but now worldwide. So basically what we did was we... put automated translation in place right at the heart of the contact center, initially to handle web chat and WhatsApp messaging and those sorts of different media types, and now email as well.
Rufus Grig
Okay. I mean, that sounds like, you know, translation has been a fairly high profile application of generative AI, I suppose, and you've been deploying it. in a way that delivers very real benefits into a contact centre, allowing a contact centre to access new markets, allowing people who might not necessarily speak the same language as the manufacturer or the service provider. It sounds absolutely on the money for this sort of technology. So how did it work? I mean, what sort of challenges did you hit along the way? Give us a sense of what happened.
David Groves
So I think the interesting thing is that we're speaking English and we're all native English speakers. And that makes us very, very spoiled when it comes to... um understanding the nuances of other languages for example english doesn't have male and female words unlike german where they've got male female and neuter words and the way that you would say something is pretty much the same whether you happen to be male or female but that's definitely not the case in other languages and also you find that idioms are really hard to translate well so if i start telling you that you made a pig's ear of something You as an English speaker completely understand what I'm saying, but someone who doesn't have that exposure to English, if you translate every single word into another language, it sounds like complete gobbledygook. And I remember reading a message that someone had flagged for us that was actually in Spain saying that someone had thrown the house out of their own window, which actually means treating something as if it was ridiculously expensive, I believe. That sort of idiom just... does not translate terribly well. And it's important because you're dealing with consumers who are just using their language the way they used to, to be able to get a sense of what they actually mean to the other person that doesn't necessarily speak that language. So we found that was a real issue.
Rufus Grig
I guess if I think about, you know, translation services have been around for a long time. I remember trying to tell my children that they shouldn't be using Google Translate to do their French homework when they, you know, years and years and years ago. But I guess those original translation services were a lot more literal, it's almost a word for word. So does generative AI manage those idioms, that sort of natural sense of language, better than the more traditional translation services?
David Groves
Yes, they're all getting better all the time. But depending on which service you use, they're not all as good as each other. So you actually do need to go and check. to see whether the quality of the translation for the particular use case you've got is going to be what you need. And a lot of the differences are now around handling of idiom and also their ability to handle context. So one of the other examples is we've got one customer who's commonly talking about hand cream and doesn't really want the word cream translated as dairy cream because that's not really what they're talking about. And so the same word in English is used for both, but it's not the same word in other languages. And you need to make sure that the translation engine has got some context. That's actually quite hard to do. It's something that the large language models are doing quite well, far better than the previous generations.
Rufus Grig
So is that something that you, I mean, you as the developer, the end user is completely oblivious to this, but you as the developer, you're taking care of this in the prompt, are you? You're saying... This is a conversation between a customer service agent and a customer and they're talking about, you know, a cosmetic hand cream product and that steers the LLM to give you a better translation. I mean, I'm sure I'm oversimplifying it, but that sort of thing.
David Groves
There are several ways to do it. So you can do it in the prompt. You're providing context. The more context you provide, the more tokens you're using, and therefore the more expensive it gets. So there is a trade-off to be had between how much of this background information you want to provide every single time and how much you want to pay for it. There are other ways of doing it as well, but many first language services have got a concept of something called glossaries, which allow you to have that context about the subject that you're talking about. The thing is that if you just type something into a web chat and you simply take that and ask a translation engine to translate it, there's an awful lot of context that you're just not giving it. It's just that one sentence that you've typed without any of the history of what's happened before or what this company provides or the kind of voice that they would like to use or any of those things. And the new developments are about getting that context into the conversation to make the translation and the communication just work better.
Rufus Grig
Okay, interesting. I'm also quite interested in how the different languages perform, because one of the things that we talked about in other episodes on the series is some of the inherent biases that you get with the models, because, as you said, we speak English. A lot of the internet is in English, and therefore you understand, you know, there are some minority languages which the language models would have had a lot less sort of source material to train on. Do you find that... Things like Spanish to English have a much higher percentage of accuracy than perhaps less spoken, less common languages.
David Groves
Yeah, Western European languages tend to perform better than most other languages. And probably I should add Chinese and Japanese to that list as well because there's a lot of written materials that is used to train these AIs that can translate between those languages. I think also, if you look at the resources that the people providing the... translation engines use. There's a massive amount of simultaneously translated text for all of the proceedings of the EU Parliament, and that's a fantastic resource for people to go and train their language models on.
Rufus Grig
And the language models don't fall asleep while trying to read it either.
David Groves
Well, there is that too. I have looked at some of this stuff, and it's not riveting. But equally, you've got the stuff that is done at the UN. If you are dealing with... Well, let's take Norwegian, which actually has two different scripts. So you can write it in two different ways using two different alphabets. There will be less translation to the less commonly used script, and that will just make the quality of the translation different. And the same is true in Serbian, for example, different scripts.
Rufus Grig
Just all of our Norwegian and Serbian listeners to The Learning Curve, we know that you have absolutely, your languages are just as valid as everybody else's. Just one final thing, David, on how do you work out which language somebody is speaking? Do you have to pre-populate that? Is automatic language detection reliable? Are there nuances that you stumble on along the way?
David Groves
You can automatically detect it. As we all know, languages have got a common root. So English, German, Dutch, Danish, all are Germanic languages. And the problem is that if... I just simply type the word hello, which is spelled the same way in three out of those four languages, I'm not going to be able to tell which language that is. H-A-L-L-O, I think, in most of those languages, and also that's allowable in English. You need a little bit more context to be able to work out which language that actually is. I think also customers are customers, and occasionally they just swap languages halfway through their web chat. We can try and tell them not to do that. But actually,
Rufus Grig
I have bilingual nieces and nephews who do that all the time. They start a sentence in one language, migrate to another one halfway through and finish up in the first one.
David Groves
Yeah. So actually, you need to be able to detect what language they're actually speaking. Because if you tell an AI to translate from French to German and give it Russian, it's not going to do a very good job of it. So you actually do need a layer in front of it to go, actually, that's not French. Treat it as Russian and it will do a much, much better job of it.
Rufus Grig
Brilliant. And just give us a sense of how busy is this tool? How many words are you translating every month in this service now?
David Groves
I haven't counted words, actually. We're doing about five million translations a month. So each of those will be one, two, three, five sentences up to an entire email. It's hundreds of millions of characters per month easily.
Rufus Grig
But as with Kyle and his thousands of operations a day, you know, a system absolutely operating at scale and really doing a lot of work. David, thank you. That's been really interesting. If we could turn to Dawn, both David and Kyle. data sources kind of already that they're using. Kyle's got his log files and his alarms and his reports and David's got every utterance at the United Nations in the last 50 years, plus the messages that the customers and the agents. But I mean, data is fundamental to making AI work properly, isn't it?
Dawn Kelso
Yeah, absolutely. And there's almost sort of two slices to it. There's the slice where you have, which David and Kyle have been talking about, where you... You've got kind of operational systems where the data is sitting there and you're applying AI to that and within that system. And then you've got the broader area when you kind of you want to consolidate data from different data sources, unify it together and see what you can do with that data in that separate data platform kind of area.
Rufus Grig
So, I mean, what's the problem if you don't get your data right?
Dawn Kelso
Your results. your outcomes are only ever going to be as good as the data. So there's that kind of rubbish in, rubbish out concept in the world of data, which is if you put rubbish in, you will get rubbish out. So you can have distorted predictions. You can have bias. You know, we hear a lot about discrimination in AI, particularly when sort of ChatGPT first launched into the market and other AI tools, they were being trained on not necessarily good quality data. And therefore, because of that, They were starting to give poor outcomes, poor results, whether that was kind of abusing the end users, which a lot of people had fun trying to get them to do that, or more serious things like, for example, I know AWS, the Amazon recruitment folks there, had issues where they were training their data. They were using AI to speed up the recruitment process, but the way that they trained the data, they trained it on good CVs, which they had more men. applying and more males in technological roles, which is an issue we still have. And because of that, it started to pick up bias against women and started rejecting female applicants and female CVs where people mentioned that they were in women in technology groups or whatever it may be. So you have to be really careful in terms of the quality of the data that you are using, or you can really go off on tangents.
Rufus Grig
That's really interesting that I tend to think of quality data as each individual piece of data is quality. In that Amazon example that you gave, each of those individual pieces of data were fine in their own. It was all the bits that were missing that was the problem. So the quality problem is an extent or a size of data set or a completeness of data set?
Dawn Kelso
Yeah. So there's all sorts of different chunks that make up data quality. So there's the accuracy, there's the data compliant, is it appropriate? Is it relevant? Is it up to date? Are you labeling and classifying the data to help your AI models work through it as well? So there are all sorts of nuances. And the best way to find these nuances is by trying it out and breaking it and getting core results and then realizing, oh, just like Amazon, you know, they obviously, they weren't trying to get their recruitment AI to recruit more men. They were just trying to get it to recruit good candidates. But so there's lots and lots of different things that you have to think about. And there's a real key thing around. something called feature selection. So is the data that we're using contextually relevant for the outputs and the outcomes that we want to achieve? And that really helps with focusing the kind of data that you're using for the outcomes that you're looking for.
Rufus Grig
And that to me is a really, it shows up just the skills and experience that, are we talking about a data scientist? Is this a data scientist role to make sure that you get that feature selection right?
Dawn Kelso
Yeah, very much so. It can be, I mean, these days data job titles merge and change and shift, and they are continuing to merge and change and shift. But certainly, you know, data scientist is a really good place to start because that's somebody who's used to really drilling into the minutiae of data and where it's at. But equally, it's having really clear, straightforward, this is our objective, you know, this is what we want to do, a real clear use case before you get started.
Rufus Grig
But it is a really specialist piece of expertise. I mean, if I think back to my early days, a data expert was somebody who could write a really complex SQL query and knew what a NINAR join was, but they didn't have to understand the world or the universe in which they were doing. They had a very specific, you know, discover this, produce this report. And now we're expecting people to have a... a broad macroscopic understanding of the problem they're solving and the ability to interpret how do I get this this data right yeah absolutely and like with all things in data and you know back when I was the girl writing the SQL queries there was still always that challenge about making sure that the data was relevant and up to date and that remains a major issue today whether you're talking about AI or more traditional kind of data and analytics you need to have that kind of review and monitor that sort of, you know, in the AI world, we talk about having a human in the loop, someone who's checking and revisiting, is this still fit for purpose? Because you end up in this kind of this data swamp, this AI swamp where things can self-perpetuate and get worse and worse and worse and very much in the AI space. If you aren't making sure that that data is right, and you're honing and refining the results, you can really go off in an interesting tangent that may not be one that you want.
Kyle Ansari
Yeah, yeah. I mean, that's really interesting. I guess another aspect must be the security of the data. You know, what data am I training on? And some of the pitfalls of training on data that is private or confidential. Talk about that a little, if you would.
Rufus Grig
Yeah, absolutely. So making sure that the data you're using is appropriate in terms of the content of it. So if you have PII, personally identifiable information in there, you know. Why are you using that? Do you have a legitimate reason to be using that? And particularly when it comes to biases in terms of gender, in terms of race, etc., etc., these are all real sensitive data topics and potentially things that should be stripped out. And often there is that kind of positive and negative bias that you can take out, or if you want the positive bias, you can put in. But absolutely, that kind of making sure that... You've got the right data in there is really key. And sensitive information, when it's exposed to an AI tool, and you may not realize you've exposed it to an AI tool, and that's where the real danger can come. If you set an AI tool over your SharePoint estate, for example, you will probably find CVs that people have applied years and years ago to roles that you're advertising. That's just one random example. But we find that when we're looking at HR, there's spreadsheets, you know. So that data... that is held all over the place. And it's not realised how actually, how accessible that is. When you point a tool at it, that tool that has that access switched on can go off and get all sorts of information that it really shouldn't. So there's a wider piece around the security of your data estate before you start setting any of these tools into motion.
Kyle Ansari
That's really interesting. And I guess there's also the fact that in, you know, back to my ancient history, It's not all rows of data in neatly organised tables. You've talked about documents, CVs on SharePoint and Excel spreadsheets and probably videos and call recordings. It's all data, yeah?
Rufus Grig
Absolutely, absolutely. The world we're in now with data, you've got structured, unstructured, semi-structured data, so it's anything and everything. Should you have those CCTV recordings? What does your data retention policy look like? All of that kind of thing. feeds into what you are using AI for and how you're potentially exposing yourself if you're not careful.
Kyle Ansari
And how do you manage that? I mean, are there principles around how you, you talked about labeling earlier and classifying, you know, whose role in an organization is it? Is that a CISO role? Is it a privacy person role? Is it a data person role?
Rufus Grig
I think your InfoSecs will definitely have something to say about it. Your data protection officers will have something to say about it. In Europe, we're looking at GDPR regulations. So, of course, there's all sorts of key things in there about how long you should be keeping data for and what you should be using data for. So there's a responsibility there. There's something about, as I said earlier, making sure that when you're pre-processing your data, you're using the right data sets. So pointing the AI tools that you're using at specific chunks of data is really key to make sure you're getting... the results, the outcomes that you want. So it varies dramatically from use case to use case. And that's where you can have some really kind of nailed down, beautifully clean applications and some really broad, messy ones, depending on what you're doing.
Kyle Ansari
Okay. Really interesting. And I guess even from a sort of more pure data AI type model, you've got to pick the right algorithm. I mean, in the same way that David found some services. do a better job of some translations than others. Just how do you go about understanding even what type of AI algorithm or model you should be using to achieve a particular outcome?
Rufus Grig
There are different algorithms, there are different off-the-shelf models. So things like Azure OpenAI, for example, there are language translation, image search, all sorts of tools there where you can actually pick something off the shelf to use. There are things like AI Studio, which will let you test different algorithms and look at the outputs and see which is the right fit for you based on those outputs and those test cases you put through. Or you can also, you know, the internet is full of decision trees in terms of taking you through what the options are. You know, what are you trying to achieve? What are the most important things for you, whether it's speed, accuracy, et cetera, et cetera. And you can go through all of those kind of decision trees in order to kind of to work it out. But yes, your mileage may vary. There are a lot of options.
Kyle Ansari
Yeah, I'm sure. And then once you have got the thing built and deployed, I mean, I'm guessing you don't just. turn you back on it and go on with the next thing. There's a role to make sure that it continues to do the right thing. Can you just explain some of those sorts of aspects?
Rufus Grig
Yeah, absolutely. So review and monitor is a real key thing. You know, when we were talking about machine learning, which was the first kind of AI thing that certainly for me in the data platform world became something real and tangible that we were using, the real key thing with machine learning was... It's learning. It's constantly learning. And you need to go back and re-evaluate it and make sure that the way that it is processing data now is the way that you still want it to process data. So making sure in the AI world that you're reviewing the outputs, you're monitoring them. The things that David talked about earlier in terms of how data is being translated, are we looking at that? Are we checking? Have we got a human in the loop who's going, yes, this is still doing the right thing, or hang on a minute. this is starting to veer off and we need to give this more thought. And of course, tools are constantly evolving. So there's constantly new versions coming out. If you're using cloud-based tools, then you don't generally have an option. You get the update whether you like it or not. But there are always things that we need to review. There are improvements, there are changes. So constantly revisiting what you're doing with AI tooling to make sure it is still fit for purpose, it's still doing the right thing. And having that human in the loop is a real key part of that. You know, you can set AI off and let it do its thing and you can monitor it with all sorts of tools. But every so often you need a person to have a look at it and just check.
Kyle Ansari
Isn't it nice that there's still a role for a person after all this time? All of you, thank you. This has been brilliant. Thank you, Kyle. Thank you, David. And thank you, Dawn. It's been absolutely fascinating hearing from people who really do this stuff and really bring this AI stuff to life. If you've been interested... in what we've had to say here on The Learning Curve, then please do get in touch and tell us what you think. You can find out more about Curve and about AI in general by visiting curve.com. And in fact, in our final episode of this Generative AI series, we're going to do something a bit different. I'm going to be joined by Will Dorrington, who was my guest right at the beginning of all of this. And we're going to look into some of the immediate future for Generative AI, particularly looking at the roles of agents and agentic AI. But we're also going to keep a bit of time back to answer some of your questions. So if you have a question on AI, generative AI, co-pilots, hallucinations, ethics, privacy, environmental impact, or quite literally anything, then please do get in touch. You can put a comment wherever you get your podcast and we'll pick that up. Or you can email us on hello at Curve.com. So do look out for that episode. If you subscribe and tell all your friends, hopefully you'll get it notified in your podcast player of your choice. Until then. thank you to my guests thank you for listening and until next time goodbye

Embed

Copy link