undefined cover
undefined cover
The Importance of Decomposition in Prompt Engineering cover
The Importance of Decomposition in Prompt Engineering cover
The Prompt Desk

The Importance of Decomposition in Prompt Engineering

The Importance of Decomposition in Prompt Engineering

16min |17/04/2024
Play
undefined cover
undefined cover
The Importance of Decomposition in Prompt Engineering cover
The Importance of Decomposition in Prompt Engineering cover
The Prompt Desk

The Importance of Decomposition in Prompt Engineering

The Importance of Decomposition in Prompt Engineering

16min |17/04/2024
Play

Description

As AI language models become increasingly powerful, more and more people are discovering the potential of well-crafted prompts to boost productivity and solve complex problems. Justin and Brad share their expertise on designing effective prompts that adhere to rigorous engineering principles. They highlight real-world examples of prompts that have yielded impressive results and provide valuable insights into the art and science of prompt engineering. We discuss decomposing problems into manageable steps, offering strategies to optimize this process for maximum efficiency.

Check out the latest episode of The Prompt Desk to learn more about decomposition and prompts.

Continue listening to The Prompt Desk Podcast for everything LLM & GPT, Prompt Engineering, Generative AI, and LLM Security.
Check out PromptDesk.ai for an open-source prompt management tool.
Check out Brad’s AI Consultancy at bradleyarsenault.me
Add Justin Macorin and Bradley Arsenault on LinkedIn.

Please fill out our listener survey here to help us create a better podcast: https://docs.google.com/forms/d/e/1FAIpQLSfNjWlWyg8zROYmGX745a56AtagX_7cS16jyhjV2u_ebgc-tw/viewform?usp=sf_link


Hosted by Ausha. See ausha.co/privacy-policy for more information.

Transcription

  • Speaker #0

    Dive deep into the realm of large language models, prompt engineering, and best practices. With over 25 years of combined AI and product engineering experience, here are your hosts, Bradley Arsenault and Justin Macaron.

  • Speaker #1

    Good evening, Justin.

  • Speaker #2

    Hello, Brad.

  • Speaker #1

    Justin, you know, a situation came up recently where... We had to make the choice between do we try to do everything all in one prompt or do we do break it apart into six prompts? And, you know, there's this like trade off where if we do it all in one prompt, we're going to lose accuracy. It's kind of like not the best practice, really, but it's a little bit faster. Whereas if we break it apart and do the six prompts, you get better results, more predictable results. And I was just thinking, this trade-off shows up in software a lot, doesn't it? Where we kind of have pressures that push us to write this kind of crappy code. But whereas we have best practices of software design. that are kind of similar in the prompt design as well, breaking things apart, kind of focusing on each function on one thing at a time. Like, have you found that as well, that there's a lot of this software design in prompt design as well?

  • Speaker #2

    Yeah, I think it's very similar. I think that, you know, that guy that introduced me to a while back, I love his YouTube videos. His name is Dave Farley. And he talks about these best practices, right? Do not repeat yourself, decomposition, making things smaller, you know, making things iterative. And this is a concept that's not new. It's something that's been around for the last 60 years, right? IBM started it, you know, in the 1970s or even earlier. And I think it's still very applicable in 2024 when we're building very complex AI applications. And I actually don't think there's a way. to build very complex AI applications around it. We must make sure that the components we build are small, testable, reusable. And I have a hard time seeing it any other way. And to your point, you know, when presenting with a problem, six prompts or one. I tend, you know, to go down the six prompt route because we'll be able to write them better, test them better, you know, and iterate.

  • Speaker #1

    Like along a lot of metrics, the six prompt result is better. And only along a very narrow set of kind of like incoherent metrics might you possibly choose the one prompt design. And it's like, that's the weird design. you know, the best practice is clearly to break it apart. And I would actually, while you were talking there, I was just thinking that should we be thinking about these prompts literally as, as if they were functions in the code, but just with a different programming language, it's like the programming language of like English.

  • Speaker #2

    Absolutely. I think I, I, you know, when we take a look at a function in source code, whether that's, you know, java or javascript or or c-sharp or python we rarely look at a hundred line function and say wow you know this is a great function we i i think that as engineers we kind of gravitate and prefer functions that are smaller in nature because we can read them better we understand them better we know what the input is we know what the output is the transformation that's happening in the function that that That data transformation is a little bit more clear. And I think that prompt, because they're, you know, mostly English, natural language, as humans, we're much better suited to understand smaller, you know, sentences, paragraphs than larger ones. And as a result, we make better decisions.

  • Speaker #1

    It focuses important. We can only keep so much in our brain at the same time. And I guess the model is the same in a lot of ways. It also has a fixed capacity that whether you give it a larger problem or a smaller problem, there's only the same number of neurons in the neural network.

  • Speaker #2

    So I guess from a programming best practice, I mentioned, you know, do not repeat yourself. Try. where we can kind of like reuse these props. I discussed a bit about, you know, decomposition where one...

  • Speaker #1

    Hold on, let's go into that. So dry. So you're... A suggestion here is like you have a prompt that maybe comes up with hashtags. Do you think that prompt should be reused throughout your company like does he do or your product and in different ways like you would have one standard hashtag prompt like let's go into that concept a little bit here.

  • Speaker #2

    I think so. I think that as an organization or as an individual or as a business, you're spending money to build source code. You're spending money to build functions. You're spending money to build AI features. If you build it and it works, why wouldn't we reuse it? right engineering effort has already gone into it the the the nuances the scope all that kind of of hard work has already gone into it why aren't we reusing this kind of stuff so i'm i'm a big fan of prompt reusability i yeah

  • Speaker #1

    there could be reasons not to reuse it and like that would be like another important programming best practice um you which is like the single stakeholder rule or the single responsibility rule, which is that the code should only have one interested party or like have serve only one master. And like the reason is, is that like, let's say you have a prompt that does hashtags, you know, the hashtags that are needed by your company's marketing department might be very different than what's needed by your HR department for their recruiting. And so if you have one hashtag master prompt, it might not be serving the needs of these different departments very well. So that's another kind of interesting programming best practice that could bleed into prompt design.

  • Speaker #2

    Yeah, no, for sure. If there are two different business units that need very different hashtags. there's definitely a case there where, where we may need different functions or different prompts.

  • Speaker #1

    Here's the real challenge is that in the beginning, they might need the same prompt. And then, so like, let's say the same prompt works for both departments in the beginning. Then one of the departments unbeknownst to the other comes to you and is like, Hey, can we modify that prompt and like to serve our needs a little bit better? And suddenly The prompt is impacting this other department that just gets swept in the ring because you've reused a prompt. And then, you know, various bureaucratic processes pull it towards the marketing department and pisses off the HR department.

  • Speaker #2

    I think that maybe, you know, versioning, that's where versioning and source code control come in. I don't think, you know, that these proms should be, you know, single version. I think that there should be multiple versions. Almost like Docker containers, right? Where we have the latest or we could also go back and like check out the, you know, specific Docker container images. Very similar to that. So I guess another component over here is. Single responsibility, which you just discussed, keeping things small that I think we all discussed. Is there anything else in this program?

  • Speaker #1

    I think we can dive a little further into decomposition. That's another element of that project, which is decomposing things makes it a lot easier for the AI to execute and produce the result. And so it's like all of these different programming best practices seem to apply exactly the same in prompt engineering.

  • Speaker #2

    they really do. And I think that there are benefits here, right? So we're not just doing this because it's fun. We're not just, you know, decoupling these prompts or these functions because it's fun. I think that there's tangible, real benefit to it. And I think it's been proven throughout history. And let's just go down this list of, you know, benefits. Are smaller functions easier to test than larger ones? Let's start off there.

  • Speaker #1

    smaller prompts smaller prompts smaller functions same thing especially if the prompt output is short because then i can just like eyeball a bunch of outputs i'm like good good good good good good oh okay that one was bad okay good good good so like it's nice when the the outputs are really small on

  • Speaker #2

    the prompt or are they easier to build are are smaller prompts easier to build than really big

  • Speaker #1

    yeah yeah yeah it's you get straight to the point it usually works like you don't have to do any messy you know when you have to like copy an instruction like three times it's like keep it short no keep it short keep it shorter you know eventually it listens like you when you have smaller prompts it just seems to work a lot better Okay.

  • Speaker #2

    And so now we have prompts that are really easy to test. We have prompts that are really easy to build. What about monitoring? I feel that the whole monitoring process, right? You mentioned monitoring small outputs, eyeballing the results. I assume that using maybe even a third-party library or a separate tool to do monitoring would be a lot easier to do with smaller outputs, smaller prompts.

  • Speaker #1

    Yes, I think so, that it's going to be a lot easier to feed that data around to other systems, feed that into secondary models, if you want to classify the output to, you know, check for error conditions or weird outputs. Absolutely. Whereas like, if you have one big prompt, there's, there's like going to be a bottleneck there, you have to produce all this output, then you have to feed all of that into a secondary model. You know, it doesn't, it doesn't streamline very well.

  • Speaker #2

    One of the things I like to do a lot of is iterate. So, and I think everybody does a lot of iteration when they build their prompts. They write, you know, the first initial prompt, they see it work, then they'll go back in time and they'll kind of like rerun it and rerun it and rerun it until they see it, you know, really work well. And that troubleshooting. for me personally, seems to work a lot better when I'm working with smaller prompts because there are fewer words to tweak. You mentioned, you know, make it shorter, make it shorter, no, make it shorter. It's a lot easier to just say it once with a small prompt. Also,

  • Speaker #1

    it just like takes like a lot less time because like if you have a big prompt, but you're just trying to modify one section of it, but you have to rerun that whole giant output. these things are slow. That's like the most annoying part. It's killing me right now how slow these things are, even with the turbo models. Yeah.

  • Speaker #2

    These large language models are not fast. They are slow. They're magical, but they're also pretty slow. They're damn slow. One of the techniques over here to increase speed is fine-tuning. We could fine-tune AR models. We could fine-tune classification models. We could fine-tune, you know, open-source models. And fine-tuning is also... a lot easier to do when dealing with a very narrow subset, with a very narrow scope of data. Is it?

  • Speaker #1

    Is it? Okay, so let's dive into that. What makes fine-tuning easier with the smaller data?

  • Speaker #2

    So if we take a look at binary classification, for example, or even multi-class classification, or let's say we take a look at named entity recognition, right? All these use cases are very focused. in one very narrow scope of work. And that is to classify something for use case A or extract something for use case B. We're not trying to boil the ocean over here. We're doing one thing really, really well. And when we build these prompts in a really small way, we're able to log the input and output. We're able to monitor the input and output, measure it, and ultimately collect enough data over time to be able to fine tune these models that one, not only potentially operates a lot faster, but also probably does a better job from an accuracy perspective.

  • Speaker #1

    Yep. Yep. Absolutely. Everything's aligned with the breaking the prompts apart.

  • Speaker #2

    So I guess to conclude, that ultimately leads to a higher quality.

  • Speaker #1

    Let's touch on really quickly why. why aren't people doing the following these principles? Like what, what in, in what scenarios are people just trying to make these like super mega prompts or they're not, they're not breaking things apart in terms of requirements. Like what, what leads them to not follow these best practices? First of all,

  • Speaker #2

    I love the term mega prompt. I, I, I think that sometimes, you know, as engineers, it's easy to be led in a certain direction, especially with the hype going on in the market right now and different, you know, players, companies that are coming out with new technologies, new tools every week that are kind of promising the world. And I think it's very easy to kind of... Get in this space where we truly believe that everything's magic and we try to make magic happen with one of these mega props. And I don't think that this is very magical. I think that this is just pure software engineering. And as a result, I think that we shouldn't be. swayed too, too much with other people's marketing efforts. And I think that we should stay true to true programming best practices. And that is truly to make sure that the things that we build are modular and functional in nature. And I don't know if you feel the same way. I'd actually love to hear your take on it.

  • Speaker #1

    I think that actually it's interesting how much of our software best practices seem to also apply to prompt engineering. And I'm wondering, and we're kind of out of time here, but we should go into another episode and we should discuss some other software concepts like inheritance. Does inheritance apply to prompts? Give me your quick take before we close out here.

  • Speaker #2

    I think it does. I think that, you know, we've already discussed schemas and, you know, how prompts should or how functions should be able to consume data. I think that, you know, you're getting me on the spot over here with all inheritance questions, but I definitely think that many programming best practices. also play a role in prompt engineer. All right.

  • Speaker #1

    Let's end it there. All right, man.

  • Speaker #2

    Brad, always a pleasure to chat with you. Take care. Until next time.

  • Speaker #1

    Great chatting with you, Justin. Have a good evening. Bye-bye.

  • Speaker #0

    Thank you for joining us. If you've enjoyed today's episode, hit subscribe and stay updated on our latest content. We appreciate your support.

Description

As AI language models become increasingly powerful, more and more people are discovering the potential of well-crafted prompts to boost productivity and solve complex problems. Justin and Brad share their expertise on designing effective prompts that adhere to rigorous engineering principles. They highlight real-world examples of prompts that have yielded impressive results and provide valuable insights into the art and science of prompt engineering. We discuss decomposing problems into manageable steps, offering strategies to optimize this process for maximum efficiency.

Check out the latest episode of The Prompt Desk to learn more about decomposition and prompts.

Continue listening to The Prompt Desk Podcast for everything LLM & GPT, Prompt Engineering, Generative AI, and LLM Security.
Check out PromptDesk.ai for an open-source prompt management tool.
Check out Brad’s AI Consultancy at bradleyarsenault.me
Add Justin Macorin and Bradley Arsenault on LinkedIn.

Please fill out our listener survey here to help us create a better podcast: https://docs.google.com/forms/d/e/1FAIpQLSfNjWlWyg8zROYmGX745a56AtagX_7cS16jyhjV2u_ebgc-tw/viewform?usp=sf_link


Hosted by Ausha. See ausha.co/privacy-policy for more information.

Transcription

  • Speaker #0

    Dive deep into the realm of large language models, prompt engineering, and best practices. With over 25 years of combined AI and product engineering experience, here are your hosts, Bradley Arsenault and Justin Macaron.

  • Speaker #1

    Good evening, Justin.

  • Speaker #2

    Hello, Brad.

  • Speaker #1

    Justin, you know, a situation came up recently where... We had to make the choice between do we try to do everything all in one prompt or do we do break it apart into six prompts? And, you know, there's this like trade off where if we do it all in one prompt, we're going to lose accuracy. It's kind of like not the best practice, really, but it's a little bit faster. Whereas if we break it apart and do the six prompts, you get better results, more predictable results. And I was just thinking, this trade-off shows up in software a lot, doesn't it? Where we kind of have pressures that push us to write this kind of crappy code. But whereas we have best practices of software design. that are kind of similar in the prompt design as well, breaking things apart, kind of focusing on each function on one thing at a time. Like, have you found that as well, that there's a lot of this software design in prompt design as well?

  • Speaker #2

    Yeah, I think it's very similar. I think that, you know, that guy that introduced me to a while back, I love his YouTube videos. His name is Dave Farley. And he talks about these best practices, right? Do not repeat yourself, decomposition, making things smaller, you know, making things iterative. And this is a concept that's not new. It's something that's been around for the last 60 years, right? IBM started it, you know, in the 1970s or even earlier. And I think it's still very applicable in 2024 when we're building very complex AI applications. And I actually don't think there's a way. to build very complex AI applications around it. We must make sure that the components we build are small, testable, reusable. And I have a hard time seeing it any other way. And to your point, you know, when presenting with a problem, six prompts or one. I tend, you know, to go down the six prompt route because we'll be able to write them better, test them better, you know, and iterate.

  • Speaker #1

    Like along a lot of metrics, the six prompt result is better. And only along a very narrow set of kind of like incoherent metrics might you possibly choose the one prompt design. And it's like, that's the weird design. you know, the best practice is clearly to break it apart. And I would actually, while you were talking there, I was just thinking that should we be thinking about these prompts literally as, as if they were functions in the code, but just with a different programming language, it's like the programming language of like English.

  • Speaker #2

    Absolutely. I think I, I, you know, when we take a look at a function in source code, whether that's, you know, java or javascript or or c-sharp or python we rarely look at a hundred line function and say wow you know this is a great function we i i think that as engineers we kind of gravitate and prefer functions that are smaller in nature because we can read them better we understand them better we know what the input is we know what the output is the transformation that's happening in the function that that That data transformation is a little bit more clear. And I think that prompt, because they're, you know, mostly English, natural language, as humans, we're much better suited to understand smaller, you know, sentences, paragraphs than larger ones. And as a result, we make better decisions.

  • Speaker #1

    It focuses important. We can only keep so much in our brain at the same time. And I guess the model is the same in a lot of ways. It also has a fixed capacity that whether you give it a larger problem or a smaller problem, there's only the same number of neurons in the neural network.

  • Speaker #2

    So I guess from a programming best practice, I mentioned, you know, do not repeat yourself. Try. where we can kind of like reuse these props. I discussed a bit about, you know, decomposition where one...

  • Speaker #1

    Hold on, let's go into that. So dry. So you're... A suggestion here is like you have a prompt that maybe comes up with hashtags. Do you think that prompt should be reused throughout your company like does he do or your product and in different ways like you would have one standard hashtag prompt like let's go into that concept a little bit here.

  • Speaker #2

    I think so. I think that as an organization or as an individual or as a business, you're spending money to build source code. You're spending money to build functions. You're spending money to build AI features. If you build it and it works, why wouldn't we reuse it? right engineering effort has already gone into it the the the nuances the scope all that kind of of hard work has already gone into it why aren't we reusing this kind of stuff so i'm i'm a big fan of prompt reusability i yeah

  • Speaker #1

    there could be reasons not to reuse it and like that would be like another important programming best practice um you which is like the single stakeholder rule or the single responsibility rule, which is that the code should only have one interested party or like have serve only one master. And like the reason is, is that like, let's say you have a prompt that does hashtags, you know, the hashtags that are needed by your company's marketing department might be very different than what's needed by your HR department for their recruiting. And so if you have one hashtag master prompt, it might not be serving the needs of these different departments very well. So that's another kind of interesting programming best practice that could bleed into prompt design.

  • Speaker #2

    Yeah, no, for sure. If there are two different business units that need very different hashtags. there's definitely a case there where, where we may need different functions or different prompts.

  • Speaker #1

    Here's the real challenge is that in the beginning, they might need the same prompt. And then, so like, let's say the same prompt works for both departments in the beginning. Then one of the departments unbeknownst to the other comes to you and is like, Hey, can we modify that prompt and like to serve our needs a little bit better? And suddenly The prompt is impacting this other department that just gets swept in the ring because you've reused a prompt. And then, you know, various bureaucratic processes pull it towards the marketing department and pisses off the HR department.

  • Speaker #2

    I think that maybe, you know, versioning, that's where versioning and source code control come in. I don't think, you know, that these proms should be, you know, single version. I think that there should be multiple versions. Almost like Docker containers, right? Where we have the latest or we could also go back and like check out the, you know, specific Docker container images. Very similar to that. So I guess another component over here is. Single responsibility, which you just discussed, keeping things small that I think we all discussed. Is there anything else in this program?

  • Speaker #1

    I think we can dive a little further into decomposition. That's another element of that project, which is decomposing things makes it a lot easier for the AI to execute and produce the result. And so it's like all of these different programming best practices seem to apply exactly the same in prompt engineering.

  • Speaker #2

    they really do. And I think that there are benefits here, right? So we're not just doing this because it's fun. We're not just, you know, decoupling these prompts or these functions because it's fun. I think that there's tangible, real benefit to it. And I think it's been proven throughout history. And let's just go down this list of, you know, benefits. Are smaller functions easier to test than larger ones? Let's start off there.

  • Speaker #1

    smaller prompts smaller prompts smaller functions same thing especially if the prompt output is short because then i can just like eyeball a bunch of outputs i'm like good good good good good good oh okay that one was bad okay good good good so like it's nice when the the outputs are really small on

  • Speaker #2

    the prompt or are they easier to build are are smaller prompts easier to build than really big

  • Speaker #1

    yeah yeah yeah it's you get straight to the point it usually works like you don't have to do any messy you know when you have to like copy an instruction like three times it's like keep it short no keep it short keep it shorter you know eventually it listens like you when you have smaller prompts it just seems to work a lot better Okay.

  • Speaker #2

    And so now we have prompts that are really easy to test. We have prompts that are really easy to build. What about monitoring? I feel that the whole monitoring process, right? You mentioned monitoring small outputs, eyeballing the results. I assume that using maybe even a third-party library or a separate tool to do monitoring would be a lot easier to do with smaller outputs, smaller prompts.

  • Speaker #1

    Yes, I think so, that it's going to be a lot easier to feed that data around to other systems, feed that into secondary models, if you want to classify the output to, you know, check for error conditions or weird outputs. Absolutely. Whereas like, if you have one big prompt, there's, there's like going to be a bottleneck there, you have to produce all this output, then you have to feed all of that into a secondary model. You know, it doesn't, it doesn't streamline very well.

  • Speaker #2

    One of the things I like to do a lot of is iterate. So, and I think everybody does a lot of iteration when they build their prompts. They write, you know, the first initial prompt, they see it work, then they'll go back in time and they'll kind of like rerun it and rerun it and rerun it until they see it, you know, really work well. And that troubleshooting. for me personally, seems to work a lot better when I'm working with smaller prompts because there are fewer words to tweak. You mentioned, you know, make it shorter, make it shorter, no, make it shorter. It's a lot easier to just say it once with a small prompt. Also,

  • Speaker #1

    it just like takes like a lot less time because like if you have a big prompt, but you're just trying to modify one section of it, but you have to rerun that whole giant output. these things are slow. That's like the most annoying part. It's killing me right now how slow these things are, even with the turbo models. Yeah.

  • Speaker #2

    These large language models are not fast. They are slow. They're magical, but they're also pretty slow. They're damn slow. One of the techniques over here to increase speed is fine-tuning. We could fine-tune AR models. We could fine-tune classification models. We could fine-tune, you know, open-source models. And fine-tuning is also... a lot easier to do when dealing with a very narrow subset, with a very narrow scope of data. Is it?

  • Speaker #1

    Is it? Okay, so let's dive into that. What makes fine-tuning easier with the smaller data?

  • Speaker #2

    So if we take a look at binary classification, for example, or even multi-class classification, or let's say we take a look at named entity recognition, right? All these use cases are very focused. in one very narrow scope of work. And that is to classify something for use case A or extract something for use case B. We're not trying to boil the ocean over here. We're doing one thing really, really well. And when we build these prompts in a really small way, we're able to log the input and output. We're able to monitor the input and output, measure it, and ultimately collect enough data over time to be able to fine tune these models that one, not only potentially operates a lot faster, but also probably does a better job from an accuracy perspective.

  • Speaker #1

    Yep. Yep. Absolutely. Everything's aligned with the breaking the prompts apart.

  • Speaker #2

    So I guess to conclude, that ultimately leads to a higher quality.

  • Speaker #1

    Let's touch on really quickly why. why aren't people doing the following these principles? Like what, what in, in what scenarios are people just trying to make these like super mega prompts or they're not, they're not breaking things apart in terms of requirements. Like what, what leads them to not follow these best practices? First of all,

  • Speaker #2

    I love the term mega prompt. I, I, I think that sometimes, you know, as engineers, it's easy to be led in a certain direction, especially with the hype going on in the market right now and different, you know, players, companies that are coming out with new technologies, new tools every week that are kind of promising the world. And I think it's very easy to kind of... Get in this space where we truly believe that everything's magic and we try to make magic happen with one of these mega props. And I don't think that this is very magical. I think that this is just pure software engineering. And as a result, I think that we shouldn't be. swayed too, too much with other people's marketing efforts. And I think that we should stay true to true programming best practices. And that is truly to make sure that the things that we build are modular and functional in nature. And I don't know if you feel the same way. I'd actually love to hear your take on it.

  • Speaker #1

    I think that actually it's interesting how much of our software best practices seem to also apply to prompt engineering. And I'm wondering, and we're kind of out of time here, but we should go into another episode and we should discuss some other software concepts like inheritance. Does inheritance apply to prompts? Give me your quick take before we close out here.

  • Speaker #2

    I think it does. I think that, you know, we've already discussed schemas and, you know, how prompts should or how functions should be able to consume data. I think that, you know, you're getting me on the spot over here with all inheritance questions, but I definitely think that many programming best practices. also play a role in prompt engineer. All right.

  • Speaker #1

    Let's end it there. All right, man.

  • Speaker #2

    Brad, always a pleasure to chat with you. Take care. Until next time.

  • Speaker #1

    Great chatting with you, Justin. Have a good evening. Bye-bye.

  • Speaker #0

    Thank you for joining us. If you've enjoyed today's episode, hit subscribe and stay updated on our latest content. We appreciate your support.

Share

Embed

You may also like

Description

As AI language models become increasingly powerful, more and more people are discovering the potential of well-crafted prompts to boost productivity and solve complex problems. Justin and Brad share their expertise on designing effective prompts that adhere to rigorous engineering principles. They highlight real-world examples of prompts that have yielded impressive results and provide valuable insights into the art and science of prompt engineering. We discuss decomposing problems into manageable steps, offering strategies to optimize this process for maximum efficiency.

Check out the latest episode of The Prompt Desk to learn more about decomposition and prompts.

Continue listening to The Prompt Desk Podcast for everything LLM & GPT, Prompt Engineering, Generative AI, and LLM Security.
Check out PromptDesk.ai for an open-source prompt management tool.
Check out Brad’s AI Consultancy at bradleyarsenault.me
Add Justin Macorin and Bradley Arsenault on LinkedIn.

Please fill out our listener survey here to help us create a better podcast: https://docs.google.com/forms/d/e/1FAIpQLSfNjWlWyg8zROYmGX745a56AtagX_7cS16jyhjV2u_ebgc-tw/viewform?usp=sf_link


Hosted by Ausha. See ausha.co/privacy-policy for more information.

Transcription

  • Speaker #0

    Dive deep into the realm of large language models, prompt engineering, and best practices. With over 25 years of combined AI and product engineering experience, here are your hosts, Bradley Arsenault and Justin Macaron.

  • Speaker #1

    Good evening, Justin.

  • Speaker #2

    Hello, Brad.

  • Speaker #1

    Justin, you know, a situation came up recently where... We had to make the choice between do we try to do everything all in one prompt or do we do break it apart into six prompts? And, you know, there's this like trade off where if we do it all in one prompt, we're going to lose accuracy. It's kind of like not the best practice, really, but it's a little bit faster. Whereas if we break it apart and do the six prompts, you get better results, more predictable results. And I was just thinking, this trade-off shows up in software a lot, doesn't it? Where we kind of have pressures that push us to write this kind of crappy code. But whereas we have best practices of software design. that are kind of similar in the prompt design as well, breaking things apart, kind of focusing on each function on one thing at a time. Like, have you found that as well, that there's a lot of this software design in prompt design as well?

  • Speaker #2

    Yeah, I think it's very similar. I think that, you know, that guy that introduced me to a while back, I love his YouTube videos. His name is Dave Farley. And he talks about these best practices, right? Do not repeat yourself, decomposition, making things smaller, you know, making things iterative. And this is a concept that's not new. It's something that's been around for the last 60 years, right? IBM started it, you know, in the 1970s or even earlier. And I think it's still very applicable in 2024 when we're building very complex AI applications. And I actually don't think there's a way. to build very complex AI applications around it. We must make sure that the components we build are small, testable, reusable. And I have a hard time seeing it any other way. And to your point, you know, when presenting with a problem, six prompts or one. I tend, you know, to go down the six prompt route because we'll be able to write them better, test them better, you know, and iterate.

  • Speaker #1

    Like along a lot of metrics, the six prompt result is better. And only along a very narrow set of kind of like incoherent metrics might you possibly choose the one prompt design. And it's like, that's the weird design. you know, the best practice is clearly to break it apart. And I would actually, while you were talking there, I was just thinking that should we be thinking about these prompts literally as, as if they were functions in the code, but just with a different programming language, it's like the programming language of like English.

  • Speaker #2

    Absolutely. I think I, I, you know, when we take a look at a function in source code, whether that's, you know, java or javascript or or c-sharp or python we rarely look at a hundred line function and say wow you know this is a great function we i i think that as engineers we kind of gravitate and prefer functions that are smaller in nature because we can read them better we understand them better we know what the input is we know what the output is the transformation that's happening in the function that that That data transformation is a little bit more clear. And I think that prompt, because they're, you know, mostly English, natural language, as humans, we're much better suited to understand smaller, you know, sentences, paragraphs than larger ones. And as a result, we make better decisions.

  • Speaker #1

    It focuses important. We can only keep so much in our brain at the same time. And I guess the model is the same in a lot of ways. It also has a fixed capacity that whether you give it a larger problem or a smaller problem, there's only the same number of neurons in the neural network.

  • Speaker #2

    So I guess from a programming best practice, I mentioned, you know, do not repeat yourself. Try. where we can kind of like reuse these props. I discussed a bit about, you know, decomposition where one...

  • Speaker #1

    Hold on, let's go into that. So dry. So you're... A suggestion here is like you have a prompt that maybe comes up with hashtags. Do you think that prompt should be reused throughout your company like does he do or your product and in different ways like you would have one standard hashtag prompt like let's go into that concept a little bit here.

  • Speaker #2

    I think so. I think that as an organization or as an individual or as a business, you're spending money to build source code. You're spending money to build functions. You're spending money to build AI features. If you build it and it works, why wouldn't we reuse it? right engineering effort has already gone into it the the the nuances the scope all that kind of of hard work has already gone into it why aren't we reusing this kind of stuff so i'm i'm a big fan of prompt reusability i yeah

  • Speaker #1

    there could be reasons not to reuse it and like that would be like another important programming best practice um you which is like the single stakeholder rule or the single responsibility rule, which is that the code should only have one interested party or like have serve only one master. And like the reason is, is that like, let's say you have a prompt that does hashtags, you know, the hashtags that are needed by your company's marketing department might be very different than what's needed by your HR department for their recruiting. And so if you have one hashtag master prompt, it might not be serving the needs of these different departments very well. So that's another kind of interesting programming best practice that could bleed into prompt design.

  • Speaker #2

    Yeah, no, for sure. If there are two different business units that need very different hashtags. there's definitely a case there where, where we may need different functions or different prompts.

  • Speaker #1

    Here's the real challenge is that in the beginning, they might need the same prompt. And then, so like, let's say the same prompt works for both departments in the beginning. Then one of the departments unbeknownst to the other comes to you and is like, Hey, can we modify that prompt and like to serve our needs a little bit better? And suddenly The prompt is impacting this other department that just gets swept in the ring because you've reused a prompt. And then, you know, various bureaucratic processes pull it towards the marketing department and pisses off the HR department.

  • Speaker #2

    I think that maybe, you know, versioning, that's where versioning and source code control come in. I don't think, you know, that these proms should be, you know, single version. I think that there should be multiple versions. Almost like Docker containers, right? Where we have the latest or we could also go back and like check out the, you know, specific Docker container images. Very similar to that. So I guess another component over here is. Single responsibility, which you just discussed, keeping things small that I think we all discussed. Is there anything else in this program?

  • Speaker #1

    I think we can dive a little further into decomposition. That's another element of that project, which is decomposing things makes it a lot easier for the AI to execute and produce the result. And so it's like all of these different programming best practices seem to apply exactly the same in prompt engineering.

  • Speaker #2

    they really do. And I think that there are benefits here, right? So we're not just doing this because it's fun. We're not just, you know, decoupling these prompts or these functions because it's fun. I think that there's tangible, real benefit to it. And I think it's been proven throughout history. And let's just go down this list of, you know, benefits. Are smaller functions easier to test than larger ones? Let's start off there.

  • Speaker #1

    smaller prompts smaller prompts smaller functions same thing especially if the prompt output is short because then i can just like eyeball a bunch of outputs i'm like good good good good good good oh okay that one was bad okay good good good so like it's nice when the the outputs are really small on

  • Speaker #2

    the prompt or are they easier to build are are smaller prompts easier to build than really big

  • Speaker #1

    yeah yeah yeah it's you get straight to the point it usually works like you don't have to do any messy you know when you have to like copy an instruction like three times it's like keep it short no keep it short keep it shorter you know eventually it listens like you when you have smaller prompts it just seems to work a lot better Okay.

  • Speaker #2

    And so now we have prompts that are really easy to test. We have prompts that are really easy to build. What about monitoring? I feel that the whole monitoring process, right? You mentioned monitoring small outputs, eyeballing the results. I assume that using maybe even a third-party library or a separate tool to do monitoring would be a lot easier to do with smaller outputs, smaller prompts.

  • Speaker #1

    Yes, I think so, that it's going to be a lot easier to feed that data around to other systems, feed that into secondary models, if you want to classify the output to, you know, check for error conditions or weird outputs. Absolutely. Whereas like, if you have one big prompt, there's, there's like going to be a bottleneck there, you have to produce all this output, then you have to feed all of that into a secondary model. You know, it doesn't, it doesn't streamline very well.

  • Speaker #2

    One of the things I like to do a lot of is iterate. So, and I think everybody does a lot of iteration when they build their prompts. They write, you know, the first initial prompt, they see it work, then they'll go back in time and they'll kind of like rerun it and rerun it and rerun it until they see it, you know, really work well. And that troubleshooting. for me personally, seems to work a lot better when I'm working with smaller prompts because there are fewer words to tweak. You mentioned, you know, make it shorter, make it shorter, no, make it shorter. It's a lot easier to just say it once with a small prompt. Also,

  • Speaker #1

    it just like takes like a lot less time because like if you have a big prompt, but you're just trying to modify one section of it, but you have to rerun that whole giant output. these things are slow. That's like the most annoying part. It's killing me right now how slow these things are, even with the turbo models. Yeah.

  • Speaker #2

    These large language models are not fast. They are slow. They're magical, but they're also pretty slow. They're damn slow. One of the techniques over here to increase speed is fine-tuning. We could fine-tune AR models. We could fine-tune classification models. We could fine-tune, you know, open-source models. And fine-tuning is also... a lot easier to do when dealing with a very narrow subset, with a very narrow scope of data. Is it?

  • Speaker #1

    Is it? Okay, so let's dive into that. What makes fine-tuning easier with the smaller data?

  • Speaker #2

    So if we take a look at binary classification, for example, or even multi-class classification, or let's say we take a look at named entity recognition, right? All these use cases are very focused. in one very narrow scope of work. And that is to classify something for use case A or extract something for use case B. We're not trying to boil the ocean over here. We're doing one thing really, really well. And when we build these prompts in a really small way, we're able to log the input and output. We're able to monitor the input and output, measure it, and ultimately collect enough data over time to be able to fine tune these models that one, not only potentially operates a lot faster, but also probably does a better job from an accuracy perspective.

  • Speaker #1

    Yep. Yep. Absolutely. Everything's aligned with the breaking the prompts apart.

  • Speaker #2

    So I guess to conclude, that ultimately leads to a higher quality.

  • Speaker #1

    Let's touch on really quickly why. why aren't people doing the following these principles? Like what, what in, in what scenarios are people just trying to make these like super mega prompts or they're not, they're not breaking things apart in terms of requirements. Like what, what leads them to not follow these best practices? First of all,

  • Speaker #2

    I love the term mega prompt. I, I, I think that sometimes, you know, as engineers, it's easy to be led in a certain direction, especially with the hype going on in the market right now and different, you know, players, companies that are coming out with new technologies, new tools every week that are kind of promising the world. And I think it's very easy to kind of... Get in this space where we truly believe that everything's magic and we try to make magic happen with one of these mega props. And I don't think that this is very magical. I think that this is just pure software engineering. And as a result, I think that we shouldn't be. swayed too, too much with other people's marketing efforts. And I think that we should stay true to true programming best practices. And that is truly to make sure that the things that we build are modular and functional in nature. And I don't know if you feel the same way. I'd actually love to hear your take on it.

  • Speaker #1

    I think that actually it's interesting how much of our software best practices seem to also apply to prompt engineering. And I'm wondering, and we're kind of out of time here, but we should go into another episode and we should discuss some other software concepts like inheritance. Does inheritance apply to prompts? Give me your quick take before we close out here.

  • Speaker #2

    I think it does. I think that, you know, we've already discussed schemas and, you know, how prompts should or how functions should be able to consume data. I think that, you know, you're getting me on the spot over here with all inheritance questions, but I definitely think that many programming best practices. also play a role in prompt engineer. All right.

  • Speaker #1

    Let's end it there. All right, man.

  • Speaker #2

    Brad, always a pleasure to chat with you. Take care. Until next time.

  • Speaker #1

    Great chatting with you, Justin. Have a good evening. Bye-bye.

  • Speaker #0

    Thank you for joining us. If you've enjoyed today's episode, hit subscribe and stay updated on our latest content. We appreciate your support.

Description

As AI language models become increasingly powerful, more and more people are discovering the potential of well-crafted prompts to boost productivity and solve complex problems. Justin and Brad share their expertise on designing effective prompts that adhere to rigorous engineering principles. They highlight real-world examples of prompts that have yielded impressive results and provide valuable insights into the art and science of prompt engineering. We discuss decomposing problems into manageable steps, offering strategies to optimize this process for maximum efficiency.

Check out the latest episode of The Prompt Desk to learn more about decomposition and prompts.

Continue listening to The Prompt Desk Podcast for everything LLM & GPT, Prompt Engineering, Generative AI, and LLM Security.
Check out PromptDesk.ai for an open-source prompt management tool.
Check out Brad’s AI Consultancy at bradleyarsenault.me
Add Justin Macorin and Bradley Arsenault on LinkedIn.

Please fill out our listener survey here to help us create a better podcast: https://docs.google.com/forms/d/e/1FAIpQLSfNjWlWyg8zROYmGX745a56AtagX_7cS16jyhjV2u_ebgc-tw/viewform?usp=sf_link


Hosted by Ausha. See ausha.co/privacy-policy for more information.

Transcription

  • Speaker #0

    Dive deep into the realm of large language models, prompt engineering, and best practices. With over 25 years of combined AI and product engineering experience, here are your hosts, Bradley Arsenault and Justin Macaron.

  • Speaker #1

    Good evening, Justin.

  • Speaker #2

    Hello, Brad.

  • Speaker #1

    Justin, you know, a situation came up recently where... We had to make the choice between do we try to do everything all in one prompt or do we do break it apart into six prompts? And, you know, there's this like trade off where if we do it all in one prompt, we're going to lose accuracy. It's kind of like not the best practice, really, but it's a little bit faster. Whereas if we break it apart and do the six prompts, you get better results, more predictable results. And I was just thinking, this trade-off shows up in software a lot, doesn't it? Where we kind of have pressures that push us to write this kind of crappy code. But whereas we have best practices of software design. that are kind of similar in the prompt design as well, breaking things apart, kind of focusing on each function on one thing at a time. Like, have you found that as well, that there's a lot of this software design in prompt design as well?

  • Speaker #2

    Yeah, I think it's very similar. I think that, you know, that guy that introduced me to a while back, I love his YouTube videos. His name is Dave Farley. And he talks about these best practices, right? Do not repeat yourself, decomposition, making things smaller, you know, making things iterative. And this is a concept that's not new. It's something that's been around for the last 60 years, right? IBM started it, you know, in the 1970s or even earlier. And I think it's still very applicable in 2024 when we're building very complex AI applications. And I actually don't think there's a way. to build very complex AI applications around it. We must make sure that the components we build are small, testable, reusable. And I have a hard time seeing it any other way. And to your point, you know, when presenting with a problem, six prompts or one. I tend, you know, to go down the six prompt route because we'll be able to write them better, test them better, you know, and iterate.

  • Speaker #1

    Like along a lot of metrics, the six prompt result is better. And only along a very narrow set of kind of like incoherent metrics might you possibly choose the one prompt design. And it's like, that's the weird design. you know, the best practice is clearly to break it apart. And I would actually, while you were talking there, I was just thinking that should we be thinking about these prompts literally as, as if they were functions in the code, but just with a different programming language, it's like the programming language of like English.

  • Speaker #2

    Absolutely. I think I, I, you know, when we take a look at a function in source code, whether that's, you know, java or javascript or or c-sharp or python we rarely look at a hundred line function and say wow you know this is a great function we i i think that as engineers we kind of gravitate and prefer functions that are smaller in nature because we can read them better we understand them better we know what the input is we know what the output is the transformation that's happening in the function that that That data transformation is a little bit more clear. And I think that prompt, because they're, you know, mostly English, natural language, as humans, we're much better suited to understand smaller, you know, sentences, paragraphs than larger ones. And as a result, we make better decisions.

  • Speaker #1

    It focuses important. We can only keep so much in our brain at the same time. And I guess the model is the same in a lot of ways. It also has a fixed capacity that whether you give it a larger problem or a smaller problem, there's only the same number of neurons in the neural network.

  • Speaker #2

    So I guess from a programming best practice, I mentioned, you know, do not repeat yourself. Try. where we can kind of like reuse these props. I discussed a bit about, you know, decomposition where one...

  • Speaker #1

    Hold on, let's go into that. So dry. So you're... A suggestion here is like you have a prompt that maybe comes up with hashtags. Do you think that prompt should be reused throughout your company like does he do or your product and in different ways like you would have one standard hashtag prompt like let's go into that concept a little bit here.

  • Speaker #2

    I think so. I think that as an organization or as an individual or as a business, you're spending money to build source code. You're spending money to build functions. You're spending money to build AI features. If you build it and it works, why wouldn't we reuse it? right engineering effort has already gone into it the the the nuances the scope all that kind of of hard work has already gone into it why aren't we reusing this kind of stuff so i'm i'm a big fan of prompt reusability i yeah

  • Speaker #1

    there could be reasons not to reuse it and like that would be like another important programming best practice um you which is like the single stakeholder rule or the single responsibility rule, which is that the code should only have one interested party or like have serve only one master. And like the reason is, is that like, let's say you have a prompt that does hashtags, you know, the hashtags that are needed by your company's marketing department might be very different than what's needed by your HR department for their recruiting. And so if you have one hashtag master prompt, it might not be serving the needs of these different departments very well. So that's another kind of interesting programming best practice that could bleed into prompt design.

  • Speaker #2

    Yeah, no, for sure. If there are two different business units that need very different hashtags. there's definitely a case there where, where we may need different functions or different prompts.

  • Speaker #1

    Here's the real challenge is that in the beginning, they might need the same prompt. And then, so like, let's say the same prompt works for both departments in the beginning. Then one of the departments unbeknownst to the other comes to you and is like, Hey, can we modify that prompt and like to serve our needs a little bit better? And suddenly The prompt is impacting this other department that just gets swept in the ring because you've reused a prompt. And then, you know, various bureaucratic processes pull it towards the marketing department and pisses off the HR department.

  • Speaker #2

    I think that maybe, you know, versioning, that's where versioning and source code control come in. I don't think, you know, that these proms should be, you know, single version. I think that there should be multiple versions. Almost like Docker containers, right? Where we have the latest or we could also go back and like check out the, you know, specific Docker container images. Very similar to that. So I guess another component over here is. Single responsibility, which you just discussed, keeping things small that I think we all discussed. Is there anything else in this program?

  • Speaker #1

    I think we can dive a little further into decomposition. That's another element of that project, which is decomposing things makes it a lot easier for the AI to execute and produce the result. And so it's like all of these different programming best practices seem to apply exactly the same in prompt engineering.

  • Speaker #2

    they really do. And I think that there are benefits here, right? So we're not just doing this because it's fun. We're not just, you know, decoupling these prompts or these functions because it's fun. I think that there's tangible, real benefit to it. And I think it's been proven throughout history. And let's just go down this list of, you know, benefits. Are smaller functions easier to test than larger ones? Let's start off there.

  • Speaker #1

    smaller prompts smaller prompts smaller functions same thing especially if the prompt output is short because then i can just like eyeball a bunch of outputs i'm like good good good good good good oh okay that one was bad okay good good good so like it's nice when the the outputs are really small on

  • Speaker #2

    the prompt or are they easier to build are are smaller prompts easier to build than really big

  • Speaker #1

    yeah yeah yeah it's you get straight to the point it usually works like you don't have to do any messy you know when you have to like copy an instruction like three times it's like keep it short no keep it short keep it shorter you know eventually it listens like you when you have smaller prompts it just seems to work a lot better Okay.

  • Speaker #2

    And so now we have prompts that are really easy to test. We have prompts that are really easy to build. What about monitoring? I feel that the whole monitoring process, right? You mentioned monitoring small outputs, eyeballing the results. I assume that using maybe even a third-party library or a separate tool to do monitoring would be a lot easier to do with smaller outputs, smaller prompts.

  • Speaker #1

    Yes, I think so, that it's going to be a lot easier to feed that data around to other systems, feed that into secondary models, if you want to classify the output to, you know, check for error conditions or weird outputs. Absolutely. Whereas like, if you have one big prompt, there's, there's like going to be a bottleneck there, you have to produce all this output, then you have to feed all of that into a secondary model. You know, it doesn't, it doesn't streamline very well.

  • Speaker #2

    One of the things I like to do a lot of is iterate. So, and I think everybody does a lot of iteration when they build their prompts. They write, you know, the first initial prompt, they see it work, then they'll go back in time and they'll kind of like rerun it and rerun it and rerun it until they see it, you know, really work well. And that troubleshooting. for me personally, seems to work a lot better when I'm working with smaller prompts because there are fewer words to tweak. You mentioned, you know, make it shorter, make it shorter, no, make it shorter. It's a lot easier to just say it once with a small prompt. Also,

  • Speaker #1

    it just like takes like a lot less time because like if you have a big prompt, but you're just trying to modify one section of it, but you have to rerun that whole giant output. these things are slow. That's like the most annoying part. It's killing me right now how slow these things are, even with the turbo models. Yeah.

  • Speaker #2

    These large language models are not fast. They are slow. They're magical, but they're also pretty slow. They're damn slow. One of the techniques over here to increase speed is fine-tuning. We could fine-tune AR models. We could fine-tune classification models. We could fine-tune, you know, open-source models. And fine-tuning is also... a lot easier to do when dealing with a very narrow subset, with a very narrow scope of data. Is it?

  • Speaker #1

    Is it? Okay, so let's dive into that. What makes fine-tuning easier with the smaller data?

  • Speaker #2

    So if we take a look at binary classification, for example, or even multi-class classification, or let's say we take a look at named entity recognition, right? All these use cases are very focused. in one very narrow scope of work. And that is to classify something for use case A or extract something for use case B. We're not trying to boil the ocean over here. We're doing one thing really, really well. And when we build these prompts in a really small way, we're able to log the input and output. We're able to monitor the input and output, measure it, and ultimately collect enough data over time to be able to fine tune these models that one, not only potentially operates a lot faster, but also probably does a better job from an accuracy perspective.

  • Speaker #1

    Yep. Yep. Absolutely. Everything's aligned with the breaking the prompts apart.

  • Speaker #2

    So I guess to conclude, that ultimately leads to a higher quality.

  • Speaker #1

    Let's touch on really quickly why. why aren't people doing the following these principles? Like what, what in, in what scenarios are people just trying to make these like super mega prompts or they're not, they're not breaking things apart in terms of requirements. Like what, what leads them to not follow these best practices? First of all,

  • Speaker #2

    I love the term mega prompt. I, I, I think that sometimes, you know, as engineers, it's easy to be led in a certain direction, especially with the hype going on in the market right now and different, you know, players, companies that are coming out with new technologies, new tools every week that are kind of promising the world. And I think it's very easy to kind of... Get in this space where we truly believe that everything's magic and we try to make magic happen with one of these mega props. And I don't think that this is very magical. I think that this is just pure software engineering. And as a result, I think that we shouldn't be. swayed too, too much with other people's marketing efforts. And I think that we should stay true to true programming best practices. And that is truly to make sure that the things that we build are modular and functional in nature. And I don't know if you feel the same way. I'd actually love to hear your take on it.

  • Speaker #1

    I think that actually it's interesting how much of our software best practices seem to also apply to prompt engineering. And I'm wondering, and we're kind of out of time here, but we should go into another episode and we should discuss some other software concepts like inheritance. Does inheritance apply to prompts? Give me your quick take before we close out here.

  • Speaker #2

    I think it does. I think that, you know, we've already discussed schemas and, you know, how prompts should or how functions should be able to consume data. I think that, you know, you're getting me on the spot over here with all inheritance questions, but I definitely think that many programming best practices. also play a role in prompt engineer. All right.

  • Speaker #1

    Let's end it there. All right, man.

  • Speaker #2

    Brad, always a pleasure to chat with you. Take care. Until next time.

  • Speaker #1

    Great chatting with you, Justin. Have a good evening. Bye-bye.

  • Speaker #0

    Thank you for joining us. If you've enjoyed today's episode, hit subscribe and stay updated on our latest content. We appreciate your support.

Share

Embed

You may also like