Writing the right AI prompt for tools like ChatGPT can be the difference between unusable and usable code.
Whether it’s asking ChatGPT to refactor some code, or working alongside a dedicated coding assistant like GitHub Copilot, the capabilities of large language models (LLMs) are increasingly useful for software engineers. These tools can already help developers write, document, troubleshoot, and optimize their code, but only if you prompt them correctly.
We aren’t just talking about students and entry-level engineers tinkering with their side projects, either. Speaking at GitHub’s Universe conference last year, head of engineering at Shopify, Farhan Thawar, claimed that the company had already deployed almost a million lines of AI-written code.
Whether you’re using these tools professionally or not, it’s a good idea to learn how to write better AI prompts – known as prompt engineering – to make the most of LLM-based coding tools.
What are the four components of a good prompt?
To get good results from any AI tool, you need to give it a prompt that it can parse correctly. In his forthcoming book, AI-Assisted Programming: Using GitHub Copilot and ChatGPT for Coding, Tom Taulli identifies four main components of a good code prompt:
- Context sets the stage and specifies the role the AI is to play. For example, telling the AI to act like an experienced Python debugger, or a software engineer specializing in secure application development, will help push the tool in the right direction.
- Instructions are the meat of the prompt. They’re where you give the AI one (and ideally no more) clear command or explanation for what you’d like it to do. For example, you can ask it to write a function, summarize some text, recommend a bug fix, or any other kind of coding task.
- Content is the information you want the AI to work on. If you are asking for some boilerplate code, you might not need to add any. However, if you are asking it to optimize a function, sort a list, or do something with some variables, or anything else, you will need to provide the actual content. To separate it from the other parts of the prompt, it’s best to use delimiters like “”” or ###.
- Format is a final request that tells the AI how you’d like the output. Depending on the prompt, you may want a table, a code snippet, a regex formula, or something else.
Examples of generative AI prompts
If you are trying to debug a function, the prompt could be:
You are an experienced Python debugger. Identify the bug in this code snippet.
Code: “””print(“Hello World.”)”””
Please recommend an updated function that fixes the bug.
Or if you just need some boilerplate code, the prompt could be:
You are a WordPress plugin developer. Write the boilerplate necessary for a basic plugin.
What are the top tools for prompt engineering?
Coding-specific AI tools tend to be easier to prompt than general-purpose tools like ChatGPT. For example, since GitHub Copilot can be used in integrated development environments (IDEs) like Visual Studio and Visual Studio Code, it is able to pick up a lot more context from the rest of your project than a standalone tool. If you are working in a .py file, it understands that any code you ask it to generate needs to be Python, and since it also has access to your existing codebase, can even call the correct variable and functions. Taulli typically uses the in-line prompting feature of Copilot when he wants AI-generated code.
On the other hand, if you are using Google Gemini or ChatGPT to generate code for a project, you will need to prompt it to use Python, as well as adapt any function and variable calls to fit the rest of your project. It’s far from an insurmountable problem if you just need help writing a few lines of code, but if you are looking for the easiest way to integrate AI into your workflow, dedicated tools are the way to go.
Be specific
Chatbots can be impressively conversational, which can make it tempting to send them vague instructions. While the LLM will respond as if it understands what you want, it is really just guessing based on the probabilities it has learned from the training data. Sometimes it will guess right, but sometimes the AI can go off on a complete tangent, or miss some key aspect of what you need.
For example, avoid vague prompts like:
- “Bug fix this code.”
- “Write a function that pulls phone numbers from text.”
- “Create a signup form.”
On the other hand, try something like:
Develop a Python function to parse phone numbers from a text string. The function should be able to identify cell phone numbers, toll-free numbers, and international numbers. If the phone number doesn’t have an international dialing code, it should add it. Include a script showing three examples of the function successfully extracting a phone number from a text string. Add comments to explain the logic of the function and give details on how to run the script.
This is far more likely to give you code that does what you want. Even if it doesn’t, the comments and examples will allow you to see where it has gone wrong so you can correct things.
Work in steps
LLMs tend to work better if you use a single prompt to get them to do a single task, rather than giving them a multi-part prompt that requires multiple tasks.
For example, if you are using ChatGPT to write some code to handle your users updating their account settings, you could try and ask it to write all the functions necessary for users to update their email address, password, and billing address in a single prompt. While ChatGPT will do its best, you’re asking it to perform so many tasks at once that something is likely to slip through the cracks.
On the other hand, if you break the problem down into multiple steps you will get much better results from ChatGPT. If instead you ask it to write a function that allows a user to update their email address then, once that’s done, prompt it again and ask it to write a function that allows a user to update their password, it will almost certainly pull things off. You will also be able to steer it at each step, giving the necessary context to integrate with the rest of your project.
Taulli likens this approach to modular programming, where different functions of a program should be written as individual modules that are each responsible for a single feature. With AI coding tools, you can use the LLM to create a skeleton for your project, and then prompt it to fill in the necessary calls and functions one at a time until you have built what you need.
Use multiple examples
In machine learning, a zero-shot learning task is where an AI is asked to do something based on its training data with no additional examples. This is something that LLMs are getting pretty good at, but they still perform better on few-shot learning tasks, where they rely on their training data as well as a couple of examples.
For example, you could prompt the LLM with something like “write a Python function that normalizes a given list of numbers” and get decent results. But Taulli says you will get even better results with a prompt that includes examples, such as:
Based on the following examples of normalizing a list of numbers to a range of [0, 1]:
1. Input: [2, 4, 6, 8] Output: [0, 0.3333, 0.6667, 1]
2. Input: [5, 10, 15] Output: [0, 0.5, 1]
3. Input: [1, 3, 2] Output: [0, 1, 0.5]
Generate a function in Python that takes a list of numbers as input and returns a list of normalized numbers.
Of course, this is a low-stakes example, but if you are asking an AI to write code that pulls information from your database, sign-up form, or any other specific source, you can get much better code by giving it enough relevant examples to work from.
Ask for advice
As well as writing surprisingly decent code, most LLMs have been trained on a lot of help docs and how-to guides. As a result, they can be surprisingly competent tutors if you are just starting out, learning a new programming language, or trying to expand your skills.
For example, as well as asking ChatGPT to generate boilerplate for a WordPress plugin, you can also ask it to explain what steps you need to take to integrate with your website. Similarly, if you want it to write a script, you can ask for advice on what you need to do to run it. Or if you are integrating APIs from other services, you can ask what you need to do to get things working correctly.
Offload the awkward work
If you are trying to write a novel bit of code for an app you’re developing, prompting an LLM to write something usable might take more time than just writing it yourself. LLMs are great at solving the kinds of problems they’ve seen before, but not necessarily developing brand new code.
Even if you can’t rely on an LLM to write the most important code, there are often smaller, more boring, and generally awkward tasks you can offload, such as writing regular expressions and developing unit tests.
With a well-structured prompt, you can get ChatGPT or Copilot to write some regex that pulls off the exact text transformation you need. Similarly, by giving it some example code, you can ask it to create unit tests that check your app is functioning as expected.
Tips and best practices for prompt engineering
- The output code from an LLM is roughly an aggregate of the relevant code they were trained on. While it generally works, it sometimes isn’t very efficient, maintainable, or secure. However, this is a simple problem to solve. If you want code that is efficient, easy to maintain, or uses better security practices, ask the LLM to write it that way – or even to rewrite a block of code it has already generated.
- Be careful of creating cascading errors in larger projects. If you use an LLM to optimize and rewrite every function you encounter, you might accidentally change a key function call or delete a variable declaration that is relied upon elsewhere. All of a sudden your project will be throwing errors everywhere and fixing it can be a massive pain.
- LLMs are only as good as their training data. With some languages like Python, they can generate really useful code. With older languages that aren’t as widely discussed online, like COBOL or FORTRAN, the quality of code they generate is likely to be lower.
Every LLM has a cutoff date in its training data. For example, GPT-4 has a cutoff date of April 2023. It wasn’t trained on any data after that, so any code it writes is going to be based on what it saw before then. For most coding tasks, it’s not a problem but you should be aware of this if you’re working with the latest features of fast-changing software packages and plugins.
With that in mind, Taulli says he’s had success giving ChatGPT the help docs for plugins it is unfamiliar with and then asking it to answer questions or write code based on them. If you’re struggling with poorly commented code or impenetrable docs, hand them over to ChatGPT and see if it can make sense of things for you.
An art and a science
Prompt engineering is both an art and a science. You need enough of an understanding of how LLMs work to steer them, but at the same time, you have to accept that there are no magic formulas that will work every time with every LLM. The best approach is to play around with your tool of choice, see what it can do, test the boundaries, and validate the results.