Coding with ChatGPT...the good, the bad, and the buggy

Human being curious

05 Jul 2024 — 6 min read

The more I learned about AI, the more I wanted to build things with it. There was just one problem—while I have a casual understanding of several programming languages, I don't have a deep computer science background or lots of syntax committed to memory. Generally speaking, I can envision more complex functionality than I can comfortably build on my own.

When I had a couple of ideas for AI-powered tools, I considered no-code development solutions like Bubble. But I like to know how and why things work, so I can fix them when they go wrong. That's why I decided to build my own code in collaboration with GPT-4. Especially at the beginning, this involved prompting the model to generate some rough code that I would then test, debug, and modify, asking lots of questions along the way. Any issue that I couldn't quickly work through with AI—and there were several—I would research using documentation and developer forums.

Today, I am close to completing version 1.0 of two web apps, both built with Django, Python, PostgreSQL, and JavaScript and incorporating Anthropic's Claude models via API. I did most of the work in about four weeks and learned a lot in the process—both about the technologies I'm working with and how to effectively "pair program" with large language models (LLMs).

The home screen for Good Bloggy, one of the AI-powered tools I've been working on. Logo art from Colorful Grays on Tess

This blog will summarize my experiences along with some helpful tips for getting good code from LLMs, especially if you're new to coding. Also, full disclosure, much of this material was produced by Good Bloggy, an AI wrapper that structures prompts into creative briefs.

1. Expect and embrace bugs.

When building apps with AI chatbots, encountering bugs is not just common—it's inevitable. Models like GPT-4 and Claude can produce code that looks correct at first glance but may contain subtle errors or inconsistencies. These issues can range from simple syntax errors to more complex logical flaws. According to a study by researchers at Stanford, programmers using AI assistants may actually produce code with more security issues and bugs.

And yet, as a fairly new programmer, I found working with AI was still worth it because it was just that much faster. To deal with the bugs, I suggest you:

Always test AI-generated code thoroughly before adding it to your project (see tip #7)
Use debugging tools and techniques to identify and fix issues (see tip #5)
Keep detailed notes on common errors and their solutions

And remember that every bug has something to teach you.

2. Look out for common AI-initiated bugs.

When working with AI-generated code, I noticed some patterns in the bugs I encountered. In particular, I found models were prone to including:

Random changes in variable or function names
Inconsistencies in code structure
Outdated or incompatible code snippets

I also noticed that AI models would become strangely attached to their errors. Even when asked to use, say, correct function names, they often persist in returning the same mistakes over multiple rounds of conversation. When this happened, it was often useful to start an entirely new thread, allowing AI to forget its previous sequence of responses.

3. Read error logs carefully.

While error logs can look intimidating at first, they can help you quickly find and fix bugs. Generally speaking, the better I could interpret error logs, the faster I was able to get code into a functional place.

I found Python error messages to be especially clear and informative. They typically include the error type, a brief description, and a traceback showing where the error occurred. The traceback provides a step-by-step path to how the error happened and directs you to the offending line of code.

Common error types in Python—all of which I've encountered—include:

SyntaxError: Occurs when the code violates Python's syntax rules
IndentationError: Results from incorrect indentation
NameError: Happens when you try to use a variable that hasn't been defined
TypeError: Arises when an operation is performed on an inappropriate data type

When working with AI-generated code, pay special attention to IndentationErrors and SyntaxErrors, as these are common issues in AI-produced Python snippets.

4. Don't skip the documentation.

AI models aren't always up-to-date with the latest developments in programming languages, frameworks, or APIs. This limitation can lead to challenges when integrating AI-generated code into your projects. In fact, even if you give models examples of correct and up-to-date code as part of your prompts, models often resist applying it. Or AI may confuse one framework or API for another.

For example, when I was trying to set up an API connection to Anthropic's Claude model, GPT-4 suggested message and response formats based on standards from OpenAI or older versions of Claude's documentation. I sorted this out by—you guessed it—checking out Anthropic's developer documentation and cookbook on GitHub. Ultimately, I ended up handling most of the API interfaces myself.

5. Treat AI as your debugging partner.

In my experience, asking AI directly to fix a bug often results in the model repeating the same bug verbatim, producing a new bug, or breaking something that does work. Instead of asking the AI for direct fixes, it's often more effective to use it as a debugging partner.

One helpful technique is using print statements to uncover issues. By strategically placing print statements throughout your code, you can track the flow of execution and identify where things go awry. It also helps to ask AI to write code as simply as possible. According to the old programming adage known as Kernighan's law, "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it."

Here's an example of how you can work through a bug with AI:

Me: "[Carefully reads error log.] I'm getting a TypeError in this function. [Provides code for function.] Can you help me understand why?"
AI: "Certainly. Let's add some print statements to track the variable types. [Provides code with print statements.]"
Me: [Runs code.] "Here's what appeared on the console while the function was running. [Shares results of print statements.]"
AI: "Based on the output, it looks like the variable is a string instead of an integer. Let's convert it using int() before the operation. [Shares updated code.]"
Me: "[Carefully reviews and tests the code.] Great, the new code works! Can you refactor it to be a simpler and shorter? And add comments explaining what each section does?"

This approach not only solves the immediate problem but also helps you understand why the bug occurred in the first place.

6. When all else fails, ask a human.

Generally speaking, if I can resolve a bug with AI, it will happen in five interactions or less. If you find yourself going in circles or repeating the same prompts to AI in the hopes of getting a better answer, it's time to take a break and research your problem through developer forums and other human-based sources.

Often, if you don't have a ton of experience with programming, all you'll need to do is search for your problem in forums like GitHub Discussions, Reddit, Stack Overflow, and others. This is because most common problems have been solved many times over. If you're lucky, you may even be able to find an existing library or step-by-solution.

For example, when I struggled to integrate an OpenAI voice into a SwiftUI app, I quickly addressed the issue by using a preexisting Swift package available on GitHub.

7. Test every single time you add AI-generated code to your project.

While AI can crank out code in seconds and save you minutes or hours, blindly cutting and pasting untested code into your project can seriously set you back. When ChatGPT hands me a new block of code, I check it to make sure it uses correct variable and function names and includes comments highlighting what each section does. Then I backup existing code, add the new code, and run a test.

This approach reduces the chances that the AI generated code will break something and prevents errors from compounding.

8. Don't ask AI to do more than one thing at a time.

Whenever I asked ChatGPT to fix multiple problems in one prompt (e.g., "Let's integrate a new text editor and add buttons for exporting to MS Word and PDF."), it would provide an unusable or incomplete response. Instead, giving AI one specific issue to address gets better results. This makes intuitive sense and is consistent with how human developers work.

Practice makes perfect

I've found using chatbots to write code is a great way to get started if you want to prototype an idea but have limited development skills. If you're patient, you will get your project done while learning a lot about different coding languages and how best to use them along the way.

I will be sharing my apps with human users within the next month or two, and I suspect this process will result in a new checklist of bugs to fix and features to build. When that happens, I'm sure I'll be building again with AI.

Reading list

Hackers Laws

The Top Bugs All AI Developer Tools Are Suffering From