From its initial appearance in the dev-tools space, GenAI has had an outsized impact on how developers approach day-to-day tasks (just ask any developer about when they first started using GitHub’s copilot).
While any risks are still being evaluated—like potential for introducing anti-patterns or inadvertently running afoul of compliance requirements, many engineering teams have successfully implemented GenAI with measurable gains in collaboration and productivity. In this blog we’ll examine the orgins of GenAI, examine how organizations have successfully adapted it to their needs, and dig into its impact on productivity and developer experience.
The Rise of Generative AI in Development
Quite a few GenAI tools have already been meaningfully adopted in the dev tools space. The Generative AI models that power them are defined by the models’ ability to generate freeform, human-sounding text or code, from prompt. With these models under the hood, the tools cover a spectrum of use cases that broadly fall into two distinct buckets.
Auto and chat-complete AI
The first wave of GenAI tools were developed to help with low-level tasks, like quickly auto-completing code based on context using the “prompt completion” characteristic of the models noted above. These tools use some level of context (the current file, any open files in the editor, etc.), to make an intelligent guess at what the developer intends to write next.
Eventually these same models were tuned to provide longer form, humanoid responses based on prompts, and were broadly adopted by developers due to their ability to understand, debug, and write code. The most noteworthy early example is ChatGPT.
Task execution AI
In the other bucket of GenAI we find tools focused on higher-order tasks—like fully conversational agents tuned for collaboration, with access to their own tools and context. These agents can be built into tools that can directly access infrastructure telemetry to identify vulnerabilities based on error logs, or have read/write access to the filesystem to write applications from scratch (instead of just telling the user how to do so).
Increasingly, the lines between some of these categories have started to blur (some copilots now have chat capabilities, for example). But broadly, in ascending order from lower-order to higher-order tasks, some of the most popular tools in the dev tools space include:
GitHub Copilot: The original, and wildly popular, AI-powered code suggestions and automation tool, primarily built into IDEs like Microsoft’s VSCode.
Amazon CodeWhisperer: Amazon’s answer to GitHub Copilot, optimized for AWS APIs like the EC2, Lambda, and S3 infrastructure types.
GitLab Duo: AI-powered code suggestions and automation, with additional features in assisting with code reviews, epic management, and more.
ChatGPT: A text-based, conversational AI tool that developers and engineering organizations are increasingly paying for to access the more powerful GPT-4, to further enhance workflows.
Google Gemini (formerly Bard): A text-based, conversational AI developed by Google, similar to ChatGPT, but with a tighter integration into Google’s product ecosystem.
Llama 2: An open-source large language model from Meta for various applications, including coding.
Perplexity: An LLM-powered search engine, combining the best of search and LLM semantic understanding and summarization.
Auto-GPT: Self-described as "ChatGPT on steroids," it offers a high-level framework that can oversee multiple GPT instances and create reports summarizing “team” activities.
Devin, by Cognition Labs: Not yet released as of the publish date for this blog, but has been billed as, “the first AI software engineer,” showing how these tools continue to move toward higher-level task coordination.
These tools all make (and largely, deliver on) the same promise to developers: they can accelerate the software development process. But how well are these tools serving the goals of improving developer productivity, and perhaps more importantly, the developer experience?
The promise for developer productivity
While still in their infancy, the impact these tools have had has been monumental. The recent GitHub Blog Survey confirmed the widespread use of GenAI, while a recent McKinsey’s survey went one step further to quantity that developers with GenAI completed coding tasks twice as fast. These developers were also found to be twice as likely to report increased overall happiness, fulfillment, and flow state.
But what exactly is it about GenAI that's driving such significant productivity and satisfaction? Here are the advantages we see:
Automation of rote tasks: GenAI tools excel at automating repetitive and time-consuming tasks, like writing boilerplate code and tests. This automation frees developers to focus on more complex and creative aspects of software development, potentially increasing satisfaction and productivity.
Improved code quality: By providing code suggestions and highlighting potential errors, tools like Copilot and CodeWhisperer can improve the overall quality of code. This leads to more reliable software and fewer bugs, enhancing the debugging experience for the developer.
Enhanced problem-solving: GenAI tools bring a new dimension to problem-solving by offering a vast range of solutions and perspectives that a developer might not consider.
Collaboration by default: These tools act as an always-on collaborator, which can lead to more innovative solutions to complex problems than going at it alone. It also helps developers have a “first-line” collaborator they can work with before escalating to a human collaborator, enhancing the support they receive without overburdening colleagues.
Reduced context-switching: Tools that can provide critical context to developers needing answers quickly can have an enormous impact on "time to find," and "time to fix" (or more traditional measures like MTTR).
Of course, it's worth noting that GenAI does have some potential downsides companies should consider prior to implementation:
New classes of errors, including everything from generated code introducing security vulnerabilities to generated code that seems correct but doesn't align with business logic or user needs.
A false sense of security and correctness. There's always a risk that developers can over-rely on the AI's suggestions (i.e., skipping the step where they escalate to a human collaborator), leading to issues being overlooked during code review and testing phases.
The best strategy for mitigating these risks may include general education of model and tool use cases and limitations, as well as firm playbooks for use maintained at the org rather than individual level.
How the most effective developer teams will adopt GenAI
So far, this blog has focused on present use cases, which primarily involve helping developers write executable code. However, we believe it will be the teams that can consistently apply these tools for repeatable meta-tasks in software development that will see the largest benefits.
If these tools can be treated as systems-level collaborators, or autonomous agents that can have meaningful parts of a task delegated to them, the step change can be significant. Elevating GenAI from a co-pilot that helps write a few lines of code, to a collaborative agent that can monitor previously subjective definitions of system health to offer areas of improvement or shorten otherwise lengthy issues, outages, or escalations will be a force multiplier for every developer.
Some example use cases might include:
Regulating design documents: Ensuring that infrastructure conforms to industry, organizational, security, and compliance standards by invoking LLM’s ability to apply structured checks to semi-structured information.
Infrastructure deployment: Automate processes for generating and interacting with infrastructure-as-code, templating, and scaffolding tsupport secure and r
Incident Response: Auto-identification of incidents and suggested fixes for developers to more quickly and confidently assess the right next steps without wasting time in information discovery or verification.
Looking Ahead
As we venture into the next 6 to 12 months, the landscape of GenAI in software development is poised for rapid evolutio. To stay ahead of the curve, teams may want to monitor several key areas:
Toolchain evolution
Staying informed about what tools exist, and where they excel, is most of the challenge in monitoring how development toolchains evolve. Subscribing to developer newsletters, attending conferences, or just remaining genuinely curious will be the best way to stay in the know.
Interaction modes and prompt engineering
Beyond the tools, the way developers interact with AI tools is also evolving. From chat interfaces that offer conversational assistance to in-line code editor integrations that provide real-time suggestions, the interface landscape is diverse, with voice as a modality gradually becoming more common (e.g., Copilot, ChatGPT).
Building up the skill of how to prompt the tool (often referred to as prompt engineering) will pay dividends as the tools themselves become more powerful.
Automation and productivity enhancements
Good automation is key for many software development workflows. Bringing infrastructure management and code generation tasks into this workstream, with the right AI tooling, will allow developers to improve reliability while reducing overhead.
Quality engineering and compliance
More GenAI tool use and functionally, means more collaborators. And adding more proverbial “cooks in the kitchen” makes it even more important to enforce standards of use—including what AI-delegated tasks will still require consistent review and methodical iteration.
The integration of GenAI into software development is not just a trend; it's a fundamental shift that will reshape the industry. While the benefits of these tools are undeniable—from accelerating development cycles to enhancing code quality—they pose some unique and non-obvious challenges. As we look ahead, the key to getting the most out of these tools while avoiding potential pitfalls lies in adopting these technologies with a balanced approach that emphasizes security, reliability, and continuous learning.